One thing I wish for is for standard library trees to present a way to take advantage of a trees structure, even if they don't directly expose the tree itself. For instance, C++ map could expose a way to start find or e.g., lower_bound from some node.
mrorigo 6 minutes ago [-]
I vibe coded with Claude to turn the tree traversal primitive concept into a Rust crate called "arboriter" (arborist + iterator): https://crates.io/crates/arboriter
The syntax follows Tylers design closely:
```rust
for_tree!(node = root; condition; branch_function => {
// Normal code here
if some_condition {
prune!(); // Skip children
}
if found_target {
break_tree!(); // Exit traversal
}
I agree with the author, we need better primitives, if you need functionality now:
Major tools that exist today for partial structure traversal and focused manipulation:
- Optics (Lenses, Prisms, Traversals)
Elegant, composable ways to zoom into, modify, and rebuild structures.
Examples: Haskell's `lens`, Scala's Monocle, Clojure's Specter.
Think of these as programmable accessors and updaters.
- Zippers
Data structures with a "focused cursor" that allow local edits without manually traversing the whole structure.
Examples: Huet’s original Zipper (1997), Haskell’s `Data.Tree.Zipper`, Clojure’s built-in zippers.
- Query Languages (for semantic traversal and deep search)
When paths aren't enough and you need semantic conditionals:
- SPARQL (semantic web graph querying)
- Datalog (logic programming and query over facts)
- Cypher (graph traversal in Neo4j)
- Prolog (pure logic exploration)
These approaches let you declaratively state what you want instead of manually specifying traversal steps.
Traversable and lenses are very closely linked. If you go to the original paper leading to Traversable [1] and read through it, it feels basically identical to reading through the parts of the lens library that lay down the core abstractions and the laws implementations must follow if you want to be able to blindly manipulate them. In fact, the traverse function is a Traversal, and so fits trivially into the lens ecosystem.
These are what I think the author is looking for. But it shouldn't be a "primitive" in terms of code automatically generated by the compiler, but an interface or typeclass like your examples (in a language advanced enough to have them.)
The problem is that 'lens', 'monocle', etc. are famously abstract and difficult for people to apply to their actual problems. IMO, the solution would be for standard libraries to specify interfaces called 'BreadthFirstTraverse', 'DepthFirstTraverse', etc.
naasking 3 hours ago [-]
> These are what I think the author is looking for. But it shouldn't be a "primitive" in terms of code automatically generated by the compiler
I think people are often too enamored by general purpose languages that can express such abstractions natively. I don't see an issue with a language that provides this as a primitive without being able to express it itself, constraints can be useful for other properties. Once you can traverse trees, most programming problems can be tackled even in such constrained languages, eg. SQL with CTE.
rebeccaskinner 5 hours ago [-]
Don’t forget recursion schemes. The last example in the article was just asking for a hylomorphism.
larodi 1 hours ago [-]
how about we also get regex-parsable streams (IO::Async in perl has something like it, suboptimal perhaps) and regex-parsable treestructures (totally possible)? seems like just having the ~= work on structures (or whatever the API is called in other languages, this being Perl5)?
postepowanieadm 2 hours ago [-]
SQL recursive CTE maybe?
CyberDildonics 3 hours ago [-]
This seems like dramatically over complicated iterators. Do they really need to be called 'optics' and 'lenses' and 'prisms' ?
Think of these as programmable accessors and updaters.
How is iterating through something already not 'programmable' ?
9. It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.
"Let's move on."
nimih 2 hours ago [-]
In addition, clojure.core has the handy tree-seq function:
(defn tree-seq
"Returns a lazy sequence of the nodes in a tree, via a depth-first walk.
branch? must be a fn of one arg that returns true if passed a node
that can have children (but may not). children must be a fn of one
arg that returns a sequence of the children. Will only be called on
nodes for which branch? returns true. Root is the root node of the
tree."
{:added "1.0"
:static true}
[branch? children root]
(let [walk (fn walk [node]
(lazy-seq
(cons node
(when (branch? node)
(mapcat walk (children node))))))]
(walk root)))
That's "just" a particular kind of fancy iterator that you should be able to implement in any language with iterator support. Here's one in Python:
# Expects:
# Tuple of iterateOn where iterateOn can be None
def fancyIt(init, *options):
if init != None:
yield(init)
for f in options:
newVal = f(init)
yield from fancyIt(newVal, *options)
class Tree:
def __init__(self, val, left = None, right = None):
self.val = val
self.left = left
self.right = right
def left(self):
return self.left
def right(self):
return self.right
myTree = Tree(
1,
Tree(2,
Tree(3),
Tree(4, None, Tree(5))),
Tree(6, None, Tree(7)))
for node in fancyIt(myTree, Tree.left, Tree.right):
print(node.val)
which prints the numbers 1 through 7 in order.
Breadth-first is slightly trickier, but only slightly trickier one time.
imglorp 3 hours ago [-]
Yes, it seems easy to implement.
Yes, students should absolutely implement the classic algorithms to learn.
Yes, there are some occasions when you need to home grow one at $work.
BUT, in my opinion, most of the time, professional code should use a battle tested, vuln hardened library or builtin version. These things are VERY HARD to get exactly right. Jon Bently's Programming Pearls famously had a latent bug in its binary search for 20 years before someone caught it.
So yeah, it looks easy but don't do it. Stand on some giant's shoulders instead.
jerf 2 hours ago [-]
Your reply is not relevant to my reply. The original poster is asking for this functionality and appears to believe it is something other than an iterator and requires some sort of special language support. However, it is completely implementable as an iterator, in a reasonably usable manner, with no additional language support. My specific code is written only to show that fact off.
Anyone who copies and pastes it is welcome to both pieces when it breaks. Others have already alluded to possible improvements that could be made, and I already have my own analysis in a grandchild reply as to why I don't think this is a terribly pressing need or necessarily even a good idea.
The reason I provide code is that it gets past the "oh, you say it's just an iterator, but I still don't believe you, since you haven't spelled it out to the n'th degree". When code is provided, belief ceases to be an issue. It is clearly something an iterator can implement, in existing languages, with existing iterator support.
Unless you're going to claim it is somehow impossible to provide this functionality in a tested manner, you're completely changing the topic in an uninteresting direction, since it is always true that functionality generally needs testing and bits of code slammed into an HN conversation just to make a particular point probably shouldn't be copied wholesale into your production code.
Amadiro 3 hours ago [-]
Sure but all pre-made, battle-tested tree datastructures you'd use in production in all languages already come with some form of iterator that you can just for-loop over, so the original articles point is still moot.
thayne 3 hours ago [-]
Sure, but it can be done by a library. There's no reason it needs to be built into the language.
packetlost 4 hours ago [-]
Yeah, tree traversal is really easy and implementing it as an iterator is natural. Maybe don't use a recursive technique if you plan on working with non-toy datasets, Python's default stack limit is pretty small (1000), but this is otherwise a very flexible API.
While easy, I think bisect would be a good addition to every stdlib too.
jerf 4 hours ago [-]
I did the tree just to match the author's example. I would agree that a bespoke iterator for breadth- and depth-first iteration for any given tree is probably a better way to go. As long as we're in a language like Python, build in something that allows you to examine a branch and decline to descend into it while you're at it.
I don't think this is a large problem in practice because you shouldn't be using dozens of tree types in a given code base, so adding iterators to a tree is no big deal. In general there aren't enough types of iteration available to a given data structure that you need to describe how to iterate on it from the "outside". (Generally when you are doing that, it's too non-trivial to fit into this pattern anyhow; see the Visitor pattern in general.) This strikes me as maybe the sort of default tool you might slap in a library somewhere, but it should be a niche tool. If you're using it all the time you're probably doing something wrong. By default your data structures should be providing iteration packaged with them and it should generally be what you need. And your language should support aborting iteration, in whatever that looks like normally. I'm not sure I know a language that doesn't, it's a fairly basic element of iterator support when you get into implementation.
There are also many cases where a tree iterator will perform significantly better, including CPython. I don't have enough experience with PyPy to know if it could inline the Tree.left and Tree.right calls down to zero penalty at JIT time. Rust and C++ and the other static languages with sophisticated compilers might be able to get that down to fully inlined and zero-cost, but even if they can it's probably better not to push that on to the optimizer as the optimizers will eventually give up if this is composed with enough other stuff. Better to just have an efficient implementation in the first place.
Well, the whole point of the blog post is to argue for a new kind of syntactic sugar and the author explicitly mentions iterators.
> Well a range based for loop requires that your tree exist in memory AND that you have an iterator defined for your tree. With for_tree you could operate on an entirely imperative tree, without needing to define any iterators or generator functions. Here's an example where I'm checking every single string composed of "a", "b", and "c" of length 8 or less.
for_tree(string x = ""; x.size() <= 8; x : {x+"a", x+"b", x+"c"}){
print(x);
}
You could definitely find every string composed of "a", "b", and "c" of length 8 or less by defining a custom iterator but it would be a verbose and unpleasant way of writing it:
class StringIterator {
public:
using iterator_category = std::forward_iterator_tag;
using value_type = std::string;
using difference_type = std::ptrdiff_t;
using pointer = const std::string*;
using reference = const std::string&;
StringIterator(bool begin = false) : is_end_(!begin) { if (begin) s_ = ""; }
const std::string& operator*() const {
if (is_end_) throw std::out_of_range("End iterator");
return s_;
}
StringIterator& operator++() {
if (is_end_) return *this;
if (s_.size() < 8) return s_.push_back('a'), *this;
while (!s_.empty() && s_.back() == 'c') s_.pop_back();
if (s_.empty()) is_end_ = true;
else s_.back() = s_.back() == 'a' ? 'b' : 'c';
return *this;
}
StringIterator operator++(int) { auto tmp = *this; ++(*this); return tmp; }
bool operator==(const StringIterator& other) const {
return is_end_ == other.is_end_ && (is_end_ || s_ == other.s_);
}
bool operator!=(const StringIterator& other) const { return !(*this == other); }
private:
std::string s_;
bool is_end_;
};
int main() {
StringIterator begin(true), end;
int count = 0;
for (auto it = begin; it != end; ++it) ++count;
std::cout << (count == 9841 ? "Pass" : "Fail") << std::endl;
return 0;
}
jerf 1 hours ago [-]
def itWithStop(init, stop, *options):
if init is not None and not stop(init):
yield(init)
for f in options:
newVal = f(init)
yield from itWithStop(newVal, stop, *options)
for s in itWithStop("",
lambda x: len(x) > 2,
lambda x: x + "a",
lambda x: x + "b",
lambda x: x + "c"):
print(s)
yields the combinations of 0 - 2 length strings with a, b, and c.
Python has a number of ways to achieve this depending on exactly how you want to pass the arguments; multiple functions, optional arguments, etc. How nice the final call looks is more about your local language's closures look.
The main point here is that this will happily iterate on things that don't "exist".
module Tmp where
iter :: forall a. (a -> Bool) -> [a -> a] -> a -> [a]
iter p opts x = if p x then x:concatMap (iter p opts) (opts <*> [x]) else []
ghci> :l tmp.hs
[1 of 1] Compiling Tmp ( tmp.hs, interpreted )
Ok, one module loaded.
ghci> iter (\x -> length x < 3) [(++ "a"), (++ "b"), (++ "c")] ""
["","a","aa","ab","ac","b","ba","bb","bc",
"c","ca","cb","cc"]
(Since things are lazy in Haskell, functions that return lists effectively are iterators. There's probably something in the standard library somewhere for (opts <*> [x]) to avoid the wrapping x in an unnecessary list, but my Haskell is rusty.)
porphyra 1 hours ago [-]
The iterator in my example will also happily iterate on things that don't exist. We all agree that it's possible to do in any language. But the main point is that the blog post is talking about syntactic sugar for an easier way to do it.
And yes, Haskell is amazing at this sort of thing.
jerf 51 minutes ago [-]
The syntactic sugar being asked for in this case is awfully thin. I definitely put this in the class of "stop pining for it and just use what's there".
If the poster wants to particularize this to C++ because C++'s syntax can't support it in any reasonable manner, that's fine, but that's a C++ problem, not a "Programming languages..." problem. Which would be perfectly understandable and I'm not really complaining, more clarifying that most of the rest of the world can just rub together three or four existing constructs in a pretty reasonable manner to get this.
4 hours ago [-]
jerf 2 hours ago [-]
Whoops, my edit window closed, but that first comment derives from a previous version that had a different signature for the functions in the "options" list. Ignore it.
pgt 28 minutes ago [-]
Nathan Marz' talk on Specter (Clojure Library that decouples navigation from transformation) is must-watch if you deal with data: https://www.youtube.com/watch?v=VTCy_DkAJGk
I use it in every project for data navigation and transformation, and it's more performant than standard Clojure data manipulation, while retaining types (instead of coercing back from seqs).
E.g. if you have a map and you want to increment every value in the map:
(require '[com.rpl.specter :as S])
Now let's say you have a map of vectors and want to increment all of those?
(->> {:a 5, :b 6, :c 7}
(S/transform [S/MAP-VALS S/ALL] inc)) ;; note the navigator juts got another level of nesting
=> {:a [2 3], :b [4 5], :c [6 7]}works for all clj data types, and of course it has navigators for recursive walking .
It took me a while to get good at Specter, but it was worth it. I hear Rama uses Specter navigators internally.
cdrini 1 hours ago [-]
I'm not sure if it needs to be at the syntax level, but I think having built in standard library support for graphs (general trees) would help make graphs more common in programming. And seeing how powerful they are, I think that would be a good thing!
I explored this idea with gstd, a standard library for graphs inspired by the JS Array interface: https://github.com/cdrini/gstd/
Folding and traversing can be trivially done in all relevant programming languages if you have an iterator
trealira 1 hours ago [-]
I feel like programming an iterator like this akin to a state machine is just not that convenient or trivial, though. Below is pseudo-Java.
class InOrderTreeIterator {
Stack stack;
TreeNode cursor;
InOrderTreeIterator(TreeNode root) {
cursor = root;
s = new Stack;
}
bool hasNext() {
return cursor != null || !stack.empty();
}
TreeNode next() {
if (cursor != null) {
while (cursor.left != null) {
stack.push(cursor);
cursor = cursor.left;
}
} else if (!stack.empty()) {
cursor = stack.pop();
} else {
throw new NoSuchElementException();
}
TreeNode ret = cursor;
cursor = cursor.right
return ret;
}
}
munchler 2 hours ago [-]
Agreed. It’s amusing to see procedural programmers slowly rediscovering what functional programmers have known for years. Paul Graham called this “The Blub Paradox”.
yxhuvud 5 hours ago [-]
No, that seems like language bloat. Better to make your language have internal iterators, as then data structures can add their own logic that matches their need for iteration. Then special cases don't need additional syntax.
naasking 3 hours ago [-]
Trees aren't really a "special case". Sequences, trees and graphs are core abstractions to most programming, why try to flatten all trees and graphs to sequences?
smallnamespace 2 hours ago [-]
Because your code is actual running serially so no matter what there is an iteration order, and the order matters for performance even if your code does not care.
For example if you literally don’t care about the order then your code can map over a default iterator that is efficient (DFS).
naasking 1 hours ago [-]
Not true due to parallelism. Also, SQL demonstrates that you can describe the desired result declaratively without being concerned about iteration order. I see SQL's CTEs as a good example of the kind of primitive the article is talking about.
pmontra 5 hours ago [-]
I remember that I was using trees quite often last century, when I was writing programs in C. I seldom use tree or comparably complex data structures nowadays, when I almost only write web apps (mostly backend in Ruby or Python but some frontend in JS.) I'm bet that both Ruby and Python have plenty of trees written in C inside their interpreters. I do everything with arrays (lists) and hash tables (dicts) in Ruby and Python. Maybe a language construct for trees would be nice for lower level languages and almost out of scope for higher level ones.
The same from another angle: there are a lot of trees in the indices of SQL databases (example [1]) but we don't zoom in to that level of detail very often when defining our tables.
User interface widgets form a forest - windows contain layouts that contain widgets which also can be layouts, etc, etc.
To implement Brown's algorithm to optimize class-based language models I had to implement a complex forest (DAG, actually) in Python using lists of fixed length. That was not especially nice to work with.
walleeee 4 hours ago [-]
Do you keep your dicts fastidiously flat?
pmontra 3 hours ago [-]
Usually yes. Web apps are simple. There isn't much to do inside a request params to db to response call. But sometimes, not every year, there is something to really reason about. I still remember some project from 10 or 20 years ago.
Retr0id 4 hours ago [-]
> "Why not just use recursive functions"
One great reason not to use recursive functions for traversing trees is that you can allocate your own stack data structure rather than relying on the call stack itself. In most languages/runtimes, the call stack has a maximum depth which limits the depth of trees you can process, usually on the order of thousands of stack frames.
Managing your own stack usually produces weirder looking code (personally I find "naive" recursive approaches more readable) - but having it as a first-class language feature could solve that!
cobbal 2 hours ago [-]
If we're fixing the language, may as well fix the real problem and not artificially limit stack space.
Retr0id 2 hours ago [-]
The language doesn't get much of a say, stack limits are usually inherited from the OS itself. It's fixable by not using the OS-provided stack but that's a much more invasive change than a new syntax/stdlib feature.
soegaard 5 hours ago [-]
Pick a language that allows users to define their own control structures and test your idea in practise.
Candidates: Racket, Scheme, Rust.
k__ 4 hours ago [-]
You don't even have to got that far.
Defining your own iterator would be enough for most cases.
demarq 32 minutes ago [-]
I would think iterators are exactly this. And they exist in almost every language.
hyperhello 1 hours ago [-]
This is interesting, but the syntax doesn't seem to have the right expressiveness for such a large change.
for recursive (Node t = tree.root; t != NULL;) {
puts(t.value);
if (t.value == target) break;
if (t.value == dontfollow) continue;
if (t.left) continue t.left;
if (t.right) continue t.right;
}
return t;
Regular 'break' is to really break out of the structure like a regular for, as regular 'continue' is to do the next iteration. But if continue has a value to recurse on, it reenters the for loop like a subroutine.
As a bonus, I think this is tail-call-optimization friendly.
toxik 6 minutes ago [-]
Now allow it to do BFS or DFS, suddenly you're not far from just a vanilla BFS or DFS if you have a sensible stack/queue API.
monkeyelite 4 hours ago [-]
This is solved by iterators in C++. The idea of an iterators is to generalize the concept of a pointer —- something which refers to a location in your data structure.
For example the most basic operations of a pointer are to advance and dereference.
std::map is actually implemented as a tree. To iterator its members you can do
for (cost auto &pair : map)
The only requirement for your custom data structure to work is to implement begin() and end() which return iterators - “pointer like” objects.
The simplifying assumption that all tree nodes have the same type and the same function should be called on them seems unrealistic.
This kind of imperative iteration seems better served by the traditional visitor design pattern: more verbose (more explicit, not more complex) and more general.
vinceguidry 3 hours ago [-]
When I went looking for this in Ruby, I eventually landed on RubyTree[1]. Being able to subclass the TreeNode makes everything so much friendlier. Sure, I could implement them myself, but getting converters to/from primitives for free is a lot off my mind. Would be nice to have it built into the language but honestly Ruby has a whole lot of stdlibs and default gems already.
How in the world is this getting a HN hug of death when it's a substack page?
kelseyfrog 2 hours ago [-]
I've written about this before, but suggesting recursion for tree traversal is equivalent to suggesting goto is a replacement for if/for/while.
We don't have syntax and semantics for recursion schemes in any programming language - it's always deferred to library support at best. As far as I'm concerned, this is an open problem in programing language design where we finally replace the vestigial recursion hack with a proper structured programming solution.
lucasoshiro 4 hours ago [-]
This is heavy biased to C++, which is a language that already has too many features that people just want to use. This is a very specific case to be placed as a language feature, and could be just a lib.
If this becomes a C++ feature, imagine how many data structures we would need to support?
Many other languages, specially the FP languages, allow to do that as a library. Even the languages that are only inspired by FP. Example, Ruby:
class BinTree
include Enumerable
def initialize v, l, r
@v, @l, @r = v, l, r
end
def each &block
@l.each(&block) unless @l.nil?
yield @v
@r.each(&block) unless @r.nil?
end
end
Using the Enumerable mixin includes many FP-based methods, such as map, filter and reduce by only defining each, which in this case is DFS.
And so on. The same can be done in Python, Kotlin and many others.
monkeyelite 4 hours ago [-]
> If this becomes a C++ feature, imagine how many data structures we would need to support?
C++ already solved that problem. Iterator are designed so that an algorithm can be written once and used with multiple data structures.
lucasoshiro 3 hours ago [-]
So we can go back to just use them and just write a new library
monkeyelite 2 hours ago [-]
You don’t need the standard library to make iterators.
meltyness 5 hours ago [-]
For perspective, I'm a novice with algorithm design, I've been grinding leetcode for the past 6 months or so, almost exclusively in Rust. I was bewildered by the same concern since I had initially set out to, not only focus on Rust, but to primarily maximize use of the Iterator construct, since I was not intricately familiar with it. A few months in I discovered that there was an appropriate Iterator construct which accomplishes the same thing.
// Comments for the non-Rust native reader, regarding this Function declaration:
// successors is a function that accepts an `Option` container for some Value of type T, called `first`
// and a Closure called `succ`, constrained below:
pub fn successors<T, F>(first: Option<T>, succ: F) -> Successors<T, F> ⓘ
where
// `succ` must receive the iterated state, and return the next iterated state
F: FnMut(&T) -> Option<T>,
// Each time the `next()` function is called on the returned Iterator (a Successors-flavored iterator),
// the state of `first` is yielded, and then
// `succ` is called to progress
// until a `None` type is reported by `succ`
I'm not sure where the concept came from, but it's not dissimilar to the author's implementation, but instead of the ControlFlow enum, it relies simply on the Option enum. I know though, that it was initially built in the Itertools crate as unfold and then upstreamed some time later.
Essentially you use `first` to contain a Queue, Stack, or Level for the different traversals, and define traversal or activities from there.
It's fairly ergonomic in practice, ergonomic enough for Leetcode.
Couldn't you just do this with a regular for loop and a few datatypes/functions? (This pseudocode is an Elm/Haskell/Rust/TypeScript-inspired abomination, but pretty portable to any language...)
type Node = { value: Any, left: Node, right: Node }
type Direction = Left | Right
type TreePosition = { root: Node, currentNode: Node = root, position: Direction[] = [] }
# Implementation left as an exercise but should be obvious and run in O(1), I believe. Returns Nothing when we're out of nodes.
function nextPosition(position: TreePosition): Option<TreePosition>
# The tree you want to iterate through
const myTree: Node = ...
# The loop
for(let position: TreePosition? = TreePosition(root: myTree); position != Nothing; position = nextPosition(position) {
node = position!.currentNode
# Your loop code
}
I'd argue this doesn't belong as a language-level feature, but maybe an API/stdlib-level feature.
xxs 5 hours ago [-]
Indeed, I don't see any need to have a tree as a language primitive along with a traversal function (iterator). Tree traversal is not really different than iterating over a vector/list. E.g. even java has stuff like: TreeSet.forEach(consumerRef) or for(Type val : tree) doStuffWith(val)
mubou 6 hours ago [-]
I bet you could do something generic like this in languages that have deferred execution like C#'s IEnumerable. Something like
foreach (Node node in EnumerateNodes(root, x => x != null, x => [x.Left, x.Right]))
where EnumerateNodes uses `yield return` (i.e. is a generator) and calls itself recursively. Though it'd probably be easier / better performance to write an implementation specific to each node type.
5 hours ago [-]
JonChesterfield 3 hours ago [-]
This implementation recurses on the call stack and calls that out as a feature. A built into the language tree walk which overflows the stack on large trees is perfect for C++, get a paper over to WG21.
Or for a possibly more useful comment, constant space tree traversal is genuinely quite difficult to implement. I don't know a general solution to it (other than walk from the start N times, quadratic style), would be interested to hear of one.
HelloNurse 3 hours ago [-]
Constant space tree traversal is impossible and unnecessary. On the other hand predictable space, including adding parent pointers to cheat, is easy to achieve by counting the total number of nodes or the depth of the tree.
kevinventullo 3 hours ago [-]
If nodes have pointers to their parent, constant space tree traversal is easy enough.
bjourne 4 hours ago [-]
I'm 100% sure smug Haskellers have something very important to say here. :)
Joel_Mckay 4 hours ago [-]
Normally someone would respond, but they are prone to lazy evaluation...
Thank you... I'll see myself out... lol =3
bee_rider 5 hours ago [-]
What is this note before the code about?
> This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Also I’m slightly confused by this example.
for_tree(string x = ""; x.size() <= 8; x : {x+"a", x+"b", x+"c"}){
print(x);
}
So, our “next node” operation is to concatenate to x. Won’t we either have to have a method for modifying x to go “up” a node, or we’ll have to keep a record of what x was upon entering each node? Like in this example we’ll end up with x=“aaaaaaaa” and then go up a node, over to a “b” node, and get x=“aaaaaaaab”, right?
voidUpdate 5 hours ago [-]
The author said it could be implemented recursively, so a call with x="aaaaaaa" would call three more functions with x="aaaaaaaa", "aaaaaaab" and "aaaaaaac". you never need to go up a node
dmurray 5 hours ago [-]
I think the intention is that you keep a record of what x is in the current node, and at every node above the current node. In the recursive implementation which the author describes as equivalent, these values are kept in the stack.
bee_rider 5 hours ago [-]
Seems a bit inefficient…
I guess we can delete the a node’s copy of x after all of a node’s child nodes are visited, at least.
dmurray 4 hours ago [-]
Well, it doesn't need to be any worse than the stack-based implementation of a recursive tree traversal, and it can't be significantly better. You have to store that state somewhere.
Perhaps it can be optimized to be a little better than the recursive version, depending on how much overhead your language uses for a stack frame that it won't need for this special case.
pbohun 6 hours ago [-]
I think this kind of control mechanism is not a great idea and could lead to easy bugs.
I think it's a better idea to write a function that applies a function to the elements of a tree. Ideally you'd make a function for each traversal order. This makes it obvious what is happening.
map_tree_pre(my_tree, my_func);
map_tree_in(my_tree, my_func);
map_tree_post(my_tree, my_func);
map_tree_bfs(my_tree, my_func);
tbrownaw 3 hours ago [-]
Standard tree traversal loop (yes, missing some obvious conditionals I didn't feel like typing out):
var todo = new List<T>();
todo.append(root);
while (var item = todo.pop_front()) {
todo.append(item.left); // or .prepend for depth-first
todo.append(item.right); // or .prepend()
// do stuff...
}
Not as ergonomic as a direct tree-iterator, but I can't see of an elegant way to introduce that in an imperative language while keeping the forking/recursion aspect clear
MontagFTB 5 hours ago [-]
Sean Parent developed ‘forest’ for C++ that automatically maintains the invariant that its structure is always a hierarchy. It includes full-order iteration through its default iterator: https://stlab.cc/2020/12/01/forest-introduction.html
usrnm 4 hours ago [-]
Don't need to go that far, std::map is a tree (the standard does not dictate it, but it is in all ilmplementations), and you could always traverse it. The whole idea of iterators was popularized by C++ and its STL.
injidup 4 hours ago [-]
That's what c++20 coroutines paired with c++23 generators are for. It's easy to define whatever traversal you want and then expose it as generic code.
Though Haskell's Traversable is similar in name, then depending on what the developer intends, both Functor, Foldable or Monad could also help with the traversal. I.e, Haskell already has what the blog post asks for.
systems 5 hours ago [-]
well, functional languages with recursive types, are very good at representing
binary trees
Once you have algebraic data-types in a language, writing a recursive visitor pattern is pretty simple.
Encoding the semantics of a tree traversal operator likewise is difficult in the general case. What exactly would the order be, what if I want to traverse in a non-standard ordering, what about skipping branches; all would be difficult to cleanly represent.
I have seen it done where you return actions with key ones being recurse, stop, replace, and replace & then carry out some function, but again, this is pretty simple to implement.
MJGrzymek 6 hours ago [-]
sticking to dfs, there is still a difference between pre-order/in-order/post-order
3 hours ago [-]
lutusp 5 hours ago [-]
The article's thesis relies on the idea that a genuinely primitive traversal action exists, in the way that a for-loop is primitive and widely applicable, or adding two floats is common enough to justify building it into the language (or processor).
But tree traversal doesn't have this universal property. There are too many methods and purposes for traversing a tree, sufficient that IMHO no single primitive embodiment could materially improve a language. Also, modern compilers efficiently break down high-level traversal code so well that expressing the idea at a high level incurs no serious penalty compared to having a primitive for that purpose, or a series of them.
fngjdflmdflg 3 hours ago [-]
I read a similar argument for graphs in general,[0] where this is more obviously true.
I worked for Guidance Software in the aughts and their main product, EnCase, had its own proprietary scripting language built in, called EnScript. Everything in EnCase inherited from “NodeClass”, a linked list for creating trees.
EnScript had a forall(NodeClass n in tree.GetRoot(){} construct that was very easy. It was essentially a depth-first iterator.
Mikhail_K 2 hours ago [-]
Ordering pizza is much more important and frequent part of a developer's work than tree traversal. Therefore, programming languages need pizza ordering primitive even more, than tree traversal ones.
chess_buster 3 hours ago [-]
Nonsense.
Programming languages should have less primitives like this and instead we should have better foundation libraries for the languages, i.e., containing iterator/-interfaces like Rust and Python (or Smalltalk).
specialist 3 hours ago [-]
FWIW, I use Null Objects to eliminate null checks. Makes recursion (tree climbing) more concise, legible.
Bonus: Java's HotSpot magick will NOP (most?) methods of Null Objects, making this a zero cost abstraction.
I should probably write a concrete example for a blog or something.
TLDR: For every base class such as TreeNode, create a NullTreeNode that does nothing, then replace all uses of null with NullTreeNode. Voilá, no more null checks or NPEs.
meindnoch 4 hours ago [-]
Horrible idea.
1. BFS is not supported.
2. Traversal type (inorder, preorer, postorder, or even mixed order!) is also not handled.
3. The syntax doesn't make it apparent that stack overflow can occur, e.g. by doing DFS on a linked list.
danieloj 57 minutes ago [-]
As a fullstack web engineer I've never had to implement a tree structure at work. I'd love to hear examples of what kinds of companies/platforms people are writing these structures regularly. If anyone is willing to share where they use them I'd appreciate it
Philpax 1 minutes ago [-]
The (V)DOM is a tree. Knowing that is useful for manipulating and composing it.
mvc 9 minutes ago [-]
You might not implement them but as a web engineer you're using them all the time. So all these tools that you use will have tree implementations in them.
- Every html document is a tree structure. And css documents have special syntax to address nodes within those trees
- If it's any good, you're routing framework probably uses some kind of tree to quickly match the request to it's handler
- The database you write to uses a tree to quickly find the location it needs to write to
The syntax follows Tylers design closely:
```rust
for_tree!(node = root; condition; branch_function => { // Normal code here
}); ```I wrote about the development process here if you're interested: https://origo.prose.sh/prompt-engineering-the-perfect-vibe-c...
Thanks for the interesting idea!
Major tools that exist today for partial structure traversal and focused manipulation:
- Optics (Lenses, Prisms, Traversals)
- Zippers - Query Languages (for semantic traversal and deep search)[1] https://www.cs.ox.ac.uk/jeremy.gibbons/publications/iterator...
The problem is that 'lens', 'monocle', etc. are famously abstract and difficult for people to apply to their actual problems. IMO, the solution would be for standard libraries to specify interfaces called 'BreadthFirstTraverse', 'DepthFirstTraverse', etc.
I think people are often too enamored by general purpose languages that can express such abstractions natively. I don't see an issue with a language that provides this as a primitive without being able to express it itself, constraints can be useful for other properties. Once you can traverse trees, most programming problems can be tackled even in such constrained languages, eg. SQL with CTE.
Think of these as programmable accessors and updaters.
How is iterating through something already not 'programmable' ?
https://clojuredocs.org/clojure.zip/zipper
Breadth-first is slightly trickier, but only slightly trickier one time.
Yes, students should absolutely implement the classic algorithms to learn.
Yes, there are some occasions when you need to home grow one at $work.
BUT, in my opinion, most of the time, professional code should use a battle tested, vuln hardened library or builtin version. These things are VERY HARD to get exactly right. Jon Bently's Programming Pearls famously had a latent bug in its binary search for 20 years before someone caught it.
https://research.google/blog/extra-extra-read-all-about-it-n...
So yeah, it looks easy but don't do it. Stand on some giant's shoulders instead.
Anyone who copies and pastes it is welcome to both pieces when it breaks. Others have already alluded to possible improvements that could be made, and I already have my own analysis in a grandchild reply as to why I don't think this is a terribly pressing need or necessarily even a good idea.
The reason I provide code is that it gets past the "oh, you say it's just an iterator, but I still don't believe you, since you haven't spelled it out to the n'th degree". When code is provided, belief ceases to be an issue. It is clearly something an iterator can implement, in existing languages, with existing iterator support.
Unless you're going to claim it is somehow impossible to provide this functionality in a tested manner, you're completely changing the topic in an uninteresting direction, since it is always true that functionality generally needs testing and bits of code slammed into an HN conversation just to make a particular point probably shouldn't be copied wholesale into your production code.
While easy, I think bisect would be a good addition to every stdlib too.
I don't think this is a large problem in practice because you shouldn't be using dozens of tree types in a given code base, so adding iterators to a tree is no big deal. In general there aren't enough types of iteration available to a given data structure that you need to describe how to iterate on it from the "outside". (Generally when you are doing that, it's too non-trivial to fit into this pattern anyhow; see the Visitor pattern in general.) This strikes me as maybe the sort of default tool you might slap in a library somewhere, but it should be a niche tool. If you're using it all the time you're probably doing something wrong. By default your data structures should be providing iteration packaged with them and it should generally be what you need. And your language should support aborting iteration, in whatever that looks like normally. I'm not sure I know a language that doesn't, it's a fairly basic element of iterator support when you get into implementation.
There are also many cases where a tree iterator will perform significantly better, including CPython. I don't have enough experience with PyPy to know if it could inline the Tree.left and Tree.right calls down to zero penalty at JIT time. Rust and C++ and the other static languages with sophisticated compilers might be able to get that down to fully inlined and zero-cost, but even if they can it's probably better not to push that on to the optimizer as the optimizers will eventually give up if this is composed with enough other stuff. Better to just have an efficient implementation in the first place.
(this is for iterating over nested JSON-like objects, which are just weird trees)
There are a lot of ways you could avoid the recursion, but that's a particularly nice way!
[1] https://doc.rust-lang.org/std/vec/struct.Vec.html#method.bin...
> Well a range based for loop requires that your tree exist in memory AND that you have an iterator defined for your tree. With for_tree you could operate on an entirely imperative tree, without needing to define any iterators or generator functions. Here's an example where I'm checking every single string composed of "a", "b", and "c" of length 8 or less.
You could definitely find every string composed of "a", "b", and "c" of length 8 or less by defining a custom iterator but it would be a verbose and unpleasant way of writing it:Python has a number of ways to achieve this depending on exactly how you want to pass the arguments; multiple functions, optional arguments, etc. How nice the final call looks is more about your local language's closures look.
The main point here is that this will happily iterate on things that don't "exist".
(Since things are lazy in Haskell, functions that return lists effectively are iterators. There's probably something in the standard library somewhere for (opts <*> [x]) to avoid the wrapping x in an unnecessary list, but my Haskell is rusty.)And yes, Haskell is amazing at this sort of thing.
If the poster wants to particularize this to C++ because C++'s syntax can't support it in any reasonable manner, that's fine, but that's a C++ problem, not a "Programming languages..." problem. Which would be perfectly understandable and I'm not really complaining, more clarifying that most of the rest of the world can just rub together three or four existing constructs in a pretty reasonable manner to get this.
I use it in every project for data navigation and transformation, and it's more performant than standard Clojure data manipulation, while retaining types (instead of coercing back from seqs).
E.g. if you have a map and you want to increment every value in the map: (require '[com.rpl.specter :as S])
``` (->> {:a 5, :b 6, :c 7} (S/transform [S/MAP-VALS] inc)) => {:a 6, :b 7, :c 8} ```
^ try to write that in normal clojure.
Now let's say you have a map of vectors and want to increment all of those? (->> {:a 5, :b 6, :c 7} (S/transform [S/MAP-VALS S/ALL] inc)) ;; note the navigator juts got another level of nesting => {:a [2 3], :b [4 5], :c [6 7]}works for all clj data types, and of course it has navigators for recursive walking .
It took me a while to get good at Specter, but it was worth it. I hear Rama uses Specter navigators internally.
I explored this idea with gstd, a standard library for graphs inspired by the JS Array interface: https://github.com/cdrini/gstd/
For example if you literally don’t care about the order then your code can map over a default iterator that is efficient (DFS).
The same from another angle: there are a lot of trees in the indices of SQL databases (example [1]) but we don't zoom in to that level of detail very often when defining our tables.
[1] https://www.postgresql.org/docs/current/btree.html
To implement Brown's algorithm to optimize class-based language models I had to implement a complex forest (DAG, actually) in Python using lists of fixed length. That was not especially nice to work with.
One great reason not to use recursive functions for traversing trees is that you can allocate your own stack data structure rather than relying on the call stack itself. In most languages/runtimes, the call stack has a maximum depth which limits the depth of trees you can process, usually on the order of thousands of stack frames.
Managing your own stack usually produces weirder looking code (personally I find "naive" recursive approaches more readable) - but having it as a first-class language feature could solve that!
Candidates: Racket, Scheme, Rust.
Defining your own iterator would be enough for most cases.
As a bonus, I think this is tail-call-optimization friendly.
For example the most basic operations of a pointer are to advance and dereference.
std::map is actually implemented as a tree. To iterator its members you can do
The only requirement for your custom data structure to work is to implement begin() and end() which return iterators - “pointer like” objects.This kind of imperative iteration seems better served by the traditional visitor design pattern: more verbose (more explicit, not more complex) and more general.
1: https://github.com/evolve75/RubyTree
We don't have syntax and semantics for recursion schemes in any programming language - it's always deferred to library support at best. As far as I'm concerned, this is an open problem in programing language design where we finally replace the vestigial recursion hack with a proper structured programming solution.
If this becomes a C++ feature, imagine how many data structures we would need to support?
Many other languages, specially the FP languages, allow to do that as a library. Even the languages that are only inspired by FP. Example, Ruby:
Using the Enumerable mixin includes many FP-based methods, such as map, filter and reduce by only defining each, which in this case is DFS.Then we can proceed to define a binary tree:
Iterate over all the elements: Iterate over the even elements: Stop iteration when finding a value: And so on. The same can be done in Python, Kotlin and many others.C++ already solved that problem. Iterator are designed so that an algorithm can be written once and used with multiple data structures.
Essentially you use `first` to contain a Queue, Stack, or Level for the different traversals, and define traversal or activities from there.
It's fairly ergonomic in practice, ergonomic enough for Leetcode.
Here's a BFS: https://leetcode.com/problems/course-schedule-iv/solutions/6...
[0] https://doc.rust-lang.org/std/iter/fn.successors.html
[1] https://docs.rs/itertools/latest/itertools/fn.unfold.html
Or for a possibly more useful comment, constant space tree traversal is genuinely quite difficult to implement. I don't know a general solution to it (other than walk from the start N times, quadratic style), would be interested to hear of one.
Thank you... I'll see myself out... lol =3
> This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Also I’m slightly confused by this example.
}So, our “next node” operation is to concatenate to x. Won’t we either have to have a method for modifying x to go “up” a node, or we’ll have to keep a record of what x was upon entering each node? Like in this example we’ll end up with x=“aaaaaaaa” and then go up a node, over to a “b” node, and get x=“aaaaaaaab”, right?
I guess we can delete the a node’s copy of x after all of a node’s child nodes are visited, at least.
Perhaps it can be optimized to be a little better than the recursive version, depending on how much overhead your language uses for a stack frame that it won't need for this special case.
I think it's a better idea to write a function that applies a function to the elements of a tree. Ideally you'd make a function for each traversal order. This makes it obvious what is happening.
map_tree_pre(my_tree, my_func);
map_tree_in(my_tree, my_func);
map_tree_post(my_tree, my_func);
map_tree_bfs(my_tree, my_func);
Not as ergonomic as a direct tree-iterator, but I can't see of an elegant way to introduce that in an imperative language while keeping the forking/recursion aspect clear
https://godbolt.org/z/fnGzszf3j
https://cs3110.github.io/textbook/chapters/data/trees.html
Encoding the semantics of a tree traversal operator likewise is difficult in the general case. What exactly would the order be, what if I want to traverse in a non-standard ordering, what about skipping branches; all would be difficult to cleanly represent.
I have seen it done where you return actions with key ones being recurse, stop, replace, and replace & then carry out some function, but again, this is pretty simple to implement.
But tree traversal doesn't have this universal property. There are too many methods and purposes for traversing a tree, sufficient that IMHO no single primitive embodiment could materially improve a language. Also, modern compilers efficiently break down high-level traversal code so well that expressing the idea at a high level incurs no serious penalty compared to having a primitive for that purpose, or a series of them.
[0] https://www.hillelwayne.com/post/graph-types/ and https://news.ycombinator.com/item?id=39592444
EnScript had a forall(NodeClass n in tree.GetRoot(){} construct that was very easy. It was essentially a depth-first iterator.
Programming languages should have less primitives like this and instead we should have better foundation libraries for the languages, i.e., containing iterator/-interfaces like Rust and Python (or Smalltalk).
Bonus: Java's HotSpot magick will NOP (most?) methods of Null Objects, making this a zero cost abstraction.
I should probably write a concrete example for a blog or something.
TLDR: For every base class such as TreeNode, create a NullTreeNode that does nothing, then replace all uses of null with NullTreeNode. Voilá, no more null checks or NPEs.
1. BFS is not supported.
2. Traversal type (inorder, preorer, postorder, or even mixed order!) is also not handled.
3. The syntax doesn't make it apparent that stack overflow can occur, e.g. by doing DFS on a linked list.
- Every html document is a tree structure. And css documents have special syntax to address nodes within those trees
- If it's any good, you're routing framework probably uses some kind of tree to quickly match the request to it's handler
- The database you write to uses a tree to quickly find the location it needs to write to