External Visitation

Your visitor API is probably hurting you

2025-12-09 | ↩️ 8 │ 🔄 2 │ ⭐ 5

If you’re already convinced that the traditional visitor pattern has problems, skip ahead to the solution.

Imagine you’re writing an optimising compiler.

At one end you have source code, and on the other, some sort of machine code. In the middle you have an IR (Intermediate Representation). This is the optimiser’s bread and butter, and most non-trivial optimisations are implemented as functions that take in some IR and emit some equivalent, but better optimised, IR.

Here’s our example IR that represents a simple functional language. There’s nothing special about functional languages in this example, but it makes the examples more concise.

#[derive(Clone, PartialEq, Debug)]
enum Expr {
    // An integer constant
    Int(i64),
    // A variable
    Var(String),
    // `x + y`
    Add(Box<Expr>, Box<Expr>),
    // `x * y`
    Mul(Box<Expr>, Box<Expr>),
    // `let v = x; y`
    Let(String, Box<Expr>, Box<Expr>),
    // `|v| x`
    Func(String, Box<Expr>),
    // f(x)
    Call(Box<Expr>, Box<Expr>),
}

Implementing a pass

Let’s implement a simple constant-folding optimisation pass for this IR. Constant folding simplifies trivial constant expressions (like 5 + 7 * 3) into their result (26).

fn const_fold(expr: Expr) -> Expr {
    match expr {
        Expr::Int(x) => Expr::Int(x),
        Expr::Var(name) => Expr::Var(name),
        // If both operands are constants, compute their sum
        Expr::Add(x, y) => match (const_fold(*x), const_fold(*y)) {
            (Expr::Int(x), Expr::Int(y)) => Expr::Int(x + y),
            (x, y) => Expr::Add(Box::new(x), Box::new(y)),
        }
        // If both operands are constants, compute their product
        Expr::Mul(x, y) => match (const_fold(*x), const_fold(*y)) {
            (Expr::Int(x), Expr::Int(y)) => Expr::Int(x * y),
            (x, y) => Expr::Mul(Box::new(x), Box::new(y)),
        }
        Expr::Let(v, x, y) => Expr::Let(
            v,
            Box::new(const_fold(*x)),
            Box::new(const_fold(*y)),
        ),
        Expr::Func(v, x) => Expr::Func(
            v,
            Box::new(const_fold(*x)),
        ),
        Expr::Call(f, x) => Expr::Call(
            Box::new(const_fold(*f)),
            Box::new(const_fold(*x)),
        ),
    }
}

Not bad. Our const_fold function recursively walks each node of the IR, folding any additions or multiplications that have constant operands. Let’s make sure it works:

#[test]
fn const_fold_test() {
    let expr = Expr::Add(
        Expr::Int(5).into(),
        Expr::Mul(Expr::Int(7).into(), Expr::Int(3).into()).into(),
    );

    let expr = const_fold(expr);

    assert_eq!(expr, Expr::Int(26));
}

The test passes.

test const_fold_test ... ok

Note, however, that much of the logic in the function is boilerplate! The logic of the pass cares only for the Add and Mul nodes. The rest of the code exists only to recursively invoke the folding logic at the rest of the IR nodes.

Visitors

How can we improve on this? The visitor pattern, borrowed from the object-oriented world, provides an answer. It involves defining an abstract class (or trait, in Rust parlance) that provides the core tree-walking logic, and can be extended/implemented by ‘visitors’ to add special behaviour for specific nodes of the tree.

Let’s give that a go.

trait Visitor {
    fn visit_expr(expr: &mut Expr);

    fn walk(expr: &mut Expr) {
        match expr {
            // These nodes have no children
            Expr::Int(_) | Expr::Var(_) => {},
            // For all other nodes, recursively call `walk` on children
            Expr::Add(x, y) => {
                Self::walk(x);
                Self::walk(y);
            }
            Expr::Mul(x, y) => {
                Self::walk(x);
                Self::walk(y);
            }
            Expr::Let(v, x, y) => {
                Self::walk(x);
                Self::walk(y);
            }
            Expr::Func(v, x) => Self::walk(x),
            Expr::Call(f, x) => {
                Self::walk(f);
                Self::walk(x);
            }
        }
        
        // Invoke the visitor logic on each IR node
        Self::visit_expr(expr);
    }
}

Here’s our generic Visitor trait. Now, we can reimplement our const-folding pass in terms of it.

struct ConstFold;

impl Visitor for ConstFold {
    fn visit_expr(expr: &mut Expr) {
        match expr {
            Expr::Add(x, y) => match (&**x, &**y) {
                (Expr::Int(x), Expr::Int(y)) => *expr = Expr::Int(*x + *y),
                _ => {}, // Ignore Add nodes without constant operands
            },
            Expr::Mul(x, y) => match (&**x, &**y) {
                (Expr::Int(x), Expr::Int(y)) => *expr = Expr::Int(*x * *y),
                _ => {}, // Ignore Mul nodes without constant operands
            },
            _ => {}, // Ignore all other IR nodes
        }
    }
}

Well, this is a little cleaner. Now we only need to mention the nodes we care about: Add and Mul nodes. So far, so good. You can imagine that this optimisation pass will continue to work, without problems, even when more kinds of IR nodes are added in the future. Only the generic visitor logic needs updating to accomodate the changes!

And, for completeness:

#[test]
fn const_fold_test() {
    let mut expr = Expr::Add(
        Expr::Int(5).into(),
        Expr::Mul(Expr::Int(7).into(), Expr::Int(3).into()).into(),
    );

    ConstFold::walk(&mut expr);

    assert_eq!(expr, Expr::Int(26));
}

Also fine.

test visitor_trait::const_fold_test ... ok

Let’s try implementing another pass.

Substituting variables

Our constant-folding pass is quite brittle. All we need is code like the following and it will fail to collapse the expression into a constant.

let five = 5;
five + 3

Our next pass will substitute variables with their definition. For completeness, here’s the long version:

fn substitute_vars(expr: Expr, vars: &mut Vec<(String, Expr)>) -> Expr {
    match expr {
        Expr::Int(x) => Expr::Int(x),
        // Search the list of active variables for a matching definition, and replace
        Expr::Var(name) => vars.iter_mut().rev()
            .find(|(n, _)| n == &name)
            .map(|(_, x)| x.clone())
            .unwrap_or(Expr::Var(name)),
        Expr::Add(x, y) => Expr::Add(
            Box::new(substitute_vars(*x, vars)),
            Box::new(substitute_vars(*y, vars)),
        ),
        Expr::Mul(x, y) => Expr::Add(
            Box::new(substitute_vars(*x, vars)),
            Box::new(substitute_vars(*y, vars)),
        ),
        // Compute the RHS of the let and add it to the active variables list
        Expr::Let(v, x, y) => {
            let x = substitute_vars(*x, vars);
            vars.push((v.clone(), x.clone()));
            let y = substitute_vars(*y, vars);
            vars.pop();
            Expr::Let(v, Box::new(x), Box::new(y))
        },
        Expr::Func(v, x) => {
            vars.push((v.clone(), Expr::Var(v.clone())));
            let expr = Expr::Func(v, Box::new(substitute_vars(*x, vars)));
            vars.pop();
            expr
        },
        Expr::Call(f, x) => Expr::Call(
            Box::new(substitute_vars(*f, vars)),
            Box::new(substitute_vars(*x, vars)),
        ),
    }
}

And a quick test:

#[test]
fn substitute_vars_test() {
    let expr = Expr::Let(
        "five".to_string(),
        Expr::Int(5).into(),
        Expr::Add(Expr::Var("five".to_string()).into(), Expr::Int(3).into()).into(),
    );

    let expr = substitute_vars(expr, &mut Vec::new()); // Output: let five = 5; 5 + 3
    let expr = const_fold(expr); // Output: let five = 5; 8

    assert_matches!(&expr, Expr::Let(_, _, box Expr::Int(8)));
}

As expected:

test substitute_vars_test ... ok

How can we implement this on top of the visitor trait? There are a few things we need to do:

Store the active variable list as visitor state
Correctly revert the list (with vars.pop()) when returning back down the IR tree

Our current design can’t accomodate those needs. This is going to necessitate some large changes to the Visitor trait:

trait Visitor {
    fn visit_expr_before(&mut self, expr: &mut Expr);
    fn visit_expr_after(&mut self, expr: &mut Expr);

    fn walk(&mut self, expr: &mut Expr) {
        self.visit_expr_before(expr);

        match expr {
            // These nodes have no children
            Expr::Int(_) | Expr::Var(_) => {},
            // For all other nodes, recursively call `walk` on children
            Expr::Add(x, y) => {
                self.walk(x);
                self.walk(y);
            }
            Expr::Mul(x, y) => {
                self.walk(x);
                self.walk(y);
            }
            Expr::Let(v, x, y) => {
                self.walk(x);
                self.walk(y);
            }
            Expr::Func(v, x) => self.walk(x),
            Expr::Call(f, x) => {
                self.walk(f);
                self.walk(x);
            }
        }

        self.visit_expr_after(expr);
    }
}

We’ve added a self argument so that the visitor can persist state. There are also two distinct methods for pre and post visition, which we need to implement list reverting.

Let’s implement variable substitution in terms of this new design:

struct SubstituteVars {
    vars: Vec<(String, Expr)>,
}

impl Visitor for SubstituteVars {
    fn visit_expr_before(&mut self, expr: &mut Expr) {
        match expr {
            Expr::Var(name) => {
                if let Some((_, x)) = self.vars.iter_mut().rev()
                    .find(|(n, _)| n == name)
                {
                    *expr = x.clone();
                }
            },
            Expr::Let(v, x, _) => {
                self.vars.push((v.clone(), (**x).clone()));
            },
            Expr::Func(v, _) => {
                self.vars.push((v.clone(), Expr::Var(v.clone())));
            },
            _ => {},
        }
    }

    fn visit_expr_after(&mut self, expr: &mut Expr) {
        match expr {
            Expr::Let(_, _, _) => {
                self.vars.pop();
            },
            Expr::Func(_, _) => {
                self.vars.pop();
            },
            _ => {},
        }
    }
}

Ok… this isn’t quite the beautiful design we’d first envisaged. But let’s roll with it for the moment.

#[test]
fn substitute_vars_test() {
    let mut expr = Expr::Let(
        "five".to_string(),
        Expr::Int(5).into(),
        Expr::Add(Expr::Var("five".to_string()).into(), Expr::Int(3).into()).into(),
    );

    SubstituteVars { vars: Vec::new() }.walk(&mut expr); // Output: let five = 5; 5 + 3
    let expr = const_fold(expr); // Output: let five = 5; 8

    assert_matches!(&expr, Expr::Let(_, _, box Expr::Int(8)));
}

At least it does the job.

test visitor_trait_v2::substitute_vars_test ... ok

So far, so good. But let’s try one more example to make sure. This time, we’re going to try optimising the following code. Our language supports variable shadowing, so this should be no problem:

let x = {
    let x = 3;
    x
};
x

Here’s the test:

#[test]
fn substitute_vars_test_2() {
    let mut expr = Expr::Let(
        "x".to_string(),
        Expr::Let(
            "x".to_string(),
            Expr::Int(3).into(),
            Expr::Var("x".to_string()).into()
        ).into(),
        Expr::Var("x".to_string()).into()
    );

    SubstituteVars { vars: Vec::new() }.walk(&mut expr);
    // `expr` should be: let x = { let x = 3; x }; { let x = 3; 3 }

    assert_matches!(&expr, Expr::Let(_, _, box Expr::Let(_, _, box Expr::Int(3))));
}

Here goes…

thread 'visitor_trait_v2::substitute_vars_test_2' (51) has overflowed its stack
fatal runtime error: stack overflow, aborting

Rats. What on earth happened?

The problem is that our generic visitor code will blindly recurse through all child nodes as if they were the same - even when it shouldn’t. In our original logic, we walked the RHS of a let expression first, then made sure to only place the result into the active variable list for the duration of the trailing expression:

Expr::Let(v, x, y) => {
    let x = substitute_vars(*x, vars); // `vars` does NOT contain `v`
    vars.push((v.clone(), x.clone())); // Add `v` to `vars`
    let y = substitute_vars(*y, vars);
    vars.pop(); // Remove `v` from `vars`
    Expr::Let(v, Box::new(x), Box::new(y))
},

This subtle distinction between the two results in our substitution pass recursively expanding the outer x into itself, eventually overflowing the stack. Not only is it functionally broken, it’s also logically incorrect.

Is there a way around this? Yes, but we’re going to need to expand the Visitor trait even further; perhaps now our visitation methods need to return a ControlFlow to tell the generic visitor code how and when it’s permitted to recurse into the child nodes.

This is rubbish

What’s going on here? We keep needing to evolve our Visitor trait with every new pass we write. We’ve taken one problem - needing to add some boilerplate to every pass each time we add an IR node - and have turned it into a worse problem, that of needing to update every existing pass each time a new pass has a requirement slightly off the beaten track.

This isn’t a problem that’s limited to the examples in this blog post either: it’s telling that very few of LLVM’s passes actually use the visitor framework that exists in the source code. There seems to be two options:

A visitor API that is utterly underpowered and is useless for most applications
A visitor API that is greebled to hell and back and is extremely difficult to work with

Can we do better?

Yes

A useful observation to make is that our previous design is an example of internal iteration. The traversal over the IR tree is controlled by the generic walk method, and each pass is but a simple client that is being driven by it. This means that if we want to control the traversal, we have to make specific allowances for that in our generic API: its design does not permit special cases. This is the killer flaw of internal iteration: it snatches control flow out of the hands of the user and makes it a responsibility of the API.

What would an external iteration visitor look like?

Let’s add a method to our IR node:

impl Expr {
    fn for_children_mut(&mut self, mut visit: impl FnMut(&mut Expr)) {
        match self {
            // These nodes have no children
            Self::Int(_) | Self::Var(_) => {},
            // For all other nodes, recursively call `walk` on children
            Self::Add(x, y) => {
                visit(x);
                visit(y);
            }
            Self::Mul(x, y) => {
                visit(x);
                visit(y);
            }
            Self::Let(v, x, y) => {
                visit(x);
                visit(y);
            }
            Self::Func(v, x) => visit(x),
            Self::Call(f, x) => {
                visit(f);
                visit(x);
            }
        }
    }
}

The method takes an arbitrary closure, visit. It calls this closure with every immediate child of the current node. There’s no recursion happening here, the IR tree isn’t being walked.

Let’s try implementing our ‘easy mode’ pass, constant folding, with the help of this method.

fn const_fold2(expr: &mut Expr) {
    expr.for_children_mut(|expr| const_fold2(expr));

    match expr {
        // If both operands are constants, compute their sum
        Expr::Add(box Expr::Int(x), box Expr::Int(y)) => *expr = Expr::Int(*x + *y),
        Expr::Mul(box Expr::Int(x), box Expr::Int(y)) => *expr = Expr::Int(*x * *y),
        _ => {},
    }
}

Wow. That’s it? Just a single recursive call - that we, the pass, control - and then a pattern-match for the exact cases we’re looking for?

This seems too good to be true. Let’s try variable substitution.

fn substitute_vars2(expr: &mut Expr, vars: &mut Vec<(String, Expr)>) {
    match expr {
        Expr::Var(name) => {
            if let Some((_, x)) = vars.iter_mut().rev()
                .find(|(n, _)| n == name)
            {
                *expr = x.clone();
            }
        },
        Expr::Func(v, x) => {
            vars.push((v.clone(), Expr::Var(v.clone())));
            x.for_children_mut(|expr| substitute_vars2(expr, vars));
            vars.pop();
        },
        Expr::Let(v, x, y) => {
            substitute_vars2(x, vars);
            vars.push((v.clone(), *x.clone()));
            y.for_children_mut(|expr| substitute_vars2(expr, vars));
            vars.pop();
        },
        _ => expr.for_children_mut(|expr| substitute_vars2(expr, vars)),
    }
}

Golly.

Maybe internal visitation is just bad

With one fell swoop, we’ve created a visitor design that is simpler for the passes that use it, simpler for the code that provides it, and is much more flexible. It doesn’t stop here either: this design can accomodate many more pass designs without so much as breaking a sweat.

It begs the question: why do we tend to reach for traits and abstract classes for this? I suspect that it’s largely because most of us started our programming lives in the world of heavily object-oriented languages, few of which have support for higher-order functions. Whatever the reason, I hope you found this post useful.

Comments

gaytabase | 2025-12-09

@jsbarretto there's a good way to design visitor apis?

Joshua Barretto | 2025-12-09

@dysfun Yes!

gaytabase | 2025-12-09

@jsbarretto i don't believe you, but i'll have a read

tef | 2025-12-09

@jsbarretto you might be interested in using a zipper as another form of 'external visitors'

Joshua Barretto | 2025-12-09

@tef For the sake of representing the tree, certainly! That's the first time I've heard them called that, so it's nice to know that 'the thing I did a while back to make walking trees fast' has a name! My focus in this post is where responsibility for driving the visitation sits: most designs I've seen in the wild seem to place the responsibility upon the visit interface itself, but that's basically never useful.

tef | 2025-12-09

@jsbarretto the key thing is that a visitor stores state on the call stack, but a zipper stores that state explicity

and as a result it's not only easier to have walkers with state, but you can also break up the operations, like walking to a specific type of node

Joshua Barretto | 2025-12-09

@tef Ah, I see. Would I be right in thinking that a gap buffer is a zipper for a list?

tef | 2025-12-09

@jsbarretto very similar yep, i think?

it's just a linked list with a flag for "i went right/i went left" with a binary tree, so a gap buffer or rope might be more "i am at this position and picked this child/range"