Java Authors: Liz McMillan, Pat Romanski, Xenia von Wedel, Elizabeth White, Ali Hussain

Related Topics: Java

Java: Article

Deriving the Visitor Pattern: A Review and Discussion

Why is the visitor pattern the way it is?

Like most other self-respecting developers I had also read the GoF book, including the section on the visitor pattern. However, when a colleague came over to me with a question, I could not initially justify the complexity of the example code I saw in the book. What follows is a discussion of why the visitor pattern is the way it is.

Brief Review of the Pattern
The definitive description of the pattern is in the GoF book Design Patterns, Chapter 5 (pp 331-344)(see References section). The Wikipedia has a concise and good description, which formed the basis for my brief review here. The visitor pattern is classified as a Behavioral pattern, so the thing to notice is the way in which the classes and objects interact and distribute responsibility. A typical application of this pattern occurs in the following scenario: we have a number of elements in an object structure (common structures include trees & lists) and we want to perform a bunch of disparate operations (e.g. printing or cloning each element) on the elements of the structure.

The visitor pattern is a way of separating the operation from the object structure and a way of collecting together the different implementations of an operation for different kinds of elements in the object structure. A Visitor class is created which knows how to perform a particular operation on the different kinds of elements in the object structure. Each type of element in the structure defines an accept() method that can accept any kind of Visitor. The visitor is passed to each element in the structure in turn, by calling its accept() method and the Visitor then performs the operation on the visited element. One important consequence of this separation of object structure and operation is that we can later add a new operation (a new kind of Visitor) without having to modify the element classes of the object structure.

Each type of Visitor defines several visit()methods, one for each kind of element. The basic insight is that the precise set of instructions to execute (i.e. the method or function to call) depends on the run-time types of both the Visitor & the visited element. Java only lets us call different methods based on the run-time type of one object (via virtual functions), so the pattern advocates a clever solution: The second dependency on the type of element visited is first resolved by polymorphically calling the accept() method of the visited element. accept() then resolves the first dependency by turning around and polymorphically calling the visit()method for its class.

An Example
Before this description gets too confusing, let us study the pattern in the context of a concrete problem: Let us say we need to traverse a list collecting node-specific information. The list has two kinds of nodes, say, Red and Black, which needed to be processed differently. It seems like an ideal application for the visitor pattern. Listing 1 shows the code. (All code samples in this article use a J2SE 5.0 compatible compiler.)

To me and my colleague, this initially seemed like an overly complex solution for a simple problem. NodeVisitor.doVisit() calls into the Node's accept methods, which simply delegates back into NodeVisitor. Furthermore, the accept() methods of RedNode and BlackNode are almost identical. Finally, notice that if we now add a GreenNode class, we need to add a new visitGreen() method to the NodeVisitor class and re-compile it (not to speak of the almost redundant implementation of accept() in the GreenNode class). Ugh! This does not seem kosher by any OO standard.

The Need for the accept() Methods
Novice armchair Java developers might ask why we can't do something simpler, like Listing 2, for example, without touching the Node interface, or the classes RedNode and BlackNode which implement it.

Listing 2 has two significant differences from the previous. First, there is no redundant method (namely accept()) for each node type to implement. Second, we use function name overloading for the visit() implementations, thus enabling the "clever" foreach loop, which iterates over each node and calls the appropriate overloaded version of visit() depending on the type of the current element. With this, we hope to contain all the visiting logic within NodeVisitor.

Alas, real developers have a more difficult job than arm-chair developers! If you are using a language like Java or C++, an overloaded function name like visit() has to get resolved at compile time. Thus line 6.iii will not compile because none of the visit() methods provided in NodeVisitor know how to accept a generic "Node" as argument.

For line 6.iii to work the way we want it to, the decision on what operation needs to be performed has to be delayed until we can determine at runtime the type of the node n being examined in the current iteration of the for-each loop.

Traditional OO languages (Java, C++ etc) provide us with one standard tool for delaying function resolution until run-time: virtual functions. Thus, in Listing 1, 6.iii is modified to a virtual function call n.accept(nv). So the actual function that gets called is decided at run-time. The version called then delegates work by invoking the right version of NodeVisitor.visit().

So Why Not Just Use Plain Vanilla Inheritance?
The explanation I just gave is good, but not good enough. I can almost hear you ask: why doesn't accept() do the work itself? Why does it have to delegate back to NodeVisitor? There are three reasons:

1.  Accumulating state: If you read the problem I presented closely, you will notice that I specified a need to collect node-specific information. Since the doVisit passes the same NodeVisitor instance to each accept(), the visitor can be used to accumulate state across the different Node objects. For example, say you have an Employee HR application where the Red nodes represent employees, the Black nodes represent managers, visitRed() calculates the pay raises for programmers, and visitBlack the pay raises for managers. The NodeVisitor nv could print a report of the total increase in salary expense at the end of the for loop.

2.  Supporting more than one visitor (the need for double dispatch): Say the next version of your Employee HR application needs to add a new HRPolicyVisitor that checks for compliance with some HR policy and the implementation is different for managers and programmers.

To accommodate both the types of Visitors, we introduce an additional layer of indirection - an abstract EmployeeNodeVisitor interface with virtual visitXXX() functions for each type of element to visit, namely visitProgrammer() & visitManager(). The old PayRaiseVisitor and the new HRPolicyVisitor both implement EmployeeNodeVisitor. The decision on which version of visit() gets called now gets determined by a two-step process. The first step is as before. The node type of the visited element n in the foreach loop determines which version of the virtual function accept() gets called. In the second step, the type of the EmployeeVisitor passed in to accept() determines the (virtual function) version of visitXXX() called. The source files that come with this article show the skeleton of this implementation. Figure 1 illustrates the sequence of calls from both doPayHike(), which uses a PayRaiseVisitor to raise the pay of each employee, and doEnforcePolicy() which uses a HRPolicyVisitor to check HR policy compliance.

More Stories By Nishanth Sastry

Nishanth Sastry is a software developer at IBM working on the WebSphere Portal & Workplace family of products. He has a Masters degree in Computer Science from the University of Texas at Austin. He enjoys well-written code & fall in New England, among other things. He lives in Concord, Mass.

Comments (2) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

Most Recent Comments
Rishi Sharma 12/27/05 06:01:14 AM EST

good article. Well laid with sequence diagram and code.

Bruce Wallace 10/23/05 06:31:14 PM EDT

In your Oct, 2005 JDJ article "Deriving the Visitor Pattern",
the author correctly describes the need for double dispatch
and hence the traditional approach of using an "accept"
method. However, in my article from 2001, I showed how
accept methods can be done away with via a couple of lines of
reflection code. See the article listed below for the technique
that can make the Visitor pattern more widely applicable since
the visited objects need not cooperate by defining an accept
method ahead of time.

[1] http://www.polyglotinc.com/articles.html
[2] http://www.ftponline.com/Archives/premier/mgznarch/javapro/2001/03mar01/...