Visitor Support in JavaCC/JJTree

Expanded by Ken Beesley, Xerox Research Centre Europe

from Metamata/SUN documentation and examples.

Comments and corrections welcome

The Visitor Design Pattern

(Last edited: 20 February 2005)

JJTree provides useful "support" for the Visitor Design Pattern. For general information on the Visitor Design Pattern, and especially the reasons for using it, see

(The alternative to using JJTree is JTB (Java Tree Builder), which also uses the Visitor Design Pattern.)

Mindtuning: JJTree and Visitors

Visitors are not magic, and they can't do all the "semantic" processing you might need when implementing a new language. Visitors are designed to work in a specific scenario, and within that scenario they are very flexible and attractive.

Here's the scenario suitable for Visitors: In the context of JavaCC and JJTree, the parser parses statements or whole programs (your input text) and returns a handle to a node which is the root of an Abstract Syntax Tree (AST); that AST represents the input parsed. After the parsing and tree-building are done, the AST tree then typically needs to be traversed and processed in one or more ways, perhaps including:

  1. Dumping/Display (walk the tree and print it out in some helpful way; principally for use in debugging)
  2. Modification (e.g. reordering, trimming, expansion, optimization, etc. of the AST tree itself)
  3. Interpretation (e.g. walk an AST tree representing an arithmetic expression, and compute the value)
  4. Compilation (walk the AST tree, outputting code for some physical or abstract machine)
  5. etc.

Each of these operations involves walking the (already constructed) AST tree, starting at the root, and executing suitable code for each type of AST node encountered. The Visitor design is an elegant and flexible way to implement such operations on AST trees, allowing you to concentrate all the actual code for the Dumper, Interpreter, Compiler or whatever in a single "Visitor" object.

A Visitor is an object that "visits" an AST tree and does something useful with the information in the AST.

Prerequisites

Visitors are an advanced subject that even some good programmers may find difficult to grasp. Before getting into Visitors,
  1. Be sure that you already have a solid understanding of JavaCC parsing.
  2. Be sure that you already have a solid understanding of JJTree tree-building. If you don't understand JJTree, you don't have a prayer of understanding and implementing Visitors.
  3. If you think that you understand JJTree, then you're probably mistaken. E.g. before getting into Visitors you should be comfortable with JJTree building, AST node types, and node methods like childrenAccept(), jjtGetNumChildren(), jjtGetChild() and jjtGetParent(). If this means nothing to you, then forget Visitors for a while and go back and study the "JJTree Documentation" (in doc/jjtintro.html) and the JJTree examples supplied with the JavaCC release.
Go back to the JavaCC and JJTree examples supplied with the JavaCC release and study them thoroughly before jumping into the difficult topic of Visitors.

An Official SUN Example

A vital, very instructive example of the use of the Visitor pattern is shown in the JavaCC release under /examples/JJTreeExamples/eg4.jjt. This eg4.jjt example should be printed out, studied very closely, and run. In this example, note that input is parsed, an AST tree is built, and then a Visitor object is instantiated and used to dump the AST.

Remember the sequence:

  1. Parse the input, producing an AST tree.
  2. Then instantiate a Visitor and use it to visit the AST tree.
  3. In real life, you may have several different kinds of Visitor object (e.g. a DumperVisitor, an InterpreterVisitor, a CompilerVisitor), and the AST tree may be visited multiple times by various Visitor objects.

The eg4.jjt example is so important that commented excerpts will be included below.

A Brief Recap of JJTree

JJTree is too big and complicated a subject to cover here; you should already be a JJTree expert before getting into Visitors. However, reiterating a few of the features of JJTree at this point may help in understanding how Visitor-support works:

  1. I assume here that you indicate the option MULTI = true; in the .jjt source file.
    options {
      MULTI = true ;
      VISITOR = true ;
      // possibly other JJTree options here
    }
    
  2. By default, during compilation, JJTree will automatically generate a Java class definition for every AST node type required by your grammar, if you have not supplied the file by hand. Thus if your grammar contains a production named statement, which will be represented in abstract syntax trees, then JJTree will automatically generate a file named ASTstatement.java containing a definition of the ASTstatement node class--unless, of course, you have already hand-written ASTstatement.java yourself.
  3. By default, when parsing actual input to create an AST tree, JJTree/JavaCC will instantiate an AST node to represent every NON-terminal involved in parsing the input. If you set the option MULTI = true;, then JJTree uses a different "type" of abstract-syntax-tree node for every production in the grammar. E.g. if you have a production named statement, then by default JJTree will represent it in an AST tree as a node of type ASTstatement; a production named or will be represented as a node of type ASTor, a production named assignment as a node of type ASTassignment, etc.
  4. There are ways to override this default behavior and to control manually how nodes are named and how the AST trees are built. You should already understand them well before jumping into Visitors. The ability to write Visitors presupposes that you understand completely the AST trees being produced by your parser.

How to tell JJTree to "provide Visitor support"

Once you understand JJTree itself, and how it builds AST trees, you invoke JJTree's Visitor support by setting VISITOR=true in the options section at the beginning of your .jjt file, e.g.

options {
   MULTI = true ;
   VISITOR = true ;   // <--magic option for Visitor support
}

When this option is set to true, JJTree will automatically do two things:

  1. Insert a jjtAccept() method into each of the AST node class definitions that it generates. E.g. if your grammar has a production named statement, the class defined in the automatically generated file ASTstatement.java will now have a method named jjtAccept().
  2. Generate a Visitor "interface" (a standard kind of Java interface) with an empty method for each type of AST node used in your .jjt grammar. If the name of your language/parser is foo, and so the JJTree source file is foo.jjt, then this automatically generated interface will have the name fooVisitor and will therefore reside in the file named fooVisitor.java. (If you don't understand interfaces, then go back and study some basic Java documentation.)

Defining a Visitor class

We continue to assume that your language is called foo and so is defined in the file foo.jjt. The automatically generated visitor interface will reside in file fooVisitor.java.

When you define (manually) a particular type of Visitor object, giving it a name like fooInterpreterVisitor or fooCompilerVisitor, it must implement the automatically generated fooVisitor interface; e.g. the class definition for fooInterpreterVisitor (written in a file named fooInterpreterVisitor.java) should start like this:

class fooInterpreterVisitor implements fooVisitor {

...
}

and this fooInterpreterVisitor class should contain a handwritten visit() method to deal with each type of AST node that your grammar generates. Because fooVisitor is an interface with an empty method for each type of AST node really used in your parser grammar, failure to specify a method for each type of AST node inside your fooInterpreterVisitor class will cause a (helpful) compile-time error. This helps you keep your various Visitor objects complete and in sync with the AST trees produced by your parser. So the fact that JJTree generates a Visitor interface, which your hand-written Visitor objects must successfully implement, is a Good Thing.

In practice, the best way to start writing a Visitor is to make a copy of the fooVisitor.java file, renaming it as something like fooInterpreterVisitor.java, and then fill in the methods.

Whenever you recompile foo.jjt, with VISITOR=true ; specified as an option, the interface fooVisitor will be updated to include all the AST nodes currently used; if you forget to update your various Visitor classes when you change your grammar and trees, this will again result in (helpful) compile-time error messages.

In the eg4.jjt example, note the following:

In your main() method, you will typically invoke the parser and get back a handle to an AST node representing the root of the AST tree. This is the parsing/tree-building step, and it occurs before any "visiting" is done. You can then instantiate one of your (manually defined) Visitor objects and invoke the jjtAccept() method of the root node, passing in a handle to the Visitor.

In object-oriented parlance, you send a message to the AST tree (or more precisely, to the root node of the AST tree), telling it to "accept" your Visitor object (which might be a Dumper, an Interpreter, a Compiler, etc.).

In the case of example eg4, the Visitor is a "dumper", i.e. it is written to walk the tree and dump it out in a useful way. The class eg4DumpVisitor is defined in the file eg4DumpVisitor.java, which somebody wrote by hand and which is of course included with the example.

// this is eg4.jjt, supplied with the JavaCC release

options {
   MULTI=true ;
   VISITOR=true;
   NODE_DEFAULT_VOID=true;
}

PARSER_BEGIN(eg4)

class eg4 {

public static void main (String args[]) {
   System.out.println("Reading from standard input...") ;
   eg4 t = new eg4(System.in) ;  // instantiate an eg4 parser,
		// to read from System.in (the standard input)

   try {
      // the "root" goal in the eg4 grammar is Start()
      // call t.Start()  (i.e. tell the parser t to parse a Start).
      // If successful, it returns a reference to an ASTStart node,
      // which is the root of the Abstract Syntax Tree

      ASTStart n = t.Start() ;

      // Now that we have an AST tree, with n as the root,
      // we can instantiate a suitably written Visitor object and 
      // "visit" the AST tree

      // This example has defined a particular kind of Visitor
      // called eg4DumpVisitor; it implements the automatically
      // generated interface eg4Visitor, which resides in eg4Visitor.java
      // The purpose of this particular kind of Visitor is to
      // "dump" the tree in a particular format desired by the writer.

      eg4Visitor v = new eg4DumpVisitor() ;

      // note the declaration of v to be eg4Visitor, this is the
      // interface type shared by all Visitor objects in this eg4 language

      n.jjtAccept(v, null) ;

      // and here the jjtAccept method of n is being invoked, passing
      // in v, the handle to the just-instantiated eg4DumpVisitor object, 
      // and a null value which is ignored in this example. (Don't worry
      // about the null for now.  Just accept that it can be useful in more
      // advanced examples.)

      // in other words, n is being sent a message to "accept" the
      // Visitor v

      // n is of type ASTStart, and with VISITOR=true, JJTree will
      // have automatically inserted the jjtAccept() method in the
      // ASTStart class
      
   } catch {
   // ...
   }
}
}

PARSER_END(eg4)

//...

Automatically Generated jjtAccept() methods

If you indicated VISITOR=true in your JJTree options{}, then inside each AST node-class definition automatically generated by JJTree will be a jjtAccept() method that looks like the following. Note that the definition of class ASTStart() will be in a file named ASTStart.java. This class is from the eg4 example.

class ASTStart() :
{}
{
   // (snip) the usual constructors here

   // this jjtAccept() method is automatically inserted by JJTree

   public Object jjtAccept(eg4Visitor visitor, Object data) {
	return visitor.visit(this, data) ;
   }
}

This jjtAccept() method is a bit more complicated than the typical accept() methods used in most illustrations of the Visitor Design Pattern. Note here that the automatically generated jjtAccept() method expects two arguments, the first being the reference to an appropriate visitor (in this example, it is passed a reference to an eg4DumpVisitor); and the second is an Object that could be any Java object. In the eg4 example, and perhaps in many practical applications, the second argument passed in is just null. (It's just ignored in this example, but in more complicated examples the second argument can be very useful for passing information back and forth.)

Even if you do not need the second argument in your current application, you should not modify the automatically generated methods. Just pass in null as the second argument to jjtAccept() and don't worry about it for now.

Note also that the jjtAccept() method passes back an Object, so there must be a return statement in the body of jjtAccept() that passes back some Object (or a null). In this example, and perhaps in many practical examples, the calling program, ultimately main(), simply ignores the returned value. (Rest assured, however, that the ability to pass back an Object value can be very useful in more complicated examples.)

The presence of the second argument to jjtAccept(), and the fact that jjtAccept() always returns a value, allows for fancier passing around of information than the simpler examples of visitors in the general Java literature. The key point, for understanding Visitors, is that the jjtAccept() method in the AST object is passed a handle to a particular Visitor.

The User-Written Visitor Object

Now, given that a jjtAccept() method is automatically inserted into every AST node class, the real work gets done (or the real code gets found) inside the user-written Visitor class definition, which in this example is in the file eg4DumpVisitor.java. It concentrates all the code necessary for dumping an AST tree in one object. It should be emphasized that this file must be written by hand. It must implement the automatically generated visitor interface, here eg4Visitor, which helps tremendously to keep your visitors complete and in sync with your grammar. Here's what the supplied eg4DumpVisitor.java looks like:

/**
 *
 * Copyright (c) 1996-1997 Sun Microsystems, Inc.
 *
 * Use of this file and the system it is part of is constrained by the
 * file COPYRIGHT in the root directory of this system.
 *
 */

/* This is an example of how the Visitor pattern might be used to
   implement the dumping code that comes with SimpleNode.  It's a bit
   long-winded, but it does illustrate a couple of the main points.

   1) the visitor can maintain state between the nodes that it visits
   (for example the current indentation level).

   2) if you don't implement a jjtAccept() method for a subclass of
   SimpleNode, then SimpleNode's acceptor will get called.  This almost
   always indicates an error, as explained below un "SimpleNode and
   Extended AST Nodes"

   3) the utility method childrenAccept() can be useful when
   implementing preorder or postorder tree walks.

*/

public class eg4DumpVisitor implements eg4Visitor
{
  private int indent = 0;

  // Visitors can include any number of helper fields and methods, like
  // the indent variable above and the following method
  // indentString(), which is used during the dumping

  private String indentString() {
    StringBuffer sb = new StringBuffer();
    for (int i = 0; i < indent; ++i) {
      sb.append(" ");
    }
    return sb.toString();
  }


  // The real "meat" of the eg4DumpVisitor starts here.

  // This Visitor contains a visit() method (below) for each specific
  // type of AST node in the grammar--it also contains a catch-all method
  // for SimpleNode, which is inherited by all the specific
  // AST nodes--if this SimpleNode method gets used, a kind
  // of error message is printed

  // when an AST node "accepts" a visitor, its jjtAccept() method
  // makes a callback to
  // the visit() method of the Visitor itself, passing a reference to
  // itself; the particular visit() method that gets performed will
  // be the one whose first argument matches the type of the calling
  // AST node

  // in Java parlance, the visit() method of the Visitor is
  // overloaded, and the various flavors of visit() are
  // distinguished by the type of the first argument, which is
  // some subtype of AST node

  public Object visit(SimpleNode node, Object data) {
    System.out.println(indentString() + node +
		       ": acceptor not unimplemented in subclass?");

                       // I think UNimplemented above is a typo
    ++indent;
    data = node.childrenAccept(this, data);
    --indent;
    return data;
  }

  // here are the visit() methods for specific AST node classes

  public Object visit(ASTStart node, Object data) {
    System.out.println(indentString() + node);
    ++indent;
    data = node.childrenAccept(this, data);
    --indent;
    return data;
  }

  public Object visit(ASTAdd node, Object data) {
    System.out.println(indentString() + node);
    ++indent;
    data = node.childrenAccept(this, data);
    --indent;
    return data;
  }

  public Object visit(ASTMult node, Object data) {
    System.out.println(indentString() + node);
    ++indent;
    data = node.childrenAccept(this, data);
    --indent;
    return data;
  }

  public Object visit(ASTMyOtherID node, Object data) {
    System.out.println(indentString() + node);
    ++indent;
    data = node.childrenAccept(this, data);
    --indent;
    return data;
  }

  public Object visit(ASTInteger node, Object data) {
    System.out.println(indentString() + node);
    ++indent;
    data = node.childrenAccept(this, data);
    --indent;
    return data;
  }
}

/*end*/

The visit() methods in the example above call a built-in method (built into SimpleNode.java) called childrenAccept(), which calls the jjtAccept() method on each of the children nodes in left-to-right order. Another useful built-in node method is jjtGetChild(n), where n is an integer, which returns a handle to the nth child, with the counting starting at zero. The node method jjtGetNumChildren() returns the number of children nodes. Thus the method above written

  public Object visit(ASTMult node, Object data) {
    System.out.println(indentString() + node);
    ++indent;
    data = node.childrenAccept(this, data);
    --indent;
    return data;
  }

could also be written (roughly) as

  public Object visit(ASTMult node, Object data) {
    int i, k = node.jjtGetNumChildren() ;

    System.out.println(indentString() + node);
    ++indent;

    for (i = 0; i < k; i++) {
      data = node.jjtGetChild(i).jjtAccept(this, data) ;
    }

    --indent;
    return data;
  }
at least if the return value is being ignored. When walking AST trees, either with or without the Visitor Design Pattern, the following node methods, documented (more or less) in the "JJTree Documentation" (doc/jjtintro.html) from the JavaCC people, may be useful. (The JJTree Documentation is quite inadequate in many ways; the only way to understand what these node methods really do is to read the code in SimpleNode.java. When iterating through children nodes, the Java operator instanceof may also come in handy.) Where n is a handle to an AST node in an AST tree:
  1. n.childrenAccept(this, data) calls jjtAccept(this, data) on each of the child nodes of n.
  2. n.jjtGetNumChildren() returns the number of children of n.
  3. n.jjtGetChild(m) returns a handle to the mth child of n, where 0 <= m < n.jjtGetNumChildren()
  4. n.jjtGetParent() returns a handle to the parent node of n.
  5. n.jjtAddChild(h, i), where h is a handle to an AST node, and i is an integer, adds h as a child of n. (Reading the code, it becomes clear that "adding" a child i, then there is already a child i, causes the existing child to be overwritten.)
  6. n.jjtSetParent(q), sets node q as the parent of n.

Note that this eg4DumpVisitor class implements the interface called eg4Visitor, which is automatically generated by JJTree when you specify VISITOR = true as an option. Inside eg4DumpVisitor there is a user-written visit() object for each kind of AST node; if you forget to provide one, you will get a (helpful) compile-time error because the interface eg4Visitor will cause the compiler to expect a complete set of fleshed-out visit() methods. The various overloaded visit() methods are differentiated by the type of the first argument (i.e. by the subtype of AST node). The automatically generated eg4Visitor.java file (the interface) looks like this:

/* Generated By:JJTree: Do not edit this line. eg4Visitor.java */

public interface eg4Visitor
{
  public Object visit(SimpleNode node, Object data);
  public Object visit(ASTStart node, Object data);
  public Object visit(ASTAdd node, Object data);
  public Object visit(ASTMult node, Object data);
  public Object visit(ASTMyOtherID node, Object data);
  public Object visit(ASTInteger node, Object data);
}

This interface file will be (helpfully) updated automatically whenever you recompile your .jjt file, so that it always reflects all and only the types of AST nodes that need to be fleshed-out by any hand-written Visitor class.

How Do these Visitors Really Work?

What really happens when you tell an AST tree node to "accept" a visitor like eg4DumpVisitor? (This is the part that took me the longest time to grasp.)

As shown above in the main() method, the eg4 parser is called and passes back a handle named n to an ASTStart node (in this example grammar, the ASTStart node is the root of the Abstract Syntax Tree). Then main() instantiates an eg4DumpVisitor object called v and then calls the jjtAccept() method of n

        eg4Visitor v = new eg4DumpVisitor() ;
	n.jjtAccept(v, null)

passing in the v handle (the just-instantiated eg4DumpVisitor object) and a null. The null is just a filler in this example; the automatically generated jjtAccept() method needs a second argument, but this example doesn't use it in any way. The call n.jjtAccept(v, null) is telling node n to "accept" the indicated visitor.

As we saw, the jjtAccept() method inside class ASTStart is just

class ASTStart() :
{}
{
   // (snip) the usual constructors here

   public Object jjtAccept(eg4Visitor visitor, Object data) {
	return visitor.visit(this, data) ;
   }
}

so this jjtAccept() method just does a call back to the eg4DumpVisitor itself,

        visitor.visit(this, data)

represented here by the dummy parameter visitor, passing to the visitor's visit method a handle to the AST node itself (this), and a handle to the Object data, which in this example is just null. So what this jjtAccept() method in ASTStart() effectively does is send a message to the indicated Visitor (here an eg4DumpVisitor object), saying "Here's a handle to me (which will be of type ASTStart). Do with me whatever you are supposed to do with an AST node of my type". And it does that for whatever kind of Visitor it may be asked to "accept"; and there may be a half dozen different Visitors that potentially apply to the same AST tree, dumping it, modifying it, interpreting it, compiling it into output code, etc.

Each time that any AST node "accepts" a visitor, it simply sends a message back to that Visitor, invoking the Visitor's .visit() method, passing a handle to itself, and saying "deal with me". The call to

        visitor.visit(this, data)
calls the visit method of the visitor, but in fact the visitor is a whole collection of overloaded visit() methods, one for each AST node type. The particular visit() method in the Visitor that will be invoked is the one whose first argument matches the type of the calling node.
When an AST node "accepts" a Visitor, it calls the Visitor's visit() method, passing in a reference to itself to the visitor , and effectively saying "Do with me whatever you are supposed to do with an AST node of my type.

Advantages of Visitors

See the URLs at the beginning of this document for more information on Visitors, keeping in mind that the jjtAccept() methods and the visit() methods inside JJTree examples are a bit more complex than the usual examples in the literature.

In general, Visitors and the Visitor support in JJTree allow you to

  1. Keep your AST node classes simple--each has a single, generic jjtAccept() method for accepting any Visitor
  2. Implement a Dumper, TreePruner, Interpreter, Compiler or whatever as a single Visitor object, concentrating all the code in that one object (rather than scattered about as separate methods in the various AST node class definition files)
  3. Keep your Visitors and your AST trees in sync

The Anti-Visitor Approach, for Contrast

You don't absolutely need Visitors to do interesting things with AST trees. Using just JJTree and the node methods listed above, you can implement dumpers, interpreters and compilers without knowing anything about Visitors at all. The argument, which I think valid, is that the Visitor Design Pattern helps you to implement such things more elegantly.

The alternative to the Visitor design pattern, for writing an AST tree Dumper, Optimizer, Interpreter or Compiler, is to edit each AST node class definition and add specialized methods. Thus to implement dumping in an anti-visitor way, one would manually edit the .java definition of each AST node class, adding a dump() method that dumps out appropriate information for that particular kind of node and in turn calls the dump() methods of any children nodes. The "dumper" would then be spread out over all the various class definition files, which makes it rather hard to edit, visualize and keep consistent.

If you later wanted to implement an interpreter, in the anti-visitor way, you would simply have to re-edit all the AST node class definition files and add an interpret() method to each AST node class. The interpreter code would then be spread over dozens of separate files. (This, in fact, is the way that the JavaCC examples/Interpreter example is implemented. Take a close look at it.) Similarly, if you wanted a compiler, then you could add a compile() method to each class definition. Etc., etc., etc.

It doesn't take too much imagination to see that the Anti-Visitor approach is workable but messy, scattering the "dumper", "interpreter" or "compiler" all over the place and making it very hard to visualize any of these applications as a whole.

In the (virtuous) Visitor alternative, there is just a single generic jjtAccept() method in each node class definition, and it can "accept" any suitably written Visitor that is passed to it. With the Visitor design pattern, all the methods that constitute a dumper are concentrated into one DumpVisitor object. Similarly, all the methods that constitute an interpreter are concentrated into one InterpreterVisitor object, and the same for any other kind of Visitor.

Using the (virtuous) Visitor design, the AST node class definitions automatically generated by JJTree are almost all "finished", seldom having to be hand edited.

Hand-Modified Class Definitions for Non-Terminals

Recall that JJTree, by default, creates an AST node to represent each non-terminal (each production) involved in parsing the input. The classic cases where the node class files automatically generated by JJTree are not completely adequate involve cases where the parse tree needs to store some kind of terminal information. Take the case of a language that allows statements like the following:
foo = 2 + 5 ;
Presumably the tokenizer will contain definitions such as
TOKEN :
{
  < IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)* >
|
  < INTEGER: (<DIGIT>)+ >
|
  < #LETTER: ["_","a"-"z","A"-"Z"] >
|
  < #DIGIT: ["0"-"9"] >
}
and the parser (simplifying shamelessly) will contain productions such as
ASTAssignment Assignment():
{}
{
   Identifier() "=" Expression() ";"
   { return jjtThis ; }
}

void Identifier():
{}
{
   <IDENTIFIER>
}

void Expression():
{}
{
   Integer() ( "+" Integer() )*
}

void Integer():
{}
{
   <INTEGER>
}
If the input
foo = 2 + 5 ;
is parsed, then the resulting AST tree should look like
ASTAssignment
   ASTIdentifier
   ASTExpression
      ASTInteger
      ASTInteger
i.e. the root would be an ASTAssignment node, with two children: an ASTIdentifier node and an ASTExpression node. The ASTExpression node would have two children, each an ASTInteger node. The parser would tell us that the input is well-formed, but the resulting tree is inadequate to support interpretation. In particular, the tree nodes tell us that the assignment expression contained an Identifier and two Integers, but they don't "remember" which Identifier and Integers were involved.

In such cases, your Interpreter or Compiler (and even some dumpers) need to have the Integer Strings (or values) and the Identifier names copied and stored in the AST nodes. The following production from eg4 shows how to store such information in the AST nodes.

void Identifier() #MyOtherID :
{
  Token t;
}
{
  t=<IDENTIFIER>
  {
    jjtThis.setName(t.image);
  }
}
Note the first line
void Identifier() #MyOtherID :
which explicitly renames the node to be produced by this production as MyOtherID. The file ASTMyOtherID.java is hand-modified and is supplied along with the eg4.jjt file. During the parsing of an Identifier, the line
  t=<IDENTIFIER>
points t (declared as Token) to the token object for the IDENTIFIER. Then the action
  {
    jjtThis.setName(t.image);
  }
refers to the magic (automatically provided) handle jjtThis, which is automatically set to the AST node currently being constructed, and invokes a setName() method, passing to it the .image field of the Token t. (The definition of the Token type, and the String .image field, are provided automatically by JavaCC. They are available for you to use in such actions.) Where is this setName() method defined? In the hand-edited MyOtherID.java file, which looks like this:
/*
 *                 Sun Public License Notice
 * 
 * The contents of this file are subject to the Sun Public License
 * Version 1.0 (the "License"). You may not use this file except in
 * compliance with the License. A copy of the License is available at
 * http://www.sun.com/
 * 
 * The Original Code is JavaCC. The Initial Developer of the Original
 * Code is Sun Microsystems, Inc. Portions Copyright 1996-2002 Sun
 * Microsystems, Inc. All Rights Reserved.
 */

public class ASTMyOtherID extends SimpleNode {
  private String name;   // String field added by hand

  ASTMyOtherID(int id) {
    super(id);
  }

  /** Accept a visitor. **/
  public Object jjtAccept(eg4Visitor visitor, Object data) {
    return visitor.visit(this, data);
  }

  public void setName(String n) {   // setName method added by hand
    name = n;
  }

  public String toString() {
    return "Identifier: " + name;
  }

}
Note that the class defines a private field
  private String name;
and the setName() method
  public void setName(String n) {
    name = n;
  }
called for by the action in the production. So there's no magic. The name field and the setName method were written by hand in the ASTMyOtherID class definition. For Integer nodes, the action could invoke a method hand-written into a similar ASTInteger node class definition that sets a String field, or a slightly more complicated action could compute the integer value and store the integer value in an int field of the ASTInteger node. Such user actions are well documented in the JJTree documentation and will not be explained to death here. The thing to note is that the class ASTMyOtherID defined above also contains the usual jjtAccept() method, just as in the automatically generated AST class definition files. In fact, the best procedure is often to let JJTree automatically generate node type definitions for all the productions, including the insertion of the jjtAccept() methods, and then hand-edit the small subset involving Integers, Identifiers, and others similar cases where the terminal Token-image information needs to be stored (remembered) in an AST node. Typically what will be added are simple fields and simple actions to set them.

SimpleNode and Extended AST Nodes

This section contains some more detailed information about AST nodes and the jjtAccept() method. I continue to assume that the JJTree option MULTI = true has been specified, so that (by default) a different class of AST node is used to represent each production in the parser.

The AST node classes must all implement an interface called Node. (This is just a rule of JJTree.) In practice, JJTree automatically generates a base class SimpleNode (residing in SimpleNode.java) that implements this Node interface, and all of the automatically generated subtypes of AST node classes "extend" SimpleNode. Thus the various AST node classes implement the Node interface a bit indirectly.

Expert users are free to modify or add to the node methods in SimpleNode.java to suit their special needs; but in the end it must still at least flesh out the methods in the Node interface. If a file SimpleNode.java is supplied by the developer, then JJTree will not generate a new one. Beginners should probably abstain from modifying SimpleNode.java.

Now the big mystery: I and others have noticed that both the class SimpleNode and all the AST classes that extend SimpleNode have an "identical" method jjtAccept() inserted into them. If all the AST node classes indeed "extend" SimpleNode, and they do, then they should inherit SimpleNode's jjtAccept() method and have no need of their own "copy".

I've finally managed to explain the reason to myself. Here's an attempt to lay it out for others, going back over the main points in more detail.

  1. With the option VISITOR = true, JJTree inserts a jjtAccept() method into every automatically generated AST node definition. E.g. for a language called SPL, you might see

    /* Generated By:JJTree: Do not edit this line. ASTAssignment.java */
    
    public class ASTAssignment extends SimpleNode {
      public ASTAssignment(int id) {
        super(id);
      }
    
      public ASTAssignment(SPL p, int id) {
        super(p, id);
      }
    
    
      /** Accept the visitor. **/
      public Object jjtAccept(SPLVisitor visitor, Object data) {
        return visitor.visit(this, data);
      }
    }
    

    If you write AST node class definitions by hand, or modify the automatically generated ones, then you should ensure that they too contain the jjtAccept() method.

  2. Note that this example, class ASTAssignment, extends SimpleNode, which in turn implements the interface called Node. The class SimpleNode is of course defined in the automatically supplied SimpleNode.java, which looks like this (minus some snips):

    /* Generated By:JJTree: Do not edit this line. SimpleNode.java */
    
    public class SimpleNode implements Node {
      protected Node parent;
      protected Node[] children;
      protected int id;
      protected SPL parser;
    
      public SimpleNode(int i) {
        id = i;
      }
    
      public SimpleNode(SPL p, int i) {
        this(i);
        parser = p;
      }
    
      // snip
    
      public Node jjtGetChild(int i) {
        return children[i];
      }
    
      public int jjtGetNumChildren() {
        return (children == null) ? 0 : children.length;
      }
    
    
      /** Accept the visitor. **/
      public Object jjtAccept(SPLVisitor visitor, Object data) {
        return visitor.visit(this, data);
      }
    
      // snip
    }
    

    The main point to notice is that both the SimpleNode class and the ASTAssignment node class that extends SimpleNode contain apparently identical jjtAccept() methods:

      /** Accept the visitor. **/
      public Object jjtAccept(SPLVisitor visitor, Object data) {
        return visitor.visit(this, data);
      }
    
    The jjtAccept() method in the ASTAssignment node overrides the apparently identical method in SimpleNode.
  3. It might thus seem that the insertion of the jjtAccept() method into each AST node type is (at least initially) superfluous, because such classes will automatically inherit the "identical" jjtAccept() method from SimpleNode. But it's not superfluous.

    Having the jjtAccept() method explicitly copied into each AST node definition would facilitate manual customization--but that's not the issue here.

  4. I found the following when writing an interpreter, implemented as a Visitor. (I have specified the MULTI=true and VISITOR=true options.)

    In the InterpreterVisitor, in the visit() method where I visit a particular node n, I tried to access one of the children of n and send it a message to accept the same visitor:

       	n.jjtGetChild(0).jjtAccept(this, data)
    

    But instead of this resulting in a call to the visit() method suitable for the AST node type of the child, it resulted in a call to the visit() method for SimpleNode. Mystery.

  5. I eventually figured out that the AST node type definition of the child was handwritten by me (it needed to store a String during parsing) and it LACKED ITS OWN jjtAccept() METHOD.

    So, here's what was happening: When I invoked jjtAccept() on the child as above, it was using the inherited jjtAccept() method from SimpleNode

      /** Accept the visitor. **/
      public Object jjtAccept(SPLVisitor visitor, Object data) {
        return visitor.visit(this, data);
      }
    

    and inside SimpleNode, "this" has the type SimpleNode. And that resulted in the invocation (in the Visitor) of the visit() method for SimpleNode.

    As soon as I added a jjtAccept() method to my handwritten AST node definition, to override the jjtAccept method in SimpleNode, everything worked perfectly.

  6. So the insertion of the overt jjtAccept() method into each AST node type definition (extending SimpleNode) appears to be necessary, at least with MULTI=true. In the context of a class definition like the following:

    /* Generated By:JJTree: Do not edit this line. ASTAssignment.java */
    
    public class ASTAssignment extends SimpleNode {
      public ASTAssignment(int id) {
        super(id);
      }
    
      public ASTAssignment(SPL p, int id) {
        super(p, id);
      }
    
    
      /** Accept the visitor. **/
      public Object jjtAccept(SPLVisitor visitor, Object data) {
        return visitor.visit(this, data);
      }
    }
    

    the reference to "this" in the call to visitor.visit(this, data) has type ASTAssignment rather than SimpleNode. And that results in a call to the visit() method suitable for ASTAssignment rather than the visit() method for SimpleNode.


    Comments and corrections to this document would be appreciated. ken.beesley@xrce.xerox.com