GEMS Code Notes :: syntax-tree

The syntax-tree package is mostly used internally, by other packages. For example, gpstats calls this package to get statistical information which it then saves to a file on a per-generation basis.

Some functions, such as replace-timeonly-code, are useful when analysing the final selected models from one or more runs of the GP system.

As explained in the next section, two different representations of parse trees exist: where possible, the functions described below will convert between them.

Representations of Parse Trees

A parse tree is a representation of the program that is run when our model is made to run an experiment. Most simply, we represent these parse trees as s-expressions - nested lists of lists:

'(IF (COMPARE-1-2)
     (PROG2 (ACCESS-STM-1) (RESPOND-LEFT))
     (RESPOND-RIGHT))

However, when tracing the performance of our model through an experiment, we need to gather further information from it. For this purpose, we have a node structure, which has a field for the node label, a field for the node children, and, so far, one field called "entries" to record how often this node has been used.

The node label is usually the first item in the s-expression, and the children the rest of the items. Two functions are provided to convert from an s-expression to a node representation, and from a node representation to an s-expression:

(tree-from-sexp tree)

Converts an s-expression tree representation into one using the node structure.

(tree-to-sexp tree)

Converts a tree representation using the node structure into one using s-expressions only.

Example:

> sbcl
* (require 'asdf)
* (require 'gems)
* (setf sexp '(if (compare-1-2) (prog2 (access-stm-1) (respond-left)) (respond-right)))
(IF (COMPARE-1-2)
    (PROG2 (ACCESS-STM-1) (RESPOND-LEFT))
    (RESPOND-RIGHT))
* (gems:tree-from-sexp sexp)
#S(PARSE-TREE::NODE
   :LABEL IF
   :ENTRIES 0
   :CHILDREN (#S(PARSE-TREE::NODE :LABEL COMPARE-1-2 :ENTRIES 0 :CHILDREN NIL)
              #S(PARSE-TREE::NODE
                 :LABEL PROG2
                 :ENTRIES 0
                 :CHILDREN (#S(PARSE-TREE::NODE
                               :LABEL ACCESS-STM-1
                               :ENTRIES 0
                               :CHILDREN NIL)
                            #S(PARSE-TREE::NODE
                               :LABEL RESPOND-LEFT
                               :ENTRIES 0
                               :CHILDREN NIL)))
              #S(PARSE-TREE::NODE
                 :LABEL RESPOND-RIGHT
                 :ENTRIES 0
                 :CHILDREN NIL)))
* (gems:tree-to-sexp (gems:tree-from-sexp sexp))
(IF (COMPARE-1-2)
    (PROG2 (ACCESS-STM-1) (RESPOND-LEFT))
    (RESPOND-RIGHT))

Structures

The following functions are exported to work with the node structure:

node-p

node-label

node-entries

node-children

Functions

(program-depth tree)

tree is a program, automatically converted to an s-expression if necessary

Returns the maximum depth of the given tree, where depth is the number of nodes you would traverse when walking from the root (top) node to a leaf node.

Example:

* (gems:program-depth '(if (compare-1-2) (prog2 (access-stm-1) (respond-left)) (respond-right)))
3

(The depth is 3 because IF → PROG2 → ACCESS-STM-1 is the longest path from the root to a leaf.)

(program-size tree)

tree is a program, automatically converted to an s-expression if necessary

Returns the number of nodes within the given tree.

Example:

* (gems:program-size '(if (compare-1-2) (prog2 (access-stm-1) (respond-left)) (respond-right)))
6

(proportion-dead-code tree run-experiment)

tree is a program, automatically converted to an s-expression if necessary
run-experiment is a function to run an experiment on a given individual

This function runs the given experiment on the program and identifies which parts of the program have not been used. It then returns the proportion of unused parts (dead code) to the total.

(remove-dead-code tree run-experiment)

tree is a program, automatically converted to an s-expression if necessary
run-experiment is a function to run an experiment on a given individual

This function runs the given experiment on the program and identifies which parts of the program are not used during the experiment. It then returns a simplified version of the program, with all unused code removed.

(replace-timeonly-code tree run-experiment equivalents)

tree is a program, automatically converted to an s-expression if necessary
run-experiment is a function to run an experiment on a given individual
equivalents is an association list, matching operators to their temporal equivalents

This function considers every node in the given program tree and determines if its action affects only the model clock or also its performance. If it affects the performance, then it is left in place. If it only affects the model clock, then it is replaced by an operator taking the same amount of time but having no other effect.

These replacement operators are determined based on the group that the operator falls in: wait-cognitive, wait-input, wait-output and wait-stm are used respectively to replace cognitive, input, output, or STM operators.

For example, in the DMTS task, using the extended language:

(gems:replace-timeonly-code
    tree
    #'run-experiment
    '((input-target . wait-input)
      (input-left . wait-input)
      (input-right . wait-input)
      (respond-left . wait-output)
      (respond-right . wait-output)
      (compare-1-2 . wait-cognitive)
      (compare-1-3 . wait-cognitive)
      (compare-2-3 . wait-cognitive)
      (put-stm . wait-cognitive)
      (access-stm-1 . wait-stm)
      (access-stm-2 . wait-stm)
      (access-stm-3 . wait-stm)
     ))

will rewrite the following program:

(IF (DOTIMES-3
       (PROG4 (PROG2 (INPUT-LEFT)
                     (INPUT-TARGET))
              (INPUT-RIGHT)
              (PROG3 (PUT-STM)
                     (COMPARE-1-2)
                     (INPUT-LEFT))
              (COMPARE-1-3)))
      (PROG4 (COMPARE-1-2)
             (PROG2 (WAIT-25)
                    (INPUT-LEFT))
             (RESPOND-RIGHT)
             (PROG2 (COMPARE-1-2)
                    (WAIT-50)))
      (DOTIMES-3 (RESPOND-LEFT)))

into:

(IF (DOTIMES-3
       (PROG4 (PROG2 (WAIT-INPUT)
                     (INPUT-TARGET))
              (INPUT-RIGHT)
              (PROG3 (PUT-STM)
                     (WAIT-COGNITIVE)
                     (WAIT-INPUT))
              (COMPARE-1-3)))
    (PROG4 (WAIT-COGNITIVE)
           (PROG2 (WAIT-25)
                  (WAIT-INPUT))
           (RESPOND-RIGHT)
           (PROG2 (WAIT-COGNITIVE)
                  (WAIT-50)))
    (DOTIMES-3 (RESPOND-LEFT)))

Notice in the second line how INPUT-LEFT has been rewritten as WAIT-INPUT. The specific actions which affect the model’s performance are now clearer.

(save-as-dot tree filename

tree - the tree to save to file.
filename - the file to save the dot representation of the tree to.

Convert a program (written as an s-expression) into a dot format file, as used by Graphviz. Install Graphviz if you wish to use the dot program to create images, or dot files can be pasted to http://www.webgraphviz.com/ to view.

Example:

> sbcl
* (require 'asdf)
* (require 'gems)
* (gems:save-as-dot '(if (compare-1-2) (respond-left) (respond-right)) "sample-tree.dot")
*
> dot -Tpng sample-tree.dot -o sample-tree.png

sample tree

(program-similarity model-1 model-2)

model-1 and model-2 are program trees, converted to s-expressions if necessary

The similarity measure adopted here is fairly simple: it relies on breaking the program down into components and then measuring their similarity using the Jaccard similarity coefficient.

Given two sets, A and B, the Jaccard similarity coefficient is computed as:

|A intersection B| / |A union B|

The components of a program’s parse tree are taken on a per-node basis. Each component includes:

the first symbol (the node label)
the first symbols of each of the child nodes

Hence, for the following program:

(PROG2 (ACCESS-STM-3)
    (PROG2 (ACCESS-STM-1)
        (IF (PROG2
                (IF (ACCESS-STM-3)
                    (COMPARE-1-3)
                    (COMPARE-1-2))
                (NIL))
            (NIL)
            (PUT-STM))))

We get the following components. Notice the first component has the "PROG2" label of the top of the program, and then the "ACCESS-STM-3" and "PROG2" symbols from its two children - there is no information about what is below the child "PROG2" symbol.

(PROG2 ACCESS-STM-3 PROG2)
(PROG2 ACCESS-STM-1 IF)
(ACCESS-STM-1)
(IF PROG2 NIL PUT-STM)
(PROG2 IF NIL)
(IF ACCESS-STM-3 COMPARE-1-3 COMPARE-1-2)
(ACCESS-STM-3)
(COMPARE-1-3)
(COMPARE-1-2)
(NIL)
(PUT-STM)

Given the second program:

(IF (IF INPUT-TARGET
        (COMPARE-1-2)
        (COMPARE-2-3))
    (IF (ACCESS-STM-1)
        INPUT-LEFT
        (ACCESS-STM-2))
    RESPOND-LEFT)

with its components:

(IF IF IF RESPOND-LEFT)
(IF INPUT-TARGET COMPARE-1-2 COMPARE-2-3)
(INPUT-TARGET)
(COMPARE-1-2)
(COMPARE-2-3)
(IF ACCESS-STM-1 INPUT-LEFT ACCESS-STM-2)
(ACCESS-STM-1)
(INPUT-LEFT)
(ACCESS-STM-2)
(RESPOND-LEFT)

There are only 2 components ("COMPARE-1-2" and "ACCESS-STM-1") in common out of 19 in total, so the similarity is 2/19, or 0.105.

GEMS Package: syntax-tree

Representations of Parse Trees

Structures

Functions