This package is used in the analysis of the GP system, looking at a single generation of individuals, or even multiple generations.
This package uses the gems:individual
structure from mini-gp
to represent
an individual.
Functions
best-fitness-in-generation (generation)
best-individuals-in-generation (generation &optional tolerance)
clean-individuals (individuals run-experiment)
find-best-worst-pairs (individuals)
read-log (filename)
Used to read in a log file, and return a list of lists, each child list being the values in a line of the log file.
This function can also read the output of write-deadcode-generations.
Example:
> sbcl * (require 'asdf) * (require 'gems) * (gems:read-log "logfile.csv") ((1 0.48 0.5 0.48 5050.0 0.99 18.0 0.09 1.0) (2 0.48 0.5 0.48 5050.0 0.99 18.0 0.09 1.0) (3 0.48 0.5 0.48 3850.0 0.96 6.0 0.03 1.0) (4 0.48 0.5 0.48 220.0 0.34 8.0 0.04 1.0) (5 0.48 0.5 0.48 240.0 0.33 6.0 0.03 1.0) (6 0.48 0.5 0.48 8530.0 1.0 18.0 0.09 1.0) (7 0.48 0.5 0.48 1735.0 0.56 21.0 0.1 1.0) ... )
read-similarity-file (filename)
Used to read a set of similarities from a dat file (see Output format) into an array.
read-trace (filename)
Used to read in a trace file, as created by a logger
of kind :trace
.
The function returns a list of generations, where each generation is itself a list comprising the generation number and then the individuals in that generation.
Example, at REPL, read in a trace file and show, for each generation, the generation number, the number of individuals in that generation, and the fitness of the first individual:
> sbcl * (require 'asdf) * (require 'gems) * (setf results (gems:read-trace "sample-trace.yml")) * (dolist (g results) (format t "~a ~a ~a" (first g) (length (rest g)) (gems:individual-fitness (second g)))) 5000 1000 0.9356
write-deadcode-generations (generations filebase run-experiment)
-
generations
is a list of generation values, as read byread-trace
-
filebase
is the first part of the filename - each generation is saved to a file formed from the filebase and the generation number -
run-experiment
is a function to run an experiment on a given individual
For each generation, computes the proportion of dead code in every instance, and then reports the frequency of each proportion under the following groups as a CSV file:
0, "0.0-0.1", 0 1, "0.1-0.2", 5 2, "0.2-0.3", 3 3, "0.3-0.4", 7 4, "0.4-0.5", 10 5, "0.5-0.6", 12 6, "0.6-0.7", 15 7, "0.7-0.8", 20 8, "0.8-0.9", 6 9, "0.9-1.0", 0
In addition, saves a file named "filebase-stats.csv" which contains the generation number, mean and standard-deviation of the dead-code proportion in CSV format.
write-fitness-generations (generations filebase)
-
generations
is a list of generation values, as read byread-trace
-
filebase
is the first part of the filename - each generation is saved to a file formed from the filebase and the generation number
For each generation, computes the fitness similarity of every pair of instances and outputs the result to a file - one file per generation - in the Output format described below.
Fitness similarity is the absolute difference of the fitness of the two models.
write-similarity-generations (generations filebase)
-
generations
is a list of generation values, as read byread-trace
-
filebase
is the first part of the filename - each generation is saved to a file formed from the filebase and the generation number
For each generation, computes the model similarity of every pair of instances and outputs the result to a file - one file per generation - in the Output format described below.
Model similarity is computed using syntax-tree:similarity.
write-similarity-individuals (individuals filename)
-
individuals
is a list ofgems:individual
instances -
filename
is the name of a file to save the similarity data to
Given a list of individuals, this function computes the similarity of every pair of individuals and outputs the result to a file of the given filename, in the Output format described below.
Output format
The functions write-fitness-generations
, write-similarity-generations
and
write-similarity-individuals
each compare every pair of individuals in one or
more given generations. They output their findings in the following output
format:
0 0 1.00 0 1 0.00 0 2 0.14 0 3 0.38 0 4 0.20 0 5 0.00 0 6 0.00 0 7 0.00 0 8 0.14 0 9 0.14 1 0 0.00 1 1 1.00 1 2 0.33 1 3 0.10 1 4 0.00 1 5 0.33 ...
Each row in the file has three numbers: the index of each of the two individuals, followed by the (fitness or model) similarity score. There is a blank line between each "row" of the output.