Software Options
Method
-
-a / --annealing : use simulated annealing method.
The goal for our algorithm is to pick one tree from each gene.
Therefore, in the consensus method (default method) we pick trees
based on their popularity since a popular tree could immediately "cover"
many genes. This strategy is useless, however, if there is no
duplicated trees in the data. In that case simulated annealing is a
better choice.
Input
- -f / --filename= : input file name.
Default = "data".
- -w / --delimiter= : set the delimiter(s) in the Newick tree
format of the input trees. Default = " ".
Output
- -d / --debug= : debug level. 0 for minimum output messages;
1 for default output messages; 2 for verbose output messages.
Default = 1.
Time controlling
- -t / --maximum_selecting_time= : maximum time in seconds allowed
to select the trees (negative values = unlimited).
Default = -1.
- -u / --maximum_combining_time= : maximum time in seconds allowed
to combine the trees (negative values = unlimited).
Default = -1.
Tree selecting
If there is only one tree for each gene, then these flags have no effect,
because the tree selecting process will not be executed.
- -n / --restart_iteration= : the number of times the main method
will be executed (either the consensus method or simulated annealing).
Default = 30.
- -s / --hitting_iteration= : the number of times the random
hitting procedure will be executed.
Default = 30.
- -e / --evaluating_iteration= : the number of times the tree
selection evaluating subroutin will be executed.
Default = 30.
All parameters are independent of each other. To improve the result,
increasing the --restart_iteration flag first. Setting
--hitting_iteration=30 is usually enough for normal cases. For a really
complicated problem, try to set --hitting_iteration=100 or a greater value.
Increasing the --evaluating_iteration flag yields a better estimation of
the number of interbreeding events needed to combine the selected trees. It
should be set to a smaller number if there are less than 5 interbreeding
events in the final graph. Again you might want to increase this flag
for complicated problems.
Tree combining
- -g / --graph_iteration= : the number of times the tree combining
procedure will be executed. Its value should never be less than the value of the
--evaluating_iteration flag.
Default = 100.
Consensus method options
- -q / --frequency_threshold : all trees appearing fewer times than
this value will be discarded by the consensus method procedure.
Default = 0.
Simulated annealing options (when looking for maximally similar trees)
- -i / --initial_temperature= : the initial temperature.
Default = 1.0.
- -z / --final_temperature= : the final temperature.
Default = 0.01.
- -p / --alpha= : the cool down factor. Should be a real number
between 0 and 1. Default=0.8.
- -m / --multiplier= : a large integer. Usually, it should be
greater than largest tree score times the number of genes.
Default = 10000.
- -r / --reverse_score : reverse the meaning of tree scores.
The default setting is that trees with lower scores are preferred by the tree
selecting algorithm.
Help
- -h / --help : print out all options without processing the input
data.
Sample commands
A sample command setting all flags relevant to the consensus method,
demonstrated in the long format (--xxx=arg):
python graph_of_life.py --consensus --filename="data1" --delimiter="; ,"
--debug=0 --maximum_selecting_time=120 --maximum_combining_time=120
--restart_iteration=100 --hitting_iteration=30
--evaluating_iteration=30 --graph_iteration=30 --frequency=10
A sample command setting all flags relevant to the simulated annealing method,
demonstrated in the short format (-x arg):
python graph_of_life.py -a -f data1 -w ",; " -d 0 -t 120 -u 120 -n 100 -s 30
-e 30 -g 30 -i 1.0 -z 0.01 -p 0.9 -m 100000 -r
back
Chien-I Liao