Home       Downloads       Documentation       Contact       Mesquite


Java ≥ 5 (aka v1.5) is required.


1) The code to calculate pairwise distances isn't yet written. Any program that creates a phylip-formatted distance matrix file may be used to create the distance matrix required by NINJA. Two such programs are QuickTree and FastTree.

2) To build a tree, the commands (including QuickTree) are:
quicktree -in a -out m  alignment_in_stockholm_format > distance_matrix

java -server -Xmx2G -jar Ninja.jar distance_matrix > tree_file

notes:

a) The "-server" flag improves speed for large (>2000) inputs by a couple percent. It makes things a little slower when the input has fewer than 1000 or so sequences.

b) the -Xmx2G flag controls how much RAM is allocated to the Java virtual machine. For large inputs, you'll probably want the value to be at least 1G (tests in my paper were with 2G). I've run it with up to 12G, and Morgan Price (of FastTree fame) has had success on a machine with 60G.

c) for inputs with fewer than about 7,000 sequences, the entire job can be done in-memory with 2G. You can force this by adding the argument "-m bin", and will get a roughly 3-fold increase in speed. Eventually NINJA will recognize when to use "bin" (an in-memory binary heap) instead of the default external-memory heap, but for now it's manually controlled.

java -server -Xmx2G -jar Ninja.jar  -m bin distance_matrix > tree_file

Note that the distance matrix must be stored in the standard phylip format, and the tree file is stored in newick format.

3) For big inputs (several thousand sequences), NINJA makes moderate use of the disk. That proves to be a negligible problem for other applications running concurrently, since most apps don't use the disk all that much, but can become a problem if several instances of NINJA are hitting the same disk. This is particularly a concern for clusters where the compute nodes all share a common disk, e.g. though NFS. NINJA allows you to manage this issue by specifying where it should place the temporary folder in which it holds all the temporary files used to store data structures on disk. If you have a cluster, each of your compute nodes likely has a local disk drive - you'd just tell NINJA to do all its temporary work in a directory that maps to that local disk. The flag "-t" gets you there :
java -server -Xmx2G -jar Ninja.jar -t /local/disk distance_matrix > tree


Details of the algorithms used in NINJA are available in the original paper, or this preprint of the 2009 WABI paper.