Home | Download repository

Tutorial 1: Inferring phylogenies using maximum likelihood

These exercises were prepared by Maria Anisimova

1) Basic ML inference: Run PhyML (the menu mode). Set the model to HKY+Gamma, estimating the transition/transversion ratio and the alpha parameter of the Gamma distribution by maximum likelihood (ML).

Run the program twice: first estimating the nucleotide frequencies by ML and the second time estimating them empirically from data.

2) ML Tree vs. starting tree: By default PhyML builds a BioNJ tree and uses this tree as a starting tree. Run PhyML without the tree-search, so that all model parameters are optimized on the BioNJ tree.

3) Model comparison: Use now GTR+Gamma and JC+Gamma, GTR, HKY and JC.

4) Branch supports: Evaluate branch supports for the inferred ML phylogeny.

5) Command line: Discover the specification of PhyML options on a command line by typing phyml -h. Write down command lines to execute the analyses performed in (1)-(4). 

6)** (Only for the brave) Inferring ML phylogenies with codon models: 
For this task, use CodonPhyML (the menu mode) to analyse your dataset (data should be protein-coding DNA). The menu interface is very similar to PhyML except CodonPhyML includes codon models and some additional amino acid models (eg, PCA models by Zoller and Schneider; antibody model AB by Mirsky et al 2015; models for ordered/disordered proteins by Szalkowski and Anisimova 2011).

7) Analyse the primate AA dataset.

8) Reanalyse the protein data sets from Lerat et al. (2003) “From gene trees to organismal phylogeny in Prokaryotes: The case of the gammaTProteobacteria.” PLoS Biology 1:101T109


  1. http://darwin.uvigo.es/download/papers/b04.modelPhylHandbook03.pdf

Document last updated on 02.02.2016

© 2016 Lorenzo Gatti – Applied Computational Genomic Team (ACGT) @ Institute of Applied Simulations (ZHAW) | Wädenswil | Zürich