5
.. Anuj Pahwa, Gavin Huttley
5
7
Built-in Phylogenetic reconstruction
6
8
====================================
13
Given an alignment, a phylogenetic tree can be generated based on the pair-wise distance matrix computed from the alignment.
15
Estimating Pairwise Distances
16
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20
>>> from cogent import LoadSeqs, DNA
21
>>> from cogent.phylo import distance
22
>>> from cogent.evolve.models import F81
23
>>> aln = LoadSeqs('data/primate_brca1.fasta')
24
>>> d = distance.EstimateDistances(aln, submodel=F81())
27
The example above will use the F81 nucleotide substitution model and run the ``distance.EstimateDistances()`` method with the default options for the optimiser. To configure the optimiser a dictionary of optimisation options can be passed onto the ``run`` command. The example below configures the ``Powell`` optimiser to run a maximum of 10000 evaluations, with a maximum of 5 restarts (a total of 5 x 10000 = 50000 evaluations).
31
>>> dist_opt_args = dict(max_restarts=5, max_evaluations=10000)
32
>>> d.run(dist_opt_args=dist_opt_args)
34
============================================================================================
35
Seq1 \ Seq2 Galago HowlerMon Rhesus Orangutan Gorilla Human Chimpanzee
36
--------------------------------------------------------------------------------------------
37
Galago * 0.2112 0.1930 0.1915 0.1891 0.1934 0.1892
38
HowlerMon 0.2112 * 0.0729 0.0713 0.0693 0.0729 0.0697
39
Rhesus 0.1930 0.0729 * 0.0410 0.0391 0.0421 0.0395
40
Orangutan 0.1915 0.0713 0.0410 * 0.0136 0.0173 0.0140
41
Gorilla 0.1891 0.0693 0.0391 0.0136 * 0.0086 0.0054
42
Human 0.1934 0.0729 0.0421 0.0173 0.0086 * 0.0089
43
Chimpanzee 0.1892 0.0697 0.0395 0.0140 0.0054 0.0089 *
44
--------------------------------------------------------------------------------------------
46
Building A Phylogenetic Tree From Pairwise Distances
47
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
49
Phylogenetic Trees can be built by using the neighbour joining algorithm by providing a dictionary of pairwise distances. This dictionary can be obtained either from the output of ``distance.EstimateDistances()``
53
>>> from cogent.phylo import nj
54
>>> njtree = nj.nj(d.getPairwiseDistances())
55
>>> njtree = njtree.balanced()
56
>>> print njtree.asciiArt()
71
Or created manually as shown below.
75
>>> dists = {('a', 'b'): 2.7, ('c', 'b'): 2.33, ('c', 'a'): 0.73}
76
>>> njtree2 = nj.nj(dists)
77
>>> print njtree2.asciiArt()
87
We illustrate the phylogeny reconstruction by least-squares using the F81 substitution model. We use the advanced-stepwise addition algorithm to search tree space. Here ``a`` is the number of taxa to exhaustively evaluate all possible phylogenies for. Successive taxa will are added to the top ``k`` trees (measured by the least-squares metric) and ``k`` trees are kept at each iteration.
92
>>> from cogent.phylo.least_squares import WLS
93
>>> dists = cPickle.load(open('data/dists_for_phylo.pickle'))
95
>>> stat, tree = ls.trex(a = 5, k = 5, show_progress = False)
97
Other optional arguments that can be passed to the ``trex`` method are: ``return_all``, whether the ``k`` best trees at the final step are returned as a ``ScoredTreeCollection`` object; ``order``, a series of tip names whose order defines the sequence in which tips will be added during tree building (this allows the user to randomise the input order).
18
3rd-party apps for phylogeny
19
============================
102
We illustrate the phylogeny reconstruction using maximum-likelihood using the F81 substitution model. We use the advanced-stepwise addition algorithm to search tree space, setting
106
>>> from cogent import LoadSeqs, DNA
107
>>> from cogent.phylo.maximum_likelihood import ML
108
>>> from cogent.evolve.models import F81
109
>>> aln = LoadSeqs('data/primate_brca1.fasta')
110
>>> ml = ML(F81(), aln)
112
The ``ML`` object also has the ``trex`` method and this can be used in the same way as for above, i.e. ``ml.trex()``. We don't do that here because this is a very slow method for phylogenetic reconstruction.
114
Building phylogenies with 3rd-party apps such as FastTree or RAxML
115
==================================================================
117
A thorough description is :ref:`appcontroller-phylogeny`.