174
187
=head2 Tree-Trait statistics
176
189
The following methods produce desciptors of trait distribution among
177
leaf nodes within the trees. They require that a trait has to be set
190
leaf nodes within the trees. They require that a trait has been set
178
191
for each leaf node. The tag methods of Bio::Tree::Node are used to
179
192
store them as key/value pairs. In this way, one tree can store more
182
195
Trees have method add_traits() to set trait values from a file.
200
Example : fitch($tree, $key, $node);
201
Description: Calculates Parsimony Score (PS) and internal trait
202
values using the Fitch 1971 parsimony algorithm for
203
the subtree a defined by the (internal) node.
204
Node defaults to the root.
205
Returns : true on success
206
Exceptions : leaf nodes have to have the trait defined
207
Args : 1. Bio::Tree::TreeI object
209
3. Bio::Tree::NodeI object within the tree, optional
211
Runs first L<fitch_up> that calculats parsimony scores and then
212
L<fitch_down> that should resolve most of the trait/character state
215
Fitch, W.M., 1971. Toward defining the course of evolution: minimal
216
change for a specific tree topology. Syst. Zool. 20, 406-416.
218
You can access calculated parsimony values using:
220
$score = $node->->get_tag_values('ps_score');
222
and the trait value with:
224
$traitvalue = $node->->get_tag_values('ps_trait'); # only the first
225
@traitvalues = $node->->get_tag_values('ps_trait');
227
Note that there can be more that one trait value, especially for the
235
my $key = shift || $self->throw("Trait name is needed");
236
my $node = shift || $tree->get_root_node;
238
$self->fitch_up($tree, $key, $node);
239
$self->fitch_down($tree, $node);
186
246
Example : ps($tree, $key, $node);
190
250
Node defaults to the root.
191
251
Returns : integer, 1< PS < n, where n is number of branches
192
252
Exceptions : leaf nodes have to have the trait defined
193
Args : Bio::Tree::TreeI object
195
Bio::Tree::NodeI object within the tree, optional
198
Fitch, W.M., 1971. Toward defining the course of evolution: minimal
199
change for a specific tree topology. Syst. Zool. 20, 406–416.
253
Args : 1. Bio::Tree::TreeI object
255
3. Bio::Tree::NodeI object within the tree, optional
258
This is the first half of the Fitch algorithm that is enough for
259
calculating the resolved parsimony values. The trait/chararacter
260
states are commonly left in ambiguos state. To resolve them, run
265
sub ps { shift->fitch_up(@_) }
269
Example : fitch_up($tree, $key, $node);
270
Description: Calculates Parsimony Score (PS) from the Fitch 1971
271
parsimony algorithm for the subtree a defined
272
by the (internal) node.
273
Node defaults to the root.
274
Returns : integer, 1< PS < n, where n is number of branches
275
Exceptions : leaf nodes have to have the trait defined
276
Args : 1. Bio::Tree::TreeI object
278
3. Bio::Tree::NodeI object within the tree, optional
280
This is a more generic name for L<ps> and indicates that it performs
281
the first bottom-up tree traversal that calculates the parsimony score
282
but usually leaves trait/character states ambiguous. If you are
283
interested in internal trait states, running L<fitch_down> should
284
resolve most of the ambiguities.
204
289
my $self = shift;
205
290
my $tree = shift;
206
291
my $key = shift || $self->throw("Trait name is needed");
338
Example : fitch_down($tree, $node);
339
Description: Runs the second pass from Fitch 1971
340
parsimony algorithm to resolve ambiguous
341
trait states left by first pass.
342
by the (internal) node.
343
Node defaults to the root.
345
Exceptions : dies unless the trait is defined in all nodes
346
Args : 1. Bio::Tree::TreeI object
347
2. Bio::Tree::NodeI object within the tree, optional
349
Before running this method you should have ran L<fitch_up> (alias to
350
L<ps> ). Note that it is not guarantied that all states are completely
359
my $node = shift || $tree->get_root_node;
361
my $key = 'ps_trait';
362
$self->throw ("ERROR: ". $node->internal_id. " needs a value for $key")
363
unless $node->has_tag($key);
366
foreach my $trait ($node->get_tag_values($key) ) {
370
foreach my $child ($node->each_Descendent) {
371
next if $child->is_Leaf; # end of recursion
374
foreach my $trait ($child->get_tag_values($key) ) {
375
$intersection->{$trait}++ if $nodev->{$trait};
378
$self->fitch_down($tree, $child);
379
$child->set_tag_value($key, keys %$intersection);
387
Example : persistence($tree, $node);
388
Description: Calculates the persistence
389
for node in the subtree defined by the (internal)
390
node. Node defaults to the root.
391
Returns : int, number of generations trait value has to remain same
392
Exceptions : all the nodes need to have the trait defined
393
Args : 1. Bio::Tree::TreeI object
394
2. Bio::Tree::NodeI object within the tree, optional
397
Persistence is a measure of the stability the trait value has in a
398
tree. It expresses the number of generations the trait value remains
399
the same. All the decendants of the root in the same generation have
400
to share the same value.
402
Depends on Fitch's parsimony score (PS).
410
my $value = shift || $self->throw("Value is needed");
413
my $key = 'ps_trait';
415
$self->throw("Node is needed") unless $node->isa('Bio::Tree::NodeI');
417
return 0 unless $node->get_tag_values($key) eq $value; # wrong value
418
return 1 if $node->is_Leaf; # end of recursion
420
my $persistence = 10000000; # an arbitrarily large number
421
foreach my $child ($node->each_Descendent) {
422
my $pers = $self->_persistence($tree, $child, $value);
423
$persistence = $pers if $pers < $persistence;
425
return $persistence + 1;
431
my $node = shift || $tree->get_root_node;
432
$self->throw("Node is needed") unless $node->isa('Bio::Tree::NodeI');
434
my $key = 'ps_trait';
435
my $value = $node->get_tag_values($key);
438
my $persistence = $self->_persistence($tree, $node, $value);
439
$node->set_tag_value('persistance', $persistence);
444
=head2 count_subclusters
446
Example : count_clusters($tree, $node);
447
Description: Calculates the number of sub-clusters
448
in the subtree defined by the (internal)
449
node. Node defaults to the root.
451
Exceptions : all the nodes need to have the trait defined
452
Args : 1. Bio::Tree::TreeI object
453
2. Bio::Tree::NodeI object within the tree, optional
455
Depends on Fitch's parsimony score (PS).
459
sub _count_subclusters {
463
my $value = shift || $self->throw("Value is needed");
465
my $key = 'ps_trait';
467
$self->throw ("ERROR: ". $node->internal_id. " needs a value for trait $key")
468
unless $node->has_tag($key);
470
if ($node->get_tag_values($key) eq $value) {
471
if ($node->get_tag_values('ps_score') == 0) {
475
foreach my $child ($node->each_Descendent) {
476
$count += $self->_count_subclusters($tree, $child, $value);
484
sub count_subclusters {
487
my $node = shift || $tree->get_root_node;
488
$self->throw("Node is needed") unless $node->isa('Bio::Tree::NodeI');
490
my $key = 'ps_trait';
491
my $value = $node->get_tag_values($key);
493
return $self->_count_subclusters($tree, $node, $value);
498
Example : count_leaves($tree, $node);
499
Description: Calculates the number of leaves with same trait
500
value as root in the subtree defined by the (internal)
501
node. Requires an unbroken line of identical trait values.
502
Node defaults to the root.
503
Returns : int, number of leaves with this trait value
504
Exceptions : all the nodes need to have the trait defined
505
Args : 1. Bio::Tree::TreeI object
506
2. Bio::Tree::NodeI object within the tree, optional
508
Depends on Fitch's parsimony score (PS).
515
my $node = shift || $tree->get_root_node;
518
my $key = 'ps_trait';
520
$self->throw ("ERROR: ". $node->internal_id. " needs a value for trait $key")
521
unless $node->has_tag($key);
523
if ($node->get_tag_values($key) eq $value) {
524
#print $node->id, ": ", $node->get_tag_values($key), "\n";
525
return 1 if $node->is_Leaf; # end of recursion
528
foreach my $child ($node->each_Descendent) {
529
$count += $self->_count_leaves($tree, $child, $value);
539
my $node = shift || $tree->get_root_node;
540
$self->throw("Node is needed") unless $node->isa('Bio::Tree::NodeI');
542
my $key = 'ps_trait';
543
my $value = $node->get_tag_values($key);
545
return $self->_count_leaves($tree, $node, $value);
548
=head2 phylotype_length
550
Example : phylotype_length($tree, $node);
551
Description: Sums up the branch lengths within phylotype
552
exluding the subclusters where the trait values
554
Returns : float, length
555
Exceptions : all the nodes need to have the trait defined
556
Args : 1. Bio::Tree::TreeI object
557
2. Bio::Tree::NodeI object within the tree, optional
559
Depends on Fitch's parsimony score (PS).
563
sub _phylotype_length {
569
my $key = 'ps_trait';
571
$self->throw ("ERROR: ". $node->internal_id. " needs a value for trait $key")
572
unless $node->has_tag($key);
574
return 0 if $node->get_tag_values($key) ne $value;
575
return $node->branch_length if $node->is_Leaf; # end of recursion
578
foreach my $child ($node->each_Descendent) {
579
my $sub_len = $self->_phylotype_length($tree, $child, $value);
581
$length += $child->branch_length if not $child->is_Leaf and $sub_len;
587
sub phylotype_length {
590
my $node = shift || $tree->get_root_node;
592
my $key = 'ps_trait';
593
my $value = $node->get_tag_values($key);
595
return $self->_phylotype_length($tree, $node, $value);
598
=head2 sum_of_leaf_distances
600
Example : sum_of_leaf_distances($tree, $node);
601
Description: Sums up the branch lengths from root to leaf
602
exluding the subclusters where the trait values
604
Returns : float, length
605
Exceptions : all the nodes need to have the trait defined
606
Args : 1. Bio::Tree::TreeI object
607
2. Bio::Tree::NodeI object within the tree, optional
609
Depends on Fitch's parsimony score (PS).
613
sub _sum_of_leaf_distances {
619
my $key = 'ps_trait';
621
$self->throw ("ERROR: ". $node->internal_id. " needs a value for trait $key")
622
unless $node->has_tag($key);
623
return 0 if $node->get_tag_values($key) ne $value;
624
#return $node->branch_length if $node->is_Leaf; # end of recursion
625
return 0 if $node->is_Leaf; # end of recursion
628
foreach my $child ($node->each_Descendent) {
629
$length += $self->_count_leaves($tree, $child, $value) * $child->branch_length +
630
$self->_sum_of_leaf_distances($tree, $child, $value);
635
sub sum_of_leaf_distances {
638
my $node = shift || $tree->get_root_node;
640
my $key = 'ps_trait';
641
my $value = $node->get_tag_values($key);
643
return $self->_sum_of_leaf_distances($tree, $node, $value);
646
=head2 genetic_diversity
648
Example : genetic_diversity($tree, $node);
649
Description: Diversity is the sum of root to leaf distances
650
within the phylotype normalised by number of leaf
652
Returns : float, value of genetic diversity
653
Exceptions : all the nodes need to have the trait defined
654
Args : 1. Bio::Tree::TreeI object
655
2. Bio::Tree::NodeI object within the tree, optional
657
Depends on Fitch's parsimony score (PS).
661
sub genetic_diversity {
664
my $node = shift || $tree->get_root_node;
666
return $self->sum_of_leaf_distances($tree, $node) /
667
$self->count_leaves($tree, $node);
672
Example : statratio($tree, $node);
673
Description: Ratio of the stem length and the genetic diversity of the
674
phylotype L<genetic_diversity>
675
Returns : float, separation score
676
Exceptions : all the nodes need to have the trait defined
677
Args : 1. Bio::Tree::TreeI object
678
2. Bio::Tree::NodeI object within the tree, optional
680
TStatratio gives a measure of separation and variability within the phylotype.
681
Larger values identify more rapidly evolving and recent phylotypes.
683
Depends on Fitch's parsimony score (PS).
690
my $node = shift || $tree->get_root_node;
692
my $div = $self->genetic_diversity($tree, $node);
693
return 0 if $div == 0;
694
return $node->branch_length / $div;
318
764
Node defaults to the root;
319
765
Returns : hashref with trait values as keys
320
766
Exceptions : leaf nodes have to have the trait defined
321
Args : Bio::Tree::TreeI object
323
Bio::Tree::NodeI object within the tree, optional
326
* Monophyletic Clade (MC) size statistics by Salemi at al 2005. It is
327
calculated for each trait value. 1<= MC <= nx, where nx is the
328
number of tips with value x:
767
Args : 1. Bio::Tree::TreeI object
769
3. Bio::Tree::NodeI object within the tree, optional
771
Monophyletic Clade (MC) size statistics by Salemi at al 2005. It is
772
calculated for each trait value. 1 E<lt>= MC E<lt>= nx, where nx is the
773
number of tips with value x:
330
775
pick the internal node with maximim value for
331
776
number of of tips with only trait x
333
MC was defined by Parker et al 2008.
778
MC was defined by Parker et al 2008.
335
Salemi, M., Lamers, S.L., Yu, S., de Oliveira, T., Fitch, W.M., McGrath, M.S.,
780
Salemi, M., Lamers, S.L., Yu, S., de Oliveira, T., Fitch, W.M., McGrath, M.S.,
336
781
2005. Phylodynamic analysis of Human Immunodeficiency Virus Type 1 in
337
782
distinct brain compartments provides a model for the neuropathogenesis of
338
AIDS. J. Virol. 79 (17), 11343–11352.
783
AIDS. J. Virol. 79 (17), 11343-11352.
340
Parker, J., Rambaut A., Pybus O., 2008. Correlating viral phenotypes
785
Parker, J., Rambaut A., Pybus O., 2008. Correlating viral phenotypes
341
786
with phylogeny: Accounting for phylogenetic uncertainty Infection,
342
Genetics and Evolution 8 (2008), 239–246.
787
Genetics and Evolution 8 (2008), 239-246.