~ubuntu-branches/ubuntu/natty/python-cogent/natty

« back to all changes in this revision

Viewing changes to cogent/core/sequence.py

  • Committer: Bazaar Package Importer
  • Author(s): Steffen Moeller
  • Date: 2010-12-04 22:30:35 UTC
  • mfrom: (1.1.1 upstream)
  • Revision ID: james.westby@ubuntu.com-20101204223035-j11kinhcrrdgg2p2
Tags: 1.5-1
* Bumped standard to 3.9.1, no changes required.
* New upstream version.
  - major additions to Cookbook
  - added AlleleFreqs attribute to ensembl Variation objects.
  - added getGeneByStableId method to genome objects.
  - added Introns attribute to Transcript objects and an Intron class.
  - added Mann-Whitney test and a Monte-Carlo version
  - exploratory and confirmatory period estimation techniques (suitable for
    symbolic and continuous data)
  - Information theoretic measures (AIC and BIC) added
  - drawing of trees with collapsed nodes
  - progress display indicator support for terminal and GUI apps
  - added parser for illumina HiSeq2000 and GAiix sequence files as 
    cogent.parse.illumina_sequence.MinimalIlluminaSequenceParser.
  - added parser to FASTQ files, one of the output options for illumina's
    workflow, also added cookbook demo.
  - added functionality for parsing of SFF files without the Roche tools in
    cogent.parse.binary_sff
  - thousand fold performance improvement to nmds
  - >10-fold performance improvements to some Table operations

Show diffs side-by-side

added added

removed removed

Lines of Context:
27
27
from operator import eq, ne
28
28
from random import shuffle
29
29
import re
30
 
import logging
31
 
LOG = logging.getLogger('cogent.data')
 
30
import warnings
32
31
 
33
32
__author__ = "Rob Knight, Gavin Huttley, and Peter Maxwell"
34
33
__copyright__ = "Copyright 2007-2009, The Cogent Project"
35
34
__credits__ = ["Rob Knight", "Peter Maxwell", "Gavin Huttley",
36
35
                    "Matthew Wakefield", "Daniel McDonald"]
37
36
__license__ = "GPL"
38
 
__version__ = "1.4.1"
 
37
__version__ = "1.5.0"
39
38
__maintainer__ = "Rob Knight"
40
39
__email__ = "rob@spot.colorado.edu"
41
40
__status__ = "Production"
685
684
    def gettype(self):
686
685
        """Return the sequence type."""
687
686
        
688
 
        return self.MolType._type
 
687
        return self.MolType.label
689
688
    
690
689
    def resolveambiguities(self):
691
690
        """Returns a list of tuples of strings."""
692
691
        ambigs = self.MolType.resolveAmbiguity
693
692
        return [ambigs(motif) for motif in self._seq]
694
693
    
695
 
    def slidingWindows(self, window, step):
 
694
    def slidingWindows(self, window, step, start=None, end=None):
696
695
        """Generator function that yield new sequence objects
697
696
        of a given length at a given interval.
698
697
        Arguments:
699
698
            - window: The length of the returned sequence
700
699
            - step: The interval between the start of the returned
701
 
              sequence objects"""
702
 
        for pos in range(0, len(self)-window+1,step):
703
 
            yield self[pos:pos+window]
 
700
              sequence objects
 
701
            - start: first window start position
 
702
            - end: last window start position
 
703
        """
 
704
        start = [start, 0][start is None]
 
705
        end = [end, len(self)-window+1][end is None]
 
706
        end = min(len(self)-window+1, end)
 
707
        if start < end and len(self)-end >= window-1:
 
708
            for pos in xrange(start, end, step):
 
709
                yield self[pos:pos+window]
704
710
    
705
711
    def getInMotifSize(self, motif_length=1, log_warnings=True):
706
712
        """returns sequence as list of non-overlapping motifs
715
721
            length = len(seq)
716
722
            remainder = length % motif_length
717
723
            if remainder and log_warnings:
718
 
                LOG.warning('Dropped remainder "%s" from end of sequence' %
 
724
                warnings.warn('Dropped remainder "%s" from end of sequence' %
719
725
                        seq[-remainder:])
720
726
            return [seq[i:i+motif_length]
721
727
                    for i in range(0, length-remainder, motif_length)]