46
48
.. cogent.util.transform
50
Miscellaneous functions
51
=======================
53
.. index:: cogent.util.misc
58
Basic ``identity`` function to avoid having to test explicitly for None
62
>>> from cogent.util.misc import identity
64
>>> if identity(my_var):
71
One-line if/else statement
72
^^^^^^^^^^^^^^^^^^^^^^^^^^
74
Convenience function for performing one-line if/else statements. This is similar to the C-style tertiary operator:
78
>>> from cogent.util.misc import if_
79
>>> result = if_(4 > 5, "Expression is True", "Expression is False")
83
However, the value returned is evaluated, but not called. For instance:
87
>>> from cogent.util.misc import if_
94
>>> if_(4 > 5, foo, bar)
97
Force a variable to be iterable
98
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
100
This support method will force a variable to be an iterable, allowing you to guarantee that the variable will be safe for use in, say, a ``for`` loop.
104
>>> from cogent.util.misc import iterable
107
... print "will not work"
109
Traceback (most recent call last):
110
TypeError: 'int' object is not iterable
111
>>> for i in iterable(my_var):
116
Obtain the index of the largest item
117
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
119
To determine the index of the largest item in any iterable container, use ``max_index``:
123
>>> from cogent.util.misc import max_index
124
>>> l = [5,4,2,2,6,8,0,10,0,5]
128
.. note:: Will return the lowest index of duplicate max values
130
Obtain the index of the smallest item
131
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
133
To determine the index of the smallest item in any iterable container, use ``min_index``:
137
>>> from cogent.util.misc import min_index
138
>>> l = [5,4,2,2,6,8,0,10,0,5]
142
.. note:: Will return the lowest index of duplicate min values
144
Remove a nesting level
145
^^^^^^^^^^^^^^^^^^^^^^
147
To flatten a 2-dimensional list, you can use ``flatten``:
151
>>> from cogent.util.misc import flatten
152
>>> l = ['abcd','efgh','ijkl']
154
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']
156
Convert a nested tuple into a list
157
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
159
Conversion of a nested ``tuple`` into a ``list`` can be performed using ``deep_list``:
163
>>> from cogent.util.misc import deep_list
164
>>> t = ((1,2),(3,4),(5,6))
166
[[1, 2], [3, 4], [5, 6]]
168
Simply calling ``list`` will not convert the nested items:
173
[(1, 2), (3, 4), (5, 6)]
175
Convert a nested list into a tuple
176
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
178
Conversion of a nested ``list`` into a ``tuple`` can be performed using ``deep_list``:
182
>>> from cogent.util.misc import deep_tuple
183
>>> l = [[1,2],[3,4],[5,6]]
185
((1, 2), (3, 4), (5, 6))
187
Simply calling ``tuple`` will not convert the nested items:
192
([1, 2], [3, 4], [5, 6])
194
Testing if an item is between two values
195
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
197
Same as: min <= number <= max, although it is quickly readable within code
201
>>> from cogent.util.misc import between
207
Return combinations of items
208
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
210
``Combinate`` returns all k-combinations of items. For instance:
214
>>> from cogent.util.misc import combinate
215
>>> list(combinate([1,2,3],0))
217
>>> list(combinate([1,2,3],1))
219
>>> list(combinate([1,2,3],2))
220
[[1, 2], [1, 3], [2, 3]]
221
>>> list(combinate([1,2,3],3))
224
Save and load gzip'd files
225
^^^^^^^^^^^^^^^^^^^^^^^^^^
227
These handy methods will ``cPickle`` an object and automagically gzip the file. You can also then reload the object at a later date.
231
>>> from cogent.util.misc import gzip_dump, gzip_load
232
>>> class foo(object):
236
>>> bar.some_var = 10
237
>>> # gzip_dump(bar, 'test_file')
238
>>> # new_bar = gzip_load('test_file')
239
>>> # isinstance(new_bar, foo)
241
.. note:: The above code does work, but cPickle won't write out within doctest
246
curry(f,x)(y) = f(x,y) or = lambda y: f(x,y). This was modified from the Python Cookbook. Docstrings are also carried over.
250
>>> from cogent.util.misc import curry
252
... """Some function"""
255
>>> bar = curry(foo, 5)
256
>>> print bar.__doc__
258
== curried from foo ==
263
Test to see if an object is iterable
264
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
266
Perform a simple test to see if an object supports iteration
270
>>> from cogent.util.misc import is_iterable
271
>>> can_iter = [1,2,3,4]
272
>>> cannot_iter = 1.234
273
>>> is_iterable(can_iter)
275
>>> is_iterable(cannot_iter)
278
Test to see if an object is a single char
279
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
281
Perform a simple test to see if an object is a single character
285
>>> from cogent.util.misc import is_char
296
Flatten a deeply nested iterable
297
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
299
To flatten a deeply nested iterable, use ``recursive_flatten``. This method supports multiple levels of nesting, and multiple iterable types
303
>>> from cogent.util.misc import recursive_flatten
304
>>> l = [[[[1,2], 'abcde'], [5,6]], [7,8], [9,10]]
305
>>> recursive_flatten(l)
306
[1, 2, 'a', 'b', 'c', 'd', 'e', 5, 6, 7, 8, 9, 10]
308
Test to determine if ``list`` of ``tuple``
309
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
311
Perform a simple check to see if an object is not a list or a tuple
315
>>> from cogent.util.misc import not_list_tuple
316
>>> not_list_tuple(1)
318
>>> not_list_tuple([1])
320
>>> not_list_tuple('ab')
323
Unflatten items to row-width
324
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
326
Unflatten an iterable of items to a specified row-width. This does reverse the effect of ``zip`` as the lists produced are not interleaved.
330
>>> from cogent.util.misc import unflatten
331
>>> l = [1,2,3,4,5,6,7,8]
333
[[1], [2], [3], [4], [5], [6], [7], [8]]
335
[[1, 2], [3, 4], [5, 6], [7, 8]]
337
[[1, 2, 3], [4, 5, 6]]
339
[[1, 2, 3, 4], [5, 6, 7, 8]]
344
Reverse the effects of a ``zip`` method, i.e. produces separate lists from tuples
348
>>> from cogent.util.misc import unzip
349
>>> l = ((1,2),(3,4),(5,6))
351
[[1, 3, 5], [2, 4, 6]]
353
Select items in order
354
^^^^^^^^^^^^^^^^^^^^^
356
Select items in a specified order
360
>>> from cogent.util.misc import select
361
>>> select('ea', {'a':1,'b':5,'c':2,'d':4,'e':6})
363
>>> select([0,4,8], 'abcdefghijklm')
366
Obtain the index sort order
367
^^^^^^^^^^^^^^^^^^^^^^^^^^^
369
Obtain the indices for items in sort order. This is similar to numpy.argsort, but will work on any iterable that implements the necessary ``cmp`` methods
373
>>> from cogent.util.misc import sort_order
374
>>> sort_order([4,2,3,5,7,8])
376
>>> sort_order('dcba')
379
Find overlapping pattern occurrences
380
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
382
Find all of the overlapping occurrences of a pattern within a text
386
>>> from cogent.util.misc import find_all
389
>>> find_all(text, pattern)
391
>>> text = 'abababab'
393
>>> find_all(text, pattern)
396
Find multiple pattern occurrences
397
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
399
Find all of the overlapping occurrences of multiple patterns within a text. Returned indices are sorted, each index is the start position of one of the patterns
403
>>> from cogent.util.misc import find_many
404
>>> text = 'abababcabab'
405
>>> patterns = ['ab','abc']
406
>>> find_many(text, patterns)
409
Safely remove a trailing underscore
410
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
412
'Unreserve' a mutation of Python reserved words
416
>>> from cogent.util.misc import unreserve
417
>>> unreserve('class_')
419
>>> unreserve('class')
422
Create a case-insensitive iterable
423
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
425
Create a case-insensitive object, for instance, if you want the key 'a' and 'A' to point to the same item in a dict
429
>>> from cogent.util.misc import add_lowercase
430
>>> d = {'A':5,'B':6,'C':7,'foo':8,42:'life'}
432
{'A': 5, 'a': 5, 'C': 7, 'B': 6, 42: 'life', 'c': 7, 'b': 6, 'foo': 8}
434
Extract data delimited by differing left and right delimiters
435
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
437
Extract data from a line that is surrounded by different right/left delimiters
441
>>> from cogent.util.misc import extract_delimited
442
>>> line = "abc[def]ghi"
443
>>> extract_delimited(line,'[',']')
449
Get a dictionary with the values set as keys and the keys set as values
453
>>> from cogent.util.misc import InverseDict
454
>>> d = {'some_key':1,'some_key_2':2}
456
{1: 'some_key', 2: 'some_key_2'}
458
.. note:: An arbitrary key will be set if there are multiple keys with the same value
460
Invert a dictionary with multiple keys having the same value
461
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
463
Get a dictionary with the values set as keys and the keys set as values. Can handle the case where multiple keys point to the same values
467
>>> from cogent.util.misc import InverseDictMulti
468
>>> d = {'some_key':1,'some_key_2':1}
469
>>> InverseDictMulti(d)
470
{1: ['some_key_2', 'some_key']}
473
Get mapping from sequence item to all positions
474
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
476
``DictFromPos`` returns the positions of all items seen within a sequence. This is useful for obtaining, for instance, nucleotide counts and positions
480
>>> from cogent.util.misc import DictFromPos
481
>>> seq = 'aattggttggaaggccgccgttagacg'
483
{'a': [0, 1, 10, 11, 22, 24], 'c': [14, 15, 17, 18, 25], 't': [2, 3, 6, 7, 20, 21], 'g': [4, 5, 8, 9, 12, 13, 16, 19, 23, 26]}
485
Get the first index of occurrence for each item in a sequence
486
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
488
``DictFromFirst`` will return the first location of each item in a sequence
492
>>> from cogent.util.misc import DictFromFirst
493
>>> seq = 'aattggttggaaggccgccgttagacg'
494
>>> DictFromFirst(seq)
495
{'a': 0, 'c': 14, 't': 2, 'g': 4}
497
Get the last index of occurrence for each item in a sequence
498
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
500
``DictFromLast`` will return the last location of each item in a sequence
504
>>> from cogent.util.misc import DictFromLast
505
>>> seq = 'aattggttggaaggccgccgttagacg'
506
>>> DictFromLast(seq)
507
{'a': 24, 'c': 25, 't': 21, 'g': 26}
509
Construct a distance matrix lookup function
510
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
512
Automatically construct a distance matrix lookup function. This is useful for maintaining flexibility about whether a function is being computed or if a lookup is being used
516
>>> from cogent.util.misc import DistanceFromMatrix
517
>>> from numpy import array
518
>>> m = array([[1,2,3],[4,5,6],[7,8,9]])
519
>>> f = DistanceFromMatrix(m)
525
Get all pairs from groups
526
^^^^^^^^^^^^^^^^^^^^^^^^^
528
Get all of the pairs of items present in a list of groups. A key will be created (i,j) iff i and j share a group
532
>>> from cogent.util.misc import PairsFromGroups
533
>>> groups = ['ab','xyz']
534
>>> PairsFromGroups(groups)
535
{('a', 'a'): None, ('b', 'b'): None, ('b', 'a'): None, ('x', 'y'): None, ('z', 'x'): None, ('y', 'y'): None, ('x', 'x'): None, ('y', 'x'): None, ('z', 'y'): None, ('x', 'z'): None, ('a', 'b'): None, ('y', 'z'): None, ('z', 'z'): None}
540
Check an object against base classes or derived classes to see if it is acceptable
544
>>> from cogent.util.misc import ClassChecker
545
>>> class not_okay(object):
549
>>> class okay(object):
553
>>> class my_dict(dict):
557
>>> cc = ClassChecker(str, okay, dict)
571
Delegate to a separate object
572
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
574
Delegate object method calls, properties and variables to the appropriate object. Useful to combine multiple objects together while assuring that the calls will go to the correct object.
578
>>> from cogent.util.misc import Delegator
579
>>> class ListAndString(list, Delegator):
580
... def __init__(self, items, string):
581
... Delegator.__init__(self, string)
585
>>> ls = ListAndString([1,2,3], 'ab_cd')
595
Wrap a function to hide from a class
596
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
598
Wrap a function to hide it from a class so that it isn't a method.
602
>>> from cogent.util.misc import FunctionWrapper
603
>>> f = FunctionWrapper(str)
605
<cogent.util.misc.FunctionWrapper object at ...
609
Construct a constrained container
610
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
612
Wrap a container with a constraint. This is useful for enforcing that the data contained is valid within a defined context. PyCogent provides a base ``ConstrainedContainer`` which can be used to construct user-defined constrained objects. PyCogent also provides ``ConstrainedString``, ``ConstrainedList``, and ``ConstrainedDict``. These provided types fully cover the builtin types while staying integrated with the ``ConstrainedContainer``.
614
Here is a light example of the ``ConstrainedDict``
618
>>> from cogent.util.misc import ConstrainedDict
619
>>> d = ConstrainedDict({'a':1,'b':2,'c':3}, Constraint='abc')
621
{'a': 1, 'c': 3, 'b': 2}
623
Traceback (most recent call last):
624
ConstraintError: Item 'd' not in constraint 'abc'
626
PyCogent also provides mapped constrained containers for each of the default types provided, ``MappedString``, ``MappedList``, and ``MappedDict``. These behave the same, except that they map a mask onto ``__contains__`` and ``__getitem__``
631
... return str(int(x) + 3)
633
>>> from cogent.util.misc import MappedString
634
>>> s = MappedString('12345', Constraint='45678', Mask=mask)
640
Traceback (most recent call last):
641
ConstraintError: Sequence '9' doesn't meet constraint
643
Check the location of an application
644
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
646
Determine if an application is available on a system
650
>>> from cogent.util.misc import app_path
653
>>> app_path('does_not_exist')