5
4
.. sectionauthor:: Kristian Rother, Patrick Yannul
9
Reading Protein structures
10
""""""""""""""""""""""""""
12
Retrieve a structure from PDB
13
+++++++++++++++++++++++++++++
17
>>> from cogent.db.pdb import Pdb
19
>>> pdb_file = p['4tsv']
20
>>> pdb = pdb_file.read()
24
This example will retrieve the structure as a PDB file string.
31
>>> from cogent.parse.pdb import PDBParser
32
>>> struc = PDBParser(open('data/4TSV.pdb'))
36
Parse a PDB entry directly from the web
37
+++++++++++++++++++++++++++++++++++++++
41
>>> from cogent.parse.pdb import PDBParser
42
>>> struc = PDBParser(p['4tsv'])
44
Accessing PDB header information
45
++++++++++++++++++++++++++++++++
49
>>> struc.header['id']
51
>>> struc.header['resolution']
53
>>> struc.header['r_free']
55
>>> struc.header['space_group']
58
Navigating structure objects
59
""""""""""""""""""""""""""""
61
What does a structure object contain?
62
+++++++++++++++++++++++++++++++++++++
64
A ``cogent.parse.pdb.Structure`` object as returned by ``PDBParser`` contains a tree-like hierarchy of ``Entity`` objects. They are organized such that ``Structures`` that contain ``Models`` that contain ``Chains`` that contain ``Residues`` that in turn contain ``Atoms``. You can read more about the entity model on [URL of Marcins example page].
66
How to access a model from a structure
67
++++++++++++++++++++++++++++++++++++++
69
To get the first model out of a structure:
73
>>> model = struc[(0,)]
77
The key contains the model number as a tuple.
79
How to access a chain from a model?
80
+++++++++++++++++++++++++++++++++++
82
To get a particular chain:
86
>>> chain = model[('A',)]
90
How to access a residue from a chain?
91
+++++++++++++++++++++++++++++++++++++
93
To get a particular residue:
97
>>> resi = chain[('ILE', 154, ' '),]
99
<Residue ILE resseq=154 icode= >
101
What properties does a residue have?
102
++++++++++++++++++++++++++++++++++++
116
Access an atom from a residue
117
+++++++++++++++++++++++++++++
119
To get a particular atom:
123
>>> atom = resi[("N", ' '),]
127
Properties of an atom
128
+++++++++++++++++++++
137
array([ 142.986, 36.523, 6.838])
143
If a model/chain/residue/atom does not exist
144
++++++++++++++++++++++++++++++++++++++++++++
146
You will get a ``KeyError``.
148
Is there something special about heteroatoms to consider?
149
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
151
Yes, they have the ``h_flag`` attribute set in residues.
153
How are Altlocs/insertion codes represented?
154
++++++++++++++++++++++++++++++++++++++++++++
156
Both are part of the residue/atom ID.
158
Useful methods to access Structure objects
159
""""""""""""""""""""""""""""""""""""""""""
161
How to access all atoms, residues etc via a dictionary
162
++++++++++++++++++++++++++++++++++++++++++++++++++++++
164
The ``table`` property of a structure returns a two-dimensional dictionary containing all atoms. The keys are 1) the entity level (any of 'A','R','C','M') and 2) the combined IDs of ``Structure``, ``Model``, ``Chain``, ``Residue``, ``Atom`` as a tuple.
168
>>> struc.table['A'][('4TSV', 0, 'A', ('HIS', 73, ' '), ('O', ' '))]
171
Calculate the center of mass of a model or chain
172
++++++++++++++++++++++++++++++++++++++++++++++++
174
.. NEEDS TO BE CHECKED WITH MARCIN
179
array([ 146.66615752, 35.08673503, -3.60735847])
181
array([ 146.66615752, 35.08673503, -3.60735847])
184
How to get a list of all residues in a chain?
185
+++++++++++++++++++++++++++++++++++++++++++++
189
>>> chain.values()[0]
190
<Residue ILE resseq=154 icode= >
192
How to get a list of all atoms in a chain?
193
++++++++++++++++++++++++++++++++++++++++++
200
Constructing structures
201
"""""""""""""""""""""""
203
How to create a new entity?
204
+++++++++++++++++++++++++++
206
``Structure``/``Model``/``Chain``/``Residue``/``Atom`` objects can be created as follows:
210
>>> from cogent.core.entity import Structure,Model,Chain,Residue,Atom
211
>>> from numpy import array
212
>>> s = Structure('my_struc')
214
>>> c = Chain(('A'),)
215
>>> r = Residue(('ALA', 1, ' ',),False,' ')
216
>>> a = Atom(('C ',' ',), 'C', 1, array([0.0,0.0,0.0]), 1.0, 0.0, 'C')
218
How to add entities to each other?
219
++++++++++++++++++++++++++++++++++
227
>>> s.setTable(force=True)
229
{'A': {('my_struc', 0, 'A', ('ALA', 1, ' '), ('C ', ' ')): <Atom ('C ', ' ')>}, 'C': {('my_struc', 0, 'A'): <Chain id=A>}, 'R': {('my_struc', 0, 'A', ('ALA', 1, ' ')): <Residue ALA resseq=1 icode= >}, 'M': {('my_struc', 0): <Model id=0>}}
231
How to remove a residue from a chain?
232
+++++++++++++++++++++++++++++++++++++
238
{'A': {('my_struc', 0, 'A', ('ALA', 1, ' '), ...
10
243
Calculating euclidean distances between atoms
11
---------------------------------------------
244
+++++++++++++++++++++++++++++++++++++++++++++
248
>>> from cogent.maths.geometry import distance
249
>>> atom1 = resi[('N', ' '),]
250
>>> atom2 = resi[('CA', ' '),]
251
>>> distance(atom1.coords, atom2.coords)
254
Calculating euclidean distances between coordinates
255
+++++++++++++++++++++++++++++++++++++++++++++++++++
15
259
>>> from numpy import array
16
260
>>> from cogent.maths.geometry import distance
17
261
>>> a1 = array([1.0, 2.0, 3.0])
18
262
>>> a2 = array([1.0, 4.0, 9.0])
19
263
>>> distance(a1,a2)
266
Calculating flat angles from atoms
267
++++++++++++++++++++++++++++++++++
271
>>> from cogent.struct.dihedral import angle
272
>>> atom3 = resi[('C', ' '),]
273
>>> a12 = atom2.coords-atom1.coords
274
>>> a23 = atom2.coords-atom3.coords
278
Calculates the angle in radians.
280
Calculating flat angles from coordinates
281
++++++++++++++++++++++++++++++++++++++++
285
>>> from cogent.struct.dihedral import angle
286
>>> a1 = array([0.0, 0.0, 1.0])
287
>>> a2 = array([0.0, 0.0, 0.0])
288
>>> a3 = array([0.0, 1.0, 0.0])
294
Calculates the angle in radians.
296
Calculating dihedral angles from atoms
297
++++++++++++++++++++++++++++++++++++++
301
>>> from cogent.struct.dihedral import dihedral
302
>>> atom4 = resi[('CG1', ' '),]
303
>>> dihedral(atom1.coords,atom2.coords,atom3.coords, atom4.coords)
306
Calculates the torsion in degrees.
308
Calculating dihedral angles from coordinates
309
++++++++++++++++++++++++++++++++++++++++++++
313
>>> from cogent.struct.dihedral import dihedral
314
>>> a1 = array([0.0, 0.0, 1.0])
315
>>> a2 = array([0.0, 0.0, 0.0])
316
>>> a3 = array([0.0, 1.0, 0.0])
317
>>> a4 = array([1.0, 1.0, 0.0])
318
>>> dihedral(a1,a2,a3,a4)
321
Calculates the torsion in degrees.
326
How to count the atoms in a structure?
327
++++++++++++++++++++++++++++++++++++++
331
>>> len(struc.table['A'].values())
334
How to iterate over chains in canonical PDB order?
335
++++++++++++++++++++++++++++++++++++++++++++++++++
337
In PDB, the chain with space as ID comes last, the
338
others in alphabetical order.
342
>>> for chain in model.sortedvalues():
346
How to iterate over chains in alphabetical order?
347
+++++++++++++++++++++++++++++++++++++++++++++++++
349
If you want the chains in purely alphabetical order:
351
.. KR 2 ROB: Is this what you requested or is the above example enough?
355
>>> keys = model.keys()
357
>>> for chain in [model[id] for id in keys]:
361
How to iterate over all residues in a chain?
362
++++++++++++++++++++++++++++++++++++++++++++
366
>>> residues = [resi for resi in chain.values()]
370
How to remove all water molecules from a structure
371
++++++++++++++++++++++++++++++++++++++++++++++++++
375
>>> water = [r for r in struc.table['R'].values() if r.name == 'H_HOH']
376
>>> for resi in water:
377
... resi.parent.delChild(resi.id)
378
>>> struc.setTable(force=True)
379
>>> len(struc.table['A'].values())
381
>>> residues = [resi for resi in chain.values()]