9
Calculates the isoelectric point of a protein, the pH at which there
10
is no overall charge on the protein. Calculates the charge on a protein
11
at a given pH. Can use built-in sets of pK values or custom pK sets.
15
use Bio::Tools::pICalculator;
18
my $in = Bio::SeqIO->new( -fh => \*STDIN ,-format => 'Fasta' );
20
my $calc = Bio::Tools::pICalculator->new(-places => 2,
23
while ( my $seq = $in->next_seq ) {
26
print sprintf( "%s\t%s\t%.2f\n",
29
$calc->charge_at_pH($iep) );
31
for( my $i = 0; $i <= 14; $i += 0.5 ){
32
print sprintf( "pH = %.2f\tCharge = %.2f\n",
34
$calc->charge_at_pH($i) );
40
http://fields.scripps.edu/DTASelect/20010710-pI-Algorithm.pdf
41
http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Apps/iep.html
42
http://us.expasy.org/tools/pi_tool.html
46
There are various sources for the pK values of the amino acids. The set of
47
pK values chosen will affect the pI reported.
49
The charge state of each residue is assumed to be independent of the others.
50
Protein modifications (such as a phosphate group) that have a charge are
57
User feedback is an integral part of the evolution of this
58
and other Bioperl modules. Send your comments and suggestions preferably
59
to one of the Bioperl mailing lists.
60
Your participation is much appreciated.
62
bioperl-l@bioperl.org - General discussion
63
http://bio.perl.org/MailList.html - About the mailing lists
67
Report bugs to the Bioperl bug tracking system to help us keep track
68
the bugs and their resolution.
69
Bug reports can be submitted via email or the web:
71
bioperl-bugs@bio.perl.org
72
http://bugzilla.bioperl.org/
76
Mark Southern (mark_southern@merck.com). From an algorithm by David Tabb found
77
at http://fields.scripps.edu/DTASelect/20010710-pI-Algorithm.pdf.
78
Modification for Bioperl, additional documentation by Brian Osborne.
82
Copyright (c) 2002, Merck & Co. Inc. All Rights Reserved. This module is
83
free software. It may be used, redistributed and/or modified under the terms
84
of the Perl Artistic License (see http://www.perl.com/perl/misc/Artistic.html)
88
The rest of the documentation details each of the object methods.
89
Private methods are usually preceded by a _.
93
# Let the code begin...
95
package Bio::Tools::pICalculator;
101
@ISA = qw(Bio::Root::Root);
103
# pK values from the DTASelect program from Scripps
104
# http://fields.scripps.edu/DTASelect
105
my $DTASelect_pK = { N_term => 8.0,
116
# pK values from the iep program from EMBOSS
117
# http://www.hgmp.mrc.ac.uk/Software/EMBOSS/
118
my $Emboss_pK = { N_term => 8.6,
132
Usage : Bio::Tools::pICalculator->new
133
Function: Instantiates the Bio::Tools::pICalculator object
134
Example : $calc = Bio::Tools::pICalculator->new( -pKset => \%pKvalues,
140
$calc = Bio::Tools::pICalculator->new( -pKset => 'string',
145
Constructs a new pICalculator. Arguments are a flattened hash.
146
Valid, optional keys are:
148
pKset - A reference to a hash with key value pairs for the
149
pK values of the charged amino acids. Required keys
152
N_term C_term K R H D E C Y
154
pKset - A string ( 'DTASelect' or 'EMBOSS' ) that will
155
specify an internal set of pK values to be used. The
158
seq - A Bio::Seq sequence object to analyze
160
places - The number of decimal places to use in the
161
isoelectric point calculation. The default is 2.
163
Returns : The description
164
Args : The description or none
169
my( $class, %opts ) = @_;
170
my $self = $class->SUPER::new(%opts);
171
$self = bless {}, ref $self || $self;
172
$self->seq( $opts{-seq} ) if exists $opts{-seq};
173
$self->pKset( $opts{-pKset} || 'EMBOSS' );
174
exists $opts{-places} ? $self->places( $opts{-places} ) :
182
Usage : $calc->seq($seqobj)
183
Function: Sets or returns the Bio::Seq used in the calculation
184
Example : $seqobj = Bio::Seq->new(-seq=>"gghhhmmm",-id=>"GHM");
185
$calc = Bio::Tools::pICalculator->new;
187
Returns : Bio::Seq object
188
Args : Bio::Seq object or none
193
my( $this, $seq ) = @_;
194
unless( defined $seq && UNIVERSAL::isa($seq,'Bio::Seq') ){
195
die $seq . " is not a valid Bio::Seq object\n";
197
$this->{-seq} = $seq;
198
$this->{-count} = count_charged_residues( $seq );
199
return $this->{-seq};
205
Usage : $pkSet = $calc->pKSet(\%pKSet)
206
Function: Sets or returns the hash of pK values used in the calculation
207
Example : $calc->pKset('emboss')
208
Returns : reference to pKset hash
209
Args : The reference to a pKset hash, a string, or none. Examples:
211
pKset - A reference to a hash with key value pairs for the
212
pK values of the charged amino acids. Required keys
215
N_term C_term K R H D E C Y
217
pKset - A valid string ( 'DTASelect' or 'EMBOSS' ) that will
218
specify an internal set of pK values to be used. The
224
my ( $this, $pKset ) = @_;
225
if( ref $pKset eq 'HASH' ){ # user defined pK values
226
$this->{-pKset} = $pKset;
227
}elsif( $pKset =~ /^emboss$/i ){ # from EMBOSS's iep program
228
$this->{-pKset} = $Emboss_pK;
229
}elsif( $pKset =~ /^dtaselect$/i ){ # from DTASelect (scripps)
230
$this->{-pKset} = $DTASelect_pK;
231
}else{ # default to EMBOSS
232
$this->{-pKset} = $Emboss_pK;
234
return $this->{-pKset};
239
$this->{-places} = shift if @_;
240
return $this->{-places};
247
Function: Returns the isoelectric point
248
Example : $calc = Bio::Tools::pICalculator->new(-places => 2);
251
Returns : The isoelectric point of the sequence in the Bio::Seq object
258
return _calculate_iep($this->{-pKset},
268
Usage : $charge = $calc->charge_at_pH($pH)
269
Function: Sets or gets the description of the sequence
270
Example : $calc = Bio::Tools::pICalculator->new(-places => 2);
272
$charge = $calc->charge_at_ph("7");
273
Returns : The predicted charge at the given pH
280
return _calculate_charge_at_pH( shift, $this->{-pKset},
284
sub count_charged_residues {
286
my $sequence = $seq->seq;
288
for ( qw( K R H D E C Y ) ){ # charged AA's
289
$count->{$_}++ while $sequence =~ /$_/ig;
295
my( $pK, $places, $seq, $count ) = @_;
298
my $last_charge = 0.0;
299
my $format = "%.${places}f";
301
unless( defined $count ){
302
$count = count_charged_residues($seq);
305
my $charge = _calculate_charge_at_pH( $pH, $pK, $count );
306
last if sprintf($format,$charge) ==
307
sprintf($format,$last_charge);
308
$charge > 0 ? ( $pH += $step ) : ( $pH -= $step );
310
$last_charge = $charge;
312
return sprintf( $format, $pH );
315
# it's the sum of all the partial charges for the
316
# termini and all of the charged aa's!
317
sub _calculate_charge_at_pH {
318
no warnings; # don't complain if a given key doesn't exist
319
my( $pH, $pK, $count ) = @_;
320
my $charge = _partial_charge( $pK->{N_term}, $pH )
321
+ $count->{K} * _partial_charge( $pK->{K}, $pH )
322
+ $count->{R} * _partial_charge( $pK->{R}, $pH )
323
+ $count->{H} * _partial_charge( $pK->{H}, $pH )
324
- $count->{D} * _partial_charge( $pH, $pK->{D} )
325
- $count->{E} * _partial_charge( $pH, $pK->{E} )
326
- $count->{C} * _partial_charge( $pH, $pK->{C} )
327
- $count->{Y} * _partial_charge( $pH, $pK->{Y} )
328
- _partial_charge( $pH, $pK->{C_term} );
332
# Concentration Ratio is 10**(pK - pH) for positive groups
333
# and 10**(pH - pK) for negative groups
334
sub _partial_charge {
335
my $cr = 10 ** ( $_[0] - $_[1] );
336
return $cr / ( $cr + 1 );