1
#-----------------------------------------------------------------------------
2
# PACKAGE : Bio::Tools::WWW
3
# PURPOSE : To encapsulate commonly used URLs for web key websites in bioinformatics.
4
# AUTHOR : Steve Chervitz
5
# CREATED : 27 Aug 1996
6
# REVISION: $Id: WWW.pm,v 1.14 2003/06/04 08:36:43 heikki Exp $
8
# For documentation, run this module through pod2html
9
# (preferably from Perl v5.004 or better).
12
# 0.014, sac --- Mon Aug 31 19:41:44 1998
13
# * Updated and added a few URLs.
14
# * Added method strip_html().
15
# * Documentation changes.
17
#-----------------------------------------------------------------------------
19
package Bio::Tools::WWW;
23
use vars qw(@ISA @EXPORT_OK %EXPORT_TAGS $ID $BioWWW $Revision
25
$AUTHORITY = 'nobody@localhost';
26
@ISA = qw( Bio::Root::Root Exporter);
27
@EXPORT_OK = qw($BioWWW);
28
%EXPORT_TAGS = ( obj => [qw($BioWWW)],
29
std => [qw($BioWWW)]);
31
$ID = 'Bio::Tools::WWW';
32
$Revision = '$Id: WWW.pm,v 1.14 2003/06/04 08:36:43 heikki Exp $'; #'
37
$BioWWW->{'_name'} = "Static $ID object";
44
Bio::Tools::WWW - Bioperl manager for web resources related to biology.
48
=head2 Object Creation
50
use Bio::Tools qw(:obj);
52
$pdb = $BioWWW->home_url('pdb');
54
There is no need to create a new Bio::Tools::WWW.pm object when the
55
C<:obj> tag is used. This tag will import the static $BioWWW object
56
created by Bio::Tools::WWW.pm into your name space. This saves you
57
from having to call C<new Bio::Tools::WWW>.
59
You are free to not use the :obj tag and create the object as you
60
like, but a Bio::Tools::WWW object is not configurable; any given
61
script only needs a single copy.
65
This module is included with the central Bioperl distribution:
67
http://bio.perl.org/Core/Latest
68
ftp://bio.perl.org/pub/DIST
70
You also need to define URLs for the following variables in this package:
72
$Not_found_url : Generic page to show in place of a 404 error.
73
$Tmp_url : Web-accessible site that is Used for scripts that
74
need to generate temporary, web-accessible files.
75
The files need not necessarily be HTML files, but
76
being on the same disk as the server will permit
77
faster IO from server scripts.
81
Bio::Tools::WWW is primarily a URL broker for a select set
82
of sites related to bioinformatics/genome analysis. It
83
definitely represents a biased, unexhaustive set.
84
It might be more accurate to call this module
85
"Bio::Tools::URL.pm". But this module does handle some non-URL
86
things and it may do more of this in the future. Having one
87
module to cover all biologically relevant web utilities
88
makes it more convenient, especially at this early stage
91
Maintaining accurate URLs over time can be challenging as
92
new web sites spring up and old sites are re-organized. Because
93
of this fact, the URLs in this module are not guaranteed to be
94
correct or exhaustive and will require periodic updating.
98
By keeping URL management within Bio::Tools::WWW.pm, other generic
99
modules can easily access a variety of different web sites without
100
having to know about a potential multitude of specific modules
101
specialized for one database or another. An alternative approach would
102
be to have addresses defined within modules specialized for different
103
web sites. This, however, may create maintenance headaches when updating
106
=head2 Complex Websites
108
Websites with complex datasets may require special treatment
109
within this module. As an example,
110
URLs for the Saccharomyces Genome Database are clustered
111
separately in this module, due to (1) the different ways to
112
access information at this database and (2) the familiarity
113
of the developer with this database. The Bio::SGD::WWW.pm inherits from
114
Bio::Tools::WWW.pm to permit access to the URLs provided by Bio::Tools::WWW.pm
115
and to SGD-specific HTML and images.
117
The organization of Bio::Tools::WWW.pm is expected to evolve as
118
websites get born, die, and mutate their APIs.
122
http://bio.perl.org/Projects/modules.html - Online module documentation
123
http://bio.perl.org/ - Bioperl Project Homepage
129
User feedback is an integral part of the evolution of this and other Bioperl modules.
130
Send your comments and suggestions preferably to one of the Bioperl mailing lists.
131
Your participation is much appreciated.
133
bioperl-l@bioperl.org - General discussion
134
http://www.bioperl.org/MailList.shtml - About the mailing lists
136
=head2 Reporting Bugs
138
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and
139
their resolution. Bug reports can be submitted via email or the web:
141
bioperl-bugs@bio.perl.org
142
http://bugzilla.bioperl.org/
146
Steve Chervitz, sac@bioperl.org
150
Bio::Tools::WWW.pm, 0.014
154
Copyright (c) 1996-98 Steve Chervitz. All Rights Reserved.
155
This module is free software; you can redistribute it and/or
156
modify it under the same terms as Perl itself.
165
#### END of main POD documentation.
171
############################ DATA ##################################
173
### Database homepage links.
176
'bioperl' =>'http://bio.perl.org/',
177
'bioperl-stanford'=>'http://genome-www.stanford.edu/perlOOP/bioperl/',
178
'bioperl-schema' =>'http://bio.perl.org/Projects/Schema/',
179
'biomoo' =>'http://bioinformatics.weizmann.ac.il/BioMOO/',
180
'blast_ncbi' =>'http://www.ncbi.nlm.nih.gov/BLAST/',
181
'blast_wu' =>'http://blast.wustl.edu/',
182
'bsm' =>'http://www.biochem.ucl.ac.uk/bsm/',
183
'clustal' =>'http://www.csc.fi/molbio/progs/clustalw/clustalw.html',
184
'ebi' =>'http://www.ebi.ac.uk/',
185
'emotif' =>'http://motif.Stanford.EDU/emotif',
186
'entrez' =>'http://www3.ncbi.nlm.nih.gov/Entrez/',
187
'expasy' =>'http://www.expasy.ch/',
188
'gdb' =>'http://www.gdb.org/', # R.I.P. (Jan 1998); site still functional
189
'mips' =>'http://speedy.mips.biochem.mpg.de/',
190
'mmdb' =>'http://www.ncbi.nlm.nih.gov/Structure/',
191
'modbase' =>'http://guitar.rockefeller.edu/',
192
'ncbi' =>'http://www.ncbi.nlm.nih.gov/',
193
'pedant' =>'http://pedant.mips.biochem.mpg.de',
194
'phylip' =>'http://evolution.genetics.washington.edu/phylip.html',
195
'pir' =>'http://www-nbrf.georgetown.edu/pir/',
196
'pfam' =>'http://pfam.wustl.edu/',
197
'pfam_uk' =>'http://www.sanger.ac.uk/Software/Pfam/',
198
'pfam_us' =>'http://pfam.wustl.edu/',
199
'pdb' =>'http://www.pdb.bnl.gov/',
200
'presage' =>'http://presage.stanford.edu/',
201
'geneQuiz' =>'http://www.sander.ebi.ac.uk/genequiz/genomes/sc/',
202
'molMov' =>'http://bioinfo.mbb.yale.edu/MolMovDB/',
203
# 'protMot' =>'http://bioinfo.mbb.yale.edu/ProtMotDB/', # old, use molMov instead
204
'pubmed' =>'http://www.ncbi.nlm.nih.gov/PubMed/',
205
'sacch3d' =>'http://genome-www.stanford.edu/Sacch3D/',
206
'sgd' =>'http://genome-www.stanford.edu/Saccharomyces/',
207
# 'scop' =>'http://www.pdb.bnl.gov/scop/',
208
'scop' =>'http://scop.stanford.edu/scop/',
209
'swissProt' =>'http://www.expasy.ch/sprot/sprot-top.html',
210
'webmol' =>'http://genome-www.stanford.edu/structure/webmol/',
211
'ypd' =>'http://quest7.proteome.com/YPDhome.html',
214
### Database access CGI stems. (For some DBs the home URL can be used as the CGI stem)
217
'emotif' =>'http://dna.Stanford.EDU/cgi-bin/emotif/',
218
'entrez' =>'http://www3.ncbi.nlm.nih.gov/htbin-post/Entrez/query?',
219
'pdb' =>'http://www.pdb.bnl.gov/pdb-bin/',
220
'pfam_uk' =>'http://www.sanger.ac.uk/cgi-bin/Pfam/',
221
'pfam_us' =>'http://pfam.wustl.edu/cgi-bin/',
222
'pir' =>'http://www-nbrf.georgetown.edu/cgi-bin/nbrfget?',
226
### Database access stems/links.
228
( #'3db' =>'http://pdb.pdb.bnl.gov/cgi-bin/pdbids?3DB_ID=', # Former stem
229
'3db' =>$Stem_url{'pdb'}.'opdbshort?oPDBid=', # New stem (aug 1997)
230
'embl' =>$Home_url{'ebi'}.'htbin/emblfetch?',
231
'expasy' =>$Home_url{'expasy'}.'cgi-bin/', # program name and query string must be supplied.
232
'cath' =>$Home_url{'bsm'}.'cath/CATHSrch.pl?type=PDB&query=',
233
'cog_seq' =>$Home_url{'ncbi'}.'cgi-bin/COG/nph-cognitor?seq=', # add sequence
234
# To cog_orf, append ORF name ('YAL005c'). Case-sensitive! YAL005C won't work!
235
'cog_orf' =>$Home_url{'ncbi'}.'cgi-bin/COG/cogeseq?',
236
'ec1' =>$Home_url{'gdb'}.'bin/bio/wais_q-bio?object_class_key=30&jhu_id=',
237
'ec2' =>$Home_url{'bsm'}.'enzymes/',
238
'ec3' =>$Home_url{'expasy'}.'cgi-bin/get-enzyme-entry?',
239
'emotif_id' =>$Stem_url{'emotif'}.'nph-identify?sequence=',
240
'entrez' =>$Stem_url{'entrez'}."db=p_r?db=1&choseninfo=ORF_NAME%20[Gene%20Name]\@1\@1&form=4&field=Gene%20Name&mode=0&retrievestring=ORF_NAME%20[Gene%20Name]",
241
'gb_n' =>$Stem_url{'entrez'}."db=n&form=6&dopt=g&uid=",
242
'gb_p' =>$Stem_url{'entrez'}."db=p&form=6&dopt=g&uid=",
243
'gb_struct' =>$Stem_url{'entrez'}."db=t&form=6&dopt=s&uid=",
244
'pdb' =>$Stem_url{'pdb'}.'send-text?filename=',
245
'medline' =>$Stem_url{'entrez'}.'form=6&db=m&Dopt=r&uid=',
246
'mmdb' =>$Stem_url{'entrez'}.'db=t&form=6&Dopt=s&uid=',
247
'modbase_orf' =>$Home_url{'modbase'}.'gm-cgi-bin/orf_page.cgi?pg1=0.5&pg2=1.0&orf=',
248
# To the modbase_model, append yeast ORF name &pdb=<4-LETTER_CODE>&chain=<UPCASE LETTER, IF ANY>
249
'modbase_model' =>$Home_url{'modbase'}.'gm-cgi-bin/model_page.cgi?pg1=0.5&pg2=1.0&orf=',
250
'molMov' =>$Home_url{'molMov'}.'search.cgi?pdb=',
251
'pdb' =>$Stem_url{'pdb'}.'opdbshort?oPDBid=', # same as 3db
252
'pdb_coord' =>$Stem_url{'pdb'}.'send-pdb?filename=', # retrieves full coordinate file
253
'pfam' =>$Home_url{'pfam'}.'cgi-bin/nph-hmm_search?evalue=1.0&protseq=', # default: seq search, US
254
'pfam_sp_uk' =>$Stem_url{'pfam_uk'}.'swisspfamget.pl?name=',
255
'pfam_seq_uk' =>$Stem_url{'pfam_uk'}.'nph-search.cgi?evalue=1.0&type=normal&protseq=',
256
'pfam_sp_us' =>$Stem_url{'pfam_us'}.'getswisspfam?key=',
257
'pfam_seq_us' =>$Stem_url{'pfam_us'}.'nph-hmm_search?evalue=1.0&protseq=',
258
'pfam_form' =>$Home_url{'pfam'}.'cgi-bin/hmm_page.cgi', # interactive search form
259
'pir_id' =>$Stem_url{'pir'}.'fmt=c&xref=0&id=',
260
'pir_acc' =>$Stem_url{'pir'}.'fmt=c&xref=1&id=',
261
'pir_uid' =>$Stem_url{'pir'}.'uid=',
262
'pdbSum' =>$Home_url{'bsm'}.'cath/GetPDBSUMCODE.pl?code=',
263
# 'protMot' =>$Home_url{'protMot'}.'search.cgi?pdb=', # old, use molMov instead
264
'presage_sp' =>$Home_url{'presage'}.'search.cgi?spac=',
265
'swpr' =>$Home_url{'expasy'}.'cgi-bin/get-sprot-entry?',
266
'swModel' =>$Home_url{'expasy'}.'cgi-bin/sprot-swmodel-sub?',
267
'swprSearch' =>$Home_url{'expasy'}.'cgi-bin/sprot-search-ful?',
269
### SCOP tlev options can be appended to the stem after adding a PDB ID.
270
### tlev options are: 'dm'(domain), 'sf'(superfamily), 'fa'(family), 'cf'(common fold), 'cl'(class)
271
### E.g., search.cgi?pdb=1ARD;tlev=dm
273
'scop' =>$Home_url{'scop'}.'search.cgi?pdb=', ### better to use scop_pdb.
274
'scop_pdb' =>$Home_url{'scop'}.'search.cgi?pdb=',
275
'scop_data' =>$Home_url{'scop'}.'data/scop.', ### Deprecated: frequent changes.
277
## Search URLs for SGD/Sacch3D are contained %SGD_url and %S3d_url (below).
279
# For wormpep, the query string MUST end with "&keyword=" (after appending a sequence ID)
280
'wormpep' =>'http://www.sanger.ac.uk/cgi-bin/wormpep_fetch.pl?entry=',
281
'wormace' =>'http://webace.sanger.ac.uk/cgi-bin/webace?db=wormace&class=Sequence&text=yes&object=',
283
### YPD: You must use a valid gene name or ORF name (IFF there is no gene name).
284
### For this reason it is most convenient to use SGD's Protein_Info link
285
### which can accept either and will provide a proper link to YPD.
286
'ypd' =>'http://quest7.proteome.com/YPD/',
291
### CGI stems for SGD and Sacch3D.
293
('stanford' =>'http://genome-www.stanford.edu/',
294
'sgd' =>'http://genome-www.stanford.edu/cgi-bin/SGD/',
295
'sgd2' =>'http://genome-www2.stanford.edu/cgi-bin/SGD/',
296
's3d' =>'http://genome-www.stanford.edu/cgi-bin/SGD/Sacch3D/',
297
's3d2' =>'http://genome-www2.stanford.edu/cgi-bin/SGD/Sacch3D/',
298
's3d3' =>'http://genome-www3.stanford.edu/cgi-bin/SGD/Sacch3D/',
299
'sacchdb' =>'http://genome-www.stanford.edu/cgi-bin/dbrun/SacchDB?',
302
### SGD stems and links.
304
('home' =>$Home_url{'sgd'},
305
'help' =>$Home_url{'sgd'}.'help/',
306
'mammal' =>$Home_url{'sgd'}.'mammal/',
307
'worm' =>$Home_url{'sgd'}.'worm/',
308
'gene' =>$SGD_stem_url{'sacchdb'}.'find+Locus+',
309
'locus' =>$SGD_stem_url{'sacchdb'}.'find+Locus+',
310
'orf' =>$SGD_stem_url{'sacchdb'}.'find+Locus+',
311
'mipsorf' =>$SGD_stem_url{'sgd'}."mips-orfs?",
312
'gene_info' =>$SGD_stem_url{'sacchdb'}.'find+Gene_Info+',
313
'prot_info' =>$SGD_stem_url{'sacchdb'}.'find+Protein_Info+',
314
'seq' =>$SGD_stem_url{'sgd'}.'seqDisplay?seq=',
315
'gi' =>$SGD_stem_url{'sacchdb'}.'find+Sequence+Database+=+GenPept+AND+NEXT+=+',
316
'chr' =>$SGD_stem_url{'sgd2'}.'seqTools?chr=',
317
'chr_old' =>$SGD_stem_url{'sgd'}.'dnaredir?chr=',
318
'seq_an' =>$SGD_stem_url{'sgd2'}.'seqTools?seqname=',
319
'seq_an_old' =>$SGD_stem_url{'sgd'}.'dnaredir?seqname=',
320
'map_chr' =>$SGD_stem_url{'sgd'}.'ORFMAP/ORFmap?chr=',
321
'map_orf' =>$SGD_stem_url{'sgd'}.'ORFMAP/ORFmap?seq=',
322
# 'chr' =>$SGD_stem_url{'sgd2'}.'seqform?chr=',
323
# 'seg' =>$SGD_stem_url{'sgd2'}.'seqform?seg=',
324
# 'fea' =>$SGD_stem_url{'sgd2'}.'featureform?seg=',
325
'feature' =>$SGD_stem_url{'sgd2'}.'featureform?chr=', # complete with "5&beg=100&end=400"
326
'search' =>$SGD_stem_url{'sgd'}.'search?',
327
'images' =>$SGD_stem_url{'stanford'}.'images/',
328
'suggest' =>$SGD_stem_url{'stanford'}.'forms/sgd-suggestion.html',
329
'tmp' =>$SGD_stem_url{'stanford'}.'tmp/',
333
### Sacch3D stems and links.
335
('home' =>$Home_url{'sacch3d'},
336
'search' =>$Home_url{'sacch3d'}.'search.html',
337
'help' =>$Home_url{'sacch3d'}.'help/',
338
'new' =>$Home_url{'sacch3d'}.'new/',
339
'chrm' =>$Home_url{'sacch3d'}.'data/chr',
340
'domains' =>$Home_url{'sacch3d'}.'domains/',
341
'genequiz' =>$Home_url{'sacch3d'}.'genequiz/',
342
'analysis' =>$Home_url{'sacch3d'}.'analysis/',
343
'scop' =>$SGD_stem_url{'s3d3'}.'getscop?data=',
344
'scop_fold' =>$SGD_stem_url{'s3d3'}.'getscop?type=fold&data=',
345
'scop_class' =>$SGD_stem_url{'s3d3'}.'getscop?type=class&data=',
346
'scop_gene' =>$SGD_stem_url{'s3d3'}.'getscop?type=gene&data=',
347
'gene' =>$SGD_stem_url{'s3d'}.'get?class=gene&item=',
348
'orf' =>$SGD_stem_url{'s3d'}.'get?class=orf&item=',
349
'text' =>$SGD_stem_url{'s3d'}.'get?class=text&item=',
350
'pdb' =>$SGD_stem_url{'s3d'}.'get?class=pdb&item=',
351
'pdb_coord' =>$SGD_stem_url{'s3d'}.'pdbcoord.pl?id=',
352
'dsc' =>$SGD_stem_url{'s3d'}.'dsc.pl?gene=',
353
'emotif' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=emotif&gene=',
354
'pfam' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=pfam&gene=',
355
'pfam_uk' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=pfam&loc=uk&gene=',
356
'pfam_us' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=pfam&loc=us&gene=',
357
'blast_pdb' =>$SGD_stem_url{'s3d'}.'getblast?db=pdb&name=',
358
'blast_nr' =>$SGD_stem_url{'s3d'}.'getblast?db=nr&name=',
359
'blast_est' =>$SGD_stem_url{'s3d'}.'getblast?db=est&name=',
360
'blast_mammal' =>$SGD_stem_url{'s3d'}.'getblast?db=mammal&name=',
361
'blast_human' =>$SGD_stem_url{'s3d'}.'getblast?db=human&name=',
362
'blast_worm' =>$SGD_stem_url{'s3d'}.'getblast?db=worm&name=',
363
'blast_yeast' =>$SGD_stem_url{'s3d'}.'getblast?db=yeast&name=',
364
'blast_worm_yeast'=>$SGD_stem_url{'s3d'}.'getblast?db=worm&query=worm&name=',
365
'patmatch' =>$SGD_stem_url{'s3d2'}.'grepmatch?', ## deprecated
366
'grepmatch' =>$SGD_stem_url{'s3d2'}.'grepmatch?',
367
'pdb_neighbors' =>$SGD_stem_url{'s3d'}.'pdb_neighbors?id=CHAIN&gene=ORF_NAME',
373
# ('java' =>$SGD_stem_url{'sgd'}.'Sacch3D/pdbViewer.pl?pdbCode=PDB&orf=',
375
'java' =>$SGD_stem_url{'sgd'}.'Sacch3D/pdbViewer.pl?pdbCode=', # Default java viewer
376
'webmol' =>$SGD_stem_url{'sgd'}.'Sacch3D/pdbViewer.pl?pdbCode=',
377
'codebase' =>$SGD_stem_url{'stanford'}.'structure/webmol/lib',
378
'rasmol' =>$Stem_url{'pdb'}.'send-ras?filename=',
379
'chime' =>$Stem_url{'pdb'}.'ccpeek?id=',
380
'cn3d' =>$Stem_url{'entrez'}.'db=t&form=6&Dopt=i&Complexity=Cn3D+Subset&uid=',
381
'kinemage' =>'http://prosci.org/Kinemage',
386
# The error reporting HTML strings represent some experiments in human psychology:
387
# how do you induce users to report errors that you should know about yet not
388
# get flooded with trivial problems caused by novices?
390
('authority' =>qq|<A HREF="mailto:$AUTHORITY"><b>$AUTHORITY</b></A>|,
391
'trouble' => <<"QQ_TROUBLE_QQ",
392
<p>If this problem persists, <A HREF="mailto:$AUTHORITY"><b>please notify us.</b></A>
393
Include a copy of this error page with your message. Thanks.<p>
395
'notify' => <<"QQ_NOTIFY_QQ",
396
<A HREF="mailto:$AUTHORITY"><b>Please notify us.</b></A>
397
Include a copy of this error page with your message. Thanks.<p>
399
'ourFault' => <<"QQ_FAULT_QQ",
400
<p><b>This is our fault!</b> There is apparently a problem with our software
401
that we may not know about. <A HREF="mailto:$AUTHORITY"><b>Please notify us!</b></A>
402
Include a copy of this error page with your message. Thanks.<p>
404
'techDiff' => <<"QQ_TECH_QQ",
405
<p><big>We are experiencing technical difficulties now.<br>
406
We will have the problem fixed soon. Sorry for any inconvenience.</big><p>
412
### Miscellaneous URLs. Configure as desired for your site.
413
my $Not_found_url = 'http://genome-www.stanford.edu/Sacch3D/notfound.html';
414
my $Tmp_url = 'http://genome-www.stanford.edu/tmp/';
420
Methods beginning with a leading underscore are considered private
421
and are intended for internal use by this module. They are
422
B<not> considered part of the public interface and are described here
423
for documentation purposes only.
427
#########################################################################
429
#########################################################################
434
Usage : $BioWWW->home_url(<string>)
435
Purpose : To obtain the homepage URL for a biological database or resource.
436
Returns : String containing the URL (including "http://")
438
: Currently acceptable arguments are:
439
: bioperl bioperl-schema biomoo bsm ebi emotif entrez
440
: expasy mips mmdb ncbi pir pfam pdb geneQuiz
441
: molMov pubmed sacch3d sgd scop swissProt webmol ypd
442
Throws : Warns if argument cannot be resolved to a URL.
443
Comments : The URLs listed here do not represent a complete list.
444
: Expect this to evolve and grow with time.
446
See Also : L<search_url>()
454
$arg eq 'all' and return %Home_url;
455
(exists $Home_url{$arg}) ? $Home_url{$arg}
456
: ($self->warn("Can't resolve argument to URL: $arg"),
464
Usage : $BioWWW->search_url(<string>)
465
Purpose : To provide a URL stem for a search engine at a biological database
467
Returns : String containing the URL (including "http://")
469
: Currently acceptable arguments are:
470
: 3db embl cath ec1 ec2 ec3 emotif_id entrez gb1 gb2
471
: gb3 gb4 gb5 pdb medline mmdb pdb pdb_coord pfam pir_acc
472
: pdbSum molMov swpr swModel swprSearch scop scop_pdb scop_data
474
Throws : Warns if argument cannot be resolved to a URL.
475
Comments : Unlike the homepage URLs, this method does not return a complete
476
: URL but a stem which must be further modified, typically by
477
: appending data to it, before it can be used. The data appended
478
: depends on the specific URL; typically, it is a database ID or
479
: other unique identifier.
480
: The requirements for each URL will be described here eventually.
482
: The URLs listed here do not represent a complete list.
483
: Expect this to evolve and grow with time.
485
: Given this complexity, it may be useful to provide special methods
486
: for these different URLs. This would however result in an
487
: explosion of methods that might make this module less
488
: maintainable and harder to use.
490
See Also : L<home_url>()
498
$arg eq 'all' and return %Search_url;
499
(exists $Search_url{$arg}) ? $Search_url{$arg}
500
: ($self->warn("Can't resolve argument to URL: $arg"),
508
Usage : $BioWWW->stem_url(<string>)
509
Purpose : To obtain the minimal stem URL for searching a biological database or resource.
510
Returns : String containing the URL (including "http://")
512
: Currently acceptable arguments are:
514
Throws : Warns if argument cannot be resolved to a URL.
515
Comments : The URLs stems returned by this method are much more minimal than
516
: this provided by search_url(). Use of these stems requires knowledge
517
: of the CGI scripts which they invoke.
519
See Also : L<search_url>()
527
$arg eq 'all' and return %Stem_url;
528
(exists $Stem_url{$arg}) ? $Stem_url{$arg}
529
: ($self->warn("Can't resolve argument to URL: $arg"),
537
Usage : $BioWWW->viewer_url(<string>)
538
Purpose : To obtain the stem URL for a 3D viewer (RasMol, WebMol, Cn3D)
539
Returns : String containing the URL (including "http://")
541
: Currently acceptable arguments are:
542
: rasmol webmol cn3d java (java is an alias for webmol)
543
Throws : Warns if argument cannot be resolved to a URL.
544
Comments : The 4-letter Brookhaven PDB identifier must be appended to the
545
: URL provided by this method.
546
: The URLs listed here do not represent a complete list.
547
: Expect this to evolve and grow with time.
555
$arg eq 'all' and return %Viewer_url;
556
(exists $Viewer_url{$arg}) ? $Viewer_url{$arg}
557
: ($self->warn("Can't resolve argument to URL: $arg"),
565
Usage : $BioWWW->not_found_url()
566
Purpose : To obtain the URL for a web page to be shown in place of a 404 error.
567
Returns : String containing the URL (including "http://")
570
Comments : This URL should be customized as desired.
575
sub not_found_url { my $self = shift; $Not_found_url; }
581
Usage : $BioWWW->tmp_url()
582
Purpose : To obtain the URL for a temporary, web-accessible directory.
583
Returns : String containing the URL (including "http://")
586
Comments : This URL should be customized as desired.
591
sub tmp_url { my $self = shift; $Tmp_url; }
598
Usage : $BioWWW->search_link(<site>, <value>, <text>)
599
Purpose : Wrapper for search_url() that returns the URL within an HTML anchor.
600
Returns : String containing the HTML anchor ( qq|<A HREF="http://..."</A>|)
601
Argument : <site> = string to be used as argument for search_url()
602
: <value> = string to be appended to the search URL stem.
603
: <text> = string to be shown as the link text (default = <value>).
605
Status : Experimental
607
See Also : L<search_url>()
614
my($self,$arg,$value,$text) = @_;
615
my $url = $self->search_url($arg);
617
qq|<A HREF="$url$value">$text</A>|;
624
Usage : $BioWWW->viewer_link(<site>, <value>, <text>)
625
Purpose : Wrapper for viewer_url() that returns the complete URL within an HTML anchor.
626
Returns : String containing the HTML anchor ( qq|<A HREF="http://..."</A>|)
627
Argument : <site> = string to be used as argument for viewer_url()
628
: <value> = string to be appended to the viewer URL stem.
629
: <text> = string to be shown as the link text (default = <value>).
631
Status : Experimental
633
See Also : L<viewer_url>()
640
my($self,$arg,$value,$text) = @_;
641
my $url = $self->viewer_url($arg);
643
qq|<A HREF="$url$value">$text</A>|;
650
Usage : $BioWWW->html(<string>)
651
Purpose : To obtain HTML-formatted text for frequently needed web-page messages.
652
Returns : String containing the HTML anchor ( qq|<A HREF="http://..."</A>|)
654
: Currently acceptable arguments are:
655
: authority (mailto: link for webmaster; shows e-mail address as link)
656
: notify (wraps mailto:authority link with text for link "please notify us")
657
: ourFault ("this problem is our fault. If it persists <notify-link>")
658
: trouble (same as ourFault but doesn't blame us for the problem)
659
: techDiff ("we are experiencing technical difficulties. Please stand by.")
661
Comments : The authority (webmaster) is imported from the Bio::Root::Global.pm
662
: module. The value for $AUTHORITY should be set there, or
663
: customize this module so that it doesn't use Bio::Root::Global.pm.
671
$arg eq 'all' and return %Html;
672
(exists $Html{$arg}) ? $Html{$arg} : "<pre>(missing HTML for \"$arg\")</pre>";
677
### Below are accessors specialized for the Saccharomyces Genome Database
678
### It is possible that they will be moved to Bio::SGD::WWW.pm in the future.
684
Usage : $BioWWW->sgd_url(<string>)
685
Purpose : To obtain the webpage URL or search stem for SGD.
686
Returns : String containing the URL (including "http://")
688
: Currently acceptable arguments (TODO).
689
Throws : Warns if argument cannot be resolved to a URL.
690
Comments : This accessor is specialized for the Saccharomyces Genome Database.
691
: It is possible that it will be moved to SGD::WWW.pm in the future.
693
See Also : L<search_url>()
701
$arg eq 'all' and return %SGD_url;
702
(exists $SGD_url{$arg}) ? $SGD_url{$arg}
703
: ($self->warn("Can't resolve argument to URL: $arg"),
711
Usage : $BioWWW->s3d_url(<string>)
712
Purpose : To obtain the webpage URL or search stem for Sacch3D.
713
Returns : String containing the URL (including "http://")
715
: Currently acceptable arguments (TODO).
716
Throws : Warns if argument cannot be resolved to a URL.
717
Comments : This accessor is specialized for the Saccharomyces Genome Database.
718
: It is possible that it will be moved to SGD::WWW.pm in the future.
720
See Also : L<search_url>()
728
$arg eq 'all' and return %S3d_url;
729
(exists $S3d_url{$arg}) ? $S3d_url{$arg}
730
: ($self->warn("Can't resolve argument to URL: $arg"),
738
Usage : $BioWWW->sgd_stem_url(<string>)
739
Purpose : To obtain the minimal stem URL for a SGD/Sacch3D CGI script.
740
Returns : String containing the URL (including "http://")
742
: Currently acceptable arguments (TODO).
743
Throws : Warns if argument cannot be resolved to a URL.
744
Comments : This accessor is specialized for the Saccharomyces Genome Database.
745
: It is possible that it will be moved to SGD::WWW.pm in the future.
747
See Also : L<search_url>()
755
$arg eq 'all' and return %SGD_stem_url;
756
(exists $SGD_stem_url{$arg}) ? $SGD_stem_url{$arg}
757
: ($self->warn("Can't resolve argument to URL: $arg"),
765
Usage : $BioWWW->s3d_link(<site>, <value>, <text>)
766
Purpose : Wrapper for s3d_url() that returns the complete URL within an HTML anchor.
767
Returns : String containing the URL (including "http://")
768
Argument : <site> = string to be used as argument for s3d_url()
769
: <value> = string to be appended to the s3d URL stem.
770
: <text> = string to be shown as the link text (default = <value>).
772
Status : Experimental
773
Comments : This accessor is specialized for the Saccharomyces Genome Database.
774
: It is possible that it will be moved to SGD::WWW.pm in the future.
776
See Also : L<s3d_url>(), L<sgd_link>()
783
my($self,$arg,$value,$text) = @_;
784
my $url = $self->s3d_url($arg);
786
qq|<A HREF="$url$value">$text</A>|;
793
Usage : $BioWWW->sgd_link(<site>, <value>, <text>)
794
Purpose : Wrapper for sgd_url() that returns the complete URL within an HTML anchor.
795
Returns : String containing the URL (including "http://")
796
Argument : <site> = string to be used as argument for sgd_url()
797
: <value> = string to be appended to the sgd URL stem.
798
: <text> = string to be shown as the link text (default = <value>).
800
Status : Experimental
801
Comments : This accessor is specialized for the Saccharomyces Genome Database.
802
: It is possible that it will be moved to SGD::WWW.pm in the future.
804
See Also : L<sgd_url>(), L<s3d_link>()
811
my($self,$arg,$value,$text) = @_;
812
my $url = $self->sgd_url($arg);
814
qq|<A HREF="$url$value">$text</A>|;
818
#########################################################################
820
#########################################################################
822
## Note that similar functions to those presented below are also availble
823
## via L. Stein's CGI.pm. These are more experimental versions.
827
Usage : $BioWWW->start_html()
828
Purpose : Prints the "Content-type: text/html\n\n<HTML>\n" header.
829
Returns : n/a; This method prints the Content-type string shown above.
832
Status : Experimental
833
Comments : This method prevents redundant invocations thus avoiding th
834
: accidental printing of the "content-type..." on the page.
835
: If using L. Stein's CGI.pm, this is similar to $query->header()
836
: (Does CGI.pm prevent redundant invocation?)
844
if(!$self->{'_started_html'}) {
845
print "Content-type: text/html\n\n<HTML>\n";
846
$self->{'_started_html'} = 1;
853
Usage : $BioWWW->redirect(<string>)
854
Purpose : Prints the header needed to redirect a web browser to a supplied URL.
855
Returns : n/a; Prints the redirection header.
856
Argument : String containing the URL to be redirected to.
858
Status : Experimental
867
print "Location: $url\n";
868
print "Content-type: text/html\n\n";
875
Usage : $BioWWW->pre("text to be pre-formatted");
876
Purpose : To produce HTML for text that is not to be formated by the brower.
877
Returns : String containing the "<pre>" formatted html.
880
Status : Experimental
888
"<PRE>\n".shift()."\n</PRE>";
895
my( $self, @param ) = @_;
897
my( $linkTo, $linkText, $modified, $mail, $mailText, $top) =
898
$self->_rearrange([qw(LINKTO LINKTEXT MODIFIED MAIL MAILTEXT TOP)], @param);
900
$modified = (scalar $modified)
901
? qq|<center><small><b>Last modified: $modified </b></small></center>|
906
# $top = (defined $top) ? qq|<a href="top">Top</a><br>| : '';
907
$top = qq|<a href="#top">Top</a>|; ## Utilizing the HTML bug/feature wherein
908
## a bogus name anchor defaults to the
913
<hr size=3 noshade width=95%>
914
$top | <a href="$linkTo"> $linkText</a><br>
916
<small><i><a href="mailto:$mail">$mailText</a></i></small>
925
Usage : $boolean = &strip_html( string_ref, [fast] );
926
Purpose : Removes HTML formatting from a supplied string.
927
Returns : Boolean: true if string was stripped, false if not.
928
Argument : string_ref = reference to a string containing the whole
929
: web page to be stripped.
930
: fast = a non-zero value. Optional. If set, a faster
931
: but perhaps less thorough procedure is used for
932
: stripping. Default = not fast.
933
Throws : Exception if the argument is not a scalar reference.
934
Comments : Based on code originally written by Alex Dong Li
935
: (ali@genet.sickkids.on.ca).
936
: This is a more generic version of the function that appears
937
: in Bio::Tools::Blast::HTML.pm
938
: This version does not perform any Blast-specific stripping.
940
: This employs a simple method for removing tags that
941
: will fail under following conditions:
942
: 1) if quoted > appears in a tag (does this ever happen?)
943
: 2) if a tag is split over multiple lines and this method is
944
: used to process one line at a time.
946
: Without fast mode, large HTML files can take exceedingly long times to
947
: strip (e.g., 1Meg file with many tags can take 10 minutes versus 5 seconds
948
: in fast mode. Try the swissprot yeast table). If you know the HTML to be
949
: well-behaved (i.e., tags are not split across mutiple lines), use fast
950
: mode for large, dense files.
957
my ($self, $string_ref, $fast) = @_;
959
ref $string_ref eq 'SCALAR' or
960
$self->throw("Can't strip HTML: ".
961
"Argument is should be a SCALAR reference not a ${\ref $string_ref}");
963
my $str = $$string_ref;
967
# MULTI-STRING-MODE: Much faster than single-string mode
968
# but will miss tags that span multiple lines.
969
# This is fine if you know the HTML to be "well-behaved".
971
my @lines = split("\n", $str);
973
s/<[^>]+>| //gi and $stripped = 1;
976
# This regexp likely won't work properly in this mode.
978
s/(\A|\n)>\s+/\n\n>/gi and $stripped = 1;
980
$$string_ref = join ("\n", @lines);
984
# SINGLE-STRING-MODE: Can be very slow for long strings with many substitutions.
986
# Removing all "<>" tags.
987
$str =~ s/<[^>]+>| //sgi and $stripped = 1;
989
# Re-uniting any lone '>' characters. Not really necessary for functional HTML
990
$str =~ s/(\A|\n)>\s+/\n\n>/sgi and $stripped = 1;
1001
########################################################################
1003
########################################################################
1005
=head1 FOR DEVELOPERS ONLY
1009
An instance of Bio::Tools::WWW.pm is a blessed reference to a hash containing
1010
all or some of the following fields:
1013
--------------------------------------------------------------
1014
_started_html Defined the on the initial invocation of start_html()
1015
to avoid duplicate printing out the "Content-type..." header.