3
Bioscripts 1.3 - Bioperl scripts
7
A list of the scripts in the Bioperl package
11
These scripts have been contributed by the developers and users of
12
Bioperl. They are organized into directories roughly mirroring those
13
in the Bioperl Bio/ directory. There are 2 directories for these
14
scripts, scripts/ and examples/. The scripts in scripts/ are
15
production quality scripts that have POD documentation and accept
16
command-line arguments, the scripts in examples/ are useful
17
examples of Bioperl code.
19
You can install the scripts in the scripts/ directory if you'd like,
20
simply follow the instructions on 'make install'. The installation
21
directory is specified by the INSTALLSCRIPT variable in the Makefile,
22
the default directory is /usr/bin. Installation will copy the scripts
23
to the specified directory, change the 'PLS' suffix to 'pl' and
24
prepend 'bp_' to the script name if it isn't so named already.
26
Please contact bioperl-l at bioperl.org if you are interested in
27
contributing your own script.
29
=head1 PRODUCTION SCRIPTS
31
=head2 scripts/install_bioperl_scripts.PLS
33
This script installs scripts from the scripts/ directory on
36
A fully-featured script that uses Bio::Biblio, a module for
37
accessing and querying bibliographic repositories like Medline.
39
=head2 scripts/biblio/biblio.PLS
41
A fully-featured script that uses Bio::Biblio, a module for
42
accessing and querying bibliographic repositories like Medline.
44
=head2 scripts/Bio-DB-GFF/bulk_load_gff.PLS
46
This script loads a mySQL Bio::DB::GFF database with the features
47
contained in a list of GFF files, it cannot do incremental loads.
49
=head2 scripts/Bio-DB-GFF/bp_genbank2gff.PLS
51
This script loads a Bio::DB::GFF database with the features contained
52
in a either a local Genbank file or an accession that is fetched from
55
=head2 scripts/Bio-DB-GFF/fast_load_gff.PLS
57
This script does a rapid load of a mySQL Bio::DB::GFF database
58
using files as source. Probably only works in Unix as it relies
61
=head2 scripts/Bio-DB-GFF/generate_histogram.PLS
63
Create a GFF-formatted histogram of the density of the indicated set
66
=head2 scripts/Bio-DB-GFF/load_gff.PLS
68
This script loads a mySQL Bio::DB::GFF database with the features
69
contained in a list of GFF files. This script will work with all
70
database adaptors supported by Bio::DB::GFF (mySQL, Oracle, Postgres).
72
=head2 scripts/Bio-DB-GFF/pg_bulk_load_gff.PLS
74
Bulk-load a PostgreSQL Bio::DB::GFF database from GFF files.
76
=head2 scripts/Bio-DB-GFF/process_gadfly.PLS
78
Transforms Gadfly GFF files into correct format.
80
=head2 scripts/Bio-DB-GFF/process_ncbi_human.PLS
82
Trnasforms NCBI's chromosome annotations into correct format.
84
=head2 scripts/Bio-DB-GFF/process_sgd.PLS
86
Transform SGD format annotations into GFF format.
88
=head2 scripts/Bio-DB-GFF/process_wormbase.PLS
90
Transforms Wormbase's GFF files into correct format. Requires Ace.pm.
92
=head2 scripts/DB/bioflat_index.pl
94
Create or update a biological sequence database indexed with the
95
Bio::DB::Flat indexing scheme.
97
=head2 scripts/DB/flanks.PLS
99
Fetch a sequence, find the sequences flanking a variant or SNP in the
100
sequence given its position.
102
=head2 scripts/DB/biofetch_genbank_proxy.PLS
104
A CGI scripts that queries NCBI's eutils to provide database access
105
according to the BioFetch protocol. Requires Cache::FileCache.
107
=head2 scripts/DB/biogetseq.PLS
109
Sequence retrieval using the OBDA registry.
111
=head2 scripts/graphics/feature_draw.PLS
113
Script that accepts files in GFF or tab-delimited format and creates
114
corresponding PNG image files. See L<Bio::Graphics> and
115
L<Bio::Graphics::FeatureFile> for more information.
117
=head2 scripts/graphics/frend.PLS
119
Create a PNG file on the Web using Bio::Graphics - accepts a file
120
containing sequence and feature coordinates.
122
=head2 scripts/graphics/search_overview.PLS
124
Create a simple overview graphic of the hits, color is based on the hit
125
score much like the NCBI overview graphic in a BLAST report.
127
=head2 scripts/index/bp_fetch.PLS
129
Fetch sequences from local indexed database or over the network and
130
reformat using Bio::Index* and Bio::DB*.
132
=head2 scripts/index/bp_index.PLS
134
Indexes local databases, partners with bp_fetch.pl.
136
=head2 scripts/seq/extract_feature_seq.PLS
138
Extract the sequence for a specified feature type.
140
=head2 scripts/popgen/composite_LD.PLS
142
An easy way to calculate composite linkage disequilibrium (LD).
144
=head2 scripts/popgen/heterogeneity.PLS
146
A test for distinguishing between selection and population
149
=head2 scripts/searchio/filter_search.PLS
151
Simple script to filter by SearchIO criteria and print.
153
=head2 scripts/seq/seqconvert.PLS
155
Bioperl sequence format converter.
157
=head2 scripts/seq/split_seq.PLS
159
Split a sequence in a file into chunks of equal size.
161
=head2 scripts/seq/translate_seq.PLS
163
A simple Bioperl translator.
165
=head2 scripts/seqstats/aacomp.PLS
167
Prints out the count of amino acids over all protein
168
sequences in the input file.
170
=head2 scripts/seqstats/chaos_plot.PLS
172
Produce a PNG or JPEG chaos plot given a DNA sequence using GD.pm.
174
=head2 scripts/seqstats/gccalc.PLS
176
Prints out the GC content for every nucleotide sequence
179
=head2 scripts/seqstats/oligo_count.PLS
181
Use this script to determine what primers would be useful for
182
frequent priming of nucleic acid for random labeling.
184
=head2 scripts/taxa/local_taxonomydb_query.PLS
186
Script that accesses a local Taxonomy database and retrieves
187
species or taxon ids.
189
=head2 scripts/taxa/taxid4species.PLS
191
Retrieve the NCBI Tada ID for a given species.
193
=head2 scripts/tree/blast2tree.PLS
195
Builds a phylogenetic tree based on a sequence search (Fasta,
198
=head2 scripts/utilities/bp_mrtrans.PLS
200
Perl implementation of Bill Pearson's mrtrans to project protein
201
alignment back into cDNA coordinates.
203
=head2 scripts/utilities/bp_nrdb.PLS
205
Make a non-redundant database based on sequence, not id.
206
Requires Digest::MD5.
208
=head2 scripts/utilities/bp_sreformat.PLS
210
Perl implementation of Sean Eddy's sreformat, a sequence and
213
=head2 scripts/utilities/dbsplit.PLS
215
Splits one or more sequence files into subfiles with specified
216
numbers of sequences, any sequence format.
218
=head2 scripts/utilities/mask_by_search.PLS
220
Masks parts of a sequence based on a significant matches to that
221
sequence as contained in a SearchIO-compatible report file.
223
=head2 scripts/utilities/mutate.PLS
225
Randomly mutagenize a single protein or DNA sequence. Specify
226
percentage mutated and number of resulting mutagenized sequences.
228
=head2 scripts/utilities/pairwise_kaks.PLS
230
Takes DNA sequences as input, aligns them as proteins,
231
projects the alignment back into DNA and estimates
232
the Ka (non-synonymous) and Ks (synonymous) substitutions.
234
=head2 scripts/utilities/remote_blast.PLS
236
This script executes a remote Blast search using RemoteBlast. See
237
L<Bio::Tools::Run::RemoteBlast> for more information.
239
=head2 scripts/utilities/search2BSML.PLS
241
Turns SearchIO-compatible reports into a BSML report.
243
=head2 scripts/utilities/search2alnblocks.PLS
245
Turns SearchIO-compatible reports into alignments in formats supported
248
=head2 scripts/utilities/search2tribe.PLS
250
This script will turn a protein SearchIO-compatible report (BLASTP,
251
FASTP, SSEARCH) into a Markov Matrix for TribeMCL clustering.
253
=head2 scripts/utilities/search2gff.PLS
255
Turn SearchIO parseable reports(s) into a GFF report.
257
=head2 scripts/utilities/seq_length.PLS
259
Reports the total number of residues and total number of individual
260
sequences in a specified sequence database file.
263
=head1 EXAMPLE SCRIPTS
265
=head2 examples/align/align_on_codons.pl
267
Aligns nucleotide sequences based on codons in a specified reading frame.
269
=head2 examples/align/aligntutorial.pl
271
Examples using EMBOSS, pSW, Clustalw, TCoffee, and Blast to align sequences.
273
=head2 examples/align/clustalw.pl
275
A demonstration of the various uses of Alignment::Clustalw. See
276
L<Bio::Tools::Run::Alignment::Clustalw> for more.
278
=head2 examples/align/simplealign.pl
280
A script that demonstrates some uses of AlignIO. Please see
281
L<Bio::AlignIO> for more information.
283
=head2 examples/biblio/biblio.pl
285
A script that shows how to query bibliographic databases, such as
286
Medline, using ids, keywords, and other fields. See L<Bio::Biblio> for
289
=head2 examples/biblio/biblio_soap.pl
291
Connect to and test a SOAP server using a Bio::Biblio object.
293
=head2 examples/biographics/all_glyphs.pl
295
Creates an image showing all possible glyphs.
297
=head2 examples/biographics/dynamic_glyphs.pl
299
Creates a complex image of a gene with confirmed and predicted exons,
300
transcripts, and text labels.
302
=head2 examples/biographics/lots_of_glyphs.pl
304
Creates a complex image of a gene with confirmed and predicted exons,
305
transcripts, and text labels.
307
=head2 examples/bioperl.pl
311
=head2 examples/cluster/dbsnp.pl
313
How to parse a dbsnp XML file. See L<Bio::ClusterIO> for details.
315
=head2 examples/contributed/nmrpdb_parse.pl
317
Extracts individual conformers from an NMR-derived PDB file.
319
=head2 examples/contributed/prosite2perl.pl
321
Convert Prosite motifs to Perl regular expressions.
323
=head2 examples/contributed/rebase2list.pl
325
Script to convert rebase file to format compatible with
326
Bio::Tools::RestrictionEnzyme.
328
=head2 examples/contributed/expression_analysis*
330
A set of scripts that accept microarray data as input and perform
331
statistical analyses, including t test, U test, Mann-Whitney,
332
and Pearson correlation coefficent.
334
=head2 examples/db/dbfetch
336
Creates a Web page to query a local SRS server and fetch sequences.
338
=head2 examples/db/est_tissue_query.pl
340
Fetch EST sequences from local files or Genbank filtered by tissue
341
using Bio::DB* or Bio::Index*.
343
=head2 examples/db/gb2features.pl
345
Shows how to extract all the features from a Genbank file. See
346
L<Bio::Seq> for more information on features.
348
=head2 examples/db/getGenBank.pl
350
Retrieving Genbank entries over the Web using DB::GenBank. See
351
L<Bio::DB::GenBank> for more information.
353
=head2 examples/db/get_seqs.pl
355
Fetches and formats sequences from GenBank, EMBL, or SwissProt over
356
the network using Bio::DB*.
358
=head2 examples/db/gff/*
360
Scripts that reformat sequence to GFF and load GFF format files into
361
an indexed database - see L<Bio::DB::GFF> for more information.
363
=head2 examples/db/rfetch.pl
365
A script that uses Bio::DB::Registry to retrieve sequences from EMBL,
366
reformat them, and print them. See L<Bio::DB::Registry>.
368
=head2 examples/db/use_registry.pl
370
Script that shows how to use Bio::DB::Registry, part of Bioperl's
371
integration with OBDA, the Open Bio Database Access registry
372
scheme. See L<Bio::DB::Registry> for more information.
374
=head2 examples/exceptions/*
376
Scripts that demonstrate how to throw and catch Error.pm objects.
378
=head2 examples/generate_random_seq.pl
380
Writes random RNA, DNA, or protein sequence of given length.
382
=head2 examples/biographics/render_sequence.pl
384
This scripts fetchs a sequence from a remote database, extracts its
385
features (CDS, gene, tRNA), and creates a graphic representation of
386
the sequence in PNG or GIF format. See L<Bio::DB::BioFetch>
387
and L<Bio::Graphics>.
389
=head2 examples/liveseq/change_gene.pl
391
A script showing how to use LiveSeq::Mutator and
392
LiveSeq::Mutation. Please see L<Bio::LiveSeq::Mutator> and
393
L<Bio::LiveSeq::Mutation> for more information.
395
=head2 examples/longorf.pl
397
A script that finds the longest ORF in one or more nucleotide sequences.
399
=head2 examples/make_mrna_protein.pl
401
Translate a cDNA or ORF to protein using Bio::Seq's translate() method.
403
=head2 examples/make_primers.pl
405
Design PCR primers given a sequence and the positions of the start and
406
stop codons in the sequence's ORF.
408
=head2 examples/popgen/parse_calc_stats.pl
410
Shows how to read data from a Bio::PopGen::IO object.
412
=head2 examples/rev_and_trans.pl
414
Examples using Bio::Seq.pm for reversing and translating
415
sequences. See L<Bio::Seq> for more information.
417
=head2 examples/revcom_dir.pl
419
Eeturn reverse complement sequences of all sequences in the current
420
directory and save them in the same directory.
422
=head2 examples/sirna/rnai_finder.cgi
424
CGI script for RNAi reagent design. See L<Bio::Tools::SiRNA> for more
427
=head2 examples/root_object/*
429
Scripts that demonstrate uses of the Bio::Root modules.
431
=head2 examples/searchio/blast_example.pl
433
Print out all parsed values from a BLAST report.
435
=head2 examples/searchio/custom_writer.pl
437
Demonstrates how to extract data from BLAST reports and output as
440
=head2 examples/searchio/hitwriter.pl
442
Demonstrates how to extract data from BLAST reports and output as
445
=head2 examples/searchio/hspwriter.pl
447
Demonstrates how to extract data from BLAST reports and output as
450
=head2 examples/searchio/htmlwriter.pl
452
Demonstrates how to extract data from BLAST reports and output as
455
=head2 examples/searchio/psiblast_features.pl
457
Illustrates how to grab a set of SeqFeatures from a Psiblast report.
459
=head2 examples/searchio/psiblast_iterations.pl
461
Demonstrates the use of a SearchIO parser for processing the
462
iterations within a PSI-BLAST report.
464
=head2 examples/searchio/rawwriter.pl
466
Shows how to print out raw BLAST alignment data for each HSP.
468
=head2 examples/searchio/resultwriter.pl
470
Demonstrates how to extract data from BLAST reports and output as
473
=head2 examples/searchio/waba2gff.pl
475
Convert raw WABA output to one type of GFF.
477
=head2 examples/seq/*
479
Example code for working with multiple sequence files, including
480
formatting, printing, and filtering based on length or description or ID.
482
=head2 examples/seq/extract_cds.pl
484
Extract the CDS features from a Genbank file.
486
=head2 examples/seqstats/aacomp.pl
488
Calculate amino acid composition of a protein using Tools::CodonTable
489
and Tools::IUPAC. See L<Bio::Tools::IUPAC> and
490
L<Bio::Tools::CodonTable> for more information.
492
=head2 examples/structure/struct_example*
494
Scripts that show how to examine details of the 3D structure of a
495
protein by parsing a PDB file. See L<Bio::Structure::IO> for more information.
497
=head2 examples/subsequence.cgi
499
CGI script to fetch a sequence from Genbank and extract a subsequence
500
using DB::GenBank. See L<Bio::DB::GenBank>.
502
=head2 examples/tk/gsequence.pl
504
Create a Protein Sequence Control Panel GUI with Gtk.
506
=head2 examples/tk/hitdisplay.pl
508
Create a GUI for displaying Blast results using Tk::HitDisplay.
509
Please see L<Bio::Tk::HitDisplay> for more information.
511
=head2 examples/tools/gb_to_gff.pl
513
Extracts top-level sequence features from Genbank-formatted sequence
514
files using Tools::GFF. See L<Bio::Tools::GFF>.
516
=head2 examples/tools/gff2ps.pl
518
Takes an input file in GFF format and draws its genes and features as
519
Postscript using Tools::GFF. See L<Bio::Tools::GFF>.
521
=head2 examples/tools/parse_codeml.pl
523
Script that parses output from codeml, one of the PAML programs. See
524
L<Bio::Tools::Phylo::PAML>.
526
=head2 examples/tools/psw.pl
528
Example code for using the XS extensions for comparing proteins
529
using Smith-Waterman.
531
=head2 examples/tools/restriction.pl
533
Example code for using the RestrictionEnzyme module. See
534
L<Bio::Tools::RestrictionEnzyme> for more information (note that
535
Bio::Tools::RestrictionEnzyme has been superceded by
536
Bio::Restriction::*).
538
=head2 examples/tools/run_genscan.pl
540
Run GENSCAN on multiple sequences and create output sequence files
541
using Tools::Genscan. Please see L<Bio::Tools::Genscan> for more information.
543
=head2 examples/tools/seq_pattern.pl
545
A script that shows how to use sequences as regular expressions using
546
Tools::SeqPattern. Please see L<Bio::Tools::SeqPattern> for more information.
548
=head2 examples/tools/standaloneblast.pl
550
The many uses of StandAloneBlast, including BLAST and PSIBLAST.
552
=head2 examples/tools/state-machine.pl
554
A demonstration of how to create a state machine using
555
StateMachine::AbstractStateMachine. Please see
556
L<Bio::Tools::StateMachine::AbstractStateMachine> for more information.
558
=head2 examples/tools/test-genscan.pl
560
Script for testing and demonstrating Bio::Tools::Genscan.
562
=head2 examples/tree/paup2phylip.pl
564
Convert a PAUP tree block to Phylip format.