1
<?xml version="1.0" encoding="ISO-8859-1"?>
3
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Bio::SeqIO HOWTO</title><link rel="stylesheet" href="e-novative.css" type="text/css"/><meta name="generator" content="DocBook XSL Stylesheets V1.55.0"/><meta name="description" content="
 This HOWTO tries to teach you about the SeqIO system for reading and 
 writing sequences of various formats
 "/></head><body><div class="article"><div class="titlepage"><div><h1 class="title"><a id="d3e1"/>Bio::SeqIO HOWTO</h1></div><div><div class="author"><h3 class="author">Ewan Birney</h3><div class="affiliation"><span class="orgname">EBI<br/></span><div class="address"><p><tt><<a href="mailto:birney-at-ebi.ac.uk">birney-at-ebi.ac.uk</a>></tt></p></div></div></div></div><div><div class="author"><h3 class="author">Darinn London</h3><div class="affiliation"><span class="orgname">EBI<br/></span><div class="address"><p><tt><<a href="mailto:dlondon-at-ebi.ac.uk">dlondon-at-ebi.ac.uk</a>></tt></p></div></div></div></div><div><div class="author"><h3 class="author">Brian Osborne</h3><div class="affiliation"><span class="orgname">Cognia Corporation<br/></span><div class="address"><p><tt><<a href="mailto:brian-at-cognia.com">brian-at-cognia.com</a>></tt></p></div></div></div></div><div><div class="legalnotice"><p>This document is copyright Ewan Birney, 2002. For
5
reproduction other than personal use please contact me at
7
</p></div></div><div><div class="abstract"><p class="title"><b>Abstract</b></p><p>
8
This HOWTO tries to teach you about the SeqIO system for reading and
9
writing sequences of various formats
10
</p></div></div><hr/></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt>1.�<a href="#overview">10 second overview</a></dt><dt>2.�<a href="#background">Background Information</a></dt><dt>3.�<a href="#formats">Formats</a></dt><dt>4.�<a href="#working">Working Examples</a></dt><dt>5.�<a href="#caveats">Caveats</a></dt><dt>6.�<a href="#errors">Error Handling</a></dt></dl></div><div class="section"><div class="titlepage"><div><h2 class="title" style="clear: both"><a id="overview"/>1.�10 second overview</h2></div></div><p>
11
Lots of bioinformatics involves processing sequence information in different formats - indeed,
12
there often seems to be about as many formats as there are programs for
13
processing them. The Bio::SeqIO systems handles sequences of many different formats
14
and is the way Bioperl pushes sequences in and out of objects. You can
15
think of the Bio::SeqIO system as "a smart filehandle for sequences".
16
</p></div><div class="section"><div class="titlepage"><div><h2 class="title" style="clear: both"><a id="background"/>2.�Background Information</h2></div></div><p>
17
The SeqIO system handles all of the complexity of parsing sequences of many
18
standard formats that scripters have wrestled with over the years. Given some
19
way of accessing some sequences (flat files, STDIN, variable, etc.), and a format description,
20
it provides access to a stream of SeqI objects tailored to the information provided
21
by the format. The format description is, technically, optional. SeqIO can try to
22
guess based on known file extensions, but if your source doesn't have a standard file
23
extension, or isn't even a file at all, it throws up its hands and tries fasta.
24
Unless you are always working with FASTA files, it is a good idea to get into the
25
practice of always specifying the format.
27
Sequences can be fed into the SeqIO system in a variety of ways. The only
28
requirement is that the sequence be contained in some kind of standard perl 'Handle'.
29
Most people will make use of the traditional handles: file handles, and STDIN/STDOUT.
30
However, perl provides ways of turning the contents of a string into a Handle as
31
well (more on this below), so just about anything can be fed into SeqIO to get at
32
the sequence information contained within it. What SeqIO does is create a Handle,
33
or take a given Handle, and parse it into SeqI compliant objects, one for each
34
entry at a time. It knows which SeqI object to use for a particular format, e.g.
35
it uses a PrimarySeq for fasta formats, Seq for most other formats, and RichSeq
36
for Genbank/EMBL. It also knows, for each of the supported formats, things like
37
which record-separator (e.g. "||" for Genbank, ">header" for fasta,
38
etc.) to use, and most importantly, how to parse their key-value based information. SeqIO does all
39
of this for you, so that you can focus on the things you want to do with the
40
information, rather than worrying about how to get the information.
41
</p></div><div class="section"><div class="titlepage"><div><h2 class="title" style="clear: both"><a id="formats"/>3.�Formats</h2></div></div><p>
42
Bioperl's SeqIO system has a lot of formats to interconvert sequences.
43
Here is a current listing of formats, as of version 1.2.
44
<div class="table"><a id="d3e41"/><table summary="Formats" border="1"><colgroup><col/><col/><col/></colgroup><thead><tr><th>Name</th><th>Description</th><th>extensions</th></tr></thead><tbody><tr><td>abi</td><td>ABI tracefile</td><td>�</td></tr><tr><td>ace</td><td>Ace database</td><td>ace</td></tr><tr><td>alf</td><td>ALF tracefile</td><td>�</td></tr><tr><td>bsml</td><td>�</td><td>bsm|bsml</td></tr><tr><td>ctf</td><td>CTF tracefile</td><td>�</td></tr><tr><td>chado</td><td>Chado formats</td><td>�</td></tr><tr><td>embl</td><td>EMBL database</td><td>embl|ebl|emb|dat</td></tr><tr><td>exp</td><td>EXP format</td><td>�</td></tr><tr><td>fasta</td><td>FASTA</td><td>fasta|fast|seq|fa|fsa|nt|aa</td></tr><tr><td>fastq</td><td>�</td><td>fastq</td></tr><tr><td>game</td><td>GAME XML</td><td>�</td></tr><tr><td>gcg</td><td>GCG</td><td>gcg</td></tr><tr><td>genbank</td><td>GenBank</td><td>gb|gbank|genbank</td></tr><tr><td>largefasta</td><td>�</td><td>�</td></tr><tr><td>phd</td><td>Phred</td><td>phd|phred</td></tr><tr><td>pir</td><td>PIR database</td><td>pir</td></tr><tr><td>pln</td><td>PLN tracefile</td><td>�</td></tr><tr><td>qual</td><td>Phred</td><td>�</td></tr><tr><td>raw</td><td>plain text</td><td>txt</td></tr><tr><td>scf</td><td>Standard Chromatogram Format</td><td>scf</td></tr><tr><td>swiss</td><td>SwissProt</td><td>swiss|sp</td></tr><tr><td>ztr</td><td>ZTR tracefile</td><td>�</td></tr></tbody></table><p class="title"><b>Table�1.�Formats</b></p></div>
45
Note: SeqIO needs the bioperl-ext package to read the scf, abi, alf, pln, exp, ctf, and ztr formats.
46
</p></div><div class="section"><div class="titlepage"><div><h2 class="title" style="clear: both"><a id="working"/>4.�Working Examples</h2></div></div><p>
47
The simplest script for parsing sequence files is written out
48
below. It prints out the accession number for each entry in the file.
49
<pre class="programlisting">
50
# first, bring in the SeqIO module
54
# Notice that you do not have to use any Bio:SeqI
55
# objects, because SeqIO does this for you. In fact, it
56
# even knows which SeqI object to use for the provided
59
# Bring in the file and format, or die with a nice
60
# usage statement if one or both arguments are missing.
61
my $usage = "getaccs.pl file format\n";
62
my $file = shift or die $usage;
63
my $format = shift or die $usage;
65
# Now create a new SeqIO object to bring in the input
66
# file. The new method takes arguments in the format
67
# key => value, key => value. The basic keys that it
68
# can accept values for are '-file' which expects some
69
# information on how to access your data, and '-format'
70
# which expects one of the Bioperl-format-labels mentioned
71
# above. Although it is optional, it is good
72
# programming practice to provide > and < in front of any
73
# filenames provided in the -file parameter. This makes the
74
# resulting filehandle created by SeqIO explicitly read (<)
75
# or write(>). It will definitely help others reading your
76
# code understand the function of the SeqIO object.
78
my $inseq = Bio:SeqIO->new('-file' => "<$file",
79
'-format' => $format );
80
# Now that we have a seq stream,
81
# we need to tell it to give us a $seq.
82
# We do this using the 'next_seq' method of SeqIO.
84
while (my $seq = $inseq->next_seq) {
85
print $seq->accession_no."\n";
89
This script takes two arguments on the commandline, and input filename
90
and the format of the input file. This is the basic way to access the data in a Genbank
91
file. It is the same for fasta, swissprot, aceDB, and
92
all the others as well, provided that the correct Bioperl-format-label is provided.
94
Notice that SeqIO naturally works over sets of sequences in files, not
95
just one sequence. Each call to next_seq will return the next sequence in
96
the 'stream', or null if the end of the file/stream has been
97
reached. This allows you to read in the contents of your data one sequence
98
at a time, which conserves memory, in contrast with pulling everything into
99
memory first. The null that is returned at the end of file/stream is
100
important, as it allows you to wrap successive calls to next_seq in a
101
while loop. This code snippet would load up all the sequences in a EMBL
103
<pre class="programlisting">
107
my $input_file = shift;
109
my $seq_in = Bio::SeqIO->new( -format => 'embl',
110
-file => $input_file);
112
# loads the whole file into memory - be careful
113
# if this is a big file, then this script will
114
# use a lot of memory
118
while( $seq = $seq_in->next_seq() ) {
119
push(@seq_array,$seq);
122
# now do something with these. First sort by length,
123
# find the average and median lengths and print them out
125
@seq_array = sort { $a->length <=> $b->length } @seq_array;
129
foreach my $seq ( @seq_array ) {
130
$total += $seq->length;
134
print "Mean length ",$total/$count," Median ",$seq_array[$count/2],"\n";
137
It's just as straightforward to use many files as input, simply
138
use the SeqIO::MultiFile module, like this:
139
<pre class="programlisting">
140
$seq_in = Bio::SeqIO::MultiFile( -format => 'Fasta',
141
-files => ['file1','file2'] );
142
while ( my $seqobj = $seq_in->next_seq ) {
143
# do something with the Seq object
147
Now, what if we want to convert one format to another. When you create a
148
Bio::SeqIO object to read in a flat file, the magic behind the curtains is that
149
each call to 'next_seq' is a complex parsing of the next
150
sequence record into a SeqI object - not a
151
single line, but the entire record!! It knows when to start
152
parsing, and when to stop and wait for the next call to next_seq. It knows how to
153
get at the DIVISION information stored on the LOCUS line, etc. To get that SeqI
154
information back out to a new file, of a different format (or of the same format,
155
but with sequences grouped in a new way), Bio::SeqIO has a second method, called
156
'write_seq' that revearses the process done by next_seq. It knows how to write
157
all of the data contained in the SeqI object into the right places, with the
158
correct labels, for any of the supported formats. Let's make this more concrete
159
by writing a universal format translator:
160
<pre class="programlisting">
162
# get command-line arguments, or die with a usage statement
163
my $usage = "x2y.pl infile infileformat outfile outfileformat\n";
164
my $infile = shift or die $usage;
165
my $infileformat = shift or die $usage;
166
my $outfile = shift or die $usage;
167
my $outfileformat = shift or die $usage;
169
# create one SeqIO object to read in,and another to write out
170
my $seq_in = Bio::SeqIO->new('-file' => "<$infile",
171
'-format' => $infileformat);
172
my $seq_out = Bio::SeqIO->new('-file' => ">$outfile",
173
'-format' => $outfileformat);
175
# write each entry in the input file to the output file
176
while ( my $inseq = $seq_in->next_seq ) {
177
$seq_out->write_seq($inseq);
182
You can think of the two variables, $seq_in and $seq_out as being rather
183
special types of filehandles which "know" about sequences and sequence
184
formats. However, rather than using the <F> operator to read files you
185
use the $seqio->next_seq() method and rather than saying "print F $line"
186
you say $seqio->write_seq($seq_object).
188
An aside: Bio::SeqIO actually allows you to make use of a rather
189
scary/clever part of Perl that can "mimic" filehandles, so that
190
the <F> operator returns sequences and the print F operator writes
191
sequences. However, for most people, including myself, this looks
192
really really weird and leads to probably more confusion.
194
Notice that the universal formatter only required a
195
few more lines of code than the accession number lister and mean sequence
196
length analyzer (mostly to get more command-line args). This is the
197
beauty of using the Bioperl system. It doesn't take a lot of code to do
198
some really complex things.
200
Now, let's play around with the previous code, changing aspects of it to
201
exploit the functionality of the SeqIO system. Let's take a stream from standard
202
in, so that we can use other programs to stream data of a particular format into
203
the program, and write out a file of a particular format. Here we have to make
204
use of two new things: one perl specific, and one SeqIO specific.
205
Perl allows you to 'GLOB' a filehandle by placing a '*' in front of the
206
handle name, making it available for use as a variable, or as in this case,
207
as an argument to a function. In concert, SeqIO allows you to pass a GLOB'ed
208
filehandle to it using the '-fh' parameter in place of the '-file' parameter.
209
Here is a program that takes a stream of sequences in a given format from
210
STDIN, meaning it could be used like this:
211
<pre class="programlisting">
212
>cat myseqs.fa | all2y.pl fasta newseqs.gb genbank
214
The code for all2y.pl is:
215
<pre class="programlisting">
217
# get command-line arguments, or die with a usage statement
218
my $usage = "all2y.pl informat outfile outfileformat\n";
219
my $informat = shift or die $usage;
220
my $outfile = shift or die $usage;
221
my $outformat = shift or die $usage;
223
# create one SeqIO object to read in, and another to write out
224
# *STDIN is a 'globbed' filehandle with the contents of Standard In
225
my $seqin = Bio::SeqIO->new('-fh' => *STDIN,
226
'-format' => $informat);
227
my $seqout = Bio::SeqIO->new('-file' => ">$outfile",
228
'-format' => $outformat);
230
# write each entry in the input file to the output file
231
while (my $inseq = $seqin->next_seq) {
232
$outseq->write_seq($inseq);
237
Why use files at all, we can pipe STDIN to STDOUT, which could
238
allow us to plug this into some other pipeline, something
240
<pre class="programlisting">
241
>cat *.seq | in2out.pl EMBL Genbank | someother program
243
The code for in2out.pl could be:
244
<pre class="programlisting">
246
# get command-line arguments, or die with a usage statement
247
my $usage = "in2out.pl informat outformat\n";
248
my $informat = shift or die $usage;
249
my $outformat = shift or die $usage;
251
# create one SeqIO object to read in, and another to write out
252
my $seqin = Bio::SeqIO->new('-fh' => *STDIN,
253
'-format' => $informat);
254
my $seqout = Bio::SeqIO->new('-fh' => *STDOUT,
255
'-format' => $outformat);
257
# write each entry in the input to the output
258
while (my $inseq = $seqin->next_seq) {
259
$outseq->write_seq($inseq);
264
A popular question many people have asked is "What if I have a string that has
265
a series of sequence records in some format, and I want to make it a Seq object?"
266
You might do this if you allow users to paste in sequence data into a web form,
267
and then do something with that sequence data. This can be accomplished using
268
the -fh parameter, along with perl's IO::String module that allows you to turn
269
the contents of a string into a standard globbed perl handle. This isn't a complete
270
program, but gives you the most relevant bits. Assume that there is some type of
271
CGI form processing, or some such business, that pulls a group of sequences into
272
a variable, and also pulls the format definition into another variable.
273
<pre class="programlisting">
277
# get a string into $string somehow, with its format in
278
# $format, say from a web form
279
my $stringfh = new IO::String($string);
280
my $seqio = new Bio::SeqIO(-fh => $stringfh,
281
-format => $format);
283
while( my $seq = $seqio->next_seq ) {
289
The -file parameter in SeqIO can take more than a filename. It can also take
290
a string that tells it to 'pipe' something else into it. This is of the form
291
'-file' => 'command |'. Notice the vertical bar at the end, just before the
292
single quote. This is especially useful when you are working with large, gzipped
293
files because you just don't have enough disk space to unzip them (e.g. a Genbank
294
full release file), but can make fasta files from them. Here is a program that
295
takes a gzipped file of a given format and writes out a FASTA file,
297
<pre class="programlisting">
298
>gzip2fasta.pl gbpri1.seq.gz Genbank gbpri1.fa
302
<pre class="programlisting">
304
# get command-line arguments, or die with a usage statement
305
my $usage = "gzip2fasta.pl infile informat outfile\n";
306
my $infile = shift or die $usage;
307
my $informat = shift or die $usage;
308
my $outformat = shift or die $usage;
310
# create one SeqIO object to read in, and another to write out
311
my $seqin = Bio::SeqIO->new('-file' => "/usr/local/bin/gunzip $infile |",
312
'-format' => $informat);
314
my $seqout = Bio::SeqIO->new('-file' => ">$outfile",
315
'-format' => 'Fasta');
317
# write each entry in the input to the output file
318
while (my $inseq = $seqin->next_seq) {
319
$outseq->write_seq($inseq);
324
Bioperl also allows a 'pipe - out' to be given as an argument to -file. This
325
is of the form '-file' => "| command". This time the vertical bar is at the
326
beginning, just after the first quote. Let's write a program to take an input file,
327
and write it directly to a WashU Blastable Database, without ever
328
writing out a fasta file, like:
329
<pre class="programlisting">
330
>any2wublastable.pl myfile.gb Genbank mywublastable p
332
And the code for any2wublastable.pl is:
334
<pre class="programlisting">
337
# get command-line arguments, or die with a usage statement
338
my $usage = "any2wublastable.pl infile informat outdbname outdbtype\n";
339
my $infile = shift or die $usage;
340
my $informat = shift or die $usage;
341
my $outdbname = shift or die $usage;
342
my $outdbtype = shift or die $usage;
344
# create one SeqIO object to read in, and another to write out
345
my $seqin = Bio::SeqIO->new('-file' => "<$infile",
346
'-format' => $informat);
347
my $seqout = Bio::SeqIO->new('-file' =>
348
"| /usr/local/bin/xdformat -o $outdbname -${outdbtype} -- -",
349
'-format' => 'Fasta');
351
# write each entry in the input to the output
352
while (my $inseq = $seqin->next_seq) {
353
$outseq->write_seq($inseq);
358
Some of the more seasoned perl hackers may have noticed that the 'new' method
359
returns a reference, which can be placed into any of the data structures used in
360
perl. For instance, let's say you wanted to take a genbank file with
361
multiple sequences, and split
362
the human sequences out into a human.gb file, and all the rest of the sequences
363
into the other.gb file. In this case, I will use a hash to store the two handles
364
where 'human' is the key for the human output, and 'other' is the key
365
to other, so the usage would be:
366
<pre class="programlisting">
367
>splitgb.pl inseq.gb
369
Here's what splitgb.pl looks like:
370
<pre class="programlisting">
373
# get command-line argument, or die with a usage statement
374
my $usage = "splitgb.pl infile\n";
375
my $infile = shift or die $usage;
376
my $inseq = Bio::SeqIO->new('-file' => ">$infile",
377
'-format' => 'Genbank');
380
'human' => Bio::SeqIO->new('-file' => '>human.gb',
381
'-format' => 'Genbank'),
382
'other' => Bio::SeqIO->new('-file' => '>other.gb',
383
'-format' => 'Genbank')
386
while (my $seqin = $inseq->next_seq) {
387
# here we make use of the species attribute, which returns a
388
# species object, which has a binomial attribute that
389
# holds the binomial species name of the source of the sequence
390
if ($seqin->species->binomial =~ m/Homo sapiens/) {
391
$outfiles{'human'}->write_seq($seqin);
393
$outfiles{'other'}->write_seq($seqin);
399
Now, let's use a multidimensional hash to hold a genbank output and a fasta
400
output for both splits.
402
<pre class="programlisting">
404
# get command-line argument, or die with a usage statement
405
my $usage = "splitgb.pl infile\n";
406
my $infile = shift or die $usage;
407
my $inseq = Bio::SeqIO->new('-file' => ">$infile",
408
' -format' => 'Genbank');
410
my %outfiles = ('human' => {
411
'Genbank' => Bio::SeqIO->new('-file' => '>human.gb',
412
'-format' => 'Genbank'),
413
'Fasta' => Bio::SeqIO->new('-file' => '>human.fa',
414
'-format' => 'Fasta')
417
'Genbank' => Bio::SeqIO->new('-file' => '>other.gb',
418
'-format' => 'Genbank'),
419
'Fasta' => Bio::SeqIO->new('-file' => '>other.fa',
420
'-format' => 'Fasta')
423
while (my $seqin = $inseq->next_seq) {
424
if ($seqin->species->binomial =~ m/Homo sapiens/) {
425
$outfiles{'human'}->{'Genbank'}->write_seq($seqin);
426
$outfiles{'human'}->{'Fasta'}->write_seq($seqin);
429
$outfiles{'other'}->{'Genbank'}->write_seq($seqin);
430
$outfiles{'other'}->{'Fasta'}->write_seq($seqin);
436
And finally, you might want to make use of the SeqIO object in a perl
437
one-liner. Perl one-liners are perl programs that make use of flags to the perl
438
binary allowing you to run programs from the command-line without actually
439
needing to write a script into a file. The -e flag takes a quoted string, usually
440
single quoted, and attempts to execute it as code, while the -M flag
441
takes a module name and tells the one-liner to use that module. When using a single
442
quote to enclose the string to -e, you also have to make use of perl's string
443
modifier 'q(string)' to single quote a string without confusing the shell.
444
Let's find out how many GSS sequences are in gbpri1.seq.gz. Note that I
445
have placed new-line characters in this to make it easier to read,
446
but in practice you wouldn't actually hit the return key until you
447
were ready to run the program.
448
<pre class="programlisting">
449
perl -MBio::SeqIO -e 'my $gss = 0; my $in = Bio::SeqIO->new(q(-file) => q(/usr/local/bin/gunzip -c gbpri1.seq.gz |),
450
q(-format) => q(Genbank)); while (my $seq = $in->next_seq) { $gss++ if ($seq->keywords =~ m/GSS/);}
451
print "There are $gss GSS sequences in gbpri1.seq.gz\n";'
453
</p></div><div class="section"><div class="titlepage"><div><h2 class="title" style="clear: both"><a id="caveats"/>5.�Caveats</h2></div></div><p>
454
Because Bioperl uses a single, generalized data structure to hold sequences from
455
all formats, it does impose its own structure on the data. For this reason, a
456
little common sense is necessary when using the system. For example, a person who
457
takes a flat file pulled directly from Genbank, and converts it to another Genbank
458
file with Bioperl, will be surprised to find subtle differences between the two files
459
(try "diff origfile newfile" to see what I am talking about). Just remember when using Bioperl that it was never designed to "round trip" your favorite formats. Rather, it
460
was designed to store sequence data from many widely different formats into a common
461
framework, and make that framework available to other sequence manipulation tasks in a
462
programmatic fashion.
463
</p></div><div class="section"><div class="titlepage"><div><h2 class="title" style="clear: both"><a id="errors"/>6.�Error Handling</h2></div></div><p>
464
If you gave an impossible filename to the first script, it
465
would have in fact died with an informative error message. In
466
object-oriented jargon, this is called "throwing an exception".
467
An example would look like:
469
<pre class="programlisting">
470
[localhost:~/src/Bioperl-live] birney% perl t.pl bollocks silly
472
------------- EXCEPTION -------------
473
MSG: Could not open bollocks for reading: No such file or directory
474
STACK Bio::Root::IO::_initialize_io Bio/Root/IO.pm:259
475
STACK Bio::SeqIO::_initialize Bio/SeqIO.pm:441
476
STACK Bio::SeqIO::genbank::_initialize Bio/SeqIO/genbank.pm:122
477
STACK Bio::SeqIO::new Bio/SeqIO.pm:359
478
STACK Bio::SeqIO::new Bio/SeqIO.pm:372
479
STACK toplevel t.pl:9
481
--------------------------------------
483
These exceptions are very useful when errors occur because you can
484
see the full route, or "stack trace", of where the error occurred and right at the end of this
485
is the line number of the script which
486
caused the error, which in this case I called t.pl.
488
The fact that these sorts of errors are automatically
489
detected and by default cause the script to stop is a good thing, but
490
you might want to handle these yourself. To do this you need to "catch
491
the exception" as follows:
492
<pre class="programlisting">
496
my $input_file = shift;
497
my $output_file = shift;
499
# we have to declare $seq_in and $seq_out before
500
# the eval block as we want to use them afterwards
506
$seq_in = Bio::SeqIO->new( -format => 'genbank',
507
-file => $input_file);
509
$seq_out = Bio::SeqIO->new( -format => 'fasta',
510
-file => ">$output_file");
512
if( $@ ) { # an error occurred
513
print "Was not able to open files, sorry!\n";
514
print "Full error is\n\n$@\n";
518
while( $seq = $seq_in->next_seq() ) {
519
$seq_out->write_seq($seq);
522
The use of eval { ... } accompanied by testing the value of the $@ variable (which
523
is set on an error) is a generic Perl approach, and will work with all errors
524
generated in a Perl program, not just the ones in Bioperl. Notice that we have
525
to declare $seq_in and $seq_out using "my" before the eval block - a common
526
gotcha is to wrap a eval block around some "my" variables inside the block - and
527
now "my" localizes those variables only to that block. If you "use strict" this
528
error will be caught. And, of course, you are going to "use strict" right?
529
</p></div></div></body></html>
b'\\ No newline at end of file'