32
32
cd moving_pictures_procrustes_demo
34
Defining reference filepaths with environment variables
35
-------------------------------------------------------
37
Through-out this tutorial we make use of a reference sequence collection, tree, and taxonomy derived from the Greengenes database. As these files may be store in different locations on your system, we'll define them as environment variables using the paths as they would be if you're running in a QIIME virtual machine (e.g., on AWS or with the Virtual Box). We'll then reference the environment variables through-out this tutorial when they are used. If you're not working on either of these systems, you'll have to modify these paths. Run the following::
39
export QIIME_DIR=$HOME/qiime_software
40
export reference_seqs $QIIME_DIR/gg_otus-4feb2011-release/rep_set/gg_97_otus_4feb2011.fasta
41
export reference_tree $QIIME_DIR/gg_otus-4feb2011-release/trees/gg_97_otus_4feb2011.tre
42
export reference_tax $QIIME_DIR/gg_otus-4feb2011-release/taxonomies/greengenes_tax.txt
34
46
Pick OTUs on Illumina data and generate an OTU table (including taxonomic assignment of samples)::
36
pick_reference_otus_through_otu_table.py -i ./subsampled_illumina_seqs.fna -o ./illumina_ucrC/ -r /software/gg_otus-4feb2011-release/rep_set/gg_97_otus_4feb2011.fasta -t /software/gg_otus-4feb2011-release/taxonomies/greengenes_tax.txt -aO8 -p ./otu_params.txt
48
pick_closed_reference_otus.py -i ./subsampled_illumina_seqs.fna -o ./illumina_ucrC/ -r $reference_seqs -t $reference_tax -aO8 -p ./otu_params.txt
38
50
Determine the number of sequences per sample and related statistics. You'll want to choose an even sampling depth for the beta diversity analysis from these data. In this tutorial we choose the smallest number of sequences per sample (239).
42
per_library_stats.py -i ./illumina_ucrC/uclust_ref_picked_otus/otu_table.biom
54
print_biom_table_summary.py -i ./illumina_ucrC/uclust_ref_picked_otus/otu_table.biom
44
56
Compute UniFrac distances between samples, run principal coordinates analysis, and build 3D PCoA plots::
46
beta_diversity_through_plots.py -i ./illumina_ucrC/uclust_ref_picked_otus/otu_table.biom -e 239 -o ./illumina_ucrC/bdiv_even239/ -t /software/gg_otus-4feb2011-release/trees/gg_97_otus_4feb2011.tre -m ./illumina_map.txt -aO8 -p ./bdiv_params.txt --suppress_2d_plots
58
beta_diversity_through_plots.py -i ./illumina_ucrC/uclust_ref_picked_otus/otu_table.biom -e 239 -o ./illumina_ucrC/bdiv_even239/ -t $reference_tree -m ./illumina_map.txt -aO8 -p ./bdiv_params.txt --suppress_2d_plots
48
60
Repeat the above steps on the 454 data::
50
pick_reference_otus_through_otu_table.py -i ./subsampled_454_seqs.fna -o ./454_ucrC/ -r /software/gg_otus-4feb2011-release/rep_set/gg_97_otus_4feb2011.fasta -t /software/gg_otus-4feb2011-release/taxonomies/greengenes_tax.txt -aO8 -p ./otu_params.txt
51
per_library_stats.py -i ./454_ucrC/uclust_ref_picked_otus/otu_table.biom
52
beta_diversity_through_plots.py -i ./454_ucrC/uclust_ref_picked_otus/otu_table.biom -e 135 -o ./454_ucrC/bdiv_even135/ -t /software/gg_otus-4feb2011-release/trees/gg_97_otus_4feb2011.tre -m ./454_map.txt -aO8 -p ./bdiv_params.txt --suppress_2d_plots
62
pick_closed_reference_otus.py -i ./subsampled_454_seqs.fna -o ./454_ucrC/ -r $reference_seqs -t $reference_tax -aO8 -p ./otu_params.txt
63
print_biom_table_summary.py -i ./454_ucrC/uclust_ref_picked_otus/otu_table.biom
64
beta_diversity_through_plots.py -i ./454_ucrC/uclust_ref_picked_otus/otu_table.biom -e 135 -o ./454_ucrC/bdiv_even135/ -t $reference_tree -m ./454_map.txt -aO8 -p ./bdiv_params.txt --suppress_2d_plots
54
66
Perform Procrustes analysis::
62
74
There will now be several results of interest. For the Procrustes analysis you can find the statistical results in ``./454_v_illumina/unweighted_unifrac_pc_unweighted_unifrac_pc_procrustes_results.txt`` and you can view the Procrustes plot by opening ``./454_v_illumina/plots/pc1_transformed_3D_PCoA_plots.html`` in a web browser.
b'\\ No newline at end of file'
77
Comparing data sets with different sample ids
78
---------------------------------------------
80
In the cases described here, we always have the same samples in our two principal coordinate matrices. If that is not the case for your study, you'll need to pass a sample id mapping file (**which is different from a QIIME metadata mapping file**). For a description of this file format, see :ref:`sample_id_map`.