With limited computational resources or who are not familiar with command line usage under Unix/Linux, web servers provide computational resources and a graphical user interface for convenient use. Furthermore, they allow a visual presentation of results for a quick overview and exploration of data sets. Our server is unique in that it provides the ability to construct and use sample-specific models, besides enabling assignment with generic models. We illustrate taxonomic metagenome assignment with the generic and sample-specific modes of the web server by analyzing metagenome samples of an acidophilic biofilm community from acid mine drainage and of a cow rumen microbial community. We provide a web server for taxonomic assignment of metagenome sequences with PhyloPythiaS. Software updates and custom-made models will be easily accessible to the community through the web server. Our server is unique in that it provides, in addition to generic models, the ability to build and use sample-specific models. The sample-specific mode allows additional sequences to be incorporated as a ICI 182780 reference and relevant clades to be defined for a given community, e.g. based on accompanying 16S rRNA sample surveys. By taxonomic assignment of the AMD metagenome sample, we have shown how creation of such a sample-specific model allowed us to increase the coverage, resolution and accuracy of taxonomic assignments, with only a small amount of reference data being used. Due to computational limitations, no cross-validation for estimation of the hyperparameters is provided for sample-specific model construction, but our experiments show that default parameters produce accurate assignments on both simulated and real metagenome samples. Furthermore, the assignments can be visualized and downloaded through an easy-to-use interactive interface. For the AMD metagenome, we found BLASTN to perform similarly to the generic model, and the sample-specific model to show considerably improved assignment accuracy, in particular for lower taxonomic ranks. The NBC server mis-assigned a considerable fraction of the sequences and had an accuracy of,45% at domain level. MEGAN performed well on this data in terms of specificity, but showed lower sensitivity. To demonstrate use of the server and generic model for exploratory analysis of a large metagenome sample generated with the Illumina sequencing technology, we assigned scaffolds from the cow rumen metagenome in the generic mode. This showed high assignment consistency for the majority of the genome bins in comparison to a manual refined reference binning of the original study. With many high-throughput sequencing technologies being developed, it is important to assess how taxonomic assignment methods cope with the different technology-specific errors and read lengths. We have previously shown that PhyloPythiaS works well with assembled contigs from Sanger and Roche/454 sequencing technologies using metagenome samples from the Tammar wallaby gut and from the guts of obese human twins, respectively.
The technologies produce reads of different potentially affecting the performance of taxonomic assignment methods
Leave a reply