gcMeta: integrated microbiome research platform

Metagenome assembly and annotation

[RUN]

(microbiome sample - NGS reads - quality control - assembly and validation - binning – genome structural analysis - database annotation): In this workflow, we assemble the short reads into contigs. These contigs can be further sorted or binned by similarity to assemble partial to full genomes of microorganisms. The assembled sequences are used for subsequent further structural and functional analysis. Firstly, NGS reads, as input, are trimmed into clean reads after performing quality control (contamination removed with Bowtie, quality viewing with FastQC and trimming with Trimmomatic). Next, clean reads are assembled into contig and scaffold (MEGAHIT). After assembly, contig and scaffold are clustered into different bins (MaxBin). The contig and scaffold are used to perform structural analysis (CRISPR detection with PILER-CR, gene prediction with Prodigal, RNA identification with tRNAscan), and then, genes are used to perform annotation (annotation with Prokka, PfamScan, InterProScan).


Metagenomic 16S rRNA sequencing taxonomy assignment

[RUN]

(microbiome sample - NGS reads - quality control - taxonomy assignment - downstream analysis): In this workflow, the users can analyze 16S rRNA sequencing data including the following procedures: sequence quality control, construction of operational taxonomic unit (OTU) table, Alpha/Beta diversity analysis, LEfSe analysis and function prediction. First, NGS reads, as input, are trimmed, and used to perform taxonomic assignment with QIIME2. Then, the generated taxonomic table is used to perform alpha/beta diversity analyses with QIIME2 and other downstream analysis (function prediction with PICRUSt and biomarker discovery with LEfSe).


Reference based metagenome taxonomy assignment

[RUN]

(microbiome sample - NGS reads - quality control - taxonomy assignment - downstream analysis): Read-based taxonomic assignment uses the unassembled DNA or mRNA sequence reads directly and compares them against reference databases to assign taxonomy name to the sequence. Firstly, NGS reads, as input, are trimmed into clean reads (contamination removed with Bowtie, quality viewing with FastQC and trimming with Trimmomatic). Clean reads are then used to perform taxonomic (MetaPhlAn2) and functional assignment (HUMAnN2).


Genome assembly and annotation

[RUN]

(isolated sample - NGS/TGS reads - quality control - assembly and validation - Genome structural analysis - database annotation): For NGS reads, after trimming we assemble the short reads into contigs(trimming with Trimmomatic and assembly by SPAdes). These assembled sequences are used for further structural and functional analysis. For example, CRISPR detection with PILER-CR, gene prediction with Prodigal, RNA identification with tRNAscan. Finally, genes are used to perform annotation (annotation with Prokka, PfamScan, InterProScan).


RNA-seq analysis

[RUN]

(isolated sample - NGS reads - quality control - alignment - assembly and differential expression analysis): This workflow allows users to identify differentially expressed genes and transcripts by comparing each samples with RNA-seq data. Firstly, NGS reads, as input, are trimmed into clean reads after quality control (quality viewing with FastQC and trimming with TrimGalore). Next, cleaned reads are aligned to the reference genomes with Hisat2. Then, the alignment result is used as an input to assemble transcripts. After assembly, differential expression analysis based on the assembled transcripts and genes will be executed with DESeq2.