What this is
- This research compares and methods for analyzing gut microbiota in migratory seagulls.
- Both methods were assessed for their ability to identify bacterial taxa and pathogenic bacteria.
- The study finds that while both methods yield consistent patterns, they differ significantly in taxonomic resolution, especially at the species level.
Essence
- provides a more comprehensive view of gut microbiota and pathogenic bacteria in migratory seagulls compared to , particularly at finer taxonomic levels.
Key takeaways
- Metagenomics identified 1,568 unique taxa compared to 67 for at the species level, highlighting its superior capability in detecting bacterial diversity.
- Correlation coefficients between the two methods decreased as taxonomic resolution increased, with a high correlation (r = 0.8701) at the phylum level and a low correlation (r = 0.0515) at the species level.
- Both methods showed consistent patterns in microbial community analysis, but metagenomics was more effective in discovering pathogenic bacteria, which were poorly detected by .
Caveats
- The study's findings are based on specific sampling locations and may not be generalizable to all migratory seagull populations.
- Potential biases in taxonomic identification due to the inherent limitations of each sequencing method could affect the results.
Definitions
- shotgun metagenomics: A method that sequences all DNA in a sample, providing comprehensive insights into microbial diversity and function.
- 16S rDNA sequencing: A targeted sequencing method that amplifies a specific region of the bacterial genome to identify microbial communities.
AI simplified
Introduction
The microbiomes of different host organisms are currently described by using culture-independent sequencing methods (). Shotgun metagenomic and 16S rDNA sequencing methods are commonly used to identify the taxonomic composition of microbial communities in several studies (;;;). Shotgun metagenomic sequencing of total DNA in a sample increases the resolution of species-level taxonomic assignments and provides direct gene functional annotations. It can also be used to analyse the nonbacterial parts of microbiomes, such as viruses and fungi. The high-resolution assemblies provide in-depth insight into microbial diversity (). 16S rDNA sequencing is a cost-effective and straightforward method to investigate the microbiota of different organisms. However, it has limited utility because of the length of the amplicon of the target gene. This method can only provide the bacterial members of microbial communities and brings about some sequence artifacts or bias in taxonomic profiles ().compared amplicon and metagenomic sequencing methods in animal metaorganisms. They demonstrated that many aspects of bacterial community characterization were consistent across the two sequencing methods. Their results facilitated the selection of appropriate methods across a wide range of host taxa ().compared the whole genomic shotgun sequencing with 16S rDNA method for human fecal microbiome. They found that shotgun metagenomics had multiple advantages compared with the 16S amplicon method, such as enhanced detection of bacterial species, increased diversity and prediction of genes (). Therefore, both sequencing methods have their own advantages and disadvantages and are widely used in current biological studies. [Harvey & Holmes, 2022] [de Melo et al., 2020] [Mackelprang et al., 2011] [Monaco & Kwon, 2017] [Qin et al., 2010] [Warnecke et al., 2007] [Franzosa et al., 2015] [Rausch et al. (2019)] [Rausch et al., 2019] [Ranjan et al. (2016)] [Ranjan et al., 2016]
The black-headed seagull (, hereafter seagull) is a migratory wild bird worldwide at present. The public is in close contact with migratory seagulls when they migrate from Siberia to southern China to overwinter at parks or rivers of the cities (;). Previously, we analysed the features of gut microbial communities and pathogens of this animal by 16S rDNA amplicon sequencing and culture methods (). The results indicated that many pathogenic bacteria, such as enteropathogenicandspp. were isolated, although little cross infection was found between humans and seagulls (). However, amplicon sequencing and culture method only reflect limited findings regarding the microbiota of animals. Currently, advanced methods have been used to perform in-depth analyses on the connection between bird activities and disease dissemination; in particular, metagenomics provides us with a broader characterization of microbe diversity (). In addition, assessing the taxonomic diversity of microbes is a key point to understanding the interaction between microbes and hosts (). To date, there is no direct comparison of the utility of the two methods mentioned above in the gut microbes of migratory seagulls. In this study, we compared the effects of profiling the seagull gut microbiome with metagenomic16S rDNA sequencing in detail to further explore the tradeoffs of each method. Larus ridibundus Escherichia coli Salmonella vs [Frixione et al., 2022] [Zeballos-Gross et al., 2021] [Liao et al., 2019] [Liao et al., 2019] [Ramirez-Martinez et al., 2018] [Wille et al., 2021]
Materials and Methods
Sample collection
A total of 160 seagull fecal samples were collected from November 2021 to January 2022. Samples were collected at two locations (HuanXiQiao, HXQ and HaiGeng, HG), and 80 samples were collected at each sampling location. HXQ (25.04 N, 102.69 E) is a central park in an urbanized area. HG (24.96 N, 102.65 E) is the surrounding area of Dianchi Lake in Kunming city, which is a more rural area. The distance between the two sampling locations was approximately 11 km. Fresh feces of seagulls were collected and stored in sterile containers. Twenty samples of each group were mixed into one for subsequent experimental analysis, and the mixed samples were collected approximately 5–10 m from each other for each sampling location.
Metagenomics
Genomic DNA was extracted using fecal sample total genomic DNA extraction kits (Tiangen, Beijing, China) according to the manufacturer’s instructions. The DNA quality was detected using Qubit 2.0 (Thermo Fisher Scientific, Waltham, MA, USA) and Nanodrop accordingly. Quantified genomic DNA was fragmented by sonication to a size of 350 bp and then end-repaired, A-tailed, and adaptor ligated using the NEB Next DNA Library Prep Kit for Illumina (NEB, Ipswich, MA, USA) according to the instructions (;;). DNA fragments with lengths of 300–400 bp were enriched by PCR (12 cycles) and purified using an AMPure XP system (Beckman Coulter, Brea, CA, USA). The libraries were analysed for size distribution by a 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA) and quantified using real-time PCR. Finally, genome sequencing was performed on an Illumina NovaSeq 6000 by a paired-end 150 (PE150) reagent kit. [Pessoa et al., 2013] [Pham, Palden & DeLong, 2007] [Solonenko & Sullivan, 2013]
16S rDNA sequencing
Genomic DNA was extracted as mentioned above, and the samples were identical to metagenomic sequencing. The V3-V4 variable region of the 16S rDNA gene was amplified using a previously described method (). In general, PCR was performed using the KAPA HiFi Hot Start kit (Kapa Biosystems, Wilmington, MA, USA). The amplification procedure has been previously described (). The AMPure XP beads (Beckman Coulter, Brea, CA, United States) was used to purify and quantify PCR products. The Illumina Nextera barcodes was added by secondary PCR procedure, and then the products were purified again to clear the nontarget fragments. The normalized and pooled amplicons were used for sequencing by using an Illumina NovaSeq 6000 system (Illumina, San Diego, CA, USA). [Gu et al., 2021] [Liao et al., 2019]
Statistical analyses
For metagenomics, raw data containing adapters or low-quality reads were filtered according to the following rules using FASTP () (version 0.18.0): (1) reads containing more than 10% unknown nucleotides (N) were removed; (2) reads containing less than 50% bases with quality (Q-value) >20 were removed; and (3) reads aligned to the barcode adapter were removed. After filtering, the resulting clean reads were used for genome assembly. Clean reads of each sample were assembled individually using MEGAHIT (version 1.1.2) stepping over a k-mer range of 21 to 141 to generate sample (or group)-derived assembly (). Genes were predicted based on the final assembly contigs (>500 bp) using MetaGeneMark (version 3.38) (). The reads were aligned to predict genes using Bowtie (version 2.2.5) to count read numbers (). The final gene catalogue was obtained from nonredundant genes with gene read counts >2. The plots were graphed using the R ggplot2 package (). The unigenes were annotated using DIAMOND (version 0.9.24) () by alignment with the deposited unigenes in different databases, including Nr, KEGG, eggNOG, CAZy, PHI, VFDB, and CARD. Since microorganisms such as fungi and eukaryotes would be detected in the metagenome, to ensure the comparability of the 16S with the metagenomics, we only retained the annotations as bacteria in the metagenome to participate in the analyses and recalculated the relative abundance of each taxon by treating the abundance of bacteria as a total (100%). Specifically, we removed results annotated as fungi and eukaryotes in each metagenomic sample. The remaining results annotated as bacteria were considered as a whole, and the relative abundance of each taxonomic level in each sample was recalculated. [Chen et al., 2018] [Li et al., 2015] [Zhu, Lomsadze & Borodovsky, 2010] [Langmead & Salzberg, 2012] [Gustavsson et al., 2022] [Buchfink, Xie & Huson, 2015]
Chao1, ACE, Shannon, and Simpson indices were calculated using Python language (version 0.5.6). Rarefaction defined at the 97% sequence similarity cut-off value was used to evaluate whether the sequencing was sufficient to cover all taxa. Alpha diversity comparison between groups was calculated by the Wilcoxon rank test. A Bray‒Curtis distance matrix based on gene/taxon/function abundance was generated separately by the R Vegan package (). The multivariate statistical technique of PCoA was calculated and plotted using the R ggplot2 package. The Anosim test was calculated using the R project Vegan package. Heatmap graphs were plotted using the R heatmap package. [Huson et al., 2007]
Bowtie2 and bedtools (version 2.29.0) were used to cluster all contigs longer than 1.5 kb from all samples into bins with default parameters (). Then, contigs of samples were clustered to recover metagenome-assembled genomes using MetaBAT2 with default parameters (). Finally, CheckM (version 1.0.6) was applied to estimate the completeness and contamination for each bin (). Finally, the bins with high completeness (>80%) and fewer contaminants (<5%) were retained as draft genomes. The contamination was estimated from the number of multicopy marker genes identified in each marker set. [Alneberg et al., 2014] [Kang et al., 2019] [Parks et al., 2015]
For 16S rDNA sequencing, the raw data were trimmed and quality controlled according to the rules using FASTP (version 0.18.0) as mentioned above. The paired-end reads were merged, and barcodes were removed using PEAR 0.9.6, cutadapt 1.2.1, and Prinseq 0.20.4. QIIME 2.0, USEARCH 11.0, and R 3.2 were used for bioinformatics analysis (). The chimeric tags were removed, and effective tags were used. Sequences were clustered into operational taxonomic units (OTUs) according to 97% sequence similarity against the Silva 132 database using the UPARSE pipeline (). OTUs were named using SILVA taxonomic nomenclature. LEfSe analysis was used to identify bacteria with statistically significant differences in abundance between the two methods (). PCoA was used to visualize the similarities between the HXQ and HG groups. ANOSIM was used to compare the differences in microbial communities between the HXQ and HG groups. PICRUSt was used to predict the functional potential of the communities (). [Gu et al., 2020] [Edgar, 2013] [Chang, He & Dang, 2022] [Douglas, Beiko & Langille, 2018]
The numbers of common and unique bacterial OTU between the two methods were visualized in a Venn diagram by the R package (). The top 10 average abundances of taxa at different taxonomic levels were compared between the two methods and visualized by a stacked graph (). The Pearson correlation coefficient of the two methods was calculated using the R psych package, and the correlation significance was calculated using Fizh-Z transformation (). Scatter plots were used to show linear correlations between the two methods at different levels of taxa. Beta diversity of the two methods was compared with the Mantel test by the R vegan package (). Procrustes analysis based on the spatial distribution of PCoA of samples was used to determine the degree of correlation between the two methods (). [Chen & Boutros, 2011] [Conway, Lex & Gehlenborg, 2017] [Oliveira et al., 2020] [Segata et al., 2011] [Chernoff & Nielsen, 2009]
Negative control and contamination exclusion experiments have been reported in our previous study (). All python scripts used in the study were submitted to dryad (). DOI 10.5061/dryad.w9ghx3fvw [Liao et al., 2023]
Availability of data
All data generated or analysed during this study are included in this article. The sequence data have been deposited into the National Center for Biotechnology Information (NCBI),with BioProject accession number:for metagenome andfor 16S rDNA sequencing. https://www.ncbi.nlm.nih.gov/↗ PRJNA849401 PRJNA849758
Results
The average effect reads after quality control for the metagenome of total samples was 99.74% ± 0.05% and 90.09% ± 4.15% for the 16S rDNA sequencing method. The composition of the gut microbiome of the metagenome indicated that 99.72% was bacteria, followed by viruses (0.2%), fungi (0.07%), archaea (0.001%) and eukaryotes (0.0002%). Annotations as bacteria in the metagenome were recalculated for the relative abundance of each taxon for comparison with 16S rDNA sequencing. The total abundance of all bacteria was compared at different taxonomic levels for both sequencing methods. Statistics of common and unique OTUs at the phylum level indicated that 13 phylum taxa were shared both with metagenome and 16S rDNA sequencing, 11 and seven unique taxa existed for metagenome and 16S rDNA sequencing, respectively (). At the class level, 15 taxa were shared with both methods, but 31 and 19 unique taxa for each of them (). At the order level, 32 common taxa were shared by the two methods, with 61 and 33 unique taxa for each method (). At the family level, there were 61 common taxa but 118 and 49 unique taxa, respectively (). The largest differences were found at the species level; only 13 species taxa were shared with both methods, while 1,568 and 67 taxa were identified for metagenome and 16S rDNA sequencing, respectively, as shown in. The cladogram showed that commonly shared taxa were Firmicutes, Proteobacteria, Actinobacteria, Bacteroidetes and Cyanobacteria at the phylum level of the two methods. Bacilli and Gammaproteobacteria at the class level; Enterobacterales, Lactobacillales and Pseudomonadales at the order level;,,,,at the family level; and,,,,andat the genus level were the shared taxa of metagenome and 16S rDNA sequencing (and).,,,,,,,andwere unique taxa for the metagenome compared with,,andfor 16S rDNA sequencing (and). Fig. 1A Fig. 1B Fig. 1C Fig. 1D Fig. 1E Fig. 1F Table S1 Fig. 1F Table S1 Enterobacteriaceae Enterococcaceae Lactobacillaceae Moraxellaceae Pseudomonadaceae Aerococcus Catellicoccus Lactobacillus Acinetobacter Psychrobacter Pseudomonas Erwiniaceae Hafniaceae Escherichia Shigella Erwinia Escherichia albertii Escherichia coli Shigella sonnei Erwinia gerundensis Escherichia-Shigella Hafnia-Obesumbacterium Lelliottia Pantoea
Binning analysis results indicated that four high-quality and 87 low-quality bins were generated from all samples (). The features of high-quality bins were shown in. Bin_ALL.85 had the highest contig sequencing depths compared with Bin_ALL.10, which had the lowest. However, Bin_ALL.10 showed the highest GC content compared with the others. The annotation results of bins at the genus level demonstrated thatspp.,spp.,spp. andspp. were the taxonomies corresponding to Bin_ALL.77, Bin_ALL.85, Bin_ALL.10 and Bin_ALL.46, respectively (). Fig. S1A Fig. S1B Fig. S1C Pseudomonas Pantoea Kocuria Lactobacillus
The top 10 relative abundances of different taxonomic levels of metagenome16S rDNA sequencing were shown in. At the phylum level, Proteobacteria, Firmicutes, Actinobacteria and Bacteroidetes accounted for 98.53%, 1.42%, 0.048% and 0.008% of the metagenome for all the samples compared with 88.89%, 10.62%, 0.27% and 0.21% of 16S rDNA sequencing (). At the class level, Gammaproteobacteria and Bacilli accounted for 98.53% and 1.40% of the metagenome compared with 88.92% and 10.46% of 16S rDNA sequencing (). Enterobacterales (92.61%), Pseudomonadales (5.99%) and Lactobacillales (1.34%) were the most abundant taxa at the order level of the metagenome, while 74.41%, 14.56% and 10.53% of the relative abundance of the 16S rDNA sequencing results corresponded to each taxon (). The most relative abundance of metagenome at family level were(90.08%),(5.90%) and(3.30%). However,(75.69%),(7.76%),(7.03%),(4.06%) and(3.52%) were the most distributed taxa in the 16S rDNA amplicon results ().(80.45%),(8.09%),(3.69%),(2.75%),(1.10%), and(0.74%) were the most relatively abundant genera in the metagenome, while(61.91%),(7.11%),(5.70%),(5.33%),(5.12%),(4.08%),(3.57%) and(2.70%) were the most abundant genera in the 16S rDNA sequencing (). Finally, the largest differences were identified at the species level of the two methods. In general,(73.85%),(12.10%),(4.36%),(4.13%),(1.09%) and(0.94%) were the most abundant taxa of the metagenome at the species level, compared with(32.69%),(27.15%),(14.36%) and(12.65%) according to 16S rDNA sequencing (). The total abundance of taxa for all samples was shown in. vs Enterobacteriaceae Moraxellaceae Erwiniaceae Enterobacteriaceae Moraxellaceae Pseudomonadaceae Enterococcaceae Lactobacillaceae Escherichia Psychrobacter Shigella Erwinia Klebsiella Salmonella Escherichia-Shigella Pseudomonas Pantoea Psychrobacter Lelliottia Catellicoccus Lactobacillus Hafnia-Obesumbacterium Escherichia coli Escherichia albertii Shigella sonnei Erwinia gerundensis Salmonella enterica Klebsiella pneumoniae Psychrobacter frigidicola Catellicoccus marimammalium Lactobacillus aviaries Pseudomonas fragi Fig. 2 Fig. 2A Fig. 2B Fig. 2C Fig. 2D Fig. 2E Fig. 2F Table S2
Alpha diversity analysis showed statistical significance between the HG and HXQ groups of the metagenome for the Ace, Chao1, Shannon and Simpson indices (–upwards). Similar results could be identified between the two groups of 16S rDNA sequencing for all the indices (–downwards). Beta diversity revealed that high diversity was found in the HG group compared with the HXQ group by metagenomic and 16S rDNA sequencing methods (and), but no significant differences were found. The details of the diversity statistical results between the HG and HXQ groups were shown in. Figs. 3A 3D Figs. 3A 3D Figs. 4A 4B Table 1
Pearson correlation analysis of the two sequencing methods indicated that statistical significance could be found at the phylum/family/genus/species levels. However, the R value of consistency gradually decreased with the refinement of the taxonomic levels (–). A high correlation coefficient (r = 0.8701) was found at the phylum level compared with a low correlation coefficient (r = 0.0515) at the species level. Beta diversity by the Mantel test revealed that there was no statistically significant correlation at the phylum and species levels, but significant correlations were identified at the family and genus levels (–). Specifically, a high correlation coefficient (r = 0.936) was found at the genus level. Similar results could be found for the procrustes tests of the two sequencing methods (–). Significant statistical correlations were recognized only at the family and genus levels, with the highest correlation coefficient (r = 0.981) at the genus level. Figs. 5A 5D Figs. 5E 5H Figs. 5I 5L

Comparison of the metagenome and 16S rDNA sequencing results for the number of bacterial taxa recovered by each method from fecal samples of the black-headed seagulls at the following levels of classification.

Comparison of the relative abundance of the metagenome and 16S rDNA sequencing results in this study.

Comparison of alpha diversity for metagenome and 16S rDNA amplicon results.

Comparison of beta diversity for metagenome and 16S rDNA amplicon results.

Correlation analysis and beta diversity comparison of metagenome and 16S rDNA sequencing results in this study.
| Index | valueP | ||
|---|---|---|---|
| Metagenome | 16S rDNA sequencing | ||
| α-diversity | Ace | 0.0078 | 0.02 |
| Chao1 | 0.0078 | 0.014 | |
| Shannon | 0.0078 | 0.02 | |
| Simpson | 0.0078 | 0.02 | |
| β-diversity | Anosim | 0.222 | 0.11 |
Discussion
Bird diseases such as avian influenza, avian cholera and West Nile fever have occurred frequently since the 20th century, leading to the death of wild birds, poultry and humans (). Birds have strong environmental adaptability and bring potential risks to the dissemination of diseases. Several studies have characterized the relationship between bird migration and disease transmission (;;). In particular, our previous study revealed that,, andwere the most major families of gut microbial communities of seagulls by using 16S rDNA sequencing. Enteropathogenicandwere the most isolated pathogenic bacteria of this animal, although little cross infection was found between humans and seagulls by PFGE method (). Metagenomic and 16S rDNA sequencing are the most commonly used methods to identify the features of microbial communities. Several studies have demonstrated their usage and applicability in different study areas but have not referred to migratory seagulls of gut microbiomes. Therefore, we revealed the similarities and differences between the two methods in this study to further explore the characteristics of seagull gut microbial communities. [Craft, 2015] [Jarma et al., 2021] [Phan et al., 2013] [Price et al., 2016] [Liao et al., 2019] Enterococcaceae Enterobacteriaceae Mycoplasmataceae E. coli Salmonella
16S rDNA amplicon sequencing is widely used in the field of microorganisms. It has a lower cost and is the preferred omics method for microbial community analysis (). However, the amplified fragments of this method are approximately 400 bp, which limits its application. In this study, we found that the bacterial OTUs composition and alpha/beta diversity of the gut microbes of seagulls were similar by using metagenomic and 16S rDNA sequencing. As the taxonomic hierarchy of bacterial OTUs moved from the phylum to the species level, the differences between the two methods gradually increased. Without considering relative abundance, there were 61 taxa detected by both methods at the family level, only 49 taxa unique to the 16S amplicon, and 118 taxa unique to the metagenome. Furthermore, this difference was even larger at the species level. It was considered that the identification range of bacterial OTUs was greatly limited due to primer amplification and short sequencing lengths, the more refined taxonomic levels, and the greater limitations of 16S rDNA sequencing (). [Klindworth et al., 2013] [Eloe-Fadrosh et al., 2016]
The abundance of taxa also influenced the results of differences. At the phylum level, the correlation coefficient between the two sequencing methods was up to 0.8701. However, at the genus level, the correlation coefficient was only 0.3215. Although increasing the threshold for the abundance filter might improve the correlation, the overall difference still existed. In general, relatively consistent patterns and reliability could be identified by both methods, but the results varied following the refinement of taxonomic levels. Taking both abundance and diversity into consideration, even at the genus level, the two sequencing methods still had a highly significant correlation of 0.936 (Mantel test). This indicated that although there were large differences in the numbers and abundance of bacterial OTUs of the two methods in terms of taxonomic levels, the final results of the samples were consistent. Based on the comparative analysis, we considered the 16S rDNA sequencing method to be the “sparrow type”, with low cost, wide species coverage, and reliable abundance assessment. Metagenomics was recognized as the “peacock type”, with more powerful data analysis ability (). Similar findings have been reported in previous studies (;;). They also demonstrated that many aspects of bacterial community characterization were consistent across methods (). Therefore, both metagenomic and 16S sequencing were effective methods to analyse the microbial community, and the method selection possibly depended on different research purposes. [Schloissnig et al., 2013] [Peterson et al., 2021] [Ravi et al., 2018] [Rubiola et al., 2022] [Rausch et al., 2019]
However, it was worth noting that at the species level, the metagenomic sequencing approach was able to detect many pathogenic bacteria, such as,,,and. In contrast, 16S rDNA sequencing was not as effective in detecting pathogenic bacteria and hardly detected any bacteria that were pathogenic to humans. Choi et al. analyzed theprevalence and pathogen-microbiota relationships in Barn Swallows by using 16S rDNA sequencing and culture methods. They found that 16S rDNA sequencing was better than culture method for detecting, and highlighted the value of 16S rDNA gene sequencing for monitoring pathogens in wild birds (). Therefore, from this point of view, the metagenomic sequencing method was more suitable for the discovery and detection of pathogenic bacteria, although in general, the results of the two sequencing methods were consistent. E. albertii S. sonnei S. enterica K. pneumoniae S. flexneri Salmonella Salmonella [Choi ON et al., 2021]
Conclusions
In this study, we compared the metagenome and 16S rDNA amplicon results to further demonstrate the features of migratory seagulls of gut microbiomes. Relatively consistent patterns and reliability could be identified by both methods to demonstrate the characteristics of gut microbial communities of seagulls, but the results varied following the refinement of taxonomic levels. Metagenomic sequencing was more suitable for the discovery and detection of pathogenic bacteria of gut microbiota in seagulls. Although there were large differences in the numbers and abundances of the two sequencing methods in terms of taxonomic levels, the final results of the samples were consistent.