We next examined the genomic bins using antiSMASH 2.0. The majority of the clusters overall were uncategorized , followed by saccharides and fatty acids. non-ribosomal peptide synthases, bacteriocins, and terpenes, and polyketide synthases were also common. Arylpolyenes, lasso- and lantipeptides also were predicted as was one instance each of a siderophore and butyrolactone. MW5 had 229 clusters in 33 bins. MW6 had 371 clusters in 22 bins. DOM had 10 clusters in 158 bins. Notably, the CPR genomes that dominate the water samples have few predicted secondary metabolites on average. Because MW5 was dominated by these genomes, its density of clusters is correspondingly lower. However, some of the individual CPR bins are dense with bio-synthetic clusters . Thus while poor representation of CPR in existing databases may reduce utility of this approach, some of the genomes certainly have detectable clusters. Grouping the genomes phylogenetically , the most clusters occur in the Planctomycetes OM190 . A range of cluster densities was apparent in the rest of the bins. Notably, ladderane biosynthesis, a hallmark of the Planctomycetes,was detected by antiSMASH in all eight of the Planctomycete assemblies , confirming that these are all true Planctomycete genomes. AntiSMASH results show a rich diversity of secondary metabolites in the anammox genomes. Specifically enriched are fatty acids, saccharides, bacteriocins, and terpenes. The OM190 genome was additionally enriched in non-ribosomal peptide synthases, and anatoxin production was predicted. While anatoxin is known to come from cyanobacteria and not from Planctomycetes, its known bio-synthetic pathway invovles polyketide synthases,nft growing system of which 18 are predicted by antiSMASH in this genome. Thus, while this cluster does not likely encode a cyanotoxin, the biosynthetic potential of this genome could certainly produce toxic secondary metabolites.
Indeed, a large number of the predicted secondary metabolites are biologically active molecules that may target other cells in the microbial community and could potentially have side effects on mammals. We saw evidence of rich secondary metabolite biosynthetic potential in several other genomes as well. including representatives of OP3, OP11, Acidobacteria, Bacteroidales, Chlorobi, Chloroflexi, Domibacillus, Entotheonella, Leptonema, Nitrospira, Sphingomonas, Spirochaetes, and from DOM were enriched. Notably, we assembled an incomplete genome that appears to be related to cyanobacterial toxin producers. Its best RAPSEARCH hit was to a Planktothrix aghardii genome. The 500 kb fragment is rich in non-ribosomal peptide synthases, which are another toxin production system in the cyanobacteria and can poison humans. In order to confirm whether this might be a toxin producer, we built a BLAST database of microcystin genes found on NCBI and compared to the genome fragment using TBLASTX. We found numerous hits > 300 bp throughout the fragment, but the percent identity was roughly 40%, indicating that the sequences are diverged. Overall, antiSMASH predicts an enrichment in biosynthetic clusters with antimicrobial activity including bacteriocins, non-ribosomal peptide synthases, polyketide synthases, and lassopeptides. While many antibiotic compounds may have broad targets or even non-antagonistic effects, bacteriocins usually have very specific antibiotic activity, often against closely related strains. The prevalence of predicted bacteriocins in the genomes suggests direct competition between genomes. For example, the Brocadiaceae Planctomycete genomes which co-occur in MW6 are predicted to have on average one bacteriocin per genome, which could be used to compete with the related strains.Overall we find that the metagenomic communities present in groundwater reflect the measured chemical conditions: we measured high nitrogen and DOC as well as a microbial community largely dominated by nitrifier, denitrifier, and anammox bacteria .
Our analysis revealed strain-level variation within key members of this community as well as the potential for rich biosynthetic capacity. We also found evidence for niche specialization based on analysis of the genetic pathways present . Such niche specialization between species in an anammox community was recently reported for a partial nitritation anammox reactor in a wastewater treatment plant. We find evidence that a similar microbial community is present in shallow, nitrate rich groundwater, and there are multiple anammox strains within a single well. The prevalence of the anammox genomes at over 10% abundance suggests that these bacteria are major drivers of the natural geochemistry of this environment. An implicit consequence is conversion of ammonium and nitrate into nitrite and N2 gas. Additionally, nitrite-dependent anaerobic oxidation of methane may be coupled to anammox in this community, reducing potential greenhouse gas emissions.An important aspect of the present study is that the source of the nitrate is cow manure, which also carries a considerable carbon load that supports microbial metabolism. Nitrates derived from synthetic fertilizers do not carry a carbon source and thus may be associated with a considerably different microbial community. Thus, different sources of nitrate could have different potential for bioremediation. Furthermore, we must consider the source of the microbial community in the environment. The Central Valley of California was once an extensive wetland, and wetland-associated microbial communities perform nitrifier, denitrifer, n-damo, and anammox reactions. If the source of the community were different, we might expect to see a different set metabolic processes with different implications for water quality and greenhouse gas emissions.
An overlap in anaerobic nitrogen and sulfur redox reactions was shown by Canfield et al in the oxygen minimum zone of the ocean. Our metagenomic data and chemical data indicate the potential for a similar overlap in nitrogen and sulfur cycles in groundwater, with OP11 Microgenomates specifically involved through assimilatory sulfur reduction . As shown previously , nitrate levels were highest in MW5 , and lower in MW6 and DOM . The sulfate levels follow a similar trend: MW5, 68.8 ppm; MW6, 15.3 ppm; DOM 2.3 ppm. The microbial abundances and corresponding chemical pathway analysis suggest that these pathways overlap in organisms that exist in the appropriate nutrient conditions. Furthermore the presence of Candidatus Methylomirabilis with the anammox communities in MW6 and DOM supports the findings of Shen et al that denitrification may be coupled to methane oxidation, reducing potential methane emissions of degrading manure.The high abundance of anammox and associated nitrifier and denitrifier bacteria in the nitrate-rich samples suggests that excess nitrate and ammonium in groundwater may be naturally remediated [or mineralized] to N2 by the endogenous microbiota. The presence of a natural microbial community that closely resembles the nitritation-anammox active sludge community for sewage wastewater denitrification could also be taken as an indication that the shallow groundwater in the Central Valley is recharged from sources similar to sewage wastewater. Based on extensive, controlled studies of this community, e.g, it appears possible that simply by decreasing the input of manure into the groundwater, the nitrogen pollutants could decrease below harmful levels. This implication holds true in the shallow groundwater as well as in the deep groundwater, where we still see evidence of the nitritationanammox community despite lower levels of nitrate . The nitrate:DOC ratio is similar between MW5, MW6, and DOM , although the total DOC and nitrate levels are an order of magnitude different between each of the samples with MW5>>MW6>>DOM, presumably due to different levels of dilution of the manured water with recharge from the adjacent, unmanured fields. The abundance of a similar nitrifer/denitrifier and anammox microbial community in all three samples appears to mirror the total DOC and nitrate, supporting the notion that bio-remediation of nitrate and DOC scales with nutrient abundance both through direct nutrition and through community metabolism. With increased sampling,nft hydroponic system observed differences in microbial communities may aid in forensic “fingerprinting” approaches to detect sources of nitrate in groundwater.The metagenomes also indicate a potential concern, which is that the same organisms that remediate the nitrogen also produce bio-active secondary metabolites that pose potential health risks and are more difficult and expensive to remove from drinking water. Thus, as groundwater becomes a scarcer and more valuable resource, quantifying the downstream risks of organic manure fertilizer contamination in groundwater becomes a more important priority. There has been speculation about how slow growing anammox bacteria can maintain a competitive advantage over faster growing bacteria. The high abundance of secondary metabolite gene clusters in their genomes may give us a clue.
Our analysis annotated a diverse array of these gene clusters as various antimicrobials, which could of course help the slow growing anammox cells maintain their dominance in the community. Groundwater microbiomes are unique communities and their metagenomes have not been extensively mined for new biosynthesis pathways. Using anti-SMASH we computationally identified many bio-synthetic gene clusters that could produce pharmacologically interesting compounds, such as butyrolactone and antibiotics. We suggest the combination of this pharmacological diversity and the unique cell biology of anammox bacteria could make them a fruitful resource for drug discovery.While short read metagenome data can potentially provide insights into taxonomic identities of organisms, we found greatly improved taxonomic inference and functional pathway inference by using partial assembly of the short reads. For instance, while MetaPhlAn analysis gave us a good depiction of the taxonomic similarity between samples , the accuracy of assignments was not sufficient to guide the choice of reference genomes for assembly of the whole metagenome deep sequencing reads, indicating that our particular samples have a taxonomic distribution that is poorly represented in the available databases that MetaPhlAn uses. Assembly of 16S rDNA from short reads is known to be chimera-prone due to the high homology across the tree of life. Solely using EMIRGE to assemble 16S genes and then aligning to SILVA gave us a much more accurate depiction of the phylogenetic diversity in our samples. However, connecting the 16S taxonomy to the genomic bins was problematic. When we tried to link these genes to contigs in the bins using targeted assembly , we found that multiple 16S genes assembled to a given genomic bin. While we could make good guesses at which 16S gene belonged to which genomic bin, we could not make these links in an unbiased manner. Therefore, we have omitted them here. While our analysis reveals only a fraction of the inherent long-tailed distribution of taxa that occur in the groundwater, because we are interested in the major factors shaping water chemistry, the most abundant taxa are the most important to sample. Thus a sequencing depth of ~50 million PE 101 bp reads per sample is quite adequate for assessing the functional geochemistry of groundwater. However, as discussed earlier, a high amount of strain-level variation is present that our current methodologies can only address at a superficial level.We found evidence for strain-level variation in the anammox community both across samples and within bins . While making further distinctions between strains is beyond the scope of this paper, future investigations into the ecological factors that support anammox strain variation with apparently overlapping niches would help define the biology of this globally important denitrifying community. Here we find evidence that at least three related Brocadiaceae strains can coexist .We find many , highly diverse, nano-prokaryote genomes , and the abundance of these genomes amounts to over 50% of the community in MW5 . Because these organisms have been shown to lack major parts of central metabolism, this observation emphasizes the question posed by Brown et al, which is, to what extent do nano-prokaryotes exist as separate cellular entities versus spatially localized to and metabolically dependent upon other cells? Of note is the presence in the small genomes of many partial pathways that affect cellular decision-making . In particular, most of the small genomes encode homologs of flagellar chemotaxis components, which we speculate could serve to modify the cellular decision-making behavior of larger cells. We note that the greater diversity of Chloroflexi, CPR, and DPANN taxa in MW5 versus MW6 and DOM corresponds to a greater presence of nitrate, sulfate, and DOC, which is contrary to macroecological theory and empirical results that demonstrate loss of diversity with increased nutrients. Future studies could address whether these phylogenetic abundance patterns are directly tied to particular nutrients or an indirect consequence of trophic community metabolism, which could aid in optimizing ecology of wastewater treatment bioreactors.Because of the employment opportunities and economic multipliers it creates, especially during the early stages of development, agriculture has long been at the center of discussions about poverty reduction and economic development .