Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Correlation between coding genetic variation and bacterial abundance. a Manhattan plot illustrating the P values (y-axis, −log scale) for correlation of each tested coding SNP (shown as circles) by its genomic location (x-axis) with the abundance of Bifidobacterium in the gut. SNP colors alternate by chromosome, with red dots representing SNPs with P values that surpass genome-wide significance after FDR correction. b A close-up of the region of correlation within LCT. Genomic positions on chromosome 2 are on the x-axis, and the P values are on the y-axis (−log scale). Each dot represents a SNP tested using our model, and the color represents the linkage disequilibrium (r2) between each dot and the top SNP, colored purple and indicated by its dbSNP rsID (inset legend indicates the spectrum of colors and matching r2 values). Blue lines represent recombination rate calculated from the European samples in the 1000 Genomes Project. Gene regions are shown underneath, with LCT highlighted. c An interaction network generated using IPA showing pathways that are enriched among genes that harbor SNPs correlated with abundance of bacterial taxa (in orange). Lines represent known interactions between genes, and shapes represent types of proteins (see legend at the bottom left)