Evaluating Bioinformatic Pipeline Performance for Forensic Microbiome Analysis*,†,‡
dc.contributor.author | Kaszubinski, Sierra F. | |
dc.contributor.author | Pechal, Jennifer L. | |
dc.contributor.author | Schmidt, Carl J. | |
dc.contributor.author | Jordan, Heather R. | |
dc.contributor.author | Benbow, Mark E. | |
dc.contributor.author | Meek, Mariah H. | |
dc.date.accessioned | 2020-03-17T18:33:16Z | |
dc.date.available | WITHHELD_13_MONTHS | |
dc.date.available | 2020-03-17T18:33:16Z | |
dc.date.issued | 2020-03 | |
dc.identifier.citation | Kaszubinski, Sierra F.; Pechal, Jennifer L.; Schmidt, Carl J.; Jordan, Heather R.; Benbow, Mark E.; Meek, Mariah H. (2020). "Evaluating Bioinformatic Pipeline Performance for Forensic Microbiome Analysis*,†,‡." Journal of Forensic Sciences 65(2): 513-525. | |
dc.identifier.issn | 0022-1198 | |
dc.identifier.issn | 1556-4029 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/154468 | |
dc.description.abstract | Microbial communities have potential evidential utility for forensic applications. However, bioinformatic analysis of high‐throughput sequencing data varies widely among laboratories. These differences can potentially affect microbial community composition and downstream analyses. To illustrate the importance of standardizing methodology, we compared analyses of postmortem microbiome samples using several bioinformatic pipelines, varying minimum library size or minimum number of sequences per sample, and sample size. Using the same input sequence data, we found that three open‐source bioinformatic pipelines, MG‐RAST, mothur, and QIIME2, had significant differences in relative abundance, alpha‐diversity, and beta‐diversity, despite the same input data. Increasing minimum library size and sample size increased the number of low‐abundant and infrequent taxa detected. Our results show that bioinformatic pipeline and parameter choice affect results in important ways. Given the growing potential application of forensic microbiology to the criminal justice system, continued research on standardizing computational methodology will be important for downstream applications. | |
dc.publisher | John Wiley & Sons Ltd | |
dc.subject.other | next‐generation sequencing | |
dc.subject.other | forensic science | |
dc.subject.other | bioinformatic pipelines | |
dc.subject.other | forensic microbiology | |
dc.subject.other | postmortem microbiome | |
dc.subject.other | microbial communities | |
dc.title | Evaluating Bioinformatic Pipeline Performance for Forensic Microbiome Analysis*,†,‡ | |
dc.type | Article | |
dc.rights.robots | IndexNoFollow | |
dc.subject.hlbsecondlevel | Science (General) | |
dc.subject.hlbtoplevel | Science | |
dc.description.peerreviewed | Peer Reviewed | |
dc.description.bitstreamurl | https://deepblue.lib.umich.edu/bitstream/2027.42/154468/1/jfo14213_am.pdf | |
dc.description.bitstreamurl | https://deepblue.lib.umich.edu/bitstream/2027.42/154468/2/jfo14213.pdf | |
dc.identifier.doi | 10.1111/1556-4029.14213 | |
dc.identifier.source | Journal of Forensic Sciences | |
dc.identifier.citedreference | Price MN, Dehal PS, Arkin AP. FastTree 2 – approximately maximum‐likelihood trees for large alignments. PLoS ONE 2010; 5 ( 3 ): e9490. | |
dc.identifier.citedreference | Pechal JL, Schmidt CJ, Jordan HR, Benbow ME. Frozen: thawing and its effect on the postmortem microbiome in two pediatric cases. J Forensic Sci 2017; 62 ( 5 ): 1399 – 405. | |
dc.identifier.citedreference | Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high‐throughput community sequencing data. Nat Methods 2010; 7 ( 5 ): 335 – 6. | |
dc.identifier.citedreference | Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. Development of a dual‐index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl Environ Microbiol 2013; 79 ( 17 ): 5112 – 20. | |
dc.identifier.citedreference | Caporaso JG, Knight R, Kelley ST. Host‐associated and free‐living phage communities differ profoundly in phylogenetic composition. PLoS ONE 2011; 6 ( 2 ): e16900. | |
dc.identifier.citedreference | Caporaso JG, Lauber CL, Walters WA, Berg‐Lyons D, Huntley J, Fierer N, et al. Ultra‐high‐throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J 2012; 6 ( 8 ): 1621 – 4. | |
dc.identifier.citedreference | Caporaso JG, Luaber CL, Costello EK, Berg‐Lyons D, Gonzalez A, Stombaugh J, et al. Moving pictures of the human microbiome. Genome Biol 2011; 12 ( 5 ): R50. | |
dc.identifier.citedreference | Glass EM, Wilkening J, Wilke A, Antonopoulos D, Meyer F. Using the metagenomics RAST server (MG‐RAST) for analyzing shotgun metagenomes. Cold Spring Harb Protoc 2010; 2010 ( 1 ): pdb.prot5368. | |
dc.identifier.citedreference | Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web‐based tools. Nucleic Acids Res 2013; 41 ( Database issue ): D590 – D596. | |
dc.identifier.citedreference | Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: high resolution sample inference from Illumina amplicon data. Nat Methods 2016; 13 ( 7 ): 581 – 3. | |
dc.identifier.citedreference | Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 2013; 30 ( 4 ): 772 – 80. | |
dc.identifier.citedreference | McDonald D, Clemente JC, Kuczynski J, Rideout JR, Stombaugh J, Wendel D, et al. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome‐ome. Gigascience 2012; 1 ( 1 ): 7. | |
dc.identifier.citedreference | Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 2010; 26 ( 19 ): 2460 – 1. | |
dc.identifier.citedreference | Rognes T, Flouri T, Nichols B, Quince C, Mahé F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 2016; 4: e2584. | |
dc.identifier.citedreference | McMurdie PJ, Holmes S. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 2013; 8 ( 4 ): e61217. | |
dc.identifier.citedreference | R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2018. | |
dc.identifier.citedreference | Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis 2015; 26: 27663. | |
dc.identifier.citedreference | Kruskal WH, Wallis WA. Use of ranks in one‐criterion variance analysis. J Am Stat Assoc 1952; 47 ( 260 ): 583 – 621. | |
dc.identifier.citedreference | Nemenyi P. Distribution‐free multiple comparisons [dissertation]. Princeton, NJ: Princeton University, 1963. | |
dc.identifier.citedreference | Pohlert T. The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR), 2014. http://CRAN.R-project.org/package=PMCMR (accessed September 19, 2019). | |
dc.identifier.citedreference | Oksanen J, Guillaume Blanchet F, Friendly M, Kindt R, Legendre P, McGlinn D, et al. vegan: community ecology package. R package version 2.5‐2, 2018. https://CRAN.R-project.org/package=vegan (accessed September 19, 2019). | |
dc.identifier.citedreference | Rosa PLS, Deych E, Carter S, Shands B, Yang D, Shannon WD. HMP: hypothesis testing and power calculations for comparing metagenomic samples from HMP. R package version 2.0, 2019. https://CRAN.R-project.org/package=HMP (accessed September 19, 2019). | |
dc.identifier.citedreference | Liaw A, Wiener M. Classification and regression by randomForest. R News 2002; 2 ( 3 ): 18 – 22. | |
dc.identifier.citedreference | Shade A, Handelsman J. Beyond the Venn diagram: the hunt for a core microbiome. Environ Microbiol 2012; 14 ( 1 ): 4 – 12. | |
dc.identifier.citedreference | Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 2007; 39 ( 2 ): 175 – 91. | |
dc.identifier.citedreference | DeBruyn JM, Hauther KA. Postmortem succession of gut microbial communities in deceased human subjects. PeerJ 2017; 5: e3437. | |
dc.identifier.citedreference | Budowle B, Schutzer SE, Einseln A, Kelley LC, Walsh AC, Smith JAL, et al. Building microbial forensics as a response to bioterrorism. Science 2003; 301 ( 5641 ): 1852 – 3. | |
dc.identifier.citedreference | Pechal JL, Schmidt CJ, Jordan HR, Benbow ME. A large‐scale survey of the postmortem human microbiome, and its potential to provide insight into the living health condition. Sci Rep 2018; 8 ( 1 ): 5724. | |
dc.identifier.citedreference | Metcalf JL, Xu ZZ, Weiss S, Lax S, Treuren WV, Hyde ER, et al. Microbial community assembly and metabolic function during mammalian corpse decomposition. Science 2016; 351 ( 6269 ): 158 – 62. | |
dc.identifier.citedreference | Schmedes SE, Sajantila A, Budowle B. Expansion of microbial forensics. J Clin Microbiol 2016; 54 ( 8 ): 1964 – 74. | |
dc.identifier.citedreference | Dobay A, Haas C, Fucile G, Downey N, Morrison HG, Kratzer A, et al. Microbiome‐based body fluid identification of samples exposed to indoor conditions. Forensic Sci Int Genet 2019; 40: 105 – 13. | |
dc.identifier.citedreference | Schmedes SE, Woerner AE, Budowle B. Forensic human identification using skin microbiomes. Appl Environ Microbiol 2017; 83 ( 22 ): 1672 – 17. | |
dc.identifier.citedreference | Benbow ME, Pechal JL, Lang JM, Wallace JR. The potential of high‐throughput metagenomic sequencing of aquatic bacterial communities to estimate the postmortem submersion interval. J Forensic Sci 2015; 60 ( 6 ): 1500 – 10. | |
dc.identifier.citedreference | Pechal JL, Crippen TL, Benbow ME, Tarone AM, Dowd S, Tomberlin JK. The potential use of bacterial community succession in forensics as described by high throughput metagenomic sequencing. Int J Legal Med 2014; 128 ( 1 ): 193 – 205. | |
dc.identifier.citedreference | Johnson HR, Trinidad DD, Guzman S, Khan Z, Parziale JV, DeBruyn JM, et al. A machine learning approach for using the postmortem skin microbiome to estimate the postmortem interval. PLoS ONE 2016; 11 ( 12 ): e0167370. | |
dc.identifier.citedreference | Metcalf JL, Parfrey LW, Gonzalez A, Lauber CL, Knights D, Ackermann,, et al. A microbial clock provides an accurate estimate of the postmortem interval in a mouse model system. Elife 2013; 2: 1104. | |
dc.identifier.citedreference | Carter DO, Tomberlin JK, Benbow ME, Metcalf JL, editors. Forensic microbiology. Hoboken, NJ: John Wiley & Sons Ltd, 2017. | |
dc.identifier.citedreference | Metcalf JL, Xu ZZ, Bouslimani A, Dorrestein P, Carter DO, Knight R. Microbiome tools for forensic science. Trends Biotech 2017; 35 ( 9 ): 814 – 23. | |
dc.identifier.citedreference | Leipzig J. A review of bioinformatic pipeline frameworks. Brief Bioinform 2017; 18 ( 3 ): 530 – 6. | |
dc.identifier.citedreference | Sivarajah U, Kamal MM, Irani Z, Weerakkody V. Critical analysis of big data challenges and analytical methods. J Bus Res 2017; 70: 263 – 86. | |
dc.identifier.citedreference | Golob JL, Margolis A, Hoffman NG, Fredricks DN. Evaluating the accuracy of amplicon‐based microbiome computational pipelines on simulated human gut microbial communities. BMC Bioinformatics 2017; 18 ( 1 ): 283. | |
dc.identifier.citedreference | Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al‐Ghalith GA, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat Biotechnol 2019; 37 ( 8 ): 852 – 7. | |
dc.identifier.citedreference | Bokulich NA, Rideout JR, Mercurio WG, Shiffer A, Wolfe B, Maurice CF, et al. mockrobiota: a public resource for microbiome bioinformatics benchmarking. mSystems 2016; 1 ( 5 ): e00062‐16. | |
dc.identifier.citedreference | Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, et al. Introducing mothur: open‐source, platform‐independent, community‐supported software for describing and comparing microbial communities. Appl Environ Microbiol 2009; 75 ( 23 ): 7537 – 41. | |
dc.identifier.citedreference | Keegan KP, Glass EM, Meyer F. MG‐RAST, a metagenomics service for analysis of microbial community structure and function. Microbial Env Genet 2016; 1399: 207 – 33. | |
dc.identifier.citedreference | Plummer E, Twin J, Bulach DM, Garland SM, Tabrizi SN. A comparison of three bioinformatics pipelines for the analysis of preterm gut microbiota using 16S RRNA gene sequencing data. J Proteomics Bioinform 2015; 8 ( 12 ): 283 – 91. | |
dc.identifier.citedreference | Siegwald L, Touzet H, Lemoine Y, Hot D, Audebert C, Caboche S. Assessment of common and emerging bioinformatics pipelines for targeted metagenomics. PLoS ONE 2017; 12 ( 1 ): e0169563. | |
dc.identifier.citedreference | Mysara M, Njima M, Leys N, Raes J, Monsieurs P. From reads to operational taxonomic units: an ensemble processing pipeline for MiSeq amplicon sequencing data. Gigascience 2017; 6 ( 2 ): 1 – 10. | |
dc.identifier.citedreference | Nilakanta H, Drews KL, Firrell S, Foulkes MA, Jablonski KA. A review of software for analyzing molecular sequences. BMC Res Notes 2014; 7: 830. | |
dc.identifier.citedreference | McMurdie PJ, Holmes S. Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol 2014; 10 ( 4 ): e1003531. | |
dc.identifier.citedreference | Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome 2017; 27 ( 5 ): 27. | |
dc.identifier.citedreference | D’Argenio V, Casaburi G, Precone V, Salvatore F. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines. Biomed Res Int 2014; 2014: 325340. | |
dc.identifier.citedreference | Hyde ER, Haarmann DP, Lynne AM, Bucheli SR, Petrosino JF. The living dead: bacterial community structure of a cadaver at the onset and end of the bloat stage of decomposition. PLoS ONE 2013; 8 ( 10 ): e77733. | |
dc.identifier.citedreference | Ioannidis JP. Why most published research findings are false. PLoS Medicine 2005; 2 ( 8 ): 696 – 701. | |
dc.identifier.citedreference | Clarke TH, Gomez A, Singh H, Nelson KE, Brinkac LM. Integrating the microbiome as a resource in the forensics toolkit. Forensic Sci Int Genet 2017; 30: 141 – 7. | |
dc.identifier.citedreference | Zhang Y, Pechal JL, Schmidt CJ, Jordan HR, Wang WW, Benbow ME, et al. Machine learning performance in a microbial molecular autopsy context: a cross‐sectional postmortem human population study. PLoS ONE 2019; 14 ( 4 ): e0213829. | |
dc.owningcollname | Interdisciplinary and Peer-Reviewed |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.