Show simple item record

The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities

dc.contributor.authorBeesley, Lauren J.
dc.contributor.authorSalvatore, Maxwell
dc.contributor.authorFritsche, Lars G.
dc.contributor.authorPandit, Anita
dc.contributor.authorRao, Arvind
dc.contributor.authorBrummett, Chad
dc.contributor.authorWiller, Cristen J.
dc.contributor.authorLisabeth, Lynda D.
dc.contributor.authorMukherjee, Bhramar
dc.date.accessioned2020-03-17T18:32:43Z
dc.date.availableWITHHELD_13_MONTHS
dc.date.available2020-03-17T18:32:43Z
dc.date.issued2020-03-15
dc.identifier.citationBeesley, Lauren J.; Salvatore, Maxwell; Fritsche, Lars G.; Pandit, Anita; Rao, Arvind; Brummett, Chad; Willer, Cristen J.; Lisabeth, Lynda D.; Mukherjee, Bhramar (2020). "The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities." Statistics in Medicine 39(6): 773-800.
dc.identifier.issn0277-6715
dc.identifier.issn1097-0258
dc.identifier.urihttps://hdl.handle.net/2027.42/154448
dc.publisherJohn Wiley & Sons, Inc.
dc.subject.otherMichigan Genomics Initiative
dc.subject.otherUK Biobank
dc.subject.otherselection bias
dc.subject.otherelectronic health records
dc.subject.otherbiobanks
dc.titleThe emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities
dc.typeArticle
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelStatistics and Numeric Data
dc.subject.hlbsecondlevelPublic Health
dc.subject.hlbsecondlevelMedicine (General)
dc.subject.hlbtoplevelHealth Sciences
dc.subject.hlbtoplevelScience
dc.subject.hlbtoplevelSocial Sciences
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/154448/1/sim8445_am.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/154448/2/sim8445.pdf
dc.identifier.doi10.1002/sim.8445
dc.identifier.sourceStatistics in Medicine
dc.identifier.citedreferenceLin DY. An efficient Monte Carlo approach to assessing statistical significance in genomic studies. Bioinformatics. 2005; 21: 781 ‐ 787.
dc.identifier.citedreferenceXie S, Greenblatt R, Levy MZ, Himes BE. Enhancing electronic health record data with geospatial information. AMIA Jt Summits Translation Science Proceedings; 2017: 123 ‐ 132.
dc.identifier.citedreferencePollard TJ et al. The eICU Collaborative Research Database, a freely available multi‐center database for critical care research. Sci. Data. 2018; 180178: 5.
dc.identifier.citedreferenceAl‐Azwani IK, Aziz HA. Integration of wearable technologies into patients’ electronic medical records. Qual. Prim. Care. 2016; 24: 151 ‐ 155.
dc.identifier.citedreferencePolzer N, Gewald H. A structured analysis of smartphone applications to early diagnose alzheimer’s disease or dementia. Procedia Comput. Sci. 2017; 113: 448 ‐ 453.
dc.identifier.citedreferenceNorén GN, Hopstadius J, Bate A, Star K, Edwards IR. Temporal pattern discovery in longitudinal electronic patient records. Data Min. Knowl. Discov. 2010; 20: 361 ‐ 387. https://doi.org/10.1007/s10618‐009‐0152‐3.
dc.identifier.citedreferenceNorén GN et al. Empirical performance of the calibrated self‐controlled cohort analysis within temporal pattern discovery: Lessons for developing a risk identification and analysis system. Drug Saf. 2013; 36: 107 ‐ 121. https://doi.org/10.1007/s40264‐013‐0095‐x.
dc.identifier.citedreferenceBoland MR, Shahn Z, Madigan D, Hripcsak G, Tatonetti NP. Birth month affects lifetime disease risk: A phenome‐wide method. J. Am. Med. Informatics Assoc. 2015; 22: 1042 ‐ 1053. https://doi.org/10.1093/jamia/ocv046.
dc.identifier.citedreferenceLiu M et al. Comparative analysis of pharmacovigilance methods in the detection of adverse drug reactions using electronic medical records. J Am Med Inf. Assoc. 2013; 20: 420 ‐ 426.
dc.identifier.citedreferenceRamirez AH et al. Predicting warfarin dosage in European‐Americans and African‐Americans using DNA samples linked to an electronic health record. Pharmacogenomics. 2012; 13: 407 ‐ 418. https://doi.org/10.2217/pgs.11.164.
dc.identifier.citedreferencePeterson JF et al. Electronic health record design and implementation for pharmacogenomics: a local perspective HHS Public access. Genet Med. 2013; 15109: 833 ‐ 841.
dc.identifier.citedreferenceMadigan D, Shin J. Drospirenone‐containing oral contraceptives and venous thromboembolism: an analysis of the FAERS database. Open Access J. Contracept. 2018; 9: 29 ‐ 32.
dc.identifier.citedreferenceShuldiner AR et al. The pharmacogenomics research network translational pharmacogenetics program: Overcoming challenges of real‐world implementation. Clin. Pharmacol. Ther. 2013; 94: 207 ‐ 210.
dc.identifier.citedreferenceKuang Z et al. Computational drug repositioning using continuous self‐controlled case series. KDD. 2017; 491 ‐ 500. https://doi.org/10.1145/2939672.2939715.
dc.identifier.citedreferencePaige E et al. Landmark models for optimizing the use of repeated measurements of risk factors in electronic health records to predict future disease risk. Am. J. Epidemiol. 2018; 187: 1530 ‐ 1538.
dc.identifier.citedreferenceGoldstein BA, Navar AM, Pencina MJ, Ioannidis JPA. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J. Am. Med. Informatics Assoc. 2017; 24: 198 ‐ 208.
dc.identifier.citedreferenceCaballero K, Akella R. Dynamic estimation of the probability of patient readmission to the ICU using electronic medical records. AMIA Annu. Symp. Proc. 2015; 2015: 1831 ‐ 1840.
dc.identifier.citedreferenceAczon M et al. Dynamic Mortality Risk Predictions in Pediatric Critical Care Using Recurrent. Neural Networks arXiv. 2017; 1 ‐ 18.
dc.identifier.citedreferenceSteorts RC, Hall R, Fienberg SE. A bayesian approach to graphical record linkage and deduplication. J. Am. Stat. Assoc. 2016; 111: 1660 ‐ 1672.
dc.identifier.citedreferenceSayers A, Ben‐Shlomo Y, Blom AW, Steele F. Probabilistic record linkage. Int. J. Epidemiol. 2016; 45: 954 ‐ 964.
dc.identifier.citedreferenceVatsalan D, Christen P, Verykios VS. A taxonomy of privacy‐preserving record linkage techniques. Inf. Syst. 2013; 38: 946 ‐ 969.
dc.identifier.citedreferenceAl Mamun A, Aseltine R, Rajasekaran S. Efficient record linkage algorithms using complete linkage clustering. PLoS One. 2016; 11: 1 ‐ 21.
dc.identifier.citedreferenceSchmidlin K, Clough‐Gorr KM, Spoerri A. Privacy preserving probabilistic record linkage (P3RL): a novel method for linking existing health‐related data and maintaining participant confidentiality. BMC Med. Res. Methodol. 2015; 15: 1 ‐ 10.
dc.identifier.citedreferenceLong Q. Statistical methods for handling missing data in distributed health data networks. Joint Statistical Meetings; 2018.
dc.identifier.citedreferenceTang, L., Zhou, L. & Song, P. X.‐K. Method of divide‐and‐combine in regularised generalised linear models for big data. arXiv. 2016.
dc.identifier.citedreferenceYang J et al. Conditional and joint multiple‐SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet. 2013; 44: 1 ‐ 22.
dc.identifier.citedreferenceHripcsak G et al. Characterizing treatment pathways at scale using the OHDSI network. Proc. Natl. Acad. Sci. USA. 2016; 113: 7329 ‐ 7336. https://doi.org/10.1073/pnas.1510502113.
dc.identifier.citedreferenceSimon GE, Peterson D, Hubbard R. Is treatment adherence consistent across time, across different treatments and across diagnoses? Gen. Hosp. Psychiatry. 2013; 35: 195 ‐ 201. https://doi.org/10.1016/j.genhosppsych.2012.10.001.
dc.identifier.citedreferenceSantillana M et al. Cloud‐based electronic health records for real‐time, region‐specific influenza surveillance. Sci. Rep. 2016; 25732: 1 ‐ 8.
dc.identifier.citedreferenceYang S et al. Using electronic health records and Internet search information for accurate influenza forecasting. BMC Infect. Dis. 2017; 17: 1 ‐ 9.
dc.identifier.citedreferenceMoran KR et al. Epidemic forecasting is messier than weather forecasting: the role of human behavior and internet data streams in epidemic forecast. J. Infect. Dis. 2016; 214: 404 ‐ 408.
dc.identifier.citedreferenceMeng XL. Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 us presidential election. Ann. Appl. Stat. 2018; 12: 685 ‐ 726.
dc.identifier.citedreferenceDe Souza YG, Greenspan JS. Biobanking past, present and future. AIDS. 2013; 27: 303 ‐ 312.
dc.identifier.citedreferenceGreely HT. The uneasy ethical and legal underpinnings of large‐scale genomic biobanks. Annu. Rev. Genomics Hum. Genet. 2007; 8: 343 ‐ 364.
dc.identifier.citedreferenceHayrinen K, Saranto K, Nyk P. Definition, structure, content, use and impacts of electronic health records: a review of the research literature. Int. J. Med. Inform. 2008; 7: 291 ‐ 304.
dc.identifier.citedreferenceDenny JC et al. PheWAS: demonstrating the feasibility of a phenome‐wide scan to discover gene‐disease associations. Bioinformatics. 2010; 26: 1205 ‐ 1210.
dc.identifier.citedreferenceWolford BN, Willer CJ, Surakka I. Electronic health records: The next wave of complex disease genetics. Hum. Mol. Genet. 2018; 27: R14 ‐ R21.
dc.identifier.citedreferenceGlicksberg BS, Johnson KW, Dudley JT. The next generation of precision medicine: Observational studies, electronic health records, biobanks and continuous monitoring. Hum. Mol. Genet. 2018; 27: R56 ‐ R62.
dc.identifier.citedreferenceBush WS, Oetjens MT, Crawford DC. Unravelling the human genome–phenome relationship using phenome‐wide association studies. Nat. Rev. Genet. 2016; 17: 129 ‐ 145.
dc.identifier.citedreferenceOhno‐Machado L, Kim J, Gabriel RA, Kuo GM, Hogarth MA. Genomics and electronic health record systems. Hum. Mol. Genet. 2018; 27: R48 ‐ R55.
dc.identifier.citedreferenceBrieger K et al. Genes for good: engaging the public in genetics research via social media. Am. J. Hum. Genet. 2019; 105: 65 ‐ 77.
dc.identifier.citedreferenceFritsche, L. G. et al. Association of Polygenic Risk Scores for Multiple Cancers in a Phenome‐wide Study: Results from The Michigan Genomics Initiative. Am. J. Hum. Genet. 2018; 102: 1048 ‐ 1061. doi: https://doi.org/10.1016/j.ajhg.2018.04.001
dc.identifier.citedreferenceMichigan Genomics Initiative Website. https://www.michigangenomics.org.
dc.identifier.citedreferenceUK Biobank Website. http://www.ukbiobank.ac.uk.
dc.identifier.citedreferenceAllen N et al. UK Biobank: current status and what it means for epidemiology. Heal. Policy Technol. 2012; 1: 123 ‐ 126.
dc.identifier.citedreferenceEstonian Genome Center. Available at: https://www.geenivaramu.ee/en/access‐biobank.
dc.identifier.citedreferenceLeitsalu L et al. Cohort profile: estonian biobank of the Estonian genome center, university of Tartu. Int. J. Epidemiol. 2015; 44: 1137 ‐ 1147.
dc.identifier.citedreferenceDanish National Biobank. http://www.biobankdenmark.dk.
dc.identifier.citedreferenceBiobank Sweden. http://biobanksverige.se/research/.
dc.identifier.citedreferenceSaudi Biobank. http://kaimrc.med.sa.
dc.identifier.citedreferenceChina National GeneBank. https://www.cngb.org/home.html.
dc.identifier.citedreferenceNational Biobank of Korea. http://www.nih.go.kr/NIH/cms/content/eng/14/65714_view.html.
dc.identifier.citedreferenceCho SY et al. Opening of the National Biobank of Korea as the infrastructure of future biomedical science in Korea. Osong Public Heal. Res. Perspect. 2012; 3: 177 ‐ 184.
dc.identifier.citedreferenceQatar Biobank. https://www.qatarbiobank.org.qa.
dc.identifier.citedreferenceAl Kuwari H et al. The qatar Biobank: background and methods. BMC Public Health. 2015; 15: 1208.
dc.identifier.citedreferenceLin E et al. Association and interaction effects of Alzheimer’s disease‐associated genes and lifestyle on cognitive aging in older adults in a Taiwanese population. Oncotarget. 2017; 8: 24077 ‐ 24087.
dc.identifier.citedreferenceTaiwan Biobank. https://www.twbiobank.org.tw/new_web_en/index.php.
dc.identifier.citedreferenceNagai A et al. Overview of the BioBank Japan Project: study design and profile. J. Epidemiol. 2017; 27: S2 ‐ S8.
dc.identifier.citedreferenceNational Institutes of Health. The All of Us Research Program: Operational Protocol. ( 2018 ).
dc.identifier.citedreferencePcBaSe Sweden Website. http://www.surgsci.umu.se/english/sections/urology‐and‐andrology/research/pcbase/?languageId=1.
dc.identifier.citedreferenceMayo Clinic Biobank for Bipolar Disorder Website. https://www.mayo.edu/research/centers‐programs/bipolar‐disorder‐biobank/overview.
dc.identifier.citedreferencePhelan M, Bhavsar N, Goldstein BA. Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference. eGEMs. 2017; 5: 22.
dc.identifier.citedreferenceGoldstein BA, Bhavsar NA, Phelan M, Pencina MJ. Controlling for informed presence bias due to the number of health encounters in an electronic health record. Am. J. Epidemiol. 2016; 184: 847 ‐ 855.
dc.identifier.citedreferenceKeiding N, Louis TA. Perils and potentials of self‐selected entry to epidemiological studies and surveys. J. R. Stat. Soc. Ser. A Stat. Soc. 2016; 179: 319 ‐ 376.
dc.identifier.citedreferenceBeesley LJ, Fritsche LG, Mukherjee BA. Modeling framework for exploring sampling and observation process biases in genome and phenome‐wide association studies using electronic health records. bioRXiv. 2018; 1: 1 ‐ 19.
dc.identifier.citedreferenceBaker, R. et al. Report of the AAPOR Task Force on Non‐Probability Sampling. ( 2013 ).
dc.identifier.citedreferenceCarroll RJ, Bastarache L, Denny JC. R PheWAS: data analysis and plotting tools for phenome‐wide association studies in the R environment. Bioinformatics. 2014; 30: 2375 ‐ 2376.
dc.identifier.citedreferenceTian Y, Schuemie MJ, Suchard MA. Evaluating large‐scale propensity score performance through real‐world and synthetic data experiments. Int. J. Epidemiol. 2018; 47: 2005 ‐ 2014. https://doi.org/10.1093/ije/dyy120.
dc.identifier.citedreferenceSchneeweiss S et al. High‐dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009; 20: 512 ‐ 522.
dc.identifier.citedreferenceHall LS et al. Genome‐wide meta‐analyses of stratified depression in Generation Scotland and UK Biobank. Transl. Psychiatry. 2018; 8: 9.
dc.identifier.citedreferenceMIT Critical Data. Secondary Analysis of Electronic Health Records. Cham, Switzerland: Springer; 2016.
dc.identifier.citedreferenceSalmasi L, Capobianco E. Use of instrumental variables in electronic health record‐driven models. Stat. Methods Med. Res. 2018; 27: 607 ‐ 621.
dc.identifier.citedreferenceLi X et al. MR‐PheWAS: exploring the causal effect of SUA level on multiple disease outcomes by using genetic instruments in UK Biobank. Ann. Rheum. Dis. 2018; 77: 1039 ‐ 1047.
dc.identifier.citedreferenceBurgess S, Timpson NJ, Ebrahim S, Smith GD. Mendelian randomization: Where are we now and where are we going? Int. J. Epidemiol. 2015; 44: 379 ‐ 388.
dc.identifier.citedreferenceRobins JM, Miguel A. Marginal structural models and causal inference. Epidemiology. 2000; 11: 550 ‐ 560.
dc.identifier.citedreferenceSperrin M, Martin GP, Peek N, Buchan I, Pate A. Using marginal structural models to adjust for treatment drop‐in when developing clinical prediction models. Stat. Med. 2018; 37: 4142 ‐ 4154.
dc.identifier.citedreferenceCarnegie NB, Harada M, Hill JL. Assessing sensitivity to unmeasured confounding using a simulated potential confounder. J. Res. Educ. Eff. 2016; 9: 395 ‐ 420.
dc.identifier.citedreferenceUddin MJ et al. Methods to control for unmeasured confounding in pharmacoepidemiology: an overview. Int. J. Clin. Pharm. 2016; 38: 714 ‐ 723.
dc.identifier.citedreferenceZhang X, Faries DE, Li H, Stamey JD, Imbens GW. Addressing unmeasured confounding in comparative observational research. Pharmacoepidemiol. Drug Saf. 2018; 27: 373 ‐ 382.
dc.identifier.citedreferenceVanderWeele TJ, Ding P. sensitivity analysis in observational research: introducing the e‐value. Ann. Intern. Med. 2017; 167: 268.
dc.identifier.citedreferenceICD Code Informational Website. https://www.cdc.gov/nchs/icd/index.htm.
dc.identifier.citedreferencePendergrass SA, Ritchie MD. Phenome‐wide association studies: leveraging comprehensive phenotypic and genotypic data for discovery. Curr. Genet. Med. Rep. 2016; 42: 407 ‐ 420.
dc.identifier.citedreferenceeMERGE PheKB Website. https://phekb.org.
dc.identifier.citedreferenceLiao KP et al. Methods to develop an electronic medical record phenotype algorithm to compare the risk of coronary artery disease across 3 chronic disease cohorts. PLoS One. 2015; 10. https://doi.org/10.1371/journal.pone.0136651.
dc.identifier.citedreferenceLiao KP et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ. 2015; 350: 1 ‐ 5. https://doi.org/10.1136/bmj.h1885.
dc.identifier.citedreferenceAnanthakrishnan AN et al. Identification of nonresponse to treatment using narrative data in an electronic health record inflammatory bowel disease cohort. Inflamm. Bowel Dis. 2016; 22: 151 ‐ 158. https://doi.org/10.1097/MIB.0000000000000580.
dc.identifier.citedreferenceCastro V et al. Identification of subjects with polycystic ovary syndrome using electronic health records. Reprod. Biol. Endocrinol. 2015; 29: 1 ‐ 8. https://doi.org/10.1186/s12958‐015‐0115‐z.
dc.identifier.citedreferenceMcCoy TH et al. Genome‐wide association study of dimensional psychopathology using electronic health records. Biol. Psychiatry. 2018; 83: 1005 ‐ 1011. https://doi.org/10.1016/j.biopsych.2017.12.004.
dc.identifier.citedreferenceSinnott JA et al. Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records. Hum. Genet. 2014; 133: 1369 ‐ 1382.
dc.identifier.citedreferenceYu S et al. Surrogate‐assisted feature extraction for high‐throughput phenotyping. J. Am. Med. Inform. Assoc. 2017; 24: e143 ‐ e149.
dc.identifier.citedreferenceYu S et al. Enabling phenotypic big data with PheNorm. J. Am. Med. Informatics Assoc. 2018; 25: 54 ‐ 60.
dc.identifier.citedreferenceCastro VM et al. Large‐scale identification of patients with cerebral aneurysms using natural language processing. Neurology. 2017; 88: 164 ‐ 168.
dc.identifier.citedreferenceKermany DS et al. Identifying medical diagnoses and treatable diseases by image‐based deep learning. Cell. 2018; 172: 1122 ‐ 1131.e9.
dc.identifier.citedreferenceTeixeira PL et al. Evaluating electronic health record data sources and algorithmic approaches to identify hypertensive individuals. J. Am. Med. Informatics Assoc. 2017; 24: 162 ‐ 171.
dc.identifier.citedreferenceGan M, Li W, Zeng W, Wang X, Jiang R. Mimvec: a deep learning approach for analyzing the human phenome. BMC Syst. Biol. 2017; 11: 76.
dc.identifier.citedreferenceHubbard RA et al. A Bayesian latent class approach for EHR‐based phenotyping. Stat Me. 2019; 38: 74 ‐ 87.
dc.identifier.citedreferenceLiu C, Wang F, Hu J, Xiong H. Temporal phenotyping from longitudinal electronic health records: a graph based framework categories and subject descriptors. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2015: 705 ‐ 714.
dc.identifier.citedreferenceZhao J et al. Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Sci. Rep. 2019; 9: 1 ‐ 10.
dc.identifier.citedreferenceSikorska K et al. GWAS with longitudinal phenotypes: performance of approximate procedures. Eur. J. Hum. Genet. 2015; 23: 1384 ‐ 1391.
dc.identifier.citedreferenceChiu Y, Justice AE, Melton PE. Longitudinal analytical approaches to genetic data. BMC Genet. 2016; 17: 25 ‐ 32.
dc.identifier.citedreferencePivovarov R. Automated methods for the summarization of electronic health records. J. Am. Med. Informatics Assoc. 2015; 22: 938 ‐ 947. https://doi.org/10.1093/jamia/ocv032.
dc.identifier.citedreferenceAlbers DJ et al. Estimating summary statistics for electronic health record laboratory data for use in high‐throughout phenotyping algorithms. J. Biomed. Inform. 2018; 78: 87 ‐ 101.
dc.identifier.citedreferenceWang H et al. From phenotype to genotype: an association study of longitudinal phenotypic markers to Alzheimer ’ s disease relevant SNPs. Bioinformatics. 2012; 28: 619 ‐ 625.
dc.identifier.citedreferenceXu Z, Shen X, Pan W, Neuroimaging D. Longitudinal analysis is more powerful than cross‐sectional analysis in detecting genetic association with neuroimaging phenotypes. PLoS One. 2014; 9: 1 ‐ 13.
dc.identifier.citedreferenceAgniel D, Kohane IS, Weber GM. Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ Open. 2018; 361: 1 ‐ 9.
dc.identifier.citedreferenceLange JM, Hubbard RA, Inoue LYT, Minin VN. A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data. Biometrics. 2015; 71: 90 ‐ 101. https://doi.org/10.1111/biom.12252.
dc.identifier.citedreferenceBergeron PJ, Asgharian M, Wolfson DB. Covariate bias induced by length‐biased sampling of failure times. J. Am. Stat. Assoc. 2008; 103: 737 ‐ 742.
dc.identifier.citedreferenceCastro VM et al. Validation of electronic health record phenotyping of bipolar disorder cases and controls. Am. J. Psychiatry. 2015; 172: 363 ‐ 372.
dc.identifier.citedreferenceBaiardini I, Braido F, Bonini M, Compalati E, Canonica GW. Why do doctors and patients not follow guidelines? Curr. Opin. Allergy Clin. Immunol. 2009; 9: 228 ‐ 233.
dc.identifier.citedreferenceRitchie MD et al. Robust replication of genotype‐phenotype associations across multiple diseases in an electronic medical record. Am. J. Hum. Genet. 2010; 86: 560 ‐ 572.
dc.identifier.citedreferenceChen Y, Wang J, Chubak J, Hubbard RA. Inflation of type I error rates due to differential misclassification in EHR—derived outcomes: empirical illustration using breast cancer recurrence. Pharmacoepidemiol. Drug Saf. 2019; 28: 264 ‐ 268.
dc.identifier.citedreferenceLuan X, Pan W, Gerberich SG, Carlin BP. Does it always help to adjust for misclassiÿcation of a binary outcome in logistic regression? Stat. Med. 2005; 24: 2221 ‐ 2234.
dc.identifier.citedreferenceCarroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement Error in Nonlinear Models: A Modern Perspective. London, United Kingdom: Chapman and Hall; 2006.
dc.identifier.citedreferenceHuang J et al. PIE: A prior knowledge guided integrated likelihood estimation method for bias reduction in association studies using electronic health records data. J. Am. Med. Informatics Assoc. 2018. https://doi.org/10.1093/jamia/ocx137 [Epub ahead of print].
dc.identifier.citedreferenceHubbard RA et al. Classification accuracy of claims‐based methods for identifying providers failing to meet performance targets. Stat. Med. 2015; 34: 93 ‐ 105. https://doi.org/10.1002/sim.6318.
dc.identifier.citedreferenceLiao KP et al. Phenome‐wide association study of autoantibodies to citrullinated and noncitrullinated epitopes in rheumatoid arthritis. Arthritis Rheumatol. 2017; 69: 742 ‐ 749. https://doi.org/10.1002/art.39974.
dc.identifier.citedreferenceWang L et al. Phenotype validation in electronic health records based genetic association studies. Genet Epidemiol. 2017; 41: 790 ‐ 800.
dc.identifier.citedreferenceDuffy SW et al. A simple model for potential use with a misclassified binary outcome in epidemiology. J. Epidemiol. Community Health. 2004; 58: 712 ‐ 717.
dc.identifier.citedreferenceAbana CO et al. IL‐6 variant is associated with metastasis in breast cancer patients. PLoS One. 2017; 12: e0181725.
dc.identifier.citedreferenceGhosh A, Wright FA, Zou F. Unified analysis of secondary traits in case‐control association studies. J. Am. Stat. Assoc. 2013; 108: 566 ‐ 576.
dc.identifier.citedreferenceJiang Y, Scott AJ, Wild CJ. Secondary analysis of case‐control data. Stat. Med. 2006; 25: 1323 ‐ 1339.
dc.identifier.citedreferenceTchetgen EJT. A general regression framework for a secondary outcome in case—control studies. Biostatistics. 2014; 15: 117 ‐ 128.
dc.identifier.citedreferenceWang J, Shete S. Estimation of odds ratios of genetic variants for the secondary phenotypes associated with primary disease. Genet. Epidemiol. 2012; 35: 190 ‐ 200.
dc.identifier.citedreferenceRusanov A, Weiskopf NG, Wang S, Weng C. Hidden in plain sight: Bias towards sick patients when sampling patients with sufficient electronic health record data for research. BMC Med. Inform. Decis. Mak. 2014; 14: 1 ‐ 9.
dc.identifier.citedreferenceReynolds KIMD, West SG. A multiplist strategy for strengthening nonequivalent control group designs. Evluation Rev. 1987; 11: 691 ‐ 714.
dc.identifier.citedreferenceWest SG et al. Alternatives to the randomized controlled trial. Res. Innov. Recomm. 2008; 98: 1359 ‐ 1366.
dc.identifier.citedreferenceAu Yeung SL et al. Aldehyde dehydrogenase 2—a potential genetic risk factor for lung function among southern Chinese: evidence from the Guangzhou Biobank Cohort Study. Ann. Epidemiol. 2014; 24: 606 ‐ 611.
dc.identifier.citedreferenceKuhnert R et al. A modified self‐controlled case series method to examine association between multidose vaccinations and death. Stat. Med. 2011; 30: 666 ‐ 677.
dc.identifier.citedreferenceZhou X, Douglas IJ, Shen R, Bate A. Signal detection for recently approved products: adapting and evaluating self‐controlled case series method using a us claims and uk electronic medical records database. Drug Saf. 2018; 41: 523 ‐ 536.
dc.identifier.citedreferenceSchumie MJ, Trifiro G, Coloma PM, Ryan PB, Madigan D. Detecting adverse drug reactions following long‐term exposure in longitudinal observational data: the exposure‐adjusted self‐controlled case series. Stat. Methods Med. Res. 2016; 25: 2577 ‐ 2592.
dc.identifier.citedreferenceMaclure M et al. When should case‐only designs be used for safety monitoring of medical products? Pharmacoepidemiol. Drug Saf. 2012; 21: 50 ‐ 61. https://doi.org/10.1002/pds.2330.
dc.identifier.citedreferenceSimpson SE et al. Multiple self‐controlled case series for large‐scale longitudinal observational databases. Biometrics. 2013; 69: 893 ‐ 902. https://doi.org/10.1111/biom.12078.
dc.identifier.citedreferencePetersen I, Douglas I, Whitaker H. Self controlled case series methods: an alternative to standard epidemiological study designs. BMJ. 2016; 354: i4515.
dc.identifier.citedreferenceSun Z, Mukherjee B, Estes JP, Vokonas PS, Park SK. Exposure enriched outcome dependent designs for longitudinal studies of gene–environment interaction. Stat. Med. 2017; 36: 2947 ‐ 2960.
dc.identifier.citedreferenceSchildcrout JS, Rathouz PJ, Zelnick LR, Garbett SP, Heagerty PJ. Biased sampling designs to improve research efficiency: Factors influencing pulmonary function over time in children with asthma. Ann. Appl. Stat. 2015; 9: 731 ‐ 753.
dc.identifier.citedreferenceSchildcrout JS, Schisterman EF, Mercaldo ND, Rathouz PJ, Heagerty PJ. Extending the case‐control design to longitudinal data: stratified sampling based on repeated binary outcomes. Epidemiology. 2018; 29: 67 ‐ 75.
dc.identifier.citedreferenceLi D, Lewinger JP, Gauderman WJ, Murcray CE, Conti D. Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies. Genet. Epidemiol. 2011; 35: 790 ‐ 799.
dc.identifier.citedreferenceBjørnland T, Bye A, Ryeng E. Improving power of genetic association studies by extreme phenotype sampling: a review and some new results. arXiv. 2017; 1 ‐ 26.
dc.identifier.citedreferenceManichaikul A et al. Robust relationship inference in genome‐wide association studies. Bioinformatics. 2010; 26: 2867 ‐ 2873.
dc.identifier.citedreferenceZhou W et al. Efficiently controlling for case‐control imbalance and sample relatedness in large‐scale genetic association studies. Nat. Genet. 2018; 50: 1335 ‐ 1341.
dc.identifier.citedreferenceWoodward M. Epidemiology: Study Design and Data Analysis. London, United Kingdom: Chapman and Hall; 2013.
dc.identifier.citedreferenceRothman KJ, Lash TL, Greenland S. Modern Epidemiology. London, United Kingdom: Wilters Kluwer; 2012.
dc.identifier.citedreferenceMadigan D, Ryan PB, Schuemie MJ. Does design matter? Systematic evaluation of the impact of analytical choices on effect esitmates in observational studies. Ther. Adv. Drug Saf. 2013; 4: 53 ‐ 62.
dc.identifier.citedreferenceHaneuse S, Chan HTH, Daniels M. A general framework for considering selection bias in ehr‐based studies: what data are observed and why? EGEMS (Wash DC). 2016; 4 ( 1 ): 1203. https://doi.org/10.13063/2327‐9214.1203.
dc.identifier.citedreferenceSmith GD, Ebrahim S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 2003; 32: 1 ‐ 22.
dc.identifier.citedreferenceAvery CL, Monda KL, North KE. Genetic association studies and the effect of misclassification and selection bias in putative confounders. BMC Proc. 2009; 3: S48.
dc.identifier.citedreferenceZheng, K., Gao, J., Ngiam, K. Y., Ooi, B. C. & Yip, W. L. J. Resolving the bias in electronic medical records. Proceedings of the 23rd ACM SIGKDD Internatinal Conference Knowledge Discovery Data Mining—KDD ’ 17 2171–2180 ( 2017 ). doi: https://doi.org/10.1145/3097983.3098149
dc.identifier.citedreferenceSchuemie MJ, Ryan PB, Dumouchel W, Suchard MA, Madigan D. Interpreting observational studies: Why empirical calibration is needed to correct p‐values. Stat. Med. 2014; 33: 209 ‐ 218.
dc.identifier.citedreferenceSchuemie MJ, Hripcsak G, Ryan PB, Madigan D, Suchard MA. Empirical confidence interval calibration for population‐level effect estimation studies in observational healthcare data. Proc. Natl. Acad. Sci. USA. 2018; 115: 2571 ‐ 2577.
dc.identifier.citedreferenceJohnson KW, Glicksberg BS, Hodos RA, Shameer K, Dudley JT. Causal inference on electronic health records to assess blood pressure treatment targets: an application of the parametric g formula. Biocomputing. 2018; 23: 180 ‐ 191.
dc.identifier.citedreferenceKleinberg S, Hripcsak G. A review of causal inference for biomedical informatics. J. Biomed. Inform. 2011; 44: 1102 ‐ 1112.
dc.identifier.citedreferenceStuart EA, DuGof E, Abrams M, Salkever D, Steinwachs D. Estimating causal effects in observational studies using electronic health data: challenges and (some) solutions. eGEMs. 2013; 1: 4.
dc.identifier.citedreferenceBeaumont RN et al. Genome‐wide association study of offspring birth weight in 86 577 women identifies five novel loci and highlights maternal genetic effects that are independent of fetal genetics. Hum. Mol. Genet. 2018; 27: 742 ‐ 756.
dc.identifier.citedreferenceKlarin D et al. Genetic analysis in UK Biobank links insulin resistance and transendothelial migration pathways to coronary artery disease. Nat. Genet. 2017; 49: 1392 ‐ 1397.
dc.identifier.citedreferenceYang J, Zaitlen NA, Goddard ME, Visscher PM, Price A. Advantages and pitfalls in the application of mized model association methods. Nat. Genet. 2014; 46: 100 ‐ 106.
dc.identifier.citedreferenceFritsche LG et al. Exploring various polygenic risk scores for basal cell carcinoma, cutaneous squamous cell carcinoma and melanoma in the phenomes of the michigan genomics initiative and the UK Biobank. bioRxiv. 2018; 1 ‐ 44. https://doi.org/10.1101/384909.
dc.identifier.citedreferenceDey R, Schmidt EM, Abecasis GR, Lee S. A fast and accurate algorithm to test for binary phenotypes and its application to pheWAS. Am. J. Hum. Genet. 2017; 101: 37 ‐ 49.
dc.identifier.citedreferenceBulik‐Sullivan B et al. LD score regression distinguishes confounding from polygenicity in genome‐wide association studies. Nat. Genet. 2015; 47: 291 ‐ 295.
dc.identifier.citedreferencePurcell SM et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009; 460: 748 ‐ 752.
dc.identifier.citedreferenceHagenaars SP et al. Shared genetic aetiology between cognitive functions and physical and mental health in UK Biobank (N=112 151) and 24 GWAS consortia. Mol. Psychiatry. 2016; 21: 1624 ‐ 1632.
dc.identifier.citedreferenceGe T, Chen C‐Y, Neale BM, Sabuncu MR, Smoller JW. Phenome‐wide heritability analysis of the UK Biobank. PLOS Genet. 2017; 13: e1006711.
dc.identifier.citedreferenceYang J et al. A commentary on ‘common SNPs explain a large proportion of the heritability for human height’ by Yang et al. (2010). Twin Res. Hum. Genet. 2010; 13: 517 ‐ 524.
dc.identifier.citedreferenceTorkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat Rev Genet. 2018; 19: 581 ‐ 590.
dc.identifier.citedreferenceGe T, Chen C, Ni Y, Feng YA, Smoller JW. Polygenic prediction via bayesian regression and continuous shrinkage priors. bioRXiv. 2018; 1 ‐ 30.
dc.identifier.citedreferenceEuesden J, Lewis CM, Reilly PFO. Genome analysis PRSice: polygenic risk score software. Bioinformatics. 2018; 31: 1466 ‐ 1468.
dc.identifier.citedreferenceDe Vlaming R, Groenen PJF. The current and future use of ridge regression for prediction in quantitative genetics. Biomed Res Int. 2015; 2015: 1 ‐ 19.
dc.identifier.citedreferenceSo H, Sham PC. Improving polygenic risk prediction from summary statistics by an empirical Bayes approach. Sci. Rep. 2017; 7: 1 ‐ 11.
dc.identifier.citedreferenceParé G, Mao S, Deng WQ. A machine‐learning heuristic to improve gene score prediction of polygenic traits. Sci. Rep. 2017; 7: 12665.
dc.identifier.citedreferenceMak TSH, Sheung J, Kwan H, Dedalus D. Local true discovery rate weighted polygenic scores using GWAS summary data. Behav. Genet. 2016; 46: 573 ‐ 582.
dc.identifier.citedreferenceLloyd‐Jones LR et al. Improved polygenic prediction by Bayesian multiple regression on summary statistics. bioRXiv. 2019; 1 ‐ 39.
dc.identifier.citedreferenceWray NR, Lee SH, Mehta D, Vinkhuyzen AAE, Middeldorp CM. Research review: polygenic methods and their application to psychiatric traits. J. Child Psychol. Psychiatry. 2014; 55: 1068 ‐ 1087.
dc.identifier.citedreferenceDudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013; 9: e1003348.
dc.identifier.citedreferenceNeale, B. Neale Lab Website for GWAS Summary Statistics. ( 2019 ). http://www.nealelab.is.
dc.identifier.citedreferenceVihjalmsson BJ et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 2015; 97: 576 ‐ 592.
dc.identifier.citedreferenceMak TSH, Sham PC, Porsch RM, Choi SW, Zhou X. Polygenic scores via penalized regression on summary statistics. Genet. Epidemiol. 2017; 41: 469 ‐ 480.
dc.identifier.citedreferenceChoi SW, Shin T, Mak H, Reilly PFO. A guide to performing polygenic risk score analyses. bioRXiv. 2018; 1 ‐ 22.
dc.identifier.citedreferenceWu Y et al. Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits. Nat. Commun. 2018; 9: 1 ‐ 14.
dc.identifier.citedreferenceLloyd‐Jones LR et al. Improved polygenic prediction by Bayesian multiple regression on summary statistics. bioRxiv. 2019; 1 ‐ 41. https://doi.org/10.1101/522961.
dc.identifier.citedreferenceZhu X, Stephens M. Bayesian large‐scale multiple regression with summary statistics from genome‐wide association studies. Ann. Appl. Stat. 2017; 11: 1561 ‐ 1592.
dc.identifier.citedreferenceTurley P et al. Multi‐trait analysis of genome‐wide association summary statistics using MTAG. Nat. Genet. 2017; 50: 229 ‐ 237.
dc.identifier.citedreferenceMaier RM et al. Improving genetic prediction by leveraging genetic correlations among human diseases and traits. Nat. Commun. 2018; 9: 1 ‐ 17.
dc.identifier.citedreferenceShaddox TR, Ryan PB, Schuemie MJ, Madigan D, Suchard MA. Hierarchical models for multiple, rare outcomes using massive observational databases. Stat Anal Data Min. 2016; 2 ( 9 ): 260 ‐ 268.
dc.identifier.citedreferenceXue X, Kim MY, Wang T, Kuniholm MH, Strickler HD. A statistical methods for studying correlated rare events and their risk factors. Stat Methods Med Res. 2017; 26: 1416 ‐ 1428.
dc.identifier.citedreferenceBastarache L et al. Phenotype risk scores identify pations with unrecognized Mendelian disease patterns. Science (80‐.). 2018; 359: 1233 ‐ 1239.
dc.identifier.citedreferenceGronsbell J, Minnier J, Yu S, Liao K, Cai T. Automated feature selection of predictors in electronic medical records data. Biometrics. 2018; 75: 268 ‐ 277.
dc.identifier.citedreferenceScheurwegs E, Cule B, Luyckx K, Luyten L, Daelemans W. Selecting relevant features from the electronic health record for clinical code prediction. J. Biomed. Inform. 2017; 74: 92 ‐ 103.
dc.identifier.citedreferenceSteele AJ, Denaxas SC, Shah AD, Hemingway H. Machine learning models in electronic health records can outperform conventional survival models for predicting patient mortality in coronary artery disease. PLoS One. 2018; 13: 1 ‐ 20.
dc.identifier.citedreferenceWu Y et al. Quantifying predictive capability of electronic health records for the most harmful breast cancer. Proceedings of SPIE—The International Society for Optical Engineering; 2018: 1 ‐ 15. https://doi.org/10.1117/12.2293954.Quantifying.
dc.identifier.citedreferenceWu J et al. Prediction modeling using EHR data challenges, strategies, and a comparison of machine learning approaches. Med. Care. 2010; 48: S106 ‐ S113.
dc.identifier.citedreferenceShickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques for electronic health record. arXiv. 2018; 1 ‐ 16.
dc.identifier.citedreferenceRajkomar A et al. Scalable and accurate deep learning with electronic health records. Digit. Med. 2018; 18: 1 ‐ 10.
dc.identifier.citedreferenceAdkins DE. Machine learning and electronic health records: a paradigm shift. Am. J. Psychiatry. 2018; 174: 93 ‐ 94.
dc.identifier.citedreferenceGarg R, Dong S, Shah S, Jonnalagadda SR. A bootstrap machine learning approach to identify rare disease patients from electronic health records. arXiv. 2016; 1 ‐ 8.
dc.identifier.citedreferenceHarang R, Rudd EM. Towards principled uncertainty estimation for deep neural networks. arXiv. 2018; 1 ‐ 11.
dc.identifier.citedreferenceThompson K, Charnigo R. Parallel computing in genome‐wide association studies journal of biometrics & biostatistics. J. Biometrics Biostat. 2015; 6: 1 ‐ 3.
dc.identifier.citedreferencePrive F, Aschard H, Ziyatdinov A, Blum MGB, Timc‐imag L. Efficient analysis of large‐scale genome‐wide data with two R packages: bigstatsr and bigsnpr. Bioinformatics. 2018; 34: 2781 ‐ 2787.
dc.identifier.citedreferenceBerger B, Peng J, Singh M. Computational solutions for omics data. Nat. Rev. Genet. 2013; 14: 333 ‐ 346.
dc.identifier.citedreferenceWells BJ, Chagin KM, Nowacki AS, Kattan MW. Strategies for handling missing data in electronic health record derived data. Vol 1. Washington, DC: EGEMS; 2013: 1035.
dc.identifier.citedreferenceHormozdiari F et al. Imputing phenotypes for genome‐wide association studies. Am. J. Hum. Genet. 2016; 99: 89 ‐ 103.
dc.identifier.citedreferenceBeaulieu‐Jones BK, Moore JH. Missing data imputation in the electronic health record using deeply learned autoencoders. Biocomput. 2017; 2017: 207 ‐ 218. https://doi.org/10.1142/9789813207813_0021.
dc.identifier.citedreferenceBeaulieu‐Jones BK et al. Characterizing and managing missing structured data in electronic health records: data analysis. JMIR Med. Informatics. 2018; e11: 6.
dc.identifier.citedreferenceLittle, R. J. A. & Rubin, D. B. Statistical Analysis with Missing Data. Hoboken, NJ: John Wiley and Sons, Inc; 2002. doi: https://doi.org/10.1002/9781119013563
dc.identifier.citedreferenceMcculloch CE, Neuhaus JM. Diagnostic methods for uncovering outcome dependent visit processes. Biostatistics. 2018; 1 ‐ 16. https://doi.org/10.1093/biostatistics/kxy068. [Epub ahead of print]
dc.identifier.citedreferenceHaneuse S et al. Learning about missing data mechanisms in electronic health records‐based research: a survey‐based approach. Epidemiology. 2016; 27: 82 ‐ 90.
dc.identifier.citedreferenceBrzyski D et al. Controlling the rate of gwas false discoveries. Genetics. 2017; 205: 61 ‐ 75.
dc.identifier.citedreferenceLi MX, Yeung JMY, Cherny SS, Sham PC. Evaluating the effective numbers of independent tests and significant p‐value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum. Genet. 2012; 131: 747 ‐ 756.
dc.identifier.citedreferenceGood P. Permutation, Parametric and Bootstrap Tests of Hypotheses. New York, NY: Springer; 2005.
dc.identifier.citedreferenceGao X, Starmer J, Martin ER. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet. Epidemiol. 2008; 32: 361 ‐ 369.
dc.identifier.citedreferenceAbraham KJ, Diaz C. Identifying large sets of unrelated individuals and unrelated markers. Source Code Biol. Med. 2014; 9: 1 ‐ 8.
dc.identifier.citedreferenceHan B, Kang HM, Eskin E. Rapid and accurate multiple testing correction and power estimation for millions of correlated markers. PLoS Genet. 2009; 5: 1 ‐ 13.
dc.identifier.citedreferenceSeaman SR, Müller‐Myhsok B. Rapid simulation of P values for product methods and multiple‐testing adjustment in association studies. Am. J. Hum. Genet. 2005; 76: 399 ‐ 408.
dc.identifier.citedreferenceDuggal P, Gillanders EM, Holmes TN, Bailey‐Wilson JE. Establishing an adjusted p ‐value threshold to control the family‐wide type 1 error in genome wide association studies. BMC Genomics. 2008; 9: 1 ‐ 8.
dc.identifier.citedreferenceJohnson RC, Nelson GW, Troyer JL, Lautenberger JA, Winkler C. A. Accounting for multiple comparisons in a genome‐wide association study (GWAS). BMC Genomics. 2010; 11: 724. https://doi.org/10.1186/1471‐2164‐11‐724.
dc.identifier.citedreferenceZhang X, Huang S, Sun W, Wang W. Rapid and robust resampling‐based multiple‐testing correction with application in a genome‐wide expression quantitative trait loci study. Genetics. 2012; 190: 1511 ‐ 1520.
dc.identifier.citedreferenceBastarache L et al. Phenotype risk scores identify patients with unrecognized Mendelian disease patterns. Science (80‐.). 2018; 359: 1233 ‐ 1239.
dc.identifier.citedreferenceInke BI, Andreas RK. What do we mean by ‘replication’ and ‘validation’ in genome‐wide association studies? Hum. Hered. 2009; 67: 66 ‐ 68.
dc.identifier.citedreferenceNHGRI‐EBI GWAS catalog. https://www.ebi.ac.uk/gwas/.
dc.identifier.citedreferenceLong Q, Flanders WD, Fedirko V, Bostick RM. Robust statistical methods for analysis of biomarkers measured with batch/experiment specific errors. Stat. Med. 2010; 29: 361 ‐ 370.
dc.identifier.citedreferenceThompson SG. Systematic Review: Why sources of heterogeneity in meta‐analysis should be investigated. BMJ. 1994; 309: 1351 ‐ 1355.
dc.identifier.citedreferenceFletcher J. What is heterogeneity and is it important? British Medical Journal. 2007; 334: 94 ‐ 96. https://doi.org/10.1136/bmj.39057.406644.68.
dc.identifier.citedreferenceHiggins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta‐analyses Need for consistency. BMJ. 2003; 327: 557 ‐ 560.
dc.identifier.citedreferenceKriston L. Dealing with clinical heterogeneity in meta‐analysis. Assumptions, methods, interpretation. Int. J. Methods Psychiatr. Res. 2013; 22: 1 ‐ 15.
dc.identifier.citedreferenceLi Y, Ghosh D. Assumption weighting for incorporating heterogeneity into meta‐analysis of genomic data. Bioinformatics. 2012; 28: 807 ‐ 814.
dc.identifier.citedreferenceGrimmer J, Messing S, Westwood SJ. estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. Polit. Anal. 2017; 25: 413 ‐ 434.
dc.identifier.citedreferenceGagnier JJ, Moher D, Boon H, Bombardier C, Beyene J. An empirical study using permutation‐based resampling in meta‐regression. Syst. Rev. 2012; 1: 1 ‐ 9.
dc.identifier.citedreferenceAltman RB, Ashley EA. Using “big data” to dissect clinical heterogeneity. Circulation. 2015; 131: 232 ‐ 233.
dc.identifier.citedreferenceShi X, Li X, Cai T. Spherical regression under mismatch corruption with application to automated knowledge translation. arXiv. 2018; 1 ‐ 45.
dc.identifier.citedreferenceTang L. Statistical Methods of Data Integration, Model Fusion, and Heterogeneity Detection in Big Biomedical Data Analysis. [PhD Thesis] Ann Arbor, MI: University of Michigan; 2018.
dc.identifier.citedreferenceChubak J, Onega T, Zhu W, Buist DSM, Hubbard RA. An electronic health record‐based algorithm to ascertain the date of second breast cancer events. Med. Care. 2017; 55: 81 ‐ 87. https://doi.org/10.1097/MLR.0000000000000352.
dc.identifier.citedreferenceManrai AK et al. Informatics and data analytics to support exposome‐based discovery for public health. Annu. Rev. Public Heal. 2017; 38: 279 ‐ 294.
dc.identifier.citedreferenceFan JW, Li J, Lussier YA. Semantic modeling for exposomics with exploratory evaluation in clinical context. J. Healthc. Eng. 2017; 1 ‐ 10. https://doi.org/10.1155/2017/3818302.
dc.identifier.citedreferenceBaek J et al. Methods to study variation in associations between food store availability and body mass in the multi‐ethnic study of atherosclerosis. Epidemiology. 2017; 28: 403 ‐ 411.
dc.identifier.citedreferenceBazemore AW et al. ‘Community vital signs’: Incorporating geocoded social determinants into electronic records to promote patient and population health. J. Am. Med. Informatics Assoc. 2016; 23: 407 ‐ 412.
dc.identifier.citedreferenceChristine PJ et al. Exposure to neighborhood foreclosures and changes in cardiometabolic health: results from MESA. Am. J. Epidemiol. 2017; 185: 106 ‐ 114.
dc.identifier.citedreferenceFrederickson Comer K, Grannis S, Dixon BE, Bodenhamer DJ, Wiehe SE. Incorporating geospatial capacity within clinical data systems to address social determinants of health. Public Health Rep. 2011; 3: 54 ‐ 61.
dc.identifier.citedreferenceSánchez BN, Sanchez‐Vaznaugh EV, Uscilka A, Baek J, Zhang L. Differential associations between the food environment near schools and childhood overweight across race/ethnicity, gender, and grade. Am. J. Epidemiol. 2012; 175: 1284 ‐ 1293.
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.