Show simple item record

Feature selection with interactions in logistic regression models using multivariate synergies for a GWAS application

dc.contributor.authorXu, Easton L
dc.contributor.authorQian, Xiaoning
dc.contributor.authorYu, Qilian
dc.contributor.authorZhang, Han
dc.contributor.authorCui, Shuguang
dc.date.accessioned2018-03-25T06:28:16Z
dc.date.available2018-03-25T06:28:16Z
dc.date.issued2018-03-21
dc.identifier.citationBMC Genomics. 2018 Mar 21;19(Suppl 4):170
dc.identifier.urihttp://dx.doi.org/10.1186/s12864-018-4552-x
dc.identifier.urihttps://hdl.handle.net/2027.42/142802
dc.description.abstractAbstract Background Genotype-phenotype association has been one of the long-standing problems in bioinformatics. Identifying both the marginal and epistatic effects among genetic markers, such as Single Nucleotide Polymorphisms (SNPs), has been extensively integrated in Genome-Wide Association Studies (GWAS) to help derive “causal” genetic risk factors and their interactions, which play critical roles in life and disease systems. Identifying “synergistic” interactions with respect to the outcome of interest can help accurate phenotypic prediction and understand the underlying mechanism of system behavior. Many statistical measures for estimating synergistic interactions have been proposed in the literature for such a purpose. However, except for empirical performance, there is still no theoretical analysis on the power and limitation of these synergistic interaction measures. Results In this paper, it is shown that the existing information-theoretic multivariate synergy depends on a small subset of the interaction parameters in the model, sometimes on only one interaction parameter. In addition, an adjusted version of multivariate synergy is proposed as a new measure to estimate the interactive effects, with experiments conducted over both simulated data sets and a real-world GWAS data set to show the effectiveness. Conclusions We provide rigorous theoretical analysis and empirical evidence on why the information-theoretic multivariate synergy helps with identifying genetic risk factors via synergistic interactions. We further establish the rigorous sample complexity analysis on detecting interactive effects, confirmed by both simulated and real-world data sets.
dc.titleFeature selection with interactions in logistic regression models using multivariate synergies for a GWAS application
dc.typeArticleen_US
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/142802/1/12864_2018_Article_4552.pdf
dc.language.rfc3066en
dc.rights.holderThe Author(s)
dc.date.updated2018-03-25T06:28:22Z
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.