Sparse estimation of gene-gene interactions in prediction models.

Lee S, Pawitan Y, Ingelsson E, Lee Y

Stat Methods Med Res 26 (5) 2319-2332 [2017-10-00; online 2015-08-11]

Current assessment of gene-gene interactions is typically based on separate parallel analysis, where each interaction term is tested separately, while less attention has been paid on simultaneous estimation of interaction terms in a prediction model. As the number of interaction terms grows fast, sparse estimation is desirable from statistical and interpretability reasons. There is a large literature on sparse estimation, but there is a natural hierarchy between the interaction and its corresponding main effects that requires special considerations. We describe random-effect models that impose sparse estimation of interactions under both strong and weak-hierarchy constraints. We develop an estimation procedure based on the hierarchical-likelihood argument and show that the modelling approach is equivalent to a penalty-based method, with the advantage of the models being more transparent and flexible. We compare the procedure with some standard methods in a simulation study and illustrate its application in an analysis of gene-gene interaction model to predict body-mass index.

Affiliated researcher

PubMed 26265764

DOI 10.1177/0962280215597261

Crossref 10.1177/0962280215597261

pii: 0962280215597261

