Abstract

The complexity of phenotype-genotype mapping are characterised by non-linear interactions between gene-gene and gene-environmental factors. These interaction studies provide better understanding of underlying biological architecture of complex disease traits. A number of statistical and machine learning approaches have been proposed to identify multi-locus interactions between genetic variants and their association to a disease. However, the challenges hindering these approaches are missing heritability, curse of dimensionality, and computational limitations. Despite abundant computational methods and tools available to discover interactions, there have been no breakthrough methods that can demonstrate replicable results. In this paper, a deep feedforward neural network is trained to identify two-locus interacting genetic variants responsible for a disease risk. The method is evaluated on number of simulated datasets to predict the performance of the model. The results are encouraging with replicable results. Hence, the model is further evaluated to confirm its findings on a published genome-wide association dataset. The experimental results demonstrated significant improvements in the prediction accuracy over the previous approaches. The result ranks top 20 interactions among 35 polymorphisms associated with the disease.

Share

COinS