206
Using Random Forests (RF) To Prescreen Candidate Genes: A New Prospective for GWAS
Using Random Forests (RF) To Prescreen Candidate Genes: A New Prospective for GWAS
Tuesday, August 19, 2014: 10:30 AM
Bayshore Grand Ballroom A (The Westin Bayshore)
Abstract Text: High-throughput genomic data present an enormous challenge to researchers, due to the “large P small N” problem. Recently a machine learning method, Random Forests (RF), has gained the popularity in addressing these problems. In this study, we examined the utility of RF in two livestock genome-wide association study (GWAS) datasets - a Spanish sheep pigmentation data and a tropical cattle pregnancy status data. The comparison of top 10 ranking SNPs identified by RF to single-marker GWAS methods found that: 1) RF confirmed the most strongly associated SNP (s26449) being the closest to the sheep pigmentation gene MCR1; 2) Five out of the top 10 SNPs identified by RF were close to the genes previously reported to link with reproductive performance in human or other species. The results indicate that RF can potentially be used in GWAS as an initial screening tool for candidate genes.
Keywords:
Random Forests
GWAS