24
Improved accuracy of genomic prediction combining linkage disequilibrium and co-segregation by fitting haplotypes in addition to SNP genotypes

Tuesday, March 17, 2015: 2:45 PM
302-303 (Community Choice Credit Union Convention Center)
Xiaochen Sun , Iowa State University, Ames, IA
Rohan L. Fernando , Iowa State University, Ames, IA
Dorian J. Garrick , Iowa State University, Ames, IA
Jack C. M. Dekkers , Iowa State University, Ames, IA
Abstract Text: In livestock populations, evidence has been increasing that genomic prediction models that fit single nucleotide polymorphism (SNP) genotypes (SNP model) have high accuracy only when prediction candidates are closely related with the training population. Further, increasing SNP density generally has limited impact on prediction accuracy. Results from field datasets suggest that historical linkage disequilibrium (LD) between quantitative trait loci (QTL) and SNPs may be low because many QTL have low minor allele frequency (MAF), while SNPs used for genotyping typically have moderate to high MAF. In these cases, prediction accuracy comes mainly from co-segregation (CS) between QTL and SNPs that is implicitly captured by SNP genotypes. In this study, fitting 1-cM haplotypes across the genome to explicitly capture CS information, in addition to fitting SNP genotypes to capture historical LD information (SNP-haplotype model), is proposed to improve accuracy when historical LD between QTL and SNPs is low. Datasets were simulated for a pedigree with 13 non-overlapping generations. The first 5 generations, with 2,455 individuals in total, were used for training to predict breeding values for each of the following 8 generations, each with 600 individuals. Results showed that the SNP-haplotype model had significantly higher prediction accuracy across validation generations than the SNP model when historical LD was low, but had similar accuracy as the SNP model when historical LD was high. When the SNP density increased from 20 to 200 SNPs per cM, the increase in accuracy was greater for the SNP-haplotype model than for the SNP model (Table 1). In conclusion, when historical LD is low, the accuracy from the SNP model is mainly contributed by CS information that is implicitly captured by SNP genotypes. Fitting haplotypes increases accuracy under low LD by explicitly capturing CS information. Increasing SNP density substantially improves the CS information between haplotypes and QTL, but has little effect on the LD between QTL and SNPs when they have different MAF.

Table 1. Average prediction accuracies and standard errors (in parentheses) in the 1st (Gen1) and 8th (Gen8) validation generation across 50 replicated datasets with different MAF of QTL and the number of SNPs per cM (# SNPs/cM). MAF of SNPs were between 0.06 – 0.5.

MAF of QTL

# SNPs/cM

SNP model

 

SNP-haplotype model

Gen1

Gen8

Decrease

 

Gen1

Gen8

Decrease

0.01–0.5

  20

0.920

(0.004)

0.862

(0.010)

0.058

(0.008)

 

0.918

(0.004)

0.859

(0.010)

0.059

(0.008)

0.01–0.06

  20

0.815

(0.016)

0.637

(0.035)

0.178

(0.027)

 

0.849

(0.013)

0.708

(0.032)

0.141

(0.024)

0.01–0.06

  200

0.861

(0.013)

0.663

(0.038)

0.198

(0.028)

 

0.895

(0.010)

0.753

(0.031)

0.142

(0.023)

Keywords: accuracy, co-segregation, haplotype