Weighted single-step genomic BLUP: an iterative approach for accurate calculation of breeding values and SNP effects
The purpose of this study was to explore options for genome wide association analysis (GWAS) with single-step GBLUP (ssGBLUP). In GWAS by ssGBLUP, GEBV are converted to marker (SNP) effects. Unequal variances for markers are then derived from SNP solutions and subsequently incorporated into a weighted genomic relationship matrix. Improvements on the SNP weights can be obtained iteratively either by recomputing the SNP effects only or by also recomputing the GEBV. Four options were used to calculate the weights: 1) proportional to 2pi(1-pi)ui2, where pi and ui are frequency and effect of the i-th SNP; 2) proportional to 2pi(1-pi)ui2+ constant; 3) weights as in 1, but updating only the top 25 SNP; 4) updating only thetop 5 SNP. A simulated data set was used that included 15,600 animals in 5 generations, of which 1540 were genotyped for 50k SNP. The simulation involved phenotypes for a trait with heritability of 0.5 potentially affected by 5 QTL. Accuracy between TBV and GEBV for genotyped animals in generation 5 was used for evaluation. Comparisons also involved BayesC with deregressed proofs and π=0.9999. In single-step, SNP effects were tracked along 10 iterations and weights were equal to 1.0 in the first iteration. Results showed option 3 as the best in identifying simulated QTL without background noise and with precision in most of the regions, as well as BayesC; after 2 iterations, the accuracy of GEBV reached a plateau and was 0.91 as opposed to 0.88 for BayesC. Testing also included a commercial data set with 200k animals and 15K genotypes for 39k SNP. For one of the traits, Manhattan plots with option 3 and BayesC looked identical showing 6 large peaks and very small background noise. However, the realized accuracy was 0.16 in the first round and 0.14 in the subsequent rounds, as opposed to 0.19 for BayesC. For the other traits, the accuracy by BayesC was lower and Manhattan plots did not have clear peaks. The option to compute weights for SNP in ssGBLUP with the top 25 SNP gives a good identification of top segments. However, further work is required to compute weights to maximize accuracy for a variety of cases. In addition, a choice for GWAS in single-step approach is based on simplicity and flexibility in case of complex models.
Keywords: Weighted SNP, ssGBLUP, BayesC