Derivation of Bayes and Minimax decision rules for allelic frequencies estimation in biallelic loci

Monday, July 21, 2014
Exhibit Hall AB (Kansas City Convention Center)
Carlos A. Martinez , Department of Animal Sciences, University of Florida, Gainesville, FL
Kshitij Khare , Department of Statistics, University of Florida, Gainesville, FL
Mauricio A. Elzo , Department of Animal Sciences, University of Florida, Gainesville, FL
Abstract Text: In population genetics, allelic frequencies are typically estimated via maximum likelihood (MLE). Under this setting, allele frequencies are treated as unknown fixed parameters. However, population genetics theory indicates that allele frequencies vary at random, thus they should be treated as random variables. The aim of this study was to derive Bayes and Minimax estimators (ME) of allele frequencies for biallelic loci using decision theory. Because an optimal decision rule with uniformly smallest risk rarely exists, an approach is to establish principles that allow ordering of decision rules according to their risk function. Two general methods were used to obtain average risk optimality: The Bayes and the Minimax principles. Briefly, given a loss function and a prior distribution, the Bayes principle looks for an estimator minimizing the posterior risk, while the Minimax principle consists of finding decision rules that minimize the supremum (over the parameter space) of the risk function (the worst scenario). For an arbitrary locus, the sampling model was a trinomial distribution for numbers of individuals for each genotype and the prior was a Beta distribution, chosen because of mathematical convenience, flexibility and genetic interpretation of its parameters. Three types of loss functions were considered: square error (SEL), Kullback-Leibler (KLL) and a quadratic error loss (QEL). The SEL and KLL yielded the same estimator, which was a convex combination of the prior mean and the MLE. Using the Bayes estimator from QEL, a ME was derived by applying a theorem which states that a Bayes estimator with constant risk is also Minimax. The constant risk was obtained by finding appropriate hyperparameter values. This estimator was shown to be equivalent to MLE. The prior associated with this ME was uniform [0,1]. One consequence of using the previous theorem on the derivation of ME is that the uniform distribution is a least favorable prior, that is, it causes the greatest average loss. Extension to several loci under linkage equilibrium and independent priors was discussed. The estimators derived here have the appealing property of allowing variation in allelic frequencies, which is more congruent with the reality of finite populations exposed to evolutionary forces. In addition, from a Bayesian perspective they permit modelling uncertainty and incorporation of previous genotypic information from the population.

Keywords: allele frequencies, average risk optimality, decision theory