Seminar
Kristin L. Ayers
Regularized Estimation of Haplotype Frequencies
Regularized estimation of haplotype frequencies, low haplotype diversity, and linkage disequilibrium
are the rule in short genomic segments. This fact suggests that parsimony should be enforced in
estimation of haplotype frequencies. The current paper introduces a diversity penalty that
automatically discards potential haplotypes with low explanatory power. The standard EM algorithm
for haplotype frequency estimation can accommodate the penalty if one passes over a more general MM
(minorize-maximize) scheme for estimation. Our new MM algorithm converges in fewer iterations,
eliminates marginal haplotypes from further consideration, and reduces the computational complexity
of each iteration. Imposition of the diversity penalty also improves haplotyping and genotype
imputation compared to naive application of the EM algorithm and can replace the EM algorithm in
many existing methods. Compared to more sophisticated methods, the MM algorithm is slightly less
accurate but at least an order of magnitude faster in performing these tasks.