Haplotype inference is an indispensable technique in medical science, especially in genome-wide association studies. Although the conventional method of inference using the expectation-maximization (EM) algorithm by Excoffier and Slatkin is one standard approach, as its calculation cost is an exponential function of the maximum number of heterozygous loci, it has not been widely applied. We propose a method of haplotype inference that can empirically accommodate up to several tens of single nucleotide polymorphism loci in a single haplotype block while maintaining criteria that are exactly equivalent to those of the EM algorithm. The idea is to reduce the cost of calculating the EM algorithm by using a haplotype-grouping preprocess exploiting the symmetrical and inclusive relationships of haplotypes based on the Hardy-Weinberg equilibrium. Testing of the proposed method using real data sets revealed that it has a wider range of applications than the EM algorithm.
- Expectation-maximization (EM) algorithm
- Genome-wide association study
- Haplotype inference
- Haplotype phase
- Hardy-Weinberg equilibrium (HWE)
- Single nucleotide polymorphisms (SNPs)
ASJC Scopus subject areas