Modeling of a large joint probability table is problematic when its variables have a large number of categories. In such a case, a mixture of simpler probability tables could be a good model. And the estimation of such a large probability table frequently has another problem of data sparseness. When constructing mixture models with sparse data, EM estimators based on the β-likelihood are expected more appropriate than those based on the log likelihood. Experimental results show that a mixture model estimated by the βlikelihood approximates a large joint probability table with sparse data more appropriately than EM estimators.