TY - GEN
T1 - A study of Bayesian clustering of a document set based on GA
AU - Aoki, Keiko
AU - Matsumoto, Kazunori
AU - Hoashi, Keiichiro
AU - Hashimoto, Kazuo
PY - 1999
Y1 - 1999
N2 - In this paper, we propose new approximate clustering algorithm that improves the precision of a top-down clustering. Top-down clustering is proposed to improve the clustering speed by Iwayama et al, where the cluster tree is generated by sampling some documents, making a cluster from these, assigning other documents to the nearest node and if the number of assigned documents is large, continuing sampling and clustering from top to down. To improve precision of the top-down clustering method, we propose selecting documents by applying a GA to decide a quasi-optimum layer and using a MDL criteria for evaluating the layer structure of a cluster tree.
AB - In this paper, we propose new approximate clustering algorithm that improves the precision of a top-down clustering. Top-down clustering is proposed to improve the clustering speed by Iwayama et al, where the cluster tree is generated by sampling some documents, making a cluster from these, assigning other documents to the nearest node and if the number of assigned documents is large, continuing sampling and clustering from top to down. To improve precision of the top-down clustering method, we propose selecting documents by applying a GA to decide a quasi-optimum layer and using a MDL criteria for evaluating the layer structure of a cluster tree.
KW - Beysian clustering
KW - Document retrieval
KW - Genetic algorithm
KW - Minimum description length criteria
UR - http://www.scopus.com/inward/record.url?scp=84956858012&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84956858012&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84956858012
SN - 3540659072
SN - 9783540659075
VL - 1585
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 260
EP - 267
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PB - Springer Verlag
T2 - 2nd Asia-Pacific Conference on Simulated Evolution and Learning, SEAL 1998
Y2 - 24 November 1998 through 27 November 1998
ER -