TY - GEN
T1 - Extracting key phrases to disambiguate personal names on the web
AU - Bollegala, Danushka
AU - Matsuo, Yutaka
AU - Ishizuka, Mitsuru
PY - 2006
Y1 - 2006
N2 - When you search for information regarding a particular person on the web, a search engine returns many pages. Some of these pages may be for people with the same name. How can we disambiguate these different people with the same name? This paper presents an unsupervised algorithm which produces key phrases for the different people with the same name. These key phrases could be used to further narrow down the search, leading to more person specific unambiguous information. The algorithm we propose does not require any biographical or social information regarding the person. Although there are some previous work in personal name disambiguation on the web, to our knowledge, this is the first attempt to extract key phrases to disambiguate the different persons with the same name. To evaluate our algorithm, we collected and hand labeled a dataset of over 1000 Web pages retrieved from Google using personal name queries. Our experimental results shows an improvement over the existing methods for namesake disambiguation.
AB - When you search for information regarding a particular person on the web, a search engine returns many pages. Some of these pages may be for people with the same name. How can we disambiguate these different people with the same name? This paper presents an unsupervised algorithm which produces key phrases for the different people with the same name. These key phrases could be used to further narrow down the search, leading to more person specific unambiguous information. The algorithm we propose does not require any biographical or social information regarding the person. Although there are some previous work in personal name disambiguation on the web, to our knowledge, this is the first attempt to extract key phrases to disambiguate the different persons with the same name. To evaluate our algorithm, we collected and hand labeled a dataset of over 1000 Web pages retrieved from Google using personal name queries. Our experimental results shows an improvement over the existing methods for namesake disambiguation.
UR - http://www.scopus.com/inward/record.url?scp=33745557469&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745557469&partnerID=8YFLogxK
U2 - 10.1007/11671299_24
DO - 10.1007/11671299_24
M3 - Conference contribution
AN - SCOPUS:33745557469
SN - 3540322051
SN - 9783540322054
VL - 3878 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 223
EP - 234
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
T2 - 7th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2006
Y2 - 19 February 2006 through 25 February 2006
ER -