TY - GEN
T1 - Person retrieval on XML documents by coreference analysis utilizing structural features
AU - Yonei, Yumi
AU - Iwaihara, Mizuho
AU - Yoshikawa, Masatoshi
PY - 2008
Y1 - 2008
N2 - Keyword retrieval of the present day exploits frequencies and positions of search keywords in target documents. As for retrieval by two or more keywords, semantic relation between keywords is important. For retrieving information about a person, it is common to search by a pair of keywords consisting of person's name and his/her attribute of the interest. By using dependency analysis and coreference analysis, correct occurrences of pairs of person and his/her attributes can be retrieved. However, existing natural language analysis does not consider the factor that logical structures of the documents strongly influence probabilistic patterns of coreference. In this paper, we propose a new way of person retrieval by computing a maximum entropy model from linguistic features and structural features, where structural features are learned from probabilistic distribution of coreference over XML document structures. Our method can utilize strong correlation between XML document structures and coreference, thus having superior accuracy than existing methods.
AB - Keyword retrieval of the present day exploits frequencies and positions of search keywords in target documents. As for retrieval by two or more keywords, semantic relation between keywords is important. For retrieving information about a person, it is common to search by a pair of keywords consisting of person's name and his/her attribute of the interest. By using dependency analysis and coreference analysis, correct occurrences of pairs of person and his/her attributes can be retrieved. However, existing natural language analysis does not consider the factor that logical structures of the documents strongly influence probabilistic patterns of coreference. In this paper, we propose a new way of person retrieval by computing a maximum entropy model from linguistic features and structural features, where structural features are learned from probabilistic distribution of coreference over XML document structures. Our method can utilize strong correlation between XML document structures and coreference, thus having superior accuracy than existing methods.
UR - http://www.scopus.com/inward/record.url?scp=53049089029&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=53049089029&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-85654-2_47
DO - 10.1007/978-3-540-85654-2_47
M3 - Conference contribution
AN - SCOPUS:53049089029
SN - 3540856536
SN - 9783540856535
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 552
EP - 565
BT - Database and Expert Systems Applications - 19th International Conference, DEXA 2008, Proceedings
T2 - 19th International Conference on Database and Expert Systems Applications, DEXA 2008
Y2 - 1 September 2008 through 5 September 2008
ER -