Person retrieval on XML documents by coreference analysis utilizing structural features

Yumi Yonei*, Mizuho Iwaihara, Masatoshi Yoshikawa

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Keyword retrieval of the present day exploits frequencies and positions of search keywords in target documents. As for retrieval by two or more keywords, semantic relation between keywords is important. For retrieving information about a person, it is common to search by a pair of keywords consisting of person's name and his/her attribute of the interest. By using dependency analysis and coreference analysis, correct occurrences of pairs of person and his/her attributes can be retrieved. However, existing natural language analysis does not consider the factor that logical structures of the documents strongly influence probabilistic patterns of coreference. In this paper, we propose a new way of person retrieval by computing a maximum entropy model from linguistic features and structural features, where structural features are learned from probabilistic distribution of coreference over XML document structures. Our method can utilize strong correlation between XML document structures and coreference, thus having superior accuracy than existing methods.

Original languageEnglish
Title of host publicationDatabase and Expert Systems Applications - 19th International Conference, DEXA 2008, Proceedings
Pages552-565
Number of pages14
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event19th International Conference on Database and Expert Systems Applications, DEXA 2008 - Turin, Italy
Duration: 2008 Sept 12008 Sept 5

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5181 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Database and Expert Systems Applications, DEXA 2008
Country/TerritoryItaly
CityTurin
Period08/9/108/9/5

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Person retrieval on XML documents by coreference analysis utilizing structural features'. Together they form a unique fingerprint.

Cite this