Multi-view bootstrapping for relation extraction by exploring web features and linguistic features

Yulan Yan, Haibo Li, Yutaka Matsuo, Mitsuru Ishizuka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Binary semantic relation extraction from Wikipedia is particularly useful for various NLP and Web applications. Currently frequent pattern miningbased methods and syntactic analysis-based methods are two types of leading methods for semantic relation extraction task. With a novel view on integrating syntactic analysis on Wikipedia text with redundancy information from the Web, we propose a multi-view learning approach for bootstrapping relationships between entities with the complementary between theWeb view and linguistic view. On the one hand, from the linguistic view, linguistic features are generated from linguistic parsing on Wikipedia texts by abstracting away from different surface realizations of semantic relations. On the other hand, Web features are extracted from the Web corpus to provide frequency information for relation extraction. Experimental evaluation on a relational dataset demonstrates that linguistic analysis on Wikipedia texts and Web collective information reveal different aspects of the nature of entity-related semantic relationships. It also shows that our multiview learning method considerably boosts the performance comparing to learning with only one view of features, with the weaknesses of one view complement the strengths of the other.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages525-536
Number of pages12
Volume6008 LNCS
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event11th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2010 - Iasi
Duration: 2010 Mar 212010 Mar 27

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6008 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other11th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2010
CityIasi
Period10/3/2110/3/27

Fingerprint

Bootstrapping
Linguistics
Wikipedia
Semantics
Syntactics
Frequent Pattern
Parsing
Web Application
Experimental Evaluation
Redundancy
Complement
Binary
Demonstrate
Text
Learning
Syntax
Relationships

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Yan, Y., Li, H., Matsuo, Y., & Ishizuka, M. (2010). Multi-view bootstrapping for relation extraction by exploring web features and linguistic features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6008 LNCS, pp. 525-536). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6008 LNCS). https://doi.org/10.1007/978-3-642-12116-6_45

Multi-view bootstrapping for relation extraction by exploring web features and linguistic features. / Yan, Yulan; Li, Haibo; Matsuo, Yutaka; Ishizuka, Mitsuru.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6008 LNCS 2010. p. 525-536 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6008 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yan, Y, Li, H, Matsuo, Y & Ishizuka, M 2010, Multi-view bootstrapping for relation extraction by exploring web features and linguistic features. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 6008 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6008 LNCS, pp. 525-536, 11th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2010, Iasi, 10/3/21. https://doi.org/10.1007/978-3-642-12116-6_45
Yan Y, Li H, Matsuo Y, Ishizuka M. Multi-view bootstrapping for relation extraction by exploring web features and linguistic features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6008 LNCS. 2010. p. 525-536. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-12116-6_45
Yan, Yulan ; Li, Haibo ; Matsuo, Yutaka ; Ishizuka, Mitsuru. / Multi-view bootstrapping for relation extraction by exploring web features and linguistic features. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6008 LNCS 2010. pp. 525-536 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{7f1bc6429bc14cf8add415199afbaf49,
title = "Multi-view bootstrapping for relation extraction by exploring web features and linguistic features",
abstract = "Binary semantic relation extraction from Wikipedia is particularly useful for various NLP and Web applications. Currently frequent pattern miningbased methods and syntactic analysis-based methods are two types of leading methods for semantic relation extraction task. With a novel view on integrating syntactic analysis on Wikipedia text with redundancy information from the Web, we propose a multi-view learning approach for bootstrapping relationships between entities with the complementary between theWeb view and linguistic view. On the one hand, from the linguistic view, linguistic features are generated from linguistic parsing on Wikipedia texts by abstracting away from different surface realizations of semantic relations. On the other hand, Web features are extracted from the Web corpus to provide frequency information for relation extraction. Experimental evaluation on a relational dataset demonstrates that linguistic analysis on Wikipedia texts and Web collective information reveal different aspects of the nature of entity-related semantic relationships. It also shows that our multiview learning method considerably boosts the performance comparing to learning with only one view of features, with the weaknesses of one view complement the strengths of the other.",
author = "Yulan Yan and Haibo Li and Yutaka Matsuo and Mitsuru Ishizuka",
year = "2010",
doi = "10.1007/978-3-642-12116-6_45",
language = "English",
isbn = "3642121152",
volume = "6008 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "525--536",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Multi-view bootstrapping for relation extraction by exploring web features and linguistic features

AU - Yan, Yulan

AU - Li, Haibo

AU - Matsuo, Yutaka

AU - Ishizuka, Mitsuru

PY - 2010

Y1 - 2010

N2 - Binary semantic relation extraction from Wikipedia is particularly useful for various NLP and Web applications. Currently frequent pattern miningbased methods and syntactic analysis-based methods are two types of leading methods for semantic relation extraction task. With a novel view on integrating syntactic analysis on Wikipedia text with redundancy information from the Web, we propose a multi-view learning approach for bootstrapping relationships between entities with the complementary between theWeb view and linguistic view. On the one hand, from the linguistic view, linguistic features are generated from linguistic parsing on Wikipedia texts by abstracting away from different surface realizations of semantic relations. On the other hand, Web features are extracted from the Web corpus to provide frequency information for relation extraction. Experimental evaluation on a relational dataset demonstrates that linguistic analysis on Wikipedia texts and Web collective information reveal different aspects of the nature of entity-related semantic relationships. It also shows that our multiview learning method considerably boosts the performance comparing to learning with only one view of features, with the weaknesses of one view complement the strengths of the other.

AB - Binary semantic relation extraction from Wikipedia is particularly useful for various NLP and Web applications. Currently frequent pattern miningbased methods and syntactic analysis-based methods are two types of leading methods for semantic relation extraction task. With a novel view on integrating syntactic analysis on Wikipedia text with redundancy information from the Web, we propose a multi-view learning approach for bootstrapping relationships between entities with the complementary between theWeb view and linguistic view. On the one hand, from the linguistic view, linguistic features are generated from linguistic parsing on Wikipedia texts by abstracting away from different surface realizations of semantic relations. On the other hand, Web features are extracted from the Web corpus to provide frequency information for relation extraction. Experimental evaluation on a relational dataset demonstrates that linguistic analysis on Wikipedia texts and Web collective information reveal different aspects of the nature of entity-related semantic relationships. It also shows that our multiview learning method considerably boosts the performance comparing to learning with only one view of features, with the weaknesses of one view complement the strengths of the other.

UR - http://www.scopus.com/inward/record.url?scp=78049292163&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78049292163&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-12116-6_45

DO - 10.1007/978-3-642-12116-6_45

M3 - Conference contribution

AN - SCOPUS:78049292163

SN - 3642121152

SN - 9783642121159

VL - 6008 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 525

EP - 536

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -