Feature distance-based framework for classification of low-frequency semantic relations

André Kenji Horie, Mitsuru Ishizuka

研究成果: Conference contribution

1 引用 (Scopus)

抄録

In the relation extraction of semantic relations, it is not uncommon to face settings in which the training data provides very few instances of some relation classes. This is mostly due to the high cost of producing such data and to the class imbalance problem, which may result in some classes presenting small frequencies even with a large annotated corpus. This work thus presents a semi-supervised bootstrapped method to expand this initial training dataset, using pattern matching to extract new candidate instances from the Web. The core of this process uses a multiview feature distance-based framework, which allows quantitative and qualitative analysis of intermediate steps of the process. Experimental results show that this framework provides better results in the relation classification task than the baseline, and the bootstrapped architecture improves the relation classification task as a whole for these low-frequency semantic relations settings.

元の言語English
ホスト出版物のタイトルProceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011
ページ59-66
ページ数8
DOI
出版物ステータスPublished - 2011
外部発表Yes
イベント5th Annual IEEE International Conference on Semantic Computing, ICSC 2011 - Palo Alto, CA
継続期間: 2011 9 182011 9 21

Other

Other5th Annual IEEE International Conference on Semantic Computing, ICSC 2011
Palo Alto, CA
期間11/9/1811/9/21

Fingerprint

Low Frequency
Semantics
Pattern matching
Costs
Pattern Matching
Qualitative Analysis
Quantitative Analysis
Expand
Framework
Baseline
Experimental Results
Class
Training

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Theoretical Computer Science

これを引用

Horie, A. K., & Ishizuka, M. (2011). Feature distance-based framework for classification of low-frequency semantic relations. : Proceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011 (pp. 59-66). [6061437] https://doi.org/10.1109/ICSC.2011.9

Feature distance-based framework for classification of low-frequency semantic relations. / Horie, André Kenji; Ishizuka, Mitsuru.

Proceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011. 2011. p. 59-66 6061437.

研究成果: Conference contribution

Horie, AK & Ishizuka, M 2011, Feature distance-based framework for classification of low-frequency semantic relations. : Proceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011., 6061437, pp. 59-66, 5th Annual IEEE International Conference on Semantic Computing, ICSC 2011, Palo Alto, CA, 11/9/18. https://doi.org/10.1109/ICSC.2011.9
Horie AK, Ishizuka M. Feature distance-based framework for classification of low-frequency semantic relations. : Proceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011. 2011. p. 59-66. 6061437 https://doi.org/10.1109/ICSC.2011.9
Horie, André Kenji ; Ishizuka, Mitsuru. / Feature distance-based framework for classification of low-frequency semantic relations. Proceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011. 2011. pp. 59-66
@inproceedings{8d89a5e0c1da49f8b73b91f20fce445b,
title = "Feature distance-based framework for classification of low-frequency semantic relations",
abstract = "In the relation extraction of semantic relations, it is not uncommon to face settings in which the training data provides very few instances of some relation classes. This is mostly due to the high cost of producing such data and to the class imbalance problem, which may result in some classes presenting small frequencies even with a large annotated corpus. This work thus presents a semi-supervised bootstrapped method to expand this initial training dataset, using pattern matching to extract new candidate instances from the Web. The core of this process uses a multiview feature distance-based framework, which allows quantitative and qualitative analysis of intermediate steps of the process. Experimental results show that this framework provides better results in the relation classification task than the baseline, and the bootstrapped architecture improves the relation classification task as a whole for these low-frequency semantic relations settings.",
keywords = "Concept description, Natural language text, Semantic computing",
author = "Horie, {Andr{\'e} Kenji} and Mitsuru Ishizuka",
year = "2011",
doi = "10.1109/ICSC.2011.9",
language = "English",
isbn = "9780769544922",
pages = "59--66",
booktitle = "Proceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011",

}

TY - GEN

T1 - Feature distance-based framework for classification of low-frequency semantic relations

AU - Horie, André Kenji

AU - Ishizuka, Mitsuru

PY - 2011

Y1 - 2011

N2 - In the relation extraction of semantic relations, it is not uncommon to face settings in which the training data provides very few instances of some relation classes. This is mostly due to the high cost of producing such data and to the class imbalance problem, which may result in some classes presenting small frequencies even with a large annotated corpus. This work thus presents a semi-supervised bootstrapped method to expand this initial training dataset, using pattern matching to extract new candidate instances from the Web. The core of this process uses a multiview feature distance-based framework, which allows quantitative and qualitative analysis of intermediate steps of the process. Experimental results show that this framework provides better results in the relation classification task than the baseline, and the bootstrapped architecture improves the relation classification task as a whole for these low-frequency semantic relations settings.

AB - In the relation extraction of semantic relations, it is not uncommon to face settings in which the training data provides very few instances of some relation classes. This is mostly due to the high cost of producing such data and to the class imbalance problem, which may result in some classes presenting small frequencies even with a large annotated corpus. This work thus presents a semi-supervised bootstrapped method to expand this initial training dataset, using pattern matching to extract new candidate instances from the Web. The core of this process uses a multiview feature distance-based framework, which allows quantitative and qualitative analysis of intermediate steps of the process. Experimental results show that this framework provides better results in the relation classification task than the baseline, and the bootstrapped architecture improves the relation classification task as a whole for these low-frequency semantic relations settings.

KW - Concept description

KW - Natural language text

KW - Semantic computing

UR - http://www.scopus.com/inward/record.url?scp=81255147433&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=81255147433&partnerID=8YFLogxK

U2 - 10.1109/ICSC.2011.9

DO - 10.1109/ICSC.2011.9

M3 - Conference contribution

AN - SCOPUS:81255147433

SN - 9780769544922

SP - 59

EP - 66

BT - Proceedings - 5th IEEE International Conference on Semantic Computing, ICSC 2011

ER -