Measuring the similarity between implicit semantic relations from the Web

Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

49 Citations (Scopus)

Abstract

Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, You Tube), between which a particular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be applied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query - not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search engine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that express a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy questions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0:74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51%. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Original languageEnglish
Title of host publicationWWW'09 - Proceedings of the 18th International World Wide Web Conference
Pages651-660
Number of pages10
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event18th International World Wide Web Conference, WWW 2009 - Madrid
Duration: 2009 Apr 202009 Apr 24

Other

Other18th International World Wide Web Conference, WWW 2009
CityMadrid
Period09/4/2009/4/24

Fingerprint

Semantics
Search engines
Information retrieval
World Wide Web

Keywords

  • Natural language processing
  • Relational similarity
  • Web mining

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this

Bollegala, D., Matsuo, Y., & Ishizuka, M. (2009). Measuring the similarity between implicit semantic relations from the Web. In WWW'09 - Proceedings of the 18th International World Wide Web Conference (pp. 651-660) https://doi.org/10.1145/1526709.1526797

Measuring the similarity between implicit semantic relations from the Web. / Bollegala, Danushka; Matsuo, Yutaka; Ishizuka, Mitsuru.

WWW'09 - Proceedings of the 18th International World Wide Web Conference. 2009. p. 651-660.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bollegala, D, Matsuo, Y & Ishizuka, M 2009, Measuring the similarity between implicit semantic relations from the Web. in WWW'09 - Proceedings of the 18th International World Wide Web Conference. pp. 651-660, 18th International World Wide Web Conference, WWW 2009, Madrid, 09/4/20. https://doi.org/10.1145/1526709.1526797
Bollegala D, Matsuo Y, Ishizuka M. Measuring the similarity between implicit semantic relations from the Web. In WWW'09 - Proceedings of the 18th International World Wide Web Conference. 2009. p. 651-660 https://doi.org/10.1145/1526709.1526797
Bollegala, Danushka ; Matsuo, Yutaka ; Ishizuka, Mitsuru. / Measuring the similarity between implicit semantic relations from the Web. WWW'09 - Proceedings of the 18th International World Wide Web Conference. 2009. pp. 651-660
@inproceedings{65d9e0cbe55d4c808ee1927942830d5e,
title = "Measuring the similarity between implicit semantic relations from the Web",
abstract = "Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, You Tube), between which a particular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be applied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query - not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search engine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that express a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy questions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0:74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51{\%}. Copyright is held by the International World Wide Web Conference Committee (IW3C2).",
keywords = "Natural language processing, Relational similarity, Web mining",
author = "Danushka Bollegala and Yutaka Matsuo and Mitsuru Ishizuka",
year = "2009",
doi = "10.1145/1526709.1526797",
language = "English",
isbn = "9781605584874",
pages = "651--660",
booktitle = "WWW'09 - Proceedings of the 18th International World Wide Web Conference",

}

TY - GEN

T1 - Measuring the similarity between implicit semantic relations from the Web

AU - Bollegala, Danushka

AU - Matsuo, Yutaka

AU - Ishizuka, Mitsuru

PY - 2009

Y1 - 2009

N2 - Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, You Tube), between which a particular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be applied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query - not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search engine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that express a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy questions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0:74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51%. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

AB - Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, You Tube), between which a particular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be applied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query - not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search engine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that express a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy questions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0:74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51%. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

KW - Natural language processing

KW - Relational similarity

KW - Web mining

UR - http://www.scopus.com/inward/record.url?scp=77954630948&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954630948&partnerID=8YFLogxK

U2 - 10.1145/1526709.1526797

DO - 10.1145/1526709.1526797

M3 - Conference contribution

AN - SCOPUS:77954630948

SN - 9781605584874

SP - 651

EP - 660

BT - WWW'09 - Proceedings of the 18th International World Wide Web Conference

ER -