Relational duality: Unsupervised extraction of semantic relations between entities on the web

Danushka Tarupathi Bollegala, Yutaka Matsuo, Mitsuru Ishizuka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

69 Citations (Scopus)

Abstract

Extracting semantic relations among entities is an important first step in various tasks in Web mining and natural language processing such as information extraction, relation detection, and social network mining. A relation can be expressed extensionally by stating all the instances of that relation or intensionally by defining all the paraphrases of that relation. For example, consider the ACQUISITION relation between two companies. An extensional definition of ACQUISITION contains all pairs of companies in which one company is acquired by another (e.g. (YouTube, Google) or (Powerset, Microsoft)). On the other hand we can intensionally define ACQUISITION as the relation described by lexical patterns such as X is acquired by Y, or Y purchased X, where X and Y denote two companies. We use this dual representation of semantic relations to propose a novel sequential co-clustering algorithm that can extract numerous relations efficiently from unlabeled data. We provide an efficient heuristic to find the parameters of the proposed coclustering algorithm. Using the clusters produced by the algorithm, we train an L1 regularized logistic regression model to identify the representative patterns that describe the relation expressed by each cluster. We evaluate the proposed method in three different tasks: measuring relational similarity between entity pairs, open information extraction (Open IE), and classifying relations in a social network system. Experiments conducted using a benchmark dataset show that the proposed method improves existing relational similarity measures. Moreover, the proposed method significantly outperforms the current state-of-the-art Open IE systems in terms of both precision and recall. The proposed method correctly classifies 53 relation types in an online social network containing 470; 671 nodes and 35; 652; 475 edges, thereby demonstrating its efficacy in real-world relation detection tasks.

Original languageEnglish
Title of host publicationProceedings of the 19th International Conference on World Wide Web, WWW '10
Pages151-160
Number of pages10
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event19th International World Wide Web Conference, WWW2010 - Raleigh, NC
Duration: 2010 Apr 262010 Apr 30

Other

Other19th International World Wide Web Conference, WWW2010
CityRaleigh, NC
Period10/4/2610/4/30

Keywords

  • relation extraction
  • relational duality
  • relational similarity
  • web mining

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Relational duality: Unsupervised extraction of semantic relations between entities on the web'. Together they form a unique fingerprint.

Cite this