Measuring the degree of synonymy between words using relational similarity between word pairs as a proxy

Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuka

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existingWeb-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.

Original languageEnglish
Pages (from-to)2116-2123
Number of pages8
JournalIEICE Transactions on Information and Systems
VolumeE95-D
Issue number8
DOIs
Publication statusPublished - 2012 Aug
Externally publishedYes

Fingerprint

Semantics
Covariance matrix
Clustering algorithms
Processing

Keywords

  • Attributional similarity
  • Miller-Charles dataset
  • Relational similarity
  • Synonymy
  • WordSimilarity-353 dataset

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Software
  • Artificial Intelligence
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition

Cite this

Measuring the degree of synonymy between words using relational similarity between word pairs as a proxy. / Bollegala, Danushka; Matsuo, Yutaka; Ishizuka, Mitsuru.

In: IEICE Transactions on Information and Systems, Vol. E95-D, No. 8, 08.2012, p. 2116-2123.

Research output: Contribution to journalArticle

Bollegala, Danushka ; Matsuo, Yutaka ; Ishizuka, Mitsuru. / Measuring the degree of synonymy between words using relational similarity between word pairs as a proxy. In: IEICE Transactions on Information and Systems. 2012 ; Vol. E95-D, No. 8. pp. 2116-2123.
@article{bfeb2f04fd0d4c618f8d56d11400b317,
title = "Measuring the degree of synonymy between words using relational similarity between word pairs as a proxy",
abstract = "Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existingWeb-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.",
keywords = "Attributional similarity, Miller-Charles dataset, Relational similarity, Synonymy, WordSimilarity-353 dataset",
author = "Danushka Bollegala and Yutaka Matsuo and Mitsuru Ishizuka",
year = "2012",
month = "8",
doi = "10.1587/transinf.E95.D.2116",
language = "English",
volume = "E95-D",
pages = "2116--2123",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "8",

}

TY - JOUR

T1 - Measuring the degree of synonymy between words using relational similarity between word pairs as a proxy

AU - Bollegala, Danushka

AU - Matsuo, Yutaka

AU - Ishizuka, Mitsuru

PY - 2012/8

Y1 - 2012/8

N2 - Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existingWeb-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.

AB - Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existingWeb-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.

KW - Attributional similarity

KW - Miller-Charles dataset

KW - Relational similarity

KW - Synonymy

KW - WordSimilarity-353 dataset

UR - http://www.scopus.com/inward/record.url?scp=84864767319&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864767319&partnerID=8YFLogxK

U2 - 10.1587/transinf.E95.D.2116

DO - 10.1587/transinf.E95.D.2116

M3 - Article

AN - SCOPUS:84864767319

VL - E95-D

SP - 2116

EP - 2123

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 8

ER -