A new minimally supervised learning method for semantic term classification-experimental results on classifying ratable aspects discussed in customer reviews

Thao Pham Thanh Nguyen, Takahiro Hayashi, Rikio Onai, Yuhei Nishioka, Takamasa Takenaka, Masaya Mori

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We present Bautext, a new minimally supervised approach for automatically extracting ratable aspects from customer reviews and classifying them to some previously defined categories. Bautext requires a small amount of seed words as supervised data and uses a bootstrapping mechanism to progressively collect new member for each category. Learning new category members and the category-specific terms for each category at the same time is the unique and featured classification mechanism of Bautext. Category-specific terms are terms that play important roles for properly extracting new category members. Furthermore, we proposed to use an additional Trash category to filter non-purpose aspects, thus led to a significant improvement in precision score but could constrain the trade-off in decreasing recall score. Experimental results, conducted on a Japanese hotel review dataset, showed that Bautext outperforms the alternative techniques in all terms of precision, recall score and significantly in running time. And in the further comparison to Adaboost (as the state-of-the-art machine learning technique for semantic term classification task), we found that Adaboost require about 50% training data to deliver a similar performance as Bautext does with less than ten selective seed words for each category.

Original languageEnglish
Title of host publicationICDM Workshops 2009 - IEEE International Conference on Data Mining
Pages43-50
Number of pages8
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event2009 IEEE International Conference on Data Mining Workshops, ICDMW 2009 - Miami, FL, United States
Duration: 2009 Dec 62009 Dec 6

Other

Other2009 IEEE International Conference on Data Mining Workshops, ICDMW 2009
CountryUnited States
CityMiami, FL
Period09/12/609/12/6

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint Dive into the research topics of 'A new minimally supervised learning method for semantic term classification-experimental results on classifying ratable aspects discussed in customer reviews'. Together they form a unique fingerprint.

Cite this