A new minimally supervised learning method for semantic term classification-experimental results on classifying ratable aspects discussed in customer reviews

Thao Pham Thanh Nguyen, Takahiro Hayashi, Rikio Onai, Yuhei Nishioka, Takamasa Takenaka, Masaya Mori

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

We present Bautext, a new minimally supervised approach for automatically extracting ratable aspects from customer reviews and classifying them to some previously defined categories. Bautext requires a small amount of seed words as supervised data and uses a bootstrapping mechanism to progressively collect new member for each category. Learning new category members and the category-specific terms for each category at the same time is the unique and featured classification mechanism of Bautext. Category-specific terms are terms that play important roles for properly extracting new category members. Furthermore, we proposed to use an additional Trash category to filter non-purpose aspects, thus led to a significant improvement in precision score but could constrain the trade-off in decreasing recall score. Experimental results, conducted on a Japanese hotel review dataset, showed that Bautext outperforms the alternative techniques in all terms of precision, recall score and significantly in running time. And in the further comparison to Adaboost (as the state-of-the-art machine learning technique for semantic term classification task), we found that Adaboost require about 50% training data to deliver a similar performance as Bautext does with less than ten selective seed words for each category.

本文言語English
ホスト出版物のタイトルICDM Workshops 2009 - IEEE International Conference on Data Mining
ページ43-50
ページ数8
DOI
出版ステータスPublished - 2009
外部発表はい
イベント2009 IEEE International Conference on Data Mining Workshops, ICDMW 2009 - Miami, FL, United States
継続期間: 2009 12月 62009 12月 6

Other

Other2009 IEEE International Conference on Data Mining Workshops, ICDMW 2009
国/地域United States
CityMiami, FL
Period09/12/609/12/6

ASJC Scopus subject areas

  • 計算理論と計算数学
  • コンピュータ ビジョンおよびパターン認識
  • ソフトウェア

フィンガープリント

「A new minimally supervised learning method for semantic term classification-experimental results on classifying ratable aspects discussed in customer reviews」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル