Text Mining using PrefixSpan constrained by Item Interval and Item Attribute

Issei Sato, Yu Hirate, Hayato Yamana

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Applying conventional sequential pattern mining methods to text data extracts many uninteresting patterns, which increases the time to interpret the extracted patterns. To solve this problem, we propose a new sequential pattern mining algorithm by adopting the following two constraints. One is to select sequences with regard to item intervals-The number of items between any two adjacent items in a sequence-And the other is to select sequences with regard to item attributes. Using Amazon customer reviews in the book category, we have confirmed that our method is able to extract patterns faster than the conventional method, and is better able to exclude uninteresting patterns while retaining the patterns of interest.

Original languageEnglish
Title of host publicationICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)0769525717, 9780769525716
DOIs
Publication statusPublished - 2006
Externally publishedYes
Event22nd International Conference on Data Engineering Workshops, ICDEW 2006 - Atlanta, United States
Duration: 2006 Apr 32006 Apr 7

Other

Other22nd International Conference on Data Engineering Workshops, ICDEW 2006
CountryUnited States
CityAtlanta
Period06/4/306/4/7

    Fingerprint

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management

Cite this

Sato, I., Hirate, Y., & Yamana, H. (2006). Text Mining using PrefixSpan constrained by Item Interval and Item Attribute. In ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops [1623913] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDEW.2006.142