Statistical evaluation of measure and distance on document classification problems in text mining

Masayuki Goto, Takashi Ishida, Shigeichi Hirasawa

研究成果: Conference contribution

4 引用 (Scopus)

抜粋

This paper discusses the document classification problems in text mining from the viewpoint of asymptotic statistical analysis. By formulation of statistical hypotheses test which is specified as a problem of text mining, some interesting properties can be visualized. In the problem of text mining, the several heuristics are applied to practical analysis because of its experimental effectiveness in many case studies. The theoretical explanation about the performance of text mining techniques is required and this approach will give us very clear idea. The distance measure in word vector space is used to classify the documents. In this paper, the performance of distance measure is also analized from the new viewpoint of asymptotic analysis.

元の言語English
ホスト出版物のタイトルCIT 2007
ホスト出版物のサブタイトル7th IEEE International Conference on Computer and Information Technology
ページ674-679
ページ数6
DOI
出版物ステータスPublished - 2007 12 1
外部発表Yes
イベントCIT 2007: 7th IEEE International Conference on Computer and Information Technology - Aizu-Wakamatsu, Fukushima, Japan
継続期間: 2007 10 162007 10 19

出版物シリーズ

名前CIT 2007: 7th IEEE International Conference on Computer and Information Technology

Conference

ConferenceCIT 2007: 7th IEEE International Conference on Computer and Information Technology
Japan
Aizu-Wakamatsu, Fukushima
期間07/10/1607/10/19

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Software
  • Mathematics(all)

フィンガープリント Statistical evaluation of measure and distance on document classification problems in text mining' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Goto, M., Ishida, T., & Hirasawa, S. (2007). Statistical evaluation of measure and distance on document classification problems in text mining. : CIT 2007: 7th IEEE International Conference on Computer and Information Technology (pp. 674-679). [4385162] (CIT 2007: 7th IEEE International Conference on Computer and Information Technology). https://doi.org/10.1109/CIT.2007.4385162