Utilizing latent posting style for authorship attribution on short texts

Patamawadee Leepaisomboon, Mizuho Iwaihara

研究成果: Conference contribution

抜粋

Character n-grams and word n-grams are the most widely used features for authorship attribution on short texts. In this paper, we propose a new method which exploits latent posting styles estimated from authors' short texts. The new posting style features characterize each user's posting style through sentiment orientation and post length. Concise hidden posting styles are captured by Latent Dirichlet Allocation (LDA), where we consider two types of LDA models. Then the vectors of latent posting styles are concatenated with averaged word embeddings of character n-grams and word n-grams, to be used to train a support vector machine. Our results show that combining latent posting styles with the traditional features can improve the accuracy of authorship attribution up to 5.2%.

元の言語English
ホスト出版物のタイトルProceedings - IEEE 17th International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019
出版者Institute of Electrical and Electronics Engineers Inc.
ページ1015-1022
ページ数8
ISBN(電子版)9781728130248
DOI
出版物ステータスPublished - 2019 8
イベント17th IEEE International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019 - Fukuoka, Japan
継続期間: 2019 8 52019 8 8

出版物シリーズ

名前Proceedings - IEEE 17th International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019

Conference

Conference17th IEEE International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019
Japan
Fukuoka
期間19/8/519/8/8

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management
  • Information Systems
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

フィンガープリント Utilizing latent posting style for authorship attribution on short texts' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Leepaisomboon, P., & Iwaihara, M. (2019). Utilizing latent posting style for authorship attribution on short texts. : Proceedings - IEEE 17th International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019 (pp. 1015-1022). [8890501] (Proceedings - IEEE 17th International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00184