Utilizing latent posting style for authorship attribution on short texts

Patamawadee Leepaisomboon, Mizuho Iwaihara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Character n-grams and word n-grams are the most widely used features for authorship attribution on short texts. In this paper, we propose a new method which exploits latent posting styles estimated from authors' short texts. The new posting style features characterize each user's posting style through sentiment orientation and post length. Concise hidden posting styles are captured by Latent Dirichlet Allocation (LDA), where we consider two types of LDA models. Then the vectors of latent posting styles are concatenated with averaged word embeddings of character n-grams and word n-grams, to be used to train a support vector machine. Our results show that combining latent posting styles with the traditional features can improve the accuracy of authorship attribution up to 5.2%.

Original languageEnglish
Title of host publicationProceedings - IEEE 17th International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1015-1022
Number of pages8
ISBN (Electronic)9781728130248
DOIs
Publication statusPublished - 2019 Aug
Event17th IEEE International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019 - Fukuoka, Japan
Duration: 2019 Aug 52019 Aug 8

Publication series

NameProceedings - IEEE 17th International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019

Conference

Conference17th IEEE International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019
CountryJapan
CityFukuoka
Period19/8/519/8/8

Keywords

  • Authorship attribution
  • Latent dirichlet allocation
  • Sentiment
  • Short text
  • Social network
  • Support vector machine
  • Twitter

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management
  • Information Systems
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

Fingerprint Dive into the research topics of 'Utilizing latent posting style for authorship attribution on short texts'. Together they form a unique fingerprint.

Cite this