Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness

Yuming Guo, Mizuho Iwaihara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Nowadays people can find almost all kinds of information they want from the Internet. However, in most cases, users are not willing to find their target among segment among long paragraphs, by spending much time browsing texts. Existing work on topic labeling works effectively and performs well on document categorization, but inadequate for granularity of detailed contents. Thus we propose a method for selecting titles for segments in long documents. We analyze the characteristics of high quality titles for article segments, from the aspect of semantic relatedness between the target segment and related articles as well as other segments. Then we revise three features proposed before. We improve the phraseness feature, for giving appropriate scores for long titles. Meanwhile, we combine the features SimPF and Embedding-vector to enhance the efficiency and rationality. We use Wikipedia articles for experimental evaluations, in which a large number of article segments are titled manually, and a great number of articles lack detailed segment titles. We evaluate scoring functions by where hidden original segment titles are ranked, through precision@K. Through rigorous evaluations, we show an optimum combination of the features.

Original languageEnglish
Title of host publicationProceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages129-132
Number of pages4
ISBN (Electronic)9781538674475
DOIs
Publication statusPublished - 2019 Apr 16
Event7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018 - Yonago, Japan
Duration: 2018 Jul 82018 Jul 13

Publication series

NameProceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018

Conference

Conference7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018
CountryJapan
CityYonago
Period18/7/818/7/13

Fingerprint

Labeling
Semantics
semantics
Internet
Wikipedia
evaluation
rationality
efficiency
lack
time

Keywords

  • Document summarization
  • Keyphrase extraction
  • Semantic relatedness
  • Titling documents

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication
  • Information Systems
  • Information Systems and Management
  • Education

Cite this

Guo, Y., & Iwaihara, M. (2019). Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness. In Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018 (pp. 129-132). [8693246] (Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IIAI-AAI.2018.00034

Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness. / Guo, Yuming; Iwaihara, Mizuho.

Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018. Institute of Electrical and Electronics Engineers Inc., 2019. p. 129-132 8693246 (Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Guo, Y & Iwaihara, M 2019, Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness. in Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018., 8693246, Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018, Institute of Electrical and Electronics Engineers Inc., pp. 129-132, 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018, Yonago, Japan, 18/7/8. https://doi.org/10.1109/IIAI-AAI.2018.00034
Guo Y, Iwaihara M. Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness. In Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018. Institute of Electrical and Electronics Engineers Inc. 2019. p. 129-132. 8693246. (Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018). https://doi.org/10.1109/IIAI-AAI.2018.00034
Guo, Yuming ; Iwaihara, Mizuho. / Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness. Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 129-132 (Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018).
@inproceedings{ea05cfa3feeb416eb1897e77d88e708c,
title = "Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness",
abstract = "Nowadays people can find almost all kinds of information they want from the Internet. However, in most cases, users are not willing to find their target among segment among long paragraphs, by spending much time browsing texts. Existing work on topic labeling works effectively and performs well on document categorization, but inadequate for granularity of detailed contents. Thus we propose a method for selecting titles for segments in long documents. We analyze the characteristics of high quality titles for article segments, from the aspect of semantic relatedness between the target segment and related articles as well as other segments. Then we revise three features proposed before. We improve the phraseness feature, for giving appropriate scores for long titles. Meanwhile, we combine the features SimPF and Embedding-vector to enhance the efficiency and rationality. We use Wikipedia articles for experimental evaluations, in which a large number of article segments are titled manually, and a great number of articles lack detailed segment titles. We evaluate scoring functions by where hidden original segment titles are ranked, through precision@K. Through rigorous evaluations, we show an optimum combination of the features.",
keywords = "Document summarization, Keyphrase extraction, Semantic relatedness, Titling documents",
author = "Yuming Guo and Mizuho Iwaihara",
year = "2019",
month = "4",
day = "16",
doi = "10.1109/IIAI-AAI.2018.00034",
language = "English",
series = "Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "129--132",
booktitle = "Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018",

}

TY - GEN

T1 - Selecting Article Segment Titles Based on Keyphrase Features and Semantic Relatedness

AU - Guo, Yuming

AU - Iwaihara, Mizuho

PY - 2019/4/16

Y1 - 2019/4/16

N2 - Nowadays people can find almost all kinds of information they want from the Internet. However, in most cases, users are not willing to find their target among segment among long paragraphs, by spending much time browsing texts. Existing work on topic labeling works effectively and performs well on document categorization, but inadequate for granularity of detailed contents. Thus we propose a method for selecting titles for segments in long documents. We analyze the characteristics of high quality titles for article segments, from the aspect of semantic relatedness between the target segment and related articles as well as other segments. Then we revise three features proposed before. We improve the phraseness feature, for giving appropriate scores for long titles. Meanwhile, we combine the features SimPF and Embedding-vector to enhance the efficiency and rationality. We use Wikipedia articles for experimental evaluations, in which a large number of article segments are titled manually, and a great number of articles lack detailed segment titles. We evaluate scoring functions by where hidden original segment titles are ranked, through precision@K. Through rigorous evaluations, we show an optimum combination of the features.

AB - Nowadays people can find almost all kinds of information they want from the Internet. However, in most cases, users are not willing to find their target among segment among long paragraphs, by spending much time browsing texts. Existing work on topic labeling works effectively and performs well on document categorization, but inadequate for granularity of detailed contents. Thus we propose a method for selecting titles for segments in long documents. We analyze the characteristics of high quality titles for article segments, from the aspect of semantic relatedness between the target segment and related articles as well as other segments. Then we revise three features proposed before. We improve the phraseness feature, for giving appropriate scores for long titles. Meanwhile, we combine the features SimPF and Embedding-vector to enhance the efficiency and rationality. We use Wikipedia articles for experimental evaluations, in which a large number of article segments are titled manually, and a great number of articles lack detailed segment titles. We evaluate scoring functions by where hidden original segment titles are ranked, through precision@K. Through rigorous evaluations, we show an optimum combination of the features.

KW - Document summarization

KW - Keyphrase extraction

KW - Semantic relatedness

KW - Titling documents

UR - http://www.scopus.com/inward/record.url?scp=85065176604&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065176604&partnerID=8YFLogxK

U2 - 10.1109/IIAI-AAI.2018.00034

DO - 10.1109/IIAI-AAI.2018.00034

M3 - Conference contribution

T3 - Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018

SP - 129

EP - 132

BT - Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -