Hit count reliability: How much can we trust hit counts?

Koh Satoh, Hayato Yamana

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    6 Citations (Scopus)

    Abstract

    Recently, there have been numerous studies that rely on the number of search results, i.e., hit count. However, hit counts returned by search engines can vary unnaturally when observed on different days, and may contain large errors that affect researches that depend on those results. Such errors can result in low precision of machine translation, incorrect extraction of synonyms and other problems. Thus, it is indispensable to evaluate and to improve the reliability of hit counts. There exist several researches to show the phenomenon; however, none of previous researches have made clear how much we can trust them. In this paper, we propose hit counts' reliability metrics to quantitatively evaluate hit counts' reliability to improve hit count selection. The evaluation results with Google show that our metrics successfully adopt reliable hit counts - 99.8% precision, and skip to adopt unreliable hit counts - 74.3% precision.

    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Pages751-758
    Number of pages8
    Volume7235 LNCS
    DOIs
    Publication statusPublished - 2012
    Event14th Asia Pacific Web Technology Conference, APWeb 2012 - Kunming
    Duration: 2012 Apr 112012 Apr 13

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume7235 LNCS
    ISSN (Print)03029743
    ISSN (Electronic)16113349

    Other

    Other14th Asia Pacific Web Technology Conference, APWeb 2012
    CityKunming
    Period12/4/1112/4/13

    Fingerprint

    Hits
    Count
    Search engines
    Metric
    Machine Translation
    Evaluate
    Search Engine
    Vary
    Evaluation

    Keywords

    • Hit Count
    • Information Retrieval
    • Reliability
    • Search Engine

    ASJC Scopus subject areas

    • Computer Science(all)
    • Theoretical Computer Science

    Cite this

    Satoh, K., & Yamana, H. (2012). Hit count reliability: How much can we trust hit counts? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7235 LNCS, pp. 751-758). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7235 LNCS). https://doi.org/10.1007/978-3-642-29253-8_73

    Hit count reliability : How much can we trust hit counts? / Satoh, Koh; Yamana, Hayato.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7235 LNCS 2012. p. 751-758 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7235 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Satoh, K & Yamana, H 2012, Hit count reliability: How much can we trust hit counts? in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7235 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7235 LNCS, pp. 751-758, 14th Asia Pacific Web Technology Conference, APWeb 2012, Kunming, 12/4/11. https://doi.org/10.1007/978-3-642-29253-8_73
    Satoh K, Yamana H. Hit count reliability: How much can we trust hit counts? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7235 LNCS. 2012. p. 751-758. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-29253-8_73
    Satoh, Koh ; Yamana, Hayato. / Hit count reliability : How much can we trust hit counts?. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7235 LNCS 2012. pp. 751-758 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
    @inproceedings{d95a3d9a01b84b028f868a7a969982bf,
    title = "Hit count reliability: How much can we trust hit counts?",
    abstract = "Recently, there have been numerous studies that rely on the number of search results, i.e., hit count. However, hit counts returned by search engines can vary unnaturally when observed on different days, and may contain large errors that affect researches that depend on those results. Such errors can result in low precision of machine translation, incorrect extraction of synonyms and other problems. Thus, it is indispensable to evaluate and to improve the reliability of hit counts. There exist several researches to show the phenomenon; however, none of previous researches have made clear how much we can trust them. In this paper, we propose hit counts' reliability metrics to quantitatively evaluate hit counts' reliability to improve hit count selection. The evaluation results with Google show that our metrics successfully adopt reliable hit counts - 99.8{\%} precision, and skip to adopt unreliable hit counts - 74.3{\%} precision.",
    keywords = "Hit Count, Information Retrieval, Reliability, Search Engine",
    author = "Koh Satoh and Hayato Yamana",
    year = "2012",
    doi = "10.1007/978-3-642-29253-8_73",
    language = "English",
    isbn = "9783642292521",
    volume = "7235 LNCS",
    series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
    pages = "751--758",
    booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

    }

    TY - GEN

    T1 - Hit count reliability

    T2 - How much can we trust hit counts?

    AU - Satoh, Koh

    AU - Yamana, Hayato

    PY - 2012

    Y1 - 2012

    N2 - Recently, there have been numerous studies that rely on the number of search results, i.e., hit count. However, hit counts returned by search engines can vary unnaturally when observed on different days, and may contain large errors that affect researches that depend on those results. Such errors can result in low precision of machine translation, incorrect extraction of synonyms and other problems. Thus, it is indispensable to evaluate and to improve the reliability of hit counts. There exist several researches to show the phenomenon; however, none of previous researches have made clear how much we can trust them. In this paper, we propose hit counts' reliability metrics to quantitatively evaluate hit counts' reliability to improve hit count selection. The evaluation results with Google show that our metrics successfully adopt reliable hit counts - 99.8% precision, and skip to adopt unreliable hit counts - 74.3% precision.

    AB - Recently, there have been numerous studies that rely on the number of search results, i.e., hit count. However, hit counts returned by search engines can vary unnaturally when observed on different days, and may contain large errors that affect researches that depend on those results. Such errors can result in low precision of machine translation, incorrect extraction of synonyms and other problems. Thus, it is indispensable to evaluate and to improve the reliability of hit counts. There exist several researches to show the phenomenon; however, none of previous researches have made clear how much we can trust them. In this paper, we propose hit counts' reliability metrics to quantitatively evaluate hit counts' reliability to improve hit count selection. The evaluation results with Google show that our metrics successfully adopt reliable hit counts - 99.8% precision, and skip to adopt unreliable hit counts - 74.3% precision.

    KW - Hit Count

    KW - Information Retrieval

    KW - Reliability

    KW - Search Engine

    UR - http://www.scopus.com/inward/record.url?scp=84859729401&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84859729401&partnerID=8YFLogxK

    U2 - 10.1007/978-3-642-29253-8_73

    DO - 10.1007/978-3-642-29253-8_73

    M3 - Conference contribution

    AN - SCOPUS:84859729401

    SN - 9783642292521

    VL - 7235 LNCS

    T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    SP - 751

    EP - 758

    BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    ER -