Reliability verification of search engines' hit counts: How to select a reliable hit count for a query

Takuya Funahashi, Hayato Yamana

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    8 Citations (Scopus)

    Abstract

    In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the "Search" button more than once or clicks the "Next" button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.

    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Pages114-125
    Number of pages12
    Volume6385 LNCS
    DOIs
    Publication statusPublished - 2010
    Event10th International Conference on Web Engineering, ICWE 2010 - Vienna
    Duration: 2010 Jul 52010 Jul 9

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume6385 LNCS
    ISSN (Print)03029743
    ISSN (Electronic)16113349

    Other

    Other10th International Conference on Web Engineering, ICWE 2010
    CityVienna
    Period10/7/510/7/9

    Fingerprint

    Search engines
    Hits
    Search Engine
    Count
    Query
    Trustworthiness
    Term

    Keywords

    • hit count
    • information retrieval
    • reliability
    • search engine
    • trustworthiness

    ASJC Scopus subject areas

    • Computer Science(all)
    • Theoretical Computer Science

    Cite this

    Funahashi, T., & Yamana, H. (2010). Reliability verification of search engines' hit counts: How to select a reliable hit count for a query. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6385 LNCS, pp. 114-125). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6385 LNCS). https://doi.org/10.1007/978-3-642-16985-4_11

    Reliability verification of search engines' hit counts : How to select a reliable hit count for a query. / Funahashi, Takuya; Yamana, Hayato.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6385 LNCS 2010. p. 114-125 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6385 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Funahashi, T & Yamana, H 2010, Reliability verification of search engines' hit counts: How to select a reliable hit count for a query. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 6385 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6385 LNCS, pp. 114-125, 10th International Conference on Web Engineering, ICWE 2010, Vienna, 10/7/5. https://doi.org/10.1007/978-3-642-16985-4_11
    Funahashi T, Yamana H. Reliability verification of search engines' hit counts: How to select a reliable hit count for a query. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6385 LNCS. 2010. p. 114-125. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-16985-4_11
    Funahashi, Takuya ; Yamana, Hayato. / Reliability verification of search engines' hit counts : How to select a reliable hit count for a query. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6385 LNCS 2010. pp. 114-125 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
    @inproceedings{f69effad3bc042778d002f06a915fc2a,
    title = "Reliability verification of search engines' hit counts: How to select a reliable hit count for a query",
    abstract = "In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the {"}Search{"} button more than once or clicks the {"}Next{"} button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.",
    keywords = "hit count, information retrieval, reliability, search engine, trustworthiness",
    author = "Takuya Funahashi and Hayato Yamana",
    year = "2010",
    doi = "10.1007/978-3-642-16985-4_11",
    language = "English",
    isbn = "3642169848",
    volume = "6385 LNCS",
    series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
    pages = "114--125",
    booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

    }

    TY - GEN

    T1 - Reliability verification of search engines' hit counts

    T2 - How to select a reliable hit count for a query

    AU - Funahashi, Takuya

    AU - Yamana, Hayato

    PY - 2010

    Y1 - 2010

    N2 - In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the "Search" button more than once or clicks the "Next" button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.

    AB - In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the "Search" button more than once or clicks the "Next" button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.

    KW - hit count

    KW - information retrieval

    KW - reliability

    KW - search engine

    KW - trustworthiness

    UR - http://www.scopus.com/inward/record.url?scp=78649832936&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=78649832936&partnerID=8YFLogxK

    U2 - 10.1007/978-3-642-16985-4_11

    DO - 10.1007/978-3-642-16985-4_11

    M3 - Conference contribution

    AN - SCOPUS:78649832936

    SN - 3642169848

    SN - 9783642169847

    VL - 6385 LNCS

    T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    SP - 114

    EP - 125

    BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    ER -