Retrieved Image Refinement by Bootstrap Outlier Test

Hayato Watanabe, Hideitsu Hino, Shotaro Akaho, Noboru Murata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Outlier detection is used to identify data points or a small number of subsets of data that are significantly different from most other data in a given dataset. It is challenging to detect outliers using an objective and quantitative approach. Methods that use the framework of statistical hypothesis testing are widely used by assuming a specific parametric distribution as a data generation model, but there is no guarantee that the distribution of data can be adequately approximated by a parametric distribution in practical problems. In this paper, a simple method is proposed to objectively detect outliers by hypothesis testing without assuming a specific distribution of outlier scores. By using an arbitrary outlier score function, hypothesis testing is used to determine whether each given sample is an outlier. The distribution of the test statistics is needed for the hypothesis test, and is estimated based on the given data using the bootstrap method. The effectiveness of the proposed outlier test was verified by applying it to outlier detection for text-based image retrieval, where it improved the quality of image searches by removing irrelevant images.

Original languageEnglish
Title of host publicationComputer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings
EditorsMario Vento, Gennaro Percannella
PublisherSpringer-Verlag
Pages505-517
Number of pages13
ISBN (Print)9783030298876
DOIs
Publication statusPublished - 2019 Jan 1
Event18th International Conference on Computer Analysis of Images and Patterns, CAIP 2019 - Salerno, Italy
Duration: 2019 Sep 32019 Sep 5

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11678 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Computer Analysis of Images and Patterns, CAIP 2019
CountryItaly
CitySalerno
Period19/9/319/9/5

Fingerprint

Bootstrap
Outlier
Refinement
Testing
Hypothesis Testing
Outlier Detection
Image retrieval
Statistics
Score Function
Bootstrap Method
Hypothesis Test
Image Retrieval
Test Statistic
Subset
Arbitrary

Keywords

  • Hypothesis testing
  • Image retrieval
  • Outlier removal

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Watanabe, H., Hino, H., Akaho, S., & Murata, N. (2019). Retrieved Image Refinement by Bootstrap Outlier Test. In M. Vento, & G. Percannella (Eds.), Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings (pp. 505-517). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11678 LNCS). Springer-Verlag. https://doi.org/10.1007/978-3-030-29888-3_41

Retrieved Image Refinement by Bootstrap Outlier Test. / Watanabe, Hayato; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru.

Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings. ed. / Mario Vento; Gennaro Percannella. Springer-Verlag, 2019. p. 505-517 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11678 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Watanabe, H, Hino, H, Akaho, S & Murata, N 2019, Retrieved Image Refinement by Bootstrap Outlier Test. in M Vento & G Percannella (eds), Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11678 LNCS, Springer-Verlag, pp. 505-517, 18th International Conference on Computer Analysis of Images and Patterns, CAIP 2019, Salerno, Italy, 19/9/3. https://doi.org/10.1007/978-3-030-29888-3_41
Watanabe H, Hino H, Akaho S, Murata N. Retrieved Image Refinement by Bootstrap Outlier Test. In Vento M, Percannella G, editors, Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings. Springer-Verlag. 2019. p. 505-517. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-29888-3_41
Watanabe, Hayato ; Hino, Hideitsu ; Akaho, Shotaro ; Murata, Noboru. / Retrieved Image Refinement by Bootstrap Outlier Test. Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings. editor / Mario Vento ; Gennaro Percannella. Springer-Verlag, 2019. pp. 505-517 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{ed5178b9026345a38d56460eecfec0d5,
title = "Retrieved Image Refinement by Bootstrap Outlier Test",
abstract = "Outlier detection is used to identify data points or a small number of subsets of data that are significantly different from most other data in a given dataset. It is challenging to detect outliers using an objective and quantitative approach. Methods that use the framework of statistical hypothesis testing are widely used by assuming a specific parametric distribution as a data generation model, but there is no guarantee that the distribution of data can be adequately approximated by a parametric distribution in practical problems. In this paper, a simple method is proposed to objectively detect outliers by hypothesis testing without assuming a specific distribution of outlier scores. By using an arbitrary outlier score function, hypothesis testing is used to determine whether each given sample is an outlier. The distribution of the test statistics is needed for the hypothesis test, and is estimated based on the given data using the bootstrap method. The effectiveness of the proposed outlier test was verified by applying it to outlier detection for text-based image retrieval, where it improved the quality of image searches by removing irrelevant images.",
keywords = "Hypothesis testing, Image retrieval, Outlier removal",
author = "Hayato Watanabe and Hideitsu Hino and Shotaro Akaho and Noboru Murata",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-29888-3_41",
language = "English",
isbn = "9783030298876",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "505--517",
editor = "Mario Vento and Gennaro Percannella",
booktitle = "Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings",

}

TY - GEN

T1 - Retrieved Image Refinement by Bootstrap Outlier Test

AU - Watanabe, Hayato

AU - Hino, Hideitsu

AU - Akaho, Shotaro

AU - Murata, Noboru

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Outlier detection is used to identify data points or a small number of subsets of data that are significantly different from most other data in a given dataset. It is challenging to detect outliers using an objective and quantitative approach. Methods that use the framework of statistical hypothesis testing are widely used by assuming a specific parametric distribution as a data generation model, but there is no guarantee that the distribution of data can be adequately approximated by a parametric distribution in practical problems. In this paper, a simple method is proposed to objectively detect outliers by hypothesis testing without assuming a specific distribution of outlier scores. By using an arbitrary outlier score function, hypothesis testing is used to determine whether each given sample is an outlier. The distribution of the test statistics is needed for the hypothesis test, and is estimated based on the given data using the bootstrap method. The effectiveness of the proposed outlier test was verified by applying it to outlier detection for text-based image retrieval, where it improved the quality of image searches by removing irrelevant images.

AB - Outlier detection is used to identify data points or a small number of subsets of data that are significantly different from most other data in a given dataset. It is challenging to detect outliers using an objective and quantitative approach. Methods that use the framework of statistical hypothesis testing are widely used by assuming a specific parametric distribution as a data generation model, but there is no guarantee that the distribution of data can be adequately approximated by a parametric distribution in practical problems. In this paper, a simple method is proposed to objectively detect outliers by hypothesis testing without assuming a specific distribution of outlier scores. By using an arbitrary outlier score function, hypothesis testing is used to determine whether each given sample is an outlier. The distribution of the test statistics is needed for the hypothesis test, and is estimated based on the given data using the bootstrap method. The effectiveness of the proposed outlier test was verified by applying it to outlier detection for text-based image retrieval, where it improved the quality of image searches by removing irrelevant images.

KW - Hypothesis testing

KW - Image retrieval

KW - Outlier removal

UR - http://www.scopus.com/inward/record.url?scp=85072871968&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072871968&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-29888-3_41

DO - 10.1007/978-3-030-29888-3_41

M3 - Conference contribution

AN - SCOPUS:85072871968

SN - 9783030298876

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 505

EP - 517

BT - Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings

A2 - Vento, Mario

A2 - Percannella, Gennaro

PB - Springer-Verlag

ER -