TY - GEN
T1 - ShamFinder
T2 - 19th ACM Internet Measurement Conference, IMC 2019
AU - Suzuki, Hiroaki
AU - Chiba, Daiki
AU - Yoneya, Yoshiro
AU - Mori, Tatsuya
AU - Goto, Shigeki
N1 - Funding Information:
We thank our shepherds, Taejoong Chung and Kimberly Claffy, for their thoughtful suggestions and feedback. We also thank the anonymous reviewers for their fruitful comments. A part of this work was supported by JSPS Grant-in-Aid for Scientific Research B, Grant Number 16H02832.
Publisher Copyright:
© 2019 Association for Computing Machinery. ACM ISBN 978-1-4503-6948-0/19/10...$15.00
PY - 2019/10/21
Y1 - 2019/10/21
N2 - The internationalized domain name (IDN) is a mechanism that enables us to use Unicode characters in domain names. The set of Unicode characters contains several pairs of characters that are visually identical with each other; e.g., the Latin character 'a' (U+0061) and Cyrillic character 'а' (U+0430). Visually identical characters such as these are generally known as homoglyphs. IDN homograph attacks, which are widely known, abuse Unicode homoglyphs to create lookalike URLs. Although the threat posed by IDN homograph attacks is not new, the recent rise of IDN adoption in both domain name registries and web browsers has resulted in the threat of these attacks becoming increasingly widespread, leading to large-scale phishing attacks such as those targeting cryptocurrency exchange companies. In this work, we developed a framework named “ShamFinder,” which is an automated scheme to detect IDN homographs. Our key contribution is the automatic construction of a homoglyph database, which can be used for direct countermeasures against the attack and to inform users about the context of an IDN homograph. Using the ShamFinder framework, we perform a large-scale measurement study that aims to understand the IDN homographs that exist in the wild. On the basis of our approach, we provide insights into an effective countermeasure against the threats caused by the IDN homograph attack.
AB - The internationalized domain name (IDN) is a mechanism that enables us to use Unicode characters in domain names. The set of Unicode characters contains several pairs of characters that are visually identical with each other; e.g., the Latin character 'a' (U+0061) and Cyrillic character 'а' (U+0430). Visually identical characters such as these are generally known as homoglyphs. IDN homograph attacks, which are widely known, abuse Unicode homoglyphs to create lookalike URLs. Although the threat posed by IDN homograph attacks is not new, the recent rise of IDN adoption in both domain name registries and web browsers has resulted in the threat of these attacks becoming increasingly widespread, leading to large-scale phishing attacks such as those targeting cryptocurrency exchange companies. In this work, we developed a framework named “ShamFinder,” which is an automated scheme to detect IDN homographs. Our key contribution is the automatic construction of a homoglyph database, which can be used for direct countermeasures against the attack and to inform users about the context of an IDN homograph. Using the ShamFinder framework, we perform a large-scale measurement study that aims to understand the IDN homographs that exist in the wild. On the basis of our approach, we provide insights into an effective countermeasure against the threats caused by the IDN homograph attack.
KW - DNS
KW - Homoglyph
KW - IDN homograph
KW - Unicode
UR - http://www.scopus.com/inward/record.url?scp=85074847753&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074847753&partnerID=8YFLogxK
U2 - 10.1145/3355369.3355587
DO - 10.1145/3355369.3355587
M3 - Conference contribution
AN - SCOPUS:85074847753
T3 - Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC
SP - 449
EP - 462
BT - IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference
PB - Association for Computing Machinery
Y2 - 21 October 2019 through 23 October 2019
ER -