A Hadoop performance model for multi-rack clusters

Jungkyu Han, Masakuni Ishii, Hiroyuki Makino

研究成果: Conference contribution

17 被引用数 (Scopus)

抄録

Hadoop becomes de facto standard framework for big data analysis due to its scalability. Despite of the importance of Hadoop's scalability, there are a few works have been made on the scalability in multi-rack clusters. In multi-rack clusters of real world, network topology becomes a major scalability bottleneck due to the limited network switch capacity. It is a waste of resources to add servers to a Hadoop cluster in such situation. Therefore, it is helpful for users to save cost by efficiently measuring the network influence to Hadoop before they add a new server to their clusters. In this paper, we describe a Hadoop performance model for the multi-rack clusters. We modeled network influence on Hadoop and achieved about 95% accuracy to the real measurement. Furthermore, we predicted Hadoop scalability in large clusters with our model and show Hadoop scales enough even in multi-rack clusters.

本文言語English
ホスト出版物のタイトル2013 5th International Conference on Computer Science and Information Technology, CSIT 2013 - Proceedings
ページ265-274
ページ数10
DOI
出版ステータスPublished - 2013
外部発表はい
イベント2013 5th International Conference on Computer Science and Information Technology, CSIT 2013 - Amman, Jordan
継続期間: 2013 3月 272013 3月 28

Other

Other2013 5th International Conference on Computer Science and Information Technology, CSIT 2013
国/地域Jordan
CityAmman
Period13/3/2713/3/28

ASJC Scopus subject areas

  • 計算理論と計算数学
  • 情報システム

フィンガープリント

「A Hadoop performance model for multi-rack clusters」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル