Extracting inter-firm networks from the World Wide Web using a general-purpose search engine

Yingzi Jin, Mitsuru Ishizuka, Yutaka Matsuo

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Purpose - Social relations play an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, firms), which can be merged to produce valuable information such as a network structure. This paper aims to present a new approach to extract inter-firm networks from the web for further analysis. Design/methodology/approach - In this study extraction of relations between a pair of firms is obtained by using a search engine and text processing. Because names of firms co-appear coincidentally on the web, an advanced algorithm is proposed, which is characterised by the addition of keywords ("relation keywords") to a query. The relation keywords are obtained from the web using a Jaccard coefficient. Findings - As an application, a network of 60 firms in Japan is extracted including IT, communication, broadcasting, and electronics firms from the web and comprehensive evaluations of this approach are shown. The alliance and lawsuit relations are easily obtainable from the web using the algorithm. By adding relation keywords to named pairs of firms as a query, It is possible to collect target pages from the top of web pages more precisely than by only using the named pairs as a query. Practical implications - This study proposes a new approach for extracting inter-firm networks from the web. The obtained network is useful in several ways. It is possible to find a cluster of firms and characterise a firm by its cluster. Business experts often make such inferences based on firm relations and firm groups. For that reason the firm network might enhance inferential abilities on the business domain. Also we might use obtained networks to recommend business partners based on structural advantages. The authors' intuition is that extracting a social network might provide information that is only recognisable from the network point of view. For example, the centrality of each firm is identified only after generating a social network. Originality/value - This study is a first attempt to extract inter-firm networks from the web using a search engine. The approach is also applicable to other actors, such as famous persons, organisations or other multiple relational entities.

Original languageEnglish
Pages (from-to)196-210
Number of pages15
JournalOnline Information Review
Volume32
Issue number2
DOIs
Publication statusPublished - 2008
Externally publishedYes

Fingerprint

Search engines
World Wide Web
search engine
firm
Internet
Text processing
Industry
Broadcasting
Websites
Electronic equipment
Communication
social network
text processing
interaction pattern
human being
lawsuit
intuition
broadcasting
Social Relations

Keywords

  • Information retrieval
  • Social networks
  • Worldwide web

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences

Cite this

Extracting inter-firm networks from the World Wide Web using a general-purpose search engine. / Jin, Yingzi; Ishizuka, Mitsuru; Matsuo, Yutaka.

In: Online Information Review, Vol. 32, No. 2, 2008, p. 196-210.

Research output: Contribution to journalArticle

Jin, Yingzi ; Ishizuka, Mitsuru ; Matsuo, Yutaka. / Extracting inter-firm networks from the World Wide Web using a general-purpose search engine. In: Online Information Review. 2008 ; Vol. 32, No. 2. pp. 196-210.
@article{6c9ad1bb499d4f488c353f143ed2c661,
title = "Extracting inter-firm networks from the World Wide Web using a general-purpose search engine",
abstract = "Purpose - Social relations play an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, firms), which can be merged to produce valuable information such as a network structure. This paper aims to present a new approach to extract inter-firm networks from the web for further analysis. Design/methodology/approach - In this study extraction of relations between a pair of firms is obtained by using a search engine and text processing. Because names of firms co-appear coincidentally on the web, an advanced algorithm is proposed, which is characterised by the addition of keywords ({"}relation keywords{"}) to a query. The relation keywords are obtained from the web using a Jaccard coefficient. Findings - As an application, a network of 60 firms in Japan is extracted including IT, communication, broadcasting, and electronics firms from the web and comprehensive evaluations of this approach are shown. The alliance and lawsuit relations are easily obtainable from the web using the algorithm. By adding relation keywords to named pairs of firms as a query, It is possible to collect target pages from the top of web pages more precisely than by only using the named pairs as a query. Practical implications - This study proposes a new approach for extracting inter-firm networks from the web. The obtained network is useful in several ways. It is possible to find a cluster of firms and characterise a firm by its cluster. Business experts often make such inferences based on firm relations and firm groups. For that reason the firm network might enhance inferential abilities on the business domain. Also we might use obtained networks to recommend business partners based on structural advantages. The authors' intuition is that extracting a social network might provide information that is only recognisable from the network point of view. For example, the centrality of each firm is identified only after generating a social network. Originality/value - This study is a first attempt to extract inter-firm networks from the web using a search engine. The approach is also applicable to other actors, such as famous persons, organisations or other multiple relational entities.",
keywords = "Information retrieval, Social networks, Worldwide web",
author = "Yingzi Jin and Mitsuru Ishizuka and Yutaka Matsuo",
year = "2008",
doi = "10.1108/14684520810879827",
language = "English",
volume = "32",
pages = "196--210",
journal = "Online Information Review",
issn = "1468-4527",
publisher = "Emerald Group Publishing Ltd.",
number = "2",

}

TY - JOUR

T1 - Extracting inter-firm networks from the World Wide Web using a general-purpose search engine

AU - Jin, Yingzi

AU - Ishizuka, Mitsuru

AU - Matsuo, Yutaka

PY - 2008

Y1 - 2008

N2 - Purpose - Social relations play an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, firms), which can be merged to produce valuable information such as a network structure. This paper aims to present a new approach to extract inter-firm networks from the web for further analysis. Design/methodology/approach - In this study extraction of relations between a pair of firms is obtained by using a search engine and text processing. Because names of firms co-appear coincidentally on the web, an advanced algorithm is proposed, which is characterised by the addition of keywords ("relation keywords") to a query. The relation keywords are obtained from the web using a Jaccard coefficient. Findings - As an application, a network of 60 firms in Japan is extracted including IT, communication, broadcasting, and electronics firms from the web and comprehensive evaluations of this approach are shown. The alliance and lawsuit relations are easily obtainable from the web using the algorithm. By adding relation keywords to named pairs of firms as a query, It is possible to collect target pages from the top of web pages more precisely than by only using the named pairs as a query. Practical implications - This study proposes a new approach for extracting inter-firm networks from the web. The obtained network is useful in several ways. It is possible to find a cluster of firms and characterise a firm by its cluster. Business experts often make such inferences based on firm relations and firm groups. For that reason the firm network might enhance inferential abilities on the business domain. Also we might use obtained networks to recommend business partners based on structural advantages. The authors' intuition is that extracting a social network might provide information that is only recognisable from the network point of view. For example, the centrality of each firm is identified only after generating a social network. Originality/value - This study is a first attempt to extract inter-firm networks from the web using a search engine. The approach is also applicable to other actors, such as famous persons, organisations or other multiple relational entities.

AB - Purpose - Social relations play an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, firms), which can be merged to produce valuable information such as a network structure. This paper aims to present a new approach to extract inter-firm networks from the web for further analysis. Design/methodology/approach - In this study extraction of relations between a pair of firms is obtained by using a search engine and text processing. Because names of firms co-appear coincidentally on the web, an advanced algorithm is proposed, which is characterised by the addition of keywords ("relation keywords") to a query. The relation keywords are obtained from the web using a Jaccard coefficient. Findings - As an application, a network of 60 firms in Japan is extracted including IT, communication, broadcasting, and electronics firms from the web and comprehensive evaluations of this approach are shown. The alliance and lawsuit relations are easily obtainable from the web using the algorithm. By adding relation keywords to named pairs of firms as a query, It is possible to collect target pages from the top of web pages more precisely than by only using the named pairs as a query. Practical implications - This study proposes a new approach for extracting inter-firm networks from the web. The obtained network is useful in several ways. It is possible to find a cluster of firms and characterise a firm by its cluster. Business experts often make such inferences based on firm relations and firm groups. For that reason the firm network might enhance inferential abilities on the business domain. Also we might use obtained networks to recommend business partners based on structural advantages. The authors' intuition is that extracting a social network might provide information that is only recognisable from the network point of view. For example, the centrality of each firm is identified only after generating a social network. Originality/value - This study is a first attempt to extract inter-firm networks from the web using a search engine. The approach is also applicable to other actors, such as famous persons, organisations or other multiple relational entities.

KW - Information retrieval

KW - Social networks

KW - Worldwide web

UR - http://www.scopus.com/inward/record.url?scp=42549154224&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=42549154224&partnerID=8YFLogxK

U2 - 10.1108/14684520810879827

DO - 10.1108/14684520810879827

M3 - Article

AN - SCOPUS:42549154224

VL - 32

SP - 196

EP - 210

JO - Online Information Review

JF - Online Information Review

SN - 1468-4527

IS - 2

ER -