Association rule mining for continuous attributes using genetic network programming

Karla Taboada, Eloy Gonzales, Kaoru Shimada, Shingo Mabu, Kotaro Hirasawa, Takayuki Furuzuki

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

Most of the existing association rule mining algorithms are able to extract knowledge from databases with attributes of binary values. However, in real-world applications, databases are usually composed of continuous values such as height, length or weight. If the attributes are continuous, the algorithms are commonly integrated with a discretization method that transforms them into discrete attributes. Discretization is a process of transforming a continuous attribute value into a finite number of intervals and assigning each interval into a discrete numerical value. However, the user most often must specify the number of intervals, or provide some heuristic rules to be used while discretization, and then it is difficult to get the highest attribute interdependency and at the same time get the lowest number of intervals. In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific and statistical databases. We propose a method using a new graph-based evolutionary algorithm named 'genetic network programming (GNP)' that can deal with continuous values directly, that is, without using any discretization method as a preprocessing step. GNP represents its individuals using graph structures and evolves them in order to find a solution; this feature contributes to creating very compact programs and implicitly memorizing past action sequences. In the proposed method using GNP, the significance of the extracted association rules is measured by the use of Χ2 test, and only important association rules are stored in a pool all together through generations. Results of experiments conducted on a real-life database suggest that the proposed method provides an effective technique for handling continuous attributes.

Original languageEnglish
Pages (from-to)199-211
Number of pages13
JournalIEEJ Transactions on Electrical and Electronic Engineering
Volume3
Issue number2
DOIs
Publication statusPublished - 2008

Fingerprint

Association rules
Evolutionary algorithms
Experiments

Keywords

  • Association rules mining
  • Continuous attributes
  • Genetic network programming

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Association rule mining for continuous attributes using genetic network programming. / Taboada, Karla; Gonzales, Eloy; Shimada, Kaoru; Mabu, Shingo; Hirasawa, Kotaro; Furuzuki, Takayuki.

In: IEEJ Transactions on Electrical and Electronic Engineering, Vol. 3, No. 2, 2008, p. 199-211.

Research output: Contribution to journalArticle

Taboada, Karla ; Gonzales, Eloy ; Shimada, Kaoru ; Mabu, Shingo ; Hirasawa, Kotaro ; Furuzuki, Takayuki. / Association rule mining for continuous attributes using genetic network programming. In: IEEJ Transactions on Electrical and Electronic Engineering. 2008 ; Vol. 3, No. 2. pp. 199-211.
@article{f50433d201ed47d6879b5504cbb3ac1b,
title = "Association rule mining for continuous attributes using genetic network programming",
abstract = "Most of the existing association rule mining algorithms are able to extract knowledge from databases with attributes of binary values. However, in real-world applications, databases are usually composed of continuous values such as height, length or weight. If the attributes are continuous, the algorithms are commonly integrated with a discretization method that transforms them into discrete attributes. Discretization is a process of transforming a continuous attribute value into a finite number of intervals and assigning each interval into a discrete numerical value. However, the user most often must specify the number of intervals, or provide some heuristic rules to be used while discretization, and then it is difficult to get the highest attribute interdependency and at the same time get the lowest number of intervals. In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific and statistical databases. We propose a method using a new graph-based evolutionary algorithm named 'genetic network programming (GNP)' that can deal with continuous values directly, that is, without using any discretization method as a preprocessing step. GNP represents its individuals using graph structures and evolves them in order to find a solution; this feature contributes to creating very compact programs and implicitly memorizing past action sequences. In the proposed method using GNP, the significance of the extracted association rules is measured by the use of Χ2 test, and only important association rules are stored in a pool all together through generations. Results of experiments conducted on a real-life database suggest that the proposed method provides an effective technique for handling continuous attributes.",
keywords = "Association rules mining, Continuous attributes, Genetic network programming",
author = "Karla Taboada and Eloy Gonzales and Kaoru Shimada and Shingo Mabu and Kotaro Hirasawa and Takayuki Furuzuki",
year = "2008",
doi = "10.1002/tee.20256",
language = "English",
volume = "3",
pages = "199--211",
journal = "IEEJ Transactions on Electrical and Electronic Engineering",
issn = "1931-4973",
publisher = "John Wiley and Sons Inc.",
number = "2",

}

TY - JOUR

T1 - Association rule mining for continuous attributes using genetic network programming

AU - Taboada, Karla

AU - Gonzales, Eloy

AU - Shimada, Kaoru

AU - Mabu, Shingo

AU - Hirasawa, Kotaro

AU - Furuzuki, Takayuki

PY - 2008

Y1 - 2008

N2 - Most of the existing association rule mining algorithms are able to extract knowledge from databases with attributes of binary values. However, in real-world applications, databases are usually composed of continuous values such as height, length or weight. If the attributes are continuous, the algorithms are commonly integrated with a discretization method that transforms them into discrete attributes. Discretization is a process of transforming a continuous attribute value into a finite number of intervals and assigning each interval into a discrete numerical value. However, the user most often must specify the number of intervals, or provide some heuristic rules to be used while discretization, and then it is difficult to get the highest attribute interdependency and at the same time get the lowest number of intervals. In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific and statistical databases. We propose a method using a new graph-based evolutionary algorithm named 'genetic network programming (GNP)' that can deal with continuous values directly, that is, without using any discretization method as a preprocessing step. GNP represents its individuals using graph structures and evolves them in order to find a solution; this feature contributes to creating very compact programs and implicitly memorizing past action sequences. In the proposed method using GNP, the significance of the extracted association rules is measured by the use of Χ2 test, and only important association rules are stored in a pool all together through generations. Results of experiments conducted on a real-life database suggest that the proposed method provides an effective technique for handling continuous attributes.

AB - Most of the existing association rule mining algorithms are able to extract knowledge from databases with attributes of binary values. However, in real-world applications, databases are usually composed of continuous values such as height, length or weight. If the attributes are continuous, the algorithms are commonly integrated with a discretization method that transforms them into discrete attributes. Discretization is a process of transforming a continuous attribute value into a finite number of intervals and assigning each interval into a discrete numerical value. However, the user most often must specify the number of intervals, or provide some heuristic rules to be used while discretization, and then it is difficult to get the highest attribute interdependency and at the same time get the lowest number of intervals. In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific and statistical databases. We propose a method using a new graph-based evolutionary algorithm named 'genetic network programming (GNP)' that can deal with continuous values directly, that is, without using any discretization method as a preprocessing step. GNP represents its individuals using graph structures and evolves them in order to find a solution; this feature contributes to creating very compact programs and implicitly memorizing past action sequences. In the proposed method using GNP, the significance of the extracted association rules is measured by the use of Χ2 test, and only important association rules are stored in a pool all together through generations. Results of experiments conducted on a real-life database suggest that the proposed method provides an effective technique for handling continuous attributes.

KW - Association rules mining

KW - Continuous attributes

KW - Genetic network programming

UR - http://www.scopus.com/inward/record.url?scp=40549135980&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=40549135980&partnerID=8YFLogxK

U2 - 10.1002/tee.20256

DO - 10.1002/tee.20256

M3 - Article

AN - SCOPUS:40549135980

VL - 3

SP - 199

EP - 211

JO - IEEJ Transactions on Electrical and Electronic Engineering

JF - IEEJ Transactions on Electrical and Electronic Engineering

SN - 1931-4973

IS - 2

ER -