FINNIM: Iterative imputation of missing values in dissolved gas analysis dataset

Zahriah Sahri, Rubiyah Yusof, Junzo Watada

    Research output: Contribution to journalArticle

    14 Citations (Scopus)

    Abstract

    Missing values are a common occurrence in a number of real world databases, and statistical methods have been developed to deal with this problem, referred to as missing data imputation. In the detection and prediction of incipient faults in power transformers using dissolved gas analysis (DGA), the problem of missing values is significant and has resulted in inconclusive decision-making. This study proposes an efficient nonparametric iterative imputation method named FINNIM, which comprises of three components: 1) the imputation ordering; 2) the imputation estimator; and 3) the iterative imputation. The relationship between gases and faults, and the percentage of missing values in an instance are used as a basis for the imputation ordering; whereas the plausible values for the missing values are estimated from bm{k}-nearest neighbor instances in the imputation estimator, and the iterative imputation allows complete and incomplete instances in a DGA dataset to be utilized iteratively for imputing all the missing values. Experimental results on both artificially inserted and actual missing values found in a few DGA datasets demonstrate that the proposed method outperforms the existing methods in imputation accuracy, classification performance, and convergence criteria at different missing percentages.

    Original languageEnglish
    Article number6882199
    Pages (from-to)2093-2102
    Number of pages10
    JournalIEEE Transactions on Industrial Informatics
    Volume10
    Issue number4
    DOIs
    Publication statusPublished - 2014 Nov 1

    Fingerprint

    Gas fuel analysis
    Power transformers
    Iterative methods
    Statistical methods
    Decision making
    Gases

    Keywords

    • Dissolved gas analysis (DGA)
    • imputation ordering
    • iterative imputation
    • k-nearest neighbor (kNN)
    • missing data imputation
    • missing values

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Control and Systems Engineering
    • Computer Science Applications
    • Information Systems

    Cite this

    FINNIM : Iterative imputation of missing values in dissolved gas analysis dataset. / Sahri, Zahriah; Yusof, Rubiyah; Watada, Junzo.

    In: IEEE Transactions on Industrial Informatics, Vol. 10, No. 4, 6882199, 01.11.2014, p. 2093-2102.

    Research output: Contribution to journalArticle

    Sahri, Zahriah ; Yusof, Rubiyah ; Watada, Junzo. / FINNIM : Iterative imputation of missing values in dissolved gas analysis dataset. In: IEEE Transactions on Industrial Informatics. 2014 ; Vol. 10, No. 4. pp. 2093-2102.
    @article{8941c226c5084d478db2617f3ebad402,
    title = "FINNIM: Iterative imputation of missing values in dissolved gas analysis dataset",
    abstract = "Missing values are a common occurrence in a number of real world databases, and statistical methods have been developed to deal with this problem, referred to as missing data imputation. In the detection and prediction of incipient faults in power transformers using dissolved gas analysis (DGA), the problem of missing values is significant and has resulted in inconclusive decision-making. This study proposes an efficient nonparametric iterative imputation method named FINNIM, which comprises of three components: 1) the imputation ordering; 2) the imputation estimator; and 3) the iterative imputation. The relationship between gases and faults, and the percentage of missing values in an instance are used as a basis for the imputation ordering; whereas the plausible values for the missing values are estimated from bm{k}-nearest neighbor instances in the imputation estimator, and the iterative imputation allows complete and incomplete instances in a DGA dataset to be utilized iteratively for imputing all the missing values. Experimental results on both artificially inserted and actual missing values found in a few DGA datasets demonstrate that the proposed method outperforms the existing methods in imputation accuracy, classification performance, and convergence criteria at different missing percentages.",
    keywords = "Dissolved gas analysis (DGA), imputation ordering, iterative imputation, k-nearest neighbor (kNN), missing data imputation, missing values",
    author = "Zahriah Sahri and Rubiyah Yusof and Junzo Watada",
    year = "2014",
    month = "11",
    day = "1",
    doi = "10.1109/TII.2014.2350837",
    language = "English",
    volume = "10",
    pages = "2093--2102",
    journal = "IEEE Transactions on Industrial Informatics",
    issn = "1551-3203",
    publisher = "IEEE Computer Society",
    number = "4",

    }

    TY - JOUR

    T1 - FINNIM

    T2 - Iterative imputation of missing values in dissolved gas analysis dataset

    AU - Sahri, Zahriah

    AU - Yusof, Rubiyah

    AU - Watada, Junzo

    PY - 2014/11/1

    Y1 - 2014/11/1

    N2 - Missing values are a common occurrence in a number of real world databases, and statistical methods have been developed to deal with this problem, referred to as missing data imputation. In the detection and prediction of incipient faults in power transformers using dissolved gas analysis (DGA), the problem of missing values is significant and has resulted in inconclusive decision-making. This study proposes an efficient nonparametric iterative imputation method named FINNIM, which comprises of three components: 1) the imputation ordering; 2) the imputation estimator; and 3) the iterative imputation. The relationship between gases and faults, and the percentage of missing values in an instance are used as a basis for the imputation ordering; whereas the plausible values for the missing values are estimated from bm{k}-nearest neighbor instances in the imputation estimator, and the iterative imputation allows complete and incomplete instances in a DGA dataset to be utilized iteratively for imputing all the missing values. Experimental results on both artificially inserted and actual missing values found in a few DGA datasets demonstrate that the proposed method outperforms the existing methods in imputation accuracy, classification performance, and convergence criteria at different missing percentages.

    AB - Missing values are a common occurrence in a number of real world databases, and statistical methods have been developed to deal with this problem, referred to as missing data imputation. In the detection and prediction of incipient faults in power transformers using dissolved gas analysis (DGA), the problem of missing values is significant and has resulted in inconclusive decision-making. This study proposes an efficient nonparametric iterative imputation method named FINNIM, which comprises of three components: 1) the imputation ordering; 2) the imputation estimator; and 3) the iterative imputation. The relationship between gases and faults, and the percentage of missing values in an instance are used as a basis for the imputation ordering; whereas the plausible values for the missing values are estimated from bm{k}-nearest neighbor instances in the imputation estimator, and the iterative imputation allows complete and incomplete instances in a DGA dataset to be utilized iteratively for imputing all the missing values. Experimental results on both artificially inserted and actual missing values found in a few DGA datasets demonstrate that the proposed method outperforms the existing methods in imputation accuracy, classification performance, and convergence criteria at different missing percentages.

    KW - Dissolved gas analysis (DGA)

    KW - imputation ordering

    KW - iterative imputation

    KW - k-nearest neighbor (kNN)

    KW - missing data imputation

    KW - missing values

    UR - http://www.scopus.com/inward/record.url?scp=84909951993&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84909951993&partnerID=8YFLogxK

    U2 - 10.1109/TII.2014.2350837

    DO - 10.1109/TII.2014.2350837

    M3 - Article

    AN - SCOPUS:84909951993

    VL - 10

    SP - 2093

    EP - 2102

    JO - IEEE Transactions on Industrial Informatics

    JF - IEEE Transactions on Industrial Informatics

    SN - 1551-3203

    IS - 4

    M1 - 6882199

    ER -