A study of learning a sparse metric matrix using l1 regularization based on supervised learning

    Research output: Contribution to journalArticle

    Abstract

    In this paper, we focus on classification problems based on the vector space model. As one of the methods, distance metric learning which estimates an appropriate metric matrix for classification by using the iterative optimization procedure is known as an effective method. However, the distance metric learning for high dimensional data tends to cause the problems of overfitting to a training dataset and longer computational time. In addition, the number of parameters that need to be estimated is in proportion to the square of the input data dimension. Therefore, if the dimension of input data becomes high, the number of training data to acquire a metric matrix with enough accuracy becomes enormous. Especially, these problems are caused when analyzing the document data and purchase history data stored in the EC site with high dimensional and sparse structure. To avoid these problems, we propose the method of l1 regularized distance metric learning by introducing the alternating direction method of multiplier (ADMM) algorithm. The effectiveness of our proposed method is clarified by classification experiments using a newspaper article that has a highly dimensional and sparse structure and the UCI machine learning repository, which has a low and dense structure.

    Original languageEnglish
    Pages (from-to)230-239
    Number of pages10
    JournalJournal of Japan Industrial Management Association
    Volume66
    Issue number3
    Publication statusPublished - 2015

    Fingerprint

    Supervised learning
    Supervised Learning
    Regularization
    Distance Metric
    Metric
    Vector spaces
    Learning systems
    Method of multipliers
    Alternating Direction Method
    Vector Space Model
    Overfitting
    High-dimensional Data
    Classification Problems
    Repository
    Machine Learning
    Proportion
    High-dimensional
    Learning
    Tend
    Experiments

    Keywords

    • ADMM
    • Distance metric learning
    • Document classification
    • L regularization
    • Vector space model

    ASJC Scopus subject areas

    • Industrial and Manufacturing Engineering
    • Applied Mathematics
    • Management Science and Operations Research
    • Strategy and Management

    Cite this

    @article{f7c5073d2dba49568e7117dfbe2c938e,
    title = "A study of learning a sparse metric matrix using l1 regularization based on supervised learning",
    abstract = "In this paper, we focus on classification problems based on the vector space model. As one of the methods, distance metric learning which estimates an appropriate metric matrix for classification by using the iterative optimization procedure is known as an effective method. However, the distance metric learning for high dimensional data tends to cause the problems of overfitting to a training dataset and longer computational time. In addition, the number of parameters that need to be estimated is in proportion to the square of the input data dimension. Therefore, if the dimension of input data becomes high, the number of training data to acquire a metric matrix with enough accuracy becomes enormous. Especially, these problems are caused when analyzing the document data and purchase history data stored in the EC site with high dimensional and sparse structure. To avoid these problems, we propose the method of l1 regularized distance metric learning by introducing the alternating direction method of multiplier (ADMM) algorithm. The effectiveness of our proposed method is clarified by classification experiments using a newspaper article that has a highly dimensional and sparse structure and the UCI machine learning repository, which has a low and dense structure.",
    keywords = "ADMM, Distance metric learning, Document classification, L regularization, Vector space model",
    author = "Kenta Mikawa and Manabu Kobayashi and Masayuki Goto",
    year = "2015",
    language = "English",
    volume = "66",
    pages = "230--239",
    journal = "Journal of Japan Industrial Management Association",
    issn = "0386-4812",
    publisher = "Nihon Keikei Kogakkai",
    number = "3",

    }

    TY - JOUR

    T1 - A study of learning a sparse metric matrix using l1 regularization based on supervised learning

    AU - Mikawa, Kenta

    AU - Kobayashi, Manabu

    AU - Goto, Masayuki

    PY - 2015

    Y1 - 2015

    N2 - In this paper, we focus on classification problems based on the vector space model. As one of the methods, distance metric learning which estimates an appropriate metric matrix for classification by using the iterative optimization procedure is known as an effective method. However, the distance metric learning for high dimensional data tends to cause the problems of overfitting to a training dataset and longer computational time. In addition, the number of parameters that need to be estimated is in proportion to the square of the input data dimension. Therefore, if the dimension of input data becomes high, the number of training data to acquire a metric matrix with enough accuracy becomes enormous. Especially, these problems are caused when analyzing the document data and purchase history data stored in the EC site with high dimensional and sparse structure. To avoid these problems, we propose the method of l1 regularized distance metric learning by introducing the alternating direction method of multiplier (ADMM) algorithm. The effectiveness of our proposed method is clarified by classification experiments using a newspaper article that has a highly dimensional and sparse structure and the UCI machine learning repository, which has a low and dense structure.

    AB - In this paper, we focus on classification problems based on the vector space model. As one of the methods, distance metric learning which estimates an appropriate metric matrix for classification by using the iterative optimization procedure is known as an effective method. However, the distance metric learning for high dimensional data tends to cause the problems of overfitting to a training dataset and longer computational time. In addition, the number of parameters that need to be estimated is in proportion to the square of the input data dimension. Therefore, if the dimension of input data becomes high, the number of training data to acquire a metric matrix with enough accuracy becomes enormous. Especially, these problems are caused when analyzing the document data and purchase history data stored in the EC site with high dimensional and sparse structure. To avoid these problems, we propose the method of l1 regularized distance metric learning by introducing the alternating direction method of multiplier (ADMM) algorithm. The effectiveness of our proposed method is clarified by classification experiments using a newspaper article that has a highly dimensional and sparse structure and the UCI machine learning repository, which has a low and dense structure.

    KW - ADMM

    KW - Distance metric learning

    KW - Document classification

    KW - L regularization

    KW - Vector space model

    UR - http://www.scopus.com/inward/record.url?scp=84946057889&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84946057889&partnerID=8YFLogxK

    M3 - Article

    AN - SCOPUS:84946057889

    VL - 66

    SP - 230

    EP - 239

    JO - Journal of Japan Industrial Management Association

    JF - Journal of Japan Industrial Management Association

    SN - 0386-4812

    IS - 3

    ER -