Learning local languages and its application to protein α-chain identification

Takashi Yokomori, Nobuyuki Ishida, Satoshi Kobayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

This paper concerns an efficient algorithm for learning in the limit of a special type of regular languages called locally testable languages from positive data, and its application to identifying the protein α-chain region in amino acid sequences. First, we present a linear time algorithm that, given a locally testable language, learns (identifies) its deterministic finite state automaton in the limit from only positive data. This provides us with a practical and efficient learning method for a specific domain of symbolic analysis. We then describe several experimental results using the learning algorithm developed above. Following a theoretical observation which strongly suggests that a certain type of amino acid sequences can be expressed by a locally testable language, we apply the learning algorithm to identifying the protein α-chain region in amino acid sequences for hemoglobin. Experimental scores show an overall success rate of 95% correct identification for positive data, and 96% for negative data.

Original languageEnglish
Title of host publicationProceedings of the Hawaii International Conference on System Sciences
EditorsJay F. Nunamaker, Ralph H.Jr. Sprague
Place of PublicationLos Alamitos, CA, United States
PublisherPubl by IEEE
Pages113-122
Number of pages10
Volume5
ISBN (Print)0818650907
Publication statusPublished - 1995
Externally publishedYes
EventProceedings of the 27th Hawaii International Conference on System Sciences (HICSS-27). Part 4 (of 5) - Wailea, HI, USA
Duration: 1994 Jan 41994 Jan 7

Other

OtherProceedings of the 27th Hawaii International Conference on System Sciences (HICSS-27). Part 4 (of 5)
CityWailea, HI, USA
Period94/1/494/1/7

Fingerprint

Amino acids
Proteins
Learning algorithms
Formal languages
Hemoglobin
Finite automata

ASJC Scopus subject areas

  • Software
  • Industrial and Manufacturing Engineering

Cite this

Yokomori, T., Ishida, N., & Kobayashi, S. (1995). Learning local languages and its application to protein α-chain identification. In J. F. Nunamaker, & R. H. J. Sprague (Eds.), Proceedings of the Hawaii International Conference on System Sciences (Vol. 5, pp. 113-122). Los Alamitos, CA, United States: Publ by IEEE.

Learning local languages and its application to protein α-chain identification. / Yokomori, Takashi; Ishida, Nobuyuki; Kobayashi, Satoshi.

Proceedings of the Hawaii International Conference on System Sciences. ed. / Jay F. Nunamaker; Ralph H.Jr. Sprague. Vol. 5 Los Alamitos, CA, United States : Publ by IEEE, 1995. p. 113-122.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yokomori, T, Ishida, N & Kobayashi, S 1995, Learning local languages and its application to protein α-chain identification. in JF Nunamaker & RHJ Sprague (eds), Proceedings of the Hawaii International Conference on System Sciences. vol. 5, Publ by IEEE, Los Alamitos, CA, United States, pp. 113-122, Proceedings of the 27th Hawaii International Conference on System Sciences (HICSS-27). Part 4 (of 5), Wailea, HI, USA, 94/1/4.
Yokomori T, Ishida N, Kobayashi S. Learning local languages and its application to protein α-chain identification. In Nunamaker JF, Sprague RHJ, editors, Proceedings of the Hawaii International Conference on System Sciences. Vol. 5. Los Alamitos, CA, United States: Publ by IEEE. 1995. p. 113-122
Yokomori, Takashi ; Ishida, Nobuyuki ; Kobayashi, Satoshi. / Learning local languages and its application to protein α-chain identification. Proceedings of the Hawaii International Conference on System Sciences. editor / Jay F. Nunamaker ; Ralph H.Jr. Sprague. Vol. 5 Los Alamitos, CA, United States : Publ by IEEE, 1995. pp. 113-122
@inproceedings{70f7d862d68747b6906dd97d7ecd7450,
title = "Learning local languages and its application to protein α-chain identification",
abstract = "This paper concerns an efficient algorithm for learning in the limit of a special type of regular languages called locally testable languages from positive data, and its application to identifying the protein α-chain region in amino acid sequences. First, we present a linear time algorithm that, given a locally testable language, learns (identifies) its deterministic finite state automaton in the limit from only positive data. This provides us with a practical and efficient learning method for a specific domain of symbolic analysis. We then describe several experimental results using the learning algorithm developed above. Following a theoretical observation which strongly suggests that a certain type of amino acid sequences can be expressed by a locally testable language, we apply the learning algorithm to identifying the protein α-chain region in amino acid sequences for hemoglobin. Experimental scores show an overall success rate of 95{\%} correct identification for positive data, and 96{\%} for negative data.",
author = "Takashi Yokomori and Nobuyuki Ishida and Satoshi Kobayashi",
year = "1995",
language = "English",
isbn = "0818650907",
volume = "5",
pages = "113--122",
editor = "Nunamaker, {Jay F.} and Sprague, {Ralph H.Jr.}",
booktitle = "Proceedings of the Hawaii International Conference on System Sciences",
publisher = "Publ by IEEE",

}

TY - GEN

T1 - Learning local languages and its application to protein α-chain identification

AU - Yokomori, Takashi

AU - Ishida, Nobuyuki

AU - Kobayashi, Satoshi

PY - 1995

Y1 - 1995

N2 - This paper concerns an efficient algorithm for learning in the limit of a special type of regular languages called locally testable languages from positive data, and its application to identifying the protein α-chain region in amino acid sequences. First, we present a linear time algorithm that, given a locally testable language, learns (identifies) its deterministic finite state automaton in the limit from only positive data. This provides us with a practical and efficient learning method for a specific domain of symbolic analysis. We then describe several experimental results using the learning algorithm developed above. Following a theoretical observation which strongly suggests that a certain type of amino acid sequences can be expressed by a locally testable language, we apply the learning algorithm to identifying the protein α-chain region in amino acid sequences for hemoglobin. Experimental scores show an overall success rate of 95% correct identification for positive data, and 96% for negative data.

AB - This paper concerns an efficient algorithm for learning in the limit of a special type of regular languages called locally testable languages from positive data, and its application to identifying the protein α-chain region in amino acid sequences. First, we present a linear time algorithm that, given a locally testable language, learns (identifies) its deterministic finite state automaton in the limit from only positive data. This provides us with a practical and efficient learning method for a specific domain of symbolic analysis. We then describe several experimental results using the learning algorithm developed above. Following a theoretical observation which strongly suggests that a certain type of amino acid sequences can be expressed by a locally testable language, we apply the learning algorithm to identifying the protein α-chain region in amino acid sequences for hemoglobin. Experimental scores show an overall success rate of 95% correct identification for positive data, and 96% for negative data.

UR - http://www.scopus.com/inward/record.url?scp=0028932794&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028932794&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0028932794

SN - 0818650907

VL - 5

SP - 113

EP - 122

BT - Proceedings of the Hawaii International Conference on System Sciences

A2 - Nunamaker, Jay F.

A2 - Sprague, Ralph H.Jr.

PB - Publ by IEEE

CY - Los Alamitos, CA, United States

ER -