SAHG, a comprehensive database of predicted structures of all human proteins

Chie Motono, Junichi Nakata, Ryotaro Koike, Kana Shimizu, Matsuyuki Shirota, Takayuki Amemiya, Kentaro Tomii, Nozomi Nagano, Naofumi Sakaya, Kiyotaka Misoo, Miwa Sato, Akinori Kidera, Hidekazu Hiroaki, Tsuyoshi Shirai, Kengo Kinoshita, Tamotsu Noguchi, Motonori Ota

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special proteinstructure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith-Waterman profile-profile alignment), global-local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42 581 protein-domain models in approximately 24 900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structurefunction relationships.

Original languageEnglish
JournalNucleic Acids Research
Volume39
Issue numberSUPPL. 1
DOIs
Publication statusPublished - 2011 Jan
Externally publishedYes

Fingerprint

Atlases
Human Genome
Databases
Proteins
Ligands
Protein Databases
Protein Sequence Analysis
Birds
Linear Models
Protein Domains

ASJC Scopus subject areas

  • Genetics

Cite this

Motono, C., Nakata, J., Koike, R., Shimizu, K., Shirota, M., Amemiya, T., ... Ota, M. (2011). SAHG, a comprehensive database of predicted structures of all human proteins. Nucleic Acids Research, 39(SUPPL. 1). https://doi.org/10.1093/nar/gkq1057

SAHG, a comprehensive database of predicted structures of all human proteins. / Motono, Chie; Nakata, Junichi; Koike, Ryotaro; Shimizu, Kana; Shirota, Matsuyuki; Amemiya, Takayuki; Tomii, Kentaro; Nagano, Nozomi; Sakaya, Naofumi; Misoo, Kiyotaka; Sato, Miwa; Kidera, Akinori; Hiroaki, Hidekazu; Shirai, Tsuyoshi; Kinoshita, Kengo; Noguchi, Tamotsu; Ota, Motonori.

In: Nucleic Acids Research, Vol. 39, No. SUPPL. 1, 01.2011.

Research output: Contribution to journalArticle

Motono, C, Nakata, J, Koike, R, Shimizu, K, Shirota, M, Amemiya, T, Tomii, K, Nagano, N, Sakaya, N, Misoo, K, Sato, M, Kidera, A, Hiroaki, H, Shirai, T, Kinoshita, K, Noguchi, T & Ota, M 2011, 'SAHG, a comprehensive database of predicted structures of all human proteins', Nucleic Acids Research, vol. 39, no. SUPPL. 1. https://doi.org/10.1093/nar/gkq1057
Motono, Chie ; Nakata, Junichi ; Koike, Ryotaro ; Shimizu, Kana ; Shirota, Matsuyuki ; Amemiya, Takayuki ; Tomii, Kentaro ; Nagano, Nozomi ; Sakaya, Naofumi ; Misoo, Kiyotaka ; Sato, Miwa ; Kidera, Akinori ; Hiroaki, Hidekazu ; Shirai, Tsuyoshi ; Kinoshita, Kengo ; Noguchi, Tamotsu ; Ota, Motonori. / SAHG, a comprehensive database of predicted structures of all human proteins. In: Nucleic Acids Research. 2011 ; Vol. 39, No. SUPPL. 1.
@article{1245dbebada74290b6cb78797f3c8db5,
title = "SAHG, a comprehensive database of predicted structures of all human proteins",
abstract = "Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special proteinstructure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith-Waterman profile-profile alignment), global-local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42 581 protein-domain models in approximately 24 900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structurefunction relationships.",
author = "Chie Motono and Junichi Nakata and Ryotaro Koike and Kana Shimizu and Matsuyuki Shirota and Takayuki Amemiya and Kentaro Tomii and Nozomi Nagano and Naofumi Sakaya and Kiyotaka Misoo and Miwa Sato and Akinori Kidera and Hidekazu Hiroaki and Tsuyoshi Shirai and Kengo Kinoshita and Tamotsu Noguchi and Motonori Ota",
year = "2011",
month = "1",
doi = "10.1093/nar/gkq1057",
language = "English",
volume = "39",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "SUPPL. 1",

}

TY - JOUR

T1 - SAHG, a comprehensive database of predicted structures of all human proteins

AU - Motono, Chie

AU - Nakata, Junichi

AU - Koike, Ryotaro

AU - Shimizu, Kana

AU - Shirota, Matsuyuki

AU - Amemiya, Takayuki

AU - Tomii, Kentaro

AU - Nagano, Nozomi

AU - Sakaya, Naofumi

AU - Misoo, Kiyotaka

AU - Sato, Miwa

AU - Kidera, Akinori

AU - Hiroaki, Hidekazu

AU - Shirai, Tsuyoshi

AU - Kinoshita, Kengo

AU - Noguchi, Tamotsu

AU - Ota, Motonori

PY - 2011/1

Y1 - 2011/1

N2 - Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special proteinstructure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith-Waterman profile-profile alignment), global-local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42 581 protein-domain models in approximately 24 900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structurefunction relationships.

AB - Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special proteinstructure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith-Waterman profile-profile alignment), global-local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42 581 protein-domain models in approximately 24 900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structurefunction relationships.

UR - http://www.scopus.com/inward/record.url?scp=78651314041&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78651314041&partnerID=8YFLogxK

U2 - 10.1093/nar/gkq1057

DO - 10.1093/nar/gkq1057

M3 - Article

C2 - 21051360

AN - SCOPUS:78651314041

VL - 39

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - SUPPL. 1

ER -