Criteria for gene identification and features of genome organization

Analysis of 6.5 Mb of DNA sequence from human chromosome 21

Dobromir Slavov, Masahira Hattori, Yoshiyuki Sakaki, André Rosenthal, Nobuyoshi Shimizu, Shinsei Minoshima, Jun Kudoh, Marie Laure Yaspo, Juliane Ramser, Richard Reinhardt, Candy Reimer, Kevin Clancy, Alla Rynditch, Katheleen Gardiner

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

To establish criteria for and the limitations of novel gene identification, to identify novel genes of potential relevance to Down Syndrome and to investigate features of genome organization, 6.5 Mb of DNA sequence, dispersed throughout the long arm of human chromosome 21, have been annotated computationally and experimentally. Exon prediction with four programs, protein and EST database searches, two-sequence BLAST searches and CpG island characterization identified 41 genes with known or new protein homologies. Features of these genes suggested criteria for prediction of novel genes (those lacking any protein homology) with the following characteristics: (1) exon+EST genes: genes with excellent patterns of predicted exons and one or more matches in dbEST; (2) exon-EST genes: genes with good patterns of predicted exons and no matches in dbEST; (3) EST-exon genes: genes without any patterns of reliable exon prediction but with matches in dbEST; and (4) isolated CpG island genes: genes consisting of strong CpG islands that are apparently unique sequences and found in regions lacking any consistent exon predictions within >50 kb. In total, 41 novel gene models were predicted, and for a subset of these, RT-PCR experiments helped to verify and refine the models, and were used to assess expression in early development and in adult brain regions of potential relevance to Down syndrome. Results suggest generally low and/or restricted patterns of expression, and also reveal examples of complex alternative processing, especially in brain, that may have important implications for regulation of protein function. Analysis of complete gene structures of the known genes identified a number of very large introns, a number of very short intergenic distances, and at least one potentially bi-directional promoter. At least 3/4 of known genes and 1/2 of predicted genes are associated with CpG islands. For novel genes, three cases of overlapping genes are predicted. Results of these analyses illustrate some of the complexities inherent in mammalian genome organization and some of the limitations of current sequence analysis technologies. They also doubled the number of potential genes within the region. (C) 2000 Elsevier Science B.V. All rights reserved.

Original languageEnglish
Pages (from-to)215-232
Number of pages18
JournalGene
Volume247
Issue number1-2
DOIs
Publication statusPublished - 2000 Apr 18
Externally publishedYes

Fingerprint

Chromosomes, Human, Pair 21
Human Chromosomes
Genome
Genes
Exons
CpG Islands
Expressed Sequence Tags
Down Syndrome
Overlapping Genes

Keywords

  • Down syndrome
  • Gene identification
  • Genome organization
  • Human chromosome 21
  • Sequence analysis

ASJC Scopus subject areas

  • Genetics

Cite this

Criteria for gene identification and features of genome organization : Analysis of 6.5 Mb of DNA sequence from human chromosome 21. / Slavov, Dobromir; Hattori, Masahira; Sakaki, Yoshiyuki; Rosenthal, André; Shimizu, Nobuyoshi; Minoshima, Shinsei; Kudoh, Jun; Yaspo, Marie Laure; Ramser, Juliane; Reinhardt, Richard; Reimer, Candy; Clancy, Kevin; Rynditch, Alla; Gardiner, Katheleen.

In: Gene, Vol. 247, No. 1-2, 18.04.2000, p. 215-232.

Research output: Contribution to journalArticle

Slavov, D, Hattori, M, Sakaki, Y, Rosenthal, A, Shimizu, N, Minoshima, S, Kudoh, J, Yaspo, ML, Ramser, J, Reinhardt, R, Reimer, C, Clancy, K, Rynditch, A & Gardiner, K 2000, 'Criteria for gene identification and features of genome organization: Analysis of 6.5 Mb of DNA sequence from human chromosome 21', Gene, vol. 247, no. 1-2, pp. 215-232. https://doi.org/10.1016/S0378-1119(00)00089-5
Slavov, Dobromir ; Hattori, Masahira ; Sakaki, Yoshiyuki ; Rosenthal, André ; Shimizu, Nobuyoshi ; Minoshima, Shinsei ; Kudoh, Jun ; Yaspo, Marie Laure ; Ramser, Juliane ; Reinhardt, Richard ; Reimer, Candy ; Clancy, Kevin ; Rynditch, Alla ; Gardiner, Katheleen. / Criteria for gene identification and features of genome organization : Analysis of 6.5 Mb of DNA sequence from human chromosome 21. In: Gene. 2000 ; Vol. 247, No. 1-2. pp. 215-232.
@article{68dac3d794624633a85e9efcca5997eb,
title = "Criteria for gene identification and features of genome organization: Analysis of 6.5 Mb of DNA sequence from human chromosome 21",
abstract = "To establish criteria for and the limitations of novel gene identification, to identify novel genes of potential relevance to Down Syndrome and to investigate features of genome organization, 6.5 Mb of DNA sequence, dispersed throughout the long arm of human chromosome 21, have been annotated computationally and experimentally. Exon prediction with four programs, protein and EST database searches, two-sequence BLAST searches and CpG island characterization identified 41 genes with known or new protein homologies. Features of these genes suggested criteria for prediction of novel genes (those lacking any protein homology) with the following characteristics: (1) exon+EST genes: genes with excellent patterns of predicted exons and one or more matches in dbEST; (2) exon-EST genes: genes with good patterns of predicted exons and no matches in dbEST; (3) EST-exon genes: genes without any patterns of reliable exon prediction but with matches in dbEST; and (4) isolated CpG island genes: genes consisting of strong CpG islands that are apparently unique sequences and found in regions lacking any consistent exon predictions within >50 kb. In total, 41 novel gene models were predicted, and for a subset of these, RT-PCR experiments helped to verify and refine the models, and were used to assess expression in early development and in adult brain regions of potential relevance to Down syndrome. Results suggest generally low and/or restricted patterns of expression, and also reveal examples of complex alternative processing, especially in brain, that may have important implications for regulation of protein function. Analysis of complete gene structures of the known genes identified a number of very large introns, a number of very short intergenic distances, and at least one potentially bi-directional promoter. At least 3/4 of known genes and 1/2 of predicted genes are associated with CpG islands. For novel genes, three cases of overlapping genes are predicted. Results of these analyses illustrate some of the complexities inherent in mammalian genome organization and some of the limitations of current sequence analysis technologies. They also doubled the number of potential genes within the region. (C) 2000 Elsevier Science B.V. All rights reserved.",
keywords = "Down syndrome, Gene identification, Genome organization, Human chromosome 21, Sequence analysis",
author = "Dobromir Slavov and Masahira Hattori and Yoshiyuki Sakaki and Andr{\'e} Rosenthal and Nobuyoshi Shimizu and Shinsei Minoshima and Jun Kudoh and Yaspo, {Marie Laure} and Juliane Ramser and Richard Reinhardt and Candy Reimer and Kevin Clancy and Alla Rynditch and Katheleen Gardiner",
year = "2000",
month = "4",
day = "18",
doi = "10.1016/S0378-1119(00)00089-5",
language = "English",
volume = "247",
pages = "215--232",
journal = "Gene",
issn = "0378-1119",
publisher = "Elsevier",
number = "1-2",

}

TY - JOUR

T1 - Criteria for gene identification and features of genome organization

T2 - Analysis of 6.5 Mb of DNA sequence from human chromosome 21

AU - Slavov, Dobromir

AU - Hattori, Masahira

AU - Sakaki, Yoshiyuki

AU - Rosenthal, André

AU - Shimizu, Nobuyoshi

AU - Minoshima, Shinsei

AU - Kudoh, Jun

AU - Yaspo, Marie Laure

AU - Ramser, Juliane

AU - Reinhardt, Richard

AU - Reimer, Candy

AU - Clancy, Kevin

AU - Rynditch, Alla

AU - Gardiner, Katheleen

PY - 2000/4/18

Y1 - 2000/4/18

N2 - To establish criteria for and the limitations of novel gene identification, to identify novel genes of potential relevance to Down Syndrome and to investigate features of genome organization, 6.5 Mb of DNA sequence, dispersed throughout the long arm of human chromosome 21, have been annotated computationally and experimentally. Exon prediction with four programs, protein and EST database searches, two-sequence BLAST searches and CpG island characterization identified 41 genes with known or new protein homologies. Features of these genes suggested criteria for prediction of novel genes (those lacking any protein homology) with the following characteristics: (1) exon+EST genes: genes with excellent patterns of predicted exons and one or more matches in dbEST; (2) exon-EST genes: genes with good patterns of predicted exons and no matches in dbEST; (3) EST-exon genes: genes without any patterns of reliable exon prediction but with matches in dbEST; and (4) isolated CpG island genes: genes consisting of strong CpG islands that are apparently unique sequences and found in regions lacking any consistent exon predictions within >50 kb. In total, 41 novel gene models were predicted, and for a subset of these, RT-PCR experiments helped to verify and refine the models, and were used to assess expression in early development and in adult brain regions of potential relevance to Down syndrome. Results suggest generally low and/or restricted patterns of expression, and also reveal examples of complex alternative processing, especially in brain, that may have important implications for regulation of protein function. Analysis of complete gene structures of the known genes identified a number of very large introns, a number of very short intergenic distances, and at least one potentially bi-directional promoter. At least 3/4 of known genes and 1/2 of predicted genes are associated with CpG islands. For novel genes, three cases of overlapping genes are predicted. Results of these analyses illustrate some of the complexities inherent in mammalian genome organization and some of the limitations of current sequence analysis technologies. They also doubled the number of potential genes within the region. (C) 2000 Elsevier Science B.V. All rights reserved.

AB - To establish criteria for and the limitations of novel gene identification, to identify novel genes of potential relevance to Down Syndrome and to investigate features of genome organization, 6.5 Mb of DNA sequence, dispersed throughout the long arm of human chromosome 21, have been annotated computationally and experimentally. Exon prediction with four programs, protein and EST database searches, two-sequence BLAST searches and CpG island characterization identified 41 genes with known or new protein homologies. Features of these genes suggested criteria for prediction of novel genes (those lacking any protein homology) with the following characteristics: (1) exon+EST genes: genes with excellent patterns of predicted exons and one or more matches in dbEST; (2) exon-EST genes: genes with good patterns of predicted exons and no matches in dbEST; (3) EST-exon genes: genes without any patterns of reliable exon prediction but with matches in dbEST; and (4) isolated CpG island genes: genes consisting of strong CpG islands that are apparently unique sequences and found in regions lacking any consistent exon predictions within >50 kb. In total, 41 novel gene models were predicted, and for a subset of these, RT-PCR experiments helped to verify and refine the models, and were used to assess expression in early development and in adult brain regions of potential relevance to Down syndrome. Results suggest generally low and/or restricted patterns of expression, and also reveal examples of complex alternative processing, especially in brain, that may have important implications for regulation of protein function. Analysis of complete gene structures of the known genes identified a number of very large introns, a number of very short intergenic distances, and at least one potentially bi-directional promoter. At least 3/4 of known genes and 1/2 of predicted genes are associated with CpG islands. For novel genes, three cases of overlapping genes are predicted. Results of these analyses illustrate some of the complexities inherent in mammalian genome organization and some of the limitations of current sequence analysis technologies. They also doubled the number of potential genes within the region. (C) 2000 Elsevier Science B.V. All rights reserved.

KW - Down syndrome

KW - Gene identification

KW - Genome organization

KW - Human chromosome 21

KW - Sequence analysis

UR - http://www.scopus.com/inward/record.url?scp=0034681998&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034681998&partnerID=8YFLogxK

U2 - 10.1016/S0378-1119(00)00089-5

DO - 10.1016/S0378-1119(00)00089-5

M3 - Article

VL - 247

SP - 215

EP - 232

JO - Gene

JF - Gene

SN - 0378-1119

IS - 1-2

ER -