An updated functional annotation of protein-coding genes in the cucumber genome

Hongtao Song, Kui Lin, Takayuki Furuzuki, Erli Pang

Research output: Contribution to journalArticle

Abstract

Background: Although the cucumber reference genome and its annotation were published several years ago, the functional annotation of predicted genes, particularly protein-coding genes, still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better and more robust functional assignments of predicted genes. As one of the most reliable strategies, the determination of collinearity information may facilitate reliable orthology inferences among genes from multiple related genomes. Currently, the identification of collinear segments has mainly been based on conservation of gene order and orientation. Over the course of plant genome evolution, various evolutionary events have disrupted or distorted the order of genes along chromosomes, making it difficult to use those genes as genome-wide markers for plant genome comparisons. Results: Using the localized LASTZ/MULTIZ analysis pipeline, we aligned 15 genomes, including cucumber and other related angiospermplants, and identified a set of genomic segments that are short in length, stable in structure, uniform in distribution and highly conserved across all 15 plants. Compared with protein-coding genes, these conserved segments were more suitable for use as genomic markers for detecting collinear segments among distantly divergent plants. Guided by this set of identified collinear genomic segments, we inferred 94,486 orthologous protein-coding gene pairs (OPPs) between cucumber and 14 other angiosperm species, which were used as proxies for transferring functional terms to cucumber genes from the annotations of the other 14 genomes. In total, 10,885 protein-coding genes were assigned Gene Ontology (GO) terms which was nearly 1,300 more than results collectedin Uniprot-proteomic database. Our results showed that annotation accuracy would been improved compared with other existing approaches. Conclusions: In this study, we provided an alternative resource for the functional annotation of predicted cucumber protein-coding genes, which we expect will be beneficial for the cucumber’s biological study, accessible from http://cmb.bnu.edu.cn/ functional_annotation. Meanwhile, using the cucumber reference genome as a case study, we presented an efficient strategy for transferring gene functional information from previously well-characterizedprotein-coding genes in model species to newly sequenced or “non-model” plant species.

Original languageEnglish
Article number325
JournalFrontiers in Plant Science
Volume9
DOIs
Publication statusPublished - 2018 Mar 15

Fingerprint

cucumbers
genome
genes
proteins
genomics
proteomics
Angiospermae

Keywords

  • Collinear segments
  • Cucumber
  • Gene functional annotation
  • Orthology
  • Protein-coding gene

ASJC Scopus subject areas

  • Plant Science

Cite this

An updated functional annotation of protein-coding genes in the cucumber genome. / Song, Hongtao; Lin, Kui; Furuzuki, Takayuki; Pang, Erli.

In: Frontiers in Plant Science, Vol. 9, 325, 15.03.2018.

Research output: Contribution to journalArticle

@article{d01aa5df38e84d1abd1c4f09766cf131,
title = "An updated functional annotation of protein-coding genes in the cucumber genome",
abstract = "Background: Although the cucumber reference genome and its annotation were published several years ago, the functional annotation of predicted genes, particularly protein-coding genes, still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better and more robust functional assignments of predicted genes. As one of the most reliable strategies, the determination of collinearity information may facilitate reliable orthology inferences among genes from multiple related genomes. Currently, the identification of collinear segments has mainly been based on conservation of gene order and orientation. Over the course of plant genome evolution, various evolutionary events have disrupted or distorted the order of genes along chromosomes, making it difficult to use those genes as genome-wide markers for plant genome comparisons. Results: Using the localized LASTZ/MULTIZ analysis pipeline, we aligned 15 genomes, including cucumber and other related angiospermplants, and identified a set of genomic segments that are short in length, stable in structure, uniform in distribution and highly conserved across all 15 plants. Compared with protein-coding genes, these conserved segments were more suitable for use as genomic markers for detecting collinear segments among distantly divergent plants. Guided by this set of identified collinear genomic segments, we inferred 94,486 orthologous protein-coding gene pairs (OPPs) between cucumber and 14 other angiosperm species, which were used as proxies for transferring functional terms to cucumber genes from the annotations of the other 14 genomes. In total, 10,885 protein-coding genes were assigned Gene Ontology (GO) terms which was nearly 1,300 more than results collectedin Uniprot-proteomic database. Our results showed that annotation accuracy would been improved compared with other existing approaches. Conclusions: In this study, we provided an alternative resource for the functional annotation of predicted cucumber protein-coding genes, which we expect will be beneficial for the cucumber’s biological study, accessible from http://cmb.bnu.edu.cn/ functional_annotation. Meanwhile, using the cucumber reference genome as a case study, we presented an efficient strategy for transferring gene functional information from previously well-characterizedprotein-coding genes in model species to newly sequenced or “non-model” plant species.",
keywords = "Collinear segments, Cucumber, Gene functional annotation, Orthology, Protein-coding gene",
author = "Hongtao Song and Kui Lin and Takayuki Furuzuki and Erli Pang",
year = "2018",
month = "3",
day = "15",
doi = "10.3389/fpls.2018.00325",
language = "English",
volume = "9",
journal = "Frontiers in Plant Science",
issn = "1664-462X",
publisher = "Frontiers Media S. A.",

}

TY - JOUR

T1 - An updated functional annotation of protein-coding genes in the cucumber genome

AU - Song, Hongtao

AU - Lin, Kui

AU - Furuzuki, Takayuki

AU - Pang, Erli

PY - 2018/3/15

Y1 - 2018/3/15

N2 - Background: Although the cucumber reference genome and its annotation were published several years ago, the functional annotation of predicted genes, particularly protein-coding genes, still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better and more robust functional assignments of predicted genes. As one of the most reliable strategies, the determination of collinearity information may facilitate reliable orthology inferences among genes from multiple related genomes. Currently, the identification of collinear segments has mainly been based on conservation of gene order and orientation. Over the course of plant genome evolution, various evolutionary events have disrupted or distorted the order of genes along chromosomes, making it difficult to use those genes as genome-wide markers for plant genome comparisons. Results: Using the localized LASTZ/MULTIZ analysis pipeline, we aligned 15 genomes, including cucumber and other related angiospermplants, and identified a set of genomic segments that are short in length, stable in structure, uniform in distribution and highly conserved across all 15 plants. Compared with protein-coding genes, these conserved segments were more suitable for use as genomic markers for detecting collinear segments among distantly divergent plants. Guided by this set of identified collinear genomic segments, we inferred 94,486 orthologous protein-coding gene pairs (OPPs) between cucumber and 14 other angiosperm species, which were used as proxies for transferring functional terms to cucumber genes from the annotations of the other 14 genomes. In total, 10,885 protein-coding genes were assigned Gene Ontology (GO) terms which was nearly 1,300 more than results collectedin Uniprot-proteomic database. Our results showed that annotation accuracy would been improved compared with other existing approaches. Conclusions: In this study, we provided an alternative resource for the functional annotation of predicted cucumber protein-coding genes, which we expect will be beneficial for the cucumber’s biological study, accessible from http://cmb.bnu.edu.cn/ functional_annotation. Meanwhile, using the cucumber reference genome as a case study, we presented an efficient strategy for transferring gene functional information from previously well-characterizedprotein-coding genes in model species to newly sequenced or “non-model” plant species.

AB - Background: Although the cucumber reference genome and its annotation were published several years ago, the functional annotation of predicted genes, particularly protein-coding genes, still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better and more robust functional assignments of predicted genes. As one of the most reliable strategies, the determination of collinearity information may facilitate reliable orthology inferences among genes from multiple related genomes. Currently, the identification of collinear segments has mainly been based on conservation of gene order and orientation. Over the course of plant genome evolution, various evolutionary events have disrupted or distorted the order of genes along chromosomes, making it difficult to use those genes as genome-wide markers for plant genome comparisons. Results: Using the localized LASTZ/MULTIZ analysis pipeline, we aligned 15 genomes, including cucumber and other related angiospermplants, and identified a set of genomic segments that are short in length, stable in structure, uniform in distribution and highly conserved across all 15 plants. Compared with protein-coding genes, these conserved segments were more suitable for use as genomic markers for detecting collinear segments among distantly divergent plants. Guided by this set of identified collinear genomic segments, we inferred 94,486 orthologous protein-coding gene pairs (OPPs) between cucumber and 14 other angiosperm species, which were used as proxies for transferring functional terms to cucumber genes from the annotations of the other 14 genomes. In total, 10,885 protein-coding genes were assigned Gene Ontology (GO) terms which was nearly 1,300 more than results collectedin Uniprot-proteomic database. Our results showed that annotation accuracy would been improved compared with other existing approaches. Conclusions: In this study, we provided an alternative resource for the functional annotation of predicted cucumber protein-coding genes, which we expect will be beneficial for the cucumber’s biological study, accessible from http://cmb.bnu.edu.cn/ functional_annotation. Meanwhile, using the cucumber reference genome as a case study, we presented an efficient strategy for transferring gene functional information from previously well-characterizedprotein-coding genes in model species to newly sequenced or “non-model” plant species.

KW - Collinear segments

KW - Cucumber

KW - Gene functional annotation

KW - Orthology

KW - Protein-coding gene

UR - http://www.scopus.com/inward/record.url?scp=85045440026&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045440026&partnerID=8YFLogxK

U2 - 10.3389/fpls.2018.00325

DO - 10.3389/fpls.2018.00325

M3 - Article

VL - 9

JO - Frontiers in Plant Science

JF - Frontiers in Plant Science

SN - 1664-462X

M1 - 325

ER -