TY - JOUR
T1 - Diversification of CpG-Island Promoters Revealed by Comparative Analysis Between Human and Rhesus Monkey Genomes
AU - Aoto, Saki
AU - Fushimi, Mayu
AU - Yura, Kei
AU - Okamura, Kohji
N1 - Funding Information:
MF was supported by Project Encouraging Science Undergraduates by MEXT and Ochanomizu University when she was an undergraduate student. The computing resources were mainly provided by cluster of Hitachi HA8000/RS210 at the Center for Regenerative Medicine, National Research Institute for Child Health and Development.
Funding Information:
This work was supported by NCCHD (Grant Number 2019C-15 to SA) and KAKENHI (Grant Number 23770273 to KO). Acknowledgments
Funding Information:
MF was supported by Project Encouraging Science Undergraduates by MEXT and Ochanomizu University when she was an undergraduate student. The computing resources were mainly provided by cluster of Hitachi HA8000/RS210 at the Center for Regenerative Medicine, National Research Institute for Child Health and Development.
Publisher Copyright:
© 2020, The Author(s).
PY - 2020/8/1
Y1 - 2020/8/1
N2 - While CpG dinucleotides are significantly reduced compared to other dinucleotides in mammalian genomes, they can congregate and form CpG islands, which localize around the 5ʹ regions of genes, where they function as promoters. CpG-island promoters are generally unmethylated and are often found in housekeeping genes. However, their nucleotide sequences and existence per se are not conserved between humans and mice, which may be due to evolutionary gain and loss of the regulatory regions. In this study, human and rhesus monkey genomes, with moderately conserved sequences, were compared at base resolution. Using transcription start site data, we first validated our methods’ ability to identify orthologous promoters and indicated a limitation using the 5ʹ end of curated gene models, such as NCBI RefSeq, as their transcription start sites. We found that, in addition to deamination mutations, insertions and deletions of bases, repeats, and long fragments contributed to the mutations of CpG dinucleotides. We also observed that the G + C contents tended to change in CpG-poor environments, while CpG content was altered in G + C-rich environments. While loss of CpG islands can be caused by gradual decreases in CpG sites, gain of these islands appear to require two distinct nucleotide altering steps. Taken together, our findings provide novel insights into the process of acquisition and diversification of CpG-island promoters in vertebrates.
AB - While CpG dinucleotides are significantly reduced compared to other dinucleotides in mammalian genomes, they can congregate and form CpG islands, which localize around the 5ʹ regions of genes, where they function as promoters. CpG-island promoters are generally unmethylated and are often found in housekeeping genes. However, their nucleotide sequences and existence per se are not conserved between humans and mice, which may be due to evolutionary gain and loss of the regulatory regions. In this study, human and rhesus monkey genomes, with moderately conserved sequences, were compared at base resolution. Using transcription start site data, we first validated our methods’ ability to identify orthologous promoters and indicated a limitation using the 5ʹ end of curated gene models, such as NCBI RefSeq, as their transcription start sites. We found that, in addition to deamination mutations, insertions and deletions of bases, repeats, and long fragments contributed to the mutations of CpG dinucleotides. We also observed that the G + C contents tended to change in CpG-poor environments, while CpG content was altered in G + C-rich environments. While loss of CpG islands can be caused by gradual decreases in CpG sites, gain of these islands appear to require two distinct nucleotide altering steps. Taken together, our findings provide novel insights into the process of acquisition and diversification of CpG-island promoters in vertebrates.
UR - http://www.scopus.com/inward/record.url?scp=85087722549&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087722549&partnerID=8YFLogxK
U2 - 10.1007/s00335-020-09844-2
DO - 10.1007/s00335-020-09844-2
M3 - Article
C2 - 32647942
AN - SCOPUS:85087722549
SN - 0938-8990
VL - 31
SP - 240
EP - 251
JO - Mammalian Genome
JF - Mammalian Genome
IS - 7-8
ER -