TY - JOUR
T1 - Shape-based alignment of genomic landscapes in multi-scale resolution
AU - Ashida, Hiroki
AU - Asai, Kiyoshi
AU - Hamada, Michiaki
PY - 2012/8
Y1 - 2012/8
N2 - Due to dramatic advances in DNA technology, quantitative measures of annotation data can now be obtained in continuous coordinates across the entire genome, allowing various heterogeneous 'genomic landscapes' to emerge. Although much effort has been devoted to comparing DNA sequences, not much attention has been given to comparing these large quantities of data comprehensively. In this article, we introduce a method for rapidly detecting local regions that show high correlations between genomic landscapes. We overcame the size problem for genome-wide data by converting the data into series of symbols and then carrying out sequence alignment. We also decomposed the oscillation of the landscape data into different frequency bands before analysis, since the real genomic landscape is a mixture of embedded and confounded biological processes working at different scales in the cell nucleus. To verify the usefulness and generality of our method, we applied our approach to well investigated landscapes from the human genome, including several histone modifications. Furthermore, by applying our method to over 20 genomic landscapes in human and 12 in mouse, we found that DNA replication timing and the density of Alu insertions are highly correlated genome-wide in both species, even though the Alu elements have amplified independently in the two genomes. To our knowledge, this is the first method to align genomic landscapes at multiple scales according to their shape.
AB - Due to dramatic advances in DNA technology, quantitative measures of annotation data can now be obtained in continuous coordinates across the entire genome, allowing various heterogeneous 'genomic landscapes' to emerge. Although much effort has been devoted to comparing DNA sequences, not much attention has been given to comparing these large quantities of data comprehensively. In this article, we introduce a method for rapidly detecting local regions that show high correlations between genomic landscapes. We overcame the size problem for genome-wide data by converting the data into series of symbols and then carrying out sequence alignment. We also decomposed the oscillation of the landscape data into different frequency bands before analysis, since the real genomic landscape is a mixture of embedded and confounded biological processes working at different scales in the cell nucleus. To verify the usefulness and generality of our method, we applied our approach to well investigated landscapes from the human genome, including several histone modifications. Furthermore, by applying our method to over 20 genomic landscapes in human and 12 in mouse, we found that DNA replication timing and the density of Alu insertions are highly correlated genome-wide in both species, even though the Alu elements have amplified independently in the two genomes. To our knowledge, this is the first method to align genomic landscapes at multiple scales according to their shape.
UR - http://www.scopus.com/inward/record.url?scp=84864921303&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84864921303&partnerID=8YFLogxK
U2 - 10.1093/nar/gks354
DO - 10.1093/nar/gks354
M3 - Article
C2 - 22561376
AN - SCOPUS:84864921303
SN - 0305-1048
VL - 40
SP - 6435
EP - 6448
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 14
ER -