Abstract
The presence of similar patterns in regulatory sequences may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between regulatory sequences. We employed it within three clustering algorithms: hierarchical clustering, Self-Organising Map, and a self-adaptive neural network. The results indicate that, in comparison to traditional clustering algorithms, the incorporation of the log likelihood ratio statistics-based distance into the learning process may offer considerable improvements in the process of regulatory sequence-based classification of genes.
Original language | English |
---|---|
Pages (from-to) | 141-157 |
Number of pages | 17 |
Journal | International journal of computational biology and drug design |
Volume | 1 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2008 Jan 1 |
Keywords
- Poisson distribution
- hierarchical clustering
- log likelihood ratio statistics
- neural networks
- regulatory sequence
ASJC Scopus subject areas
- Drug Discovery
- Computer Science Applications