In this work we introduce a new method for protein coding regions identification
with TBP, based on a newly transform, called Modified Morlet Transform (MMT),
which does not need to be trained on sequences databases. We use a fixed binary
mapping rules to create four binary sequences. Where each one represents the
positions of each nitrogenate base in DNA sequence. Next the MMT, with
different scales is applied to all binary
sequences. The module of each normalized coefficient is projected onto the
position axis. Projection onto the scale axis reveal which scale carry more
signal energy throughout the positions. The result of the projection position
axis represents the protein coding region identificator. These projection
coefficients correspond to regions with TBP. Thus, we use thresholding
coefficients, to exclude
positions where the associated energy is lower.
The performance of the proposed transform was examined by analyzing synthetic
and real DNA sequences. Preliminary results show that MMT is
better than traditional methods by presenting greater sensitivity to TBP
and discriminatory capability between protein coding regions.
This work is a principal part of my
master's thesis.
