Transcription factor (TF) binding site discovery is an important step in understanding eukaryotic regulatory systems. Many computational tools have already been developed, but their success in detecting TF motifs is still limited. We believe one of the main reasons for the low accuracy of current methods is that they do not take into account the structural aspects of TF-DNA interaction. We have previously shown that knowledge about the structural class of the TF and information about nucleosome occupancy %chromatin organization can be used to improve motif discovery. Here we demonstrate the benefits of using information about the DNA double-helical stability for motif discovery. We notice that, in general, the energy needed to separate the DNA strands is higher at TF binding sites compared to random DNA sites. We then use this information to derive informative positional priors which we incorporate into a motif discovery algorithm. When applied to yeast ChIP-chip data, the new informative priors improve the performance of the motif finder by up to 52% compared to the widely-used uniform prior.
Top scoring motifs learned by PRIORITY-U, PRIORITY-E, PRIORITY-D, PRIORITY-DE, AlignACE, MEME, MDscan, MEME_c, CONVERGE, and Kellis.
Comparison
of above motifs with literature consensus. Note that while looking
for matches, we consider both the literature consensus, and its
reverse complement.