Using DNA duplex stability information to discover transcription factor binding sites
Raluca Gordân, Alexander J. Hartemink

Abstract

Transcription factor (TF) binding site discovery is an important step in understanding eukaryotic regulatory systems. Many computational tools have already been developed, but their success in detecting TF motifs is still limited. We believe one of the main reasons for the low accuracy of current methods is that they do not take into account the structural aspects of TF-DNA interaction. We have previously shown that knowledge about the structural class of the TF and information about nucleosome occupancy %chromatin organization can be used to improve motif discovery. Here we demonstrate the benefits of using information about the DNA double-helical stability for motif discovery. We notice that, in general, the energy needed to separate the DNA strands is higher at TF binding sites compared to random DNA sites. We then use this information to derive informative positional priors which we incorporate into a motif discovery algorithm. When applied to yeast ChIP-chip data, the new informative priors improve the performance of the motif finder by up to 52% compared to the widely-used uniform prior.


Supplementary files

Top scoring motifs learned by PRIORITY-U, PRIORITY-E, PRIORITY-D, PRIORITY-DE, AlignACE, MEME, MDscan, MEME_c, CONVERGE, and Kellis.

Comparison of above motifs with literature consensus. Note that while looking for matches, we consider both the literature consensus, and its reverse complement.


Supplementary figures