Using Sequence-Specific Chemical and Structural Properties of DNA to Predict Transcription Factor Binding Sites.
Bauer AL, Hlavacek WS, Unkefer PJ, Mu F (2010)
PLoS Comput Biol 6(11): e1001007. doi:10.1371/journal.pcbi.1001007
An important step in characterizing the genetic regulatory network of a cell is to identify the DNA binding sites recognized by each transcription factor (TF) protein encoded in the genome. Current computational approaches to TF binding site prediction rely exclusively on DNA sequence analysis. In this manuscript, we present a novel method called SiteSleuth, in which classifiers are trained to discriminate between true and false binding sites based on the sequence-specific chemical and structural features of DNA. According to cross-validation analysis and a comparison of computational predictions against ChIP-chip data available for the TF Fis, SiteSleuth predicts fewer estimated false positives than any of four other methods considered. A better understanding of gene regulation, which plays a central role in cellular responses to environmental changes, is a key to manipulating cellular behavior for a variety of useful purposes, as in metabolic engineering applications.

