Motivation: Transcription factors (TFs) are gene regulatory proteins that are essential for an effective regulation of the transcriptional machinery. It is still a challenging bioinformatic task to identify in a set of genes those functionally important important TFs that explain observed co-expression patterns. Despite the rich literature on the identification of over-represented TF binding motifs (TFBSs) in the promoter regions of a gene set, the performance of conventional methods strongly depends on the chosen background random set and on other parameters which have to be specified for a gene set. In this study, we present a new method, TF-Spiker, that exploits the distributions of individual TFBSs in the promoter regions of co-expressed genes to identify functionally important TFs for those genes.
Results: We tested the effectiveness of TF-Spiker using two sets of co-expressed genes, a breast cancer gene set and an NF-kappaB gene set. Our results suggest that the TFs that were found by TF-Spiker play functional roles in the regulatory syntax of these gene sets. To further analyze the results of TF-Spiker, we made a cross comparison between TF-Spiker and two existing conventional methods. TF-Spiker identified new TFs in both datasets, which were not found by either of the conventional methods although they are functionally important in the context of breast cancer or the NF-kappaB pathway.
Please use the following link to get the supplementary data used in the paper: [pdf]
Global Count Matrix
- Global Count Matrix: [.zip archive]
Breast Cancer data set
NF-kappaB data set
- Executable jar archive: [jar]
- This program takes a gene list and a TFBS-based count matrix as input and outputs a list of significant TRANSFAC PWMs along their connectivity degrees.
- Usage: java -jar TFSpiker.jar [gene-list-filename] [TFBS-based-count-matrix] [output-filename]
That might also interest you