Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Bayesian variable selection for gene expression modeling with regulatory motif binding sites in neuroinflammatory events

Bayesian variable selection for gene expression modeling with regulatory motif binding sites in... Multiple transcription factors (TFs) coordinately control transcriptional regulation of genes in eukaryotes. Although numerous computational methods focus on the identification of individual TF-binding sites (TFBSs), very few consider the interdependence among these sites. In this article, we studied the relationship between TFBSs and microarray gene expression levels using both family-wise and member-specific motifs, under various combination of regression models with Bayesian variable selection, as well as motif scoring and sharing conditions, in order to account for the coordination complexity of transcription regulation. We proposed a three-step approach to model the relationship. In the first step, we preprocessed microarray data and usedp-values and expression ratios to preselect upregulated and down-regulated genes. The second step aimed to identify and score individual TFBSs within DNA sequence of each gene. A method based on the degree of similarity and the number of TFBSs was employed to calculate the score of each TFBS in each gene sequence. In the last step, linear regression and probit regression were used to build a predictive model of gene expression outcomes using these TFBSs as predictors. Given a certain number of predictors to be used, a full search of all possible predictor sets is usually combinatorially prohibitive. Therefore, this article considered the Bayesian variable selection for prediction using either of the regression models. The Bayesian variable selection applied in the context of gene selection, missing value estimation, and regulatory motif identification. In our modeling, the regressor was approximated as a linear combination of the TFBSs and a Gibbs sampler was employed to find the strongest TFBSs. We applied these regression models with the Bayesian variable selection on spinal cord injury gene expression data set. These TFs demonstrated intricate regulatory roles either as a family or as individual members in neuroinflammatory events. Our an alysis can be applied to create plausible hypotheses for combinatorial regulation by TFBSs and avoiding false-positive candidates in the modeling process at the same time. Such a systematic approach provides the possibility to dissect transcription regulation, from a more comprehensive perspective, through which phenotypical events at cellular and tissue levels are moved forward by molecular events at gene transcription and translation levels. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Neuroinformatics Springer Journals

Bayesian variable selection for gene expression modeling with regulatory motif binding sites in neuroinflammatory events

Loading next page...
 
/lp/springer-journals/bayesian-variable-selection-for-gene-expression-modeling-with-MRqDF0IshJ
Publisher
Springer Journals
Copyright
Copyright © 2006 by Humana Press Inc
Subject
Chemistry; Biotechnology; Engineering, general; Neurology
ISSN
1539-2791
eISSN
1559-0089
DOI
10.1385/NI:4:1:95
pmid
16595861
Publisher site
See Article on Publisher Site

Abstract

Multiple transcription factors (TFs) coordinately control transcriptional regulation of genes in eukaryotes. Although numerous computational methods focus on the identification of individual TF-binding sites (TFBSs), very few consider the interdependence among these sites. In this article, we studied the relationship between TFBSs and microarray gene expression levels using both family-wise and member-specific motifs, under various combination of regression models with Bayesian variable selection, as well as motif scoring and sharing conditions, in order to account for the coordination complexity of transcription regulation. We proposed a three-step approach to model the relationship. In the first step, we preprocessed microarray data and usedp-values and expression ratios to preselect upregulated and down-regulated genes. The second step aimed to identify and score individual TFBSs within DNA sequence of each gene. A method based on the degree of similarity and the number of TFBSs was employed to calculate the score of each TFBS in each gene sequence. In the last step, linear regression and probit regression were used to build a predictive model of gene expression outcomes using these TFBSs as predictors. Given a certain number of predictors to be used, a full search of all possible predictor sets is usually combinatorially prohibitive. Therefore, this article considered the Bayesian variable selection for prediction using either of the regression models. The Bayesian variable selection applied in the context of gene selection, missing value estimation, and regulatory motif identification. In our modeling, the regressor was approximated as a linear combination of the TFBSs and a Gibbs sampler was employed to find the strongest TFBSs. We applied these regression models with the Bayesian variable selection on spinal cord injury gene expression data set. These TFs demonstrated intricate regulatory roles either as a family or as individual members in neuroinflammatory events. Our an alysis can be applied to create plausible hypotheses for combinatorial regulation by TFBSs and avoiding false-positive candidates in the modeling process at the same time. Such a systematic approach provides the possibility to dissect transcription regulation, from a more comprehensive perspective, through which phenotypical events at cellular and tissue levels are moved forward by molecular events at gene transcription and translation levels.

Journal

NeuroinformaticsSpringer Journals

Published: Apr 11, 2007

References