Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Effect of high variation in transcript expression on identifying differentially expressed genes in RNA‐seq analysis

Effect of high variation in transcript expression on identifying differentially expressed genes... Great efforts have been made on the algorithms that deal with RNA‐seq data to enhance the accuracy and efficiency of differential expression (DE) analysis. However, no consensus has been reached on the proper threshold values of fold change and adjusted p‐value for filtering differentially expressed genes (DEGs). It is generally believed that the more stringent the filtering threshold, the more reliable the result of a DE analysis. Nevertheless, by analyzing the impact of both adjusted p‐value and fold change thresholds on DE analyses, with RNA‐seq data obtained for three different cancer types from the Cancer Genome Atlas (TCGA) database, we found that, for a given sample size, the reproducibility of DE results became poorer when more stringent thresholds were applied. No matter which threshold level was applied, the overlap rates of DEGs were generally lower for small sample sizes than for large sample sizes. The raw read count analysis demonstrated that the transcript expression of the same gene in different samples, whether in tumor groups or in normal groups, showed high variations, which resulted in a drastic fluctuation in fold change values and adjustedp‐values when different sets of samples were used. Overall, more stringent thresholds did not yield more reliable DEGs due to high variations in transcript expression; the reliability of DEGs obtained with small sample sizes was more susceptible to these variations. Therefore, less stringent thresholds are recommended for screening DEGs. Moreover, large sample sizes should be considered in RNA‐seq experimental designs to reduce the interfering effect of variations in transcript expression on DEG identification. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Annals of Human Genetics Wiley

Effect of high variation in transcript expression on identifying differentially expressed genes in RNA‐seq analysis

Loading next page...
 
/lp/wiley/effect-of-high-variation-in-transcript-expression-on-identifying-83mikbevg0

References (43)

Publisher
Wiley
Copyright
© 2021 John Wiley & Sons Ltd/University College London
ISSN
0003-4800
eISSN
1469-1809
DOI
10.1111/ahg.12441
Publisher site
See Article on Publisher Site

Abstract

Great efforts have been made on the algorithms that deal with RNA‐seq data to enhance the accuracy and efficiency of differential expression (DE) analysis. However, no consensus has been reached on the proper threshold values of fold change and adjusted p‐value for filtering differentially expressed genes (DEGs). It is generally believed that the more stringent the filtering threshold, the more reliable the result of a DE analysis. Nevertheless, by analyzing the impact of both adjusted p‐value and fold change thresholds on DE analyses, with RNA‐seq data obtained for three different cancer types from the Cancer Genome Atlas (TCGA) database, we found that, for a given sample size, the reproducibility of DE results became poorer when more stringent thresholds were applied. No matter which threshold level was applied, the overlap rates of DEGs were generally lower for small sample sizes than for large sample sizes. The raw read count analysis demonstrated that the transcript expression of the same gene in different samples, whether in tumor groups or in normal groups, showed high variations, which resulted in a drastic fluctuation in fold change values and adjustedp‐values when different sets of samples were used. Overall, more stringent thresholds did not yield more reliable DEGs due to high variations in transcript expression; the reliability of DEGs obtained with small sample sizes was more susceptible to these variations. Therefore, less stringent thresholds are recommended for screening DEGs. Moreover, large sample sizes should be considered in RNA‐seq experimental designs to reduce the interfering effect of variations in transcript expression on DEG identification.

Journal

Annals of Human GeneticsWiley

Published: Nov 1, 2021

Keywords: Differential expression; false discovery rate; fold change; RNA‐seq; sample size; threshold

There are no references for this article.