Access the full text.
Sign up today, get DeepDyve free for 14 days.
G. Carruba (2007)
Estrogen and prostate cancer: An eclipsed truth in an androgen‐dominated scenarioJournal of Cellular Biochemistry, 102
S. Ellem, G. Risbridger (2010)
Aromatase and regulating the estrogen:androgen ratio in the prostate glandThe Journal of Steroid Biochemistry and Molecular Biology, 118
M. Nothnagel, D. Ellinghaus, S. Schreiber, M. Krawczak, A. Franke (2009)
A comprehensive evaluation of SNP genotype imputationHuman Genetics, 125
Hui-Yi Lin, Wenquan Wang, Yung-Hsin Liu, S. Soong, T. York, L. Myers, Jennifer Hu (2008)
Comparison of multivariate adaptive regression splines and logistic regression in detecting SNP-SNP interactions and their application in prostate cancerJournal of Human Genetics, 53
J. Moore (2003)
The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human DiseasesHuman Heredity, 56
R. Díaz-Uriarte (2007)
GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forestBMC Bioinformatics, 8
M. Nelson, S. Kardia, R. Ferrell, C. Sing (2001)
A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation.Genome research, 11 3
S. Purcell, B. Neale, K. Todd-Brown, L. Thomas, M. Ferreira, David Bender, J. Maller, P. Sklar, P. Bakker, M. Daly, P. Sham (2007)
PLINK: a tool set for whole-genome association and population-based linkage analyses.American journal of human genetics, 81 3
C. Kooperberg, J. Bis, K. Marciante, S. Heckbert, T. Lumley, B. Psaty (2006)
Logic regression for analysis of the association between genetic variation in the renin-angiotensin system and myocardial infarction or stroke.American journal of epidemiology, 165 3
(2010)
Package ‘varSelRF’, http://ligarto.org/ rdiaz/Software/Software.html
L. Lobel, P. Geurts, G. Baele, F. Castro-Giner, M. Kogevinas, K. Steen (2010)
A screening methodology based on Random Forests to improve the detection of gene–gene interactionsEuropean Journal of Human Genetics, 18
S. Naylor (2007)
SNPs associated with prostate cancer risk and prognosis.Frontiers in bioscience : a journal and virtual library, 12
N. Cook, R. Zee, P. Ridker (2004)
Tree and spline based association analysis of gene–gene interaction models for ischemic strokeStatistics in Medicine, 23
R. Díaz-Uriarte, S. Andrés (2006)
Gene selection and classification of microarray data using random forestBMC Bioinformatics, 7
Sunshin Kim, Jung-A. Pyun, Hyunjun Kang, Jihye Kim, D. Cha, K. Kwack (2011)
Epistasis between CYP19A1 and ESR1 polymorphisms is associated with premature ovarian failure.Fertility and sterility, 95 1
J. Beuten, J. Gelfond, Jennifer Franke, K. Weldon, AnaLisa Crandall, T. Johnson-Pais, I. Thompson, R. Leach (2009)
Single and Multigenic Analysis of the Association between Variants in 12 Steroid Hormone Metabolism Genes and Risk of Prostate CancerCancer Epidemiology Biomarkers & Prevention, 18
M. Ritchie, L. Hahn, Nady Roodi, L. Bailey, W. Dupont, F. Parl, J. Moore (2001)
Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer.American journal of human genetics, 69 1
J. Zabaleta, Hui-Yi Lin, Rosa Sierra, M. Hall, P. Clark, O. Sartor, Jennifer Hu, A. Ochoa (2007)
Interactions of cytokine gene polymorphisms in prostate cancer risk.Carcinogenesis, 29 3
F. Briggs, B. Goldstein, J. McCauley, R. Zuvich, P. Jager, J. Rioux, A. Ivinson, Alastair Compston, D. Hafler, S. Hauser, J. Oksenberg, S. Sawcer, M. Pericak-Vance, J. Haines, L. Barcellos (2010)
Variation within DNA repair pathway genes and risk of multiple sclerosis.American journal of epidemiology, 172 2
Bryan Howie, P. Donnelly, J. Marchini (2009)
A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association StudiesPLoS Genetics, 5
Kathryn Lunetta, L. Hayward, J. Segal, P. Eerdewegh (2004)
Screening large-scale association study data: exploiting interactions using random forestsBMC Genetics, 5
S. Shervais, P. Kramer, Shawn Westaway, Nancy Cox, M. Zwick (2010)
Reconstructability Analysis as a Tool for Identifying Gene-Gene Interactions in Studies of Human DiseasesStatistical Applications in Genetics and Molecular Biology, 9
M. McIntyre, P. Kantoff, M. Stampfer, L. Mucci, D. Parslow, Haojie Li, J. Gaziano, Miyako Abe, Jing Ma (2007)
Prostate Cancer Risk and ESR1 TA, ESR2 CA Repeat PolymorphismsCancer Epidemiology Biomarkers & Prevention, 16
N. Speybroeck (2012)
Classification and regression treesInternational Journal of Public Health, 57
X. Wan, Can Yang, Qiang Yang, H. Xue, Xiaodan Fan, N. Tang, Weichuan Yu (2010)
BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studiesAmerican journal of human genetics, 87 3
P. Bakker, R. Yelensky, I. Pe’er, S. Gabriel, M. Daly, D. Altshuler (2005)
Efficiency and power in genetic association studiesNature Genetics, 37
N. Nicolaiew, G. Cancel-Tassin, A. Azzouzi, Beatrice Grand, P. Mangin, L. Cormier, G. Fournier, J. Giordanella, M. Pouchard, J. Escary, A. Valéri, O. Cussenot (2009)
Association between estrogen and androgen receptor genes and prostate cancer risk.European journal of endocrinology, 160 1
B. Goldstein, A. Hubbard, Adele Cutler, L. Barcellos (2010)
An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findingsBMC Genetics, 11
L. Breiman (2001)
Random ForestsMachine Learning, 45
J. Zabaleta, L. Su, Hui-Yi Lin, Rosa Sierra, M. Hall, A. Sartor, P. Clark, Jennifer Hu, A. Ochoa (2009)
Cytokine genetic polymorphisms and prostate cancer aggressiveness.Carcinogenesis, 30 8
Y. Meng, Qiong Yang, K. Cuenco, L. Cupples, A. Destefano, K. Lunetta, Qiong, L. Adrienne, Anita L, J. Cordell, M. Andrade, M. Babron, C. Bartlett, J. Beyene, H. Bickeböller, R. Culverhouse, A. Cupples, Warwick Daw, J. Dupuis, C. Falk, Saurabh Ghosh, K. Goddard, E. Goode, Elizabeth Hauser, L. Martin, Maria Martinez, K. North, N. Saccone, S. Schmidt, W. Tapper, Duncan Thomas, D. Tritchler, V. Vieland, E. Wijsman, M. Wilcox, J. Witte, Andreas Ziegler, L. Almasy, J. W. (2007)
Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networksBMC Proceedings, 1
(2010)
PLINK(version 1.07): http://pngu.mgh. harvard.edu/purcell/plink
Yuichiro Tanaka, M. Sasaki, M. Kaneuchi, H. Shiina, M. Igawa, R. Dahiya (2003)
Polymorphisms of estrogen receptor alpha in prostate cancerMolecular Carcinogenesis, 37
Y. Benjamini, Y. Hochberg (1995)
Controlling the false discovery rate: a practical and powerful approach to multiple testingJournal of the royal statistical society series b-methodological, 57
M. Yeager, N. Orr, R. Hayes, K. Jacobs, P. Kraft, S. Wacholder, Mark Minichiello, P. Fearnhead, Kai Yu, N. Chatterjee, Zhaoming Wang, R. Welch, B. Staats, E. Calle, H. Feigelson, M. Thun, C. Rodríguez, D. Albanes, J. Virtamo, S. Weinstein, F. Schumacher, E. Giovannucci, W. Willett, G. Cancel-Tassin, O. Cussenot, A. Valéri, G. Andriole, E. Gelmann, M. Tucker, D. Gerhard, J. Fraumeni, R. Hoover, D. Hunter, S. Chanock, G. Thomas (2007)
Genome-wide association study of prostate cancer identifies a second risk locus at 8q24Nature Genetics, 39
K. Nicodemus, J. Malley (2009)
Predictor correlation impacts machine learning algorithms: implications for genomic studiesBioinformatics, 25 15
R. Veaux, D. Psichogios, L. Ungar (1993)
A comparison of two nonparametric estimation schemes: MARS and neural networksComputers & Chemical Engineering, 17
Andy Liaw, M. Wiener (2007)
Classification and Regression by randomForest
(2008)
Application of two machine learning algorithms to genetic association studies in the presence of covariates
J. Friedman, Charles Roosen (1995)
An introduction to multivariate adaptive regression splinesStatistical Methods in Medical Research, 4
(2010)
Breiman and Cutler ’ s random forests for classification and regression
Studies have shown that interactions of single nucleotide polymorphisms (SNPs) may play an important role in understanding the causes of complex disease. We have proposed an integrated machine learning method that combines two machine‐learning methods—Random Forests (RF) and Multivariate Adaptive Regression Splines (MARS)—to identify a subset of important SNPs and detect interaction patterns more effectively and efficiently. In this two‐stage RF‐MARS (TRM) approach, RF is first applied to detect a predictive subset of SNPs, and then MARS is used to identify the interaction patterns. We evaluated the TRM performances in four models. RF variable selection was based on out‐of‐bag classification error rate (OOB) and variable important spectrum (IS). Our results support that RFOOB had better performance than MARS and RFIS in detecting important variables. This study demonstrates that TRMOOB, which is RFOOB plus MARS, has combined the strengths of RF and MARS in identifying SNP‐SNP interactions in a scenario of 100 candidate SNPs. TRMOOB had greater true positive rate and lower false positive rate compared with MARS, particularly for searching interactions with a strong association with the outcome. Therefore, the use of TRMOOB is favored for exploring SNP‐SNP interactions in a large‐scale genetic variation study.
Annals of Human Genetics – Wiley
Published: Jan 1, 2012
Keywords: ; ;
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.