Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Simple Linear Cancer Risk Prediction Models With Novel Features Outperform Complex Approaches

Simple Linear Cancer Risk Prediction Models With Novel Features Outperform Complex Approaches PURPOSEThe ability to accurately predict an individual's risk for cancer is critical to the implementation of precision prevention measures. Current cancer risk predictions are frequently made with simple models that use a few proven risk factors, such as the Gail model for breast cancer, which are easy to interpret, but may theoretically be less accurate than advanced machine learning (ML) models.METHODSWith the UK Biobank, a large prospective study, we developed models that predicted 13 cancer diagnoses within a 10-year time span. ML and linear models fit with all features, linear models fit with 10 features, and externally developed QCancer models, which are available to more than 4,000 general practices, were assessed.RESULTSThe average area under the receiver operator curve (AUC) of the linear models (0.722, SE = 0.015) was greater than the average AUC of the ML models (0.720, SE = 0.016) when all 931 features were used. Linear models with only 10 features generated an average AUC of 0.706 (SE 0.015), which was comparable to the complex models using all features and greater than the average AUC of the QCancer models (0.684, SE 0.021). The high performance of the 10-feature linear model may be caused by the consideration of often omitted feature types, including census records and genetic information.CONCLUSIONThe high performance of the 10-feature linear models indicate that unbiased selection of diverse features, not ML models, may lead to impressively accurate predictions, possibly enabling personalized screening schedules that increase cancer survival. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JCO: Clinical Cancer Informatics Wolters Kluwer Health

Simple Linear Cancer Risk Prediction Models With Novel Features Outperform Complex Approaches

Loading next page...
 
/lp/wolters-kluwer-health/simple-linear-cancer-risk-prediction-models-with-novel-features-5x4AyL0Jka
Publisher
Wolters Kluwer Health
Copyright
© 2022 by American Society of Clinical Oncology
eISSN
2473-4276
DOI
10.1200/cci.21.00166
Publisher site
See Article on Publisher Site

Abstract

PURPOSEThe ability to accurately predict an individual's risk for cancer is critical to the implementation of precision prevention measures. Current cancer risk predictions are frequently made with simple models that use a few proven risk factors, such as the Gail model for breast cancer, which are easy to interpret, but may theoretically be less accurate than advanced machine learning (ML) models.METHODSWith the UK Biobank, a large prospective study, we developed models that predicted 13 cancer diagnoses within a 10-year time span. ML and linear models fit with all features, linear models fit with 10 features, and externally developed QCancer models, which are available to more than 4,000 general practices, were assessed.RESULTSThe average area under the receiver operator curve (AUC) of the linear models (0.722, SE = 0.015) was greater than the average AUC of the ML models (0.720, SE = 0.016) when all 931 features were used. Linear models with only 10 features generated an average AUC of 0.706 (SE 0.015), which was comparable to the complex models using all features and greater than the average AUC of the QCancer models (0.684, SE 0.021). The high performance of the 10-feature linear model may be caused by the consideration of often omitted feature types, including census records and genetic information.CONCLUSIONThe high performance of the 10-feature linear models indicate that unbiased selection of diverse features, not ML models, may lead to impressively accurate predictions, possibly enabling personalized screening schedules that increase cancer survival.

Journal

JCO: Clinical Cancer InformaticsWolters Kluwer Health

Published: Mar 3, 2022

References