Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

How to Handle Health-Related Small Imbalanced Data in Machine Learning?

How to Handle Health-Related Small Imbalanced Data in Machine Learning? AbstractWhen discussing interpretable machine learning results, researchers need to compare them and check for reliability, especially for health-related data. The reason is the negative impact of wrong results on a person, such as in wrong prediction of cancer, incorrect assessment of the COVID-19 pandemic situation, or missing early screening of dyslexia. Often only small data exists for these complex interdisciplinary research projects. Hence, it is essential that this type of research understands different methodologies and mindsets such as the Design Science Methodology, Human-Centered Design or Data Science approaches to ensure interpretable and reliable results. Therefore, we present various recommendations and design considerations for experiments that help to avoid over-fitting and biased interpretation of results when having small imbalanced data related to health. We also present two very different use cases: early screening of dyslexia and event prediction in multiple sclerosis. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png i-com de Gruyter

How to Handle Health-Related Small Imbalanced Data in Machine Learning?

i-com , Volume 19 (3): 12 – Jan 26, 2021

Loading next page...
 
/lp/de-gruyter/how-to-handle-health-related-small-imbalanced-data-in-machine-learning-USYn2ySF3c
Publisher
de Gruyter
Copyright
© 2020 Walter de Gruyter GmbH, Berlin/Boston
ISSN
2196-6826
eISSN
2196-6826
DOI
10.1515/icom-2020-0018
Publisher site
See Article on Publisher Site

Abstract

AbstractWhen discussing interpretable machine learning results, researchers need to compare them and check for reliability, especially for health-related data. The reason is the negative impact of wrong results on a person, such as in wrong prediction of cancer, incorrect assessment of the COVID-19 pandemic situation, or missing early screening of dyslexia. Often only small data exists for these complex interdisciplinary research projects. Hence, it is essential that this type of research understands different methodologies and mindsets such as the Design Science Methodology, Human-Centered Design or Data Science approaches to ensure interpretable and reliable results. Therefore, we present various recommendations and design considerations for experiments that help to avoid over-fitting and biased interpretation of results when having small imbalanced data related to health. We also present two very different use cases: early screening of dyslexia and event prediction in multiple sclerosis.

Journal

i-comde Gruyter

Published: Jan 26, 2021

Keywords: Machine Learning; Human-Centered Design; HCD; interactive systems; health; small data; imbalanced data; over-fitting; variances; interpretable results; guidelines

There are no references for this article.