Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

An effective integrated machine learning approach for detecting diabetic retinopathy

An effective integrated machine learning approach for detecting diabetic retinopathy 1IntroductionSeveral classification techniques of machine learning (ML) algorithms were discussed, and these techniques greatly helped the stakeholders of the medical field for predicting heart disease. A model of the artificial neural network (ANN) was proposed by Dangare and Apte was outperformed with 100% accuracy [1]. To detect anomalies in hyperglycemia classification, techniques such as feedforward ANN, deep belief network, genetic algorithm (GA), support vector machine (SVM), and Bayesian neural network were proposed and implemented [2]. Some of the ML techniques were rarely implemented or not implemented at all. Besides, the accuracy of some ML techniques was lower than the accuracy obtained by DL techniques. Hence, the combined models of ML and deep learning (DL) techniques were discussed to enhance accuracy for diabetes prediction [3]. The principal component analysis (PCA) was discussed to deal with large datasets. The dimensions of these large datasets could be reduced by using PCA to observe the correlation between the attributes and for better interpretability [4]. Diabetic retinopathy would affect eyes. The current trends of disease, mechanisms, and approaches to treat diabetic retinopathy were discussed [5]. A system with ML algorithms namely, kk-nearest neighbor (KNN), variants of SVM, and NB was discussed to detect exudates in retinal images automatically. The proposed system detected the exudates with an accuracy of 98.58% that was greater than other existing techniques [6]. The classification techniques such as C4.5, Naïve Bayes (NB), and clustering technique kk-means clustering were used to detect the risk factors of diabetes disease complications. The proposed system achieved an average accuracy of 68% [7]. Moth-flame optimization algorithm was discussed, which would improve the accuracy of classification [8]. To detect and classify characteristics such as micro-aneurysms (MA) and hemorrhages in retinal images, a model with convolutional neural networks (CNN) was proposed. The proposed model achieved 95% accuracy for the two-class classification of the dataset size 30,000 images and 85% for the five-class classification of the dataset size 3,000 images [9]. Diseases related to heart, breast cancer, and diabetes were analyzed using ML techniques. This study revealed the significance of predicting the risk factors of diseases [10]. The aforementioned discussions exhibit the role of ML in predicting the symptoms and risk factors of different kinds of chronic diseases. At this moment, the statement “prevention is better than cure” is to be reminded. If the symptoms of the disease are identified before the occurrence, then it will help the people to take necessary precautions. Hence, the ML algorithms have great participation and impact on medical diagnosis.1.1State-of-the art literature reviewThe aforementioned works represent various ML algorithms and their corresponding accuracies in the medical diagnosis. In this section, in addition to mentioned earlier, some more state-of-the-art related works are presented as follows: PCA-based techniques were discussed in refs [11,12, 13,14], where PCA and KK-means techniques were integrated with logistic regression for predicting diabetes [11], PCA and linear discriminant analysis (LDA) were discussed for reducing dimensions of a large dataset cardiotocography [12], a deep neural network based on the PCA-firefly method was proposed to detect the signs of diabetic retinopathy at an early stage [13], and PCA-firefly-based classification model with the XGBoost classification method was discussed [14]. SVM-based techniques were discussed in refs [15,16], where SVM and simulated annealing (SA) were proposed for diagnosing the disease hepatitis [15], and SVM with a fruit fly optimization algorithm was proposed to classify medical data effectively [16]. Neural network-based approaches were discussed in refs [17,18], where a multilayer perceptron NN with backpropagation was selected to develop a system that predicts the risk factors of heart disease [17], and a model of deep CNN was proposed to notice and classify the diabetic retinopathy in retinal images [18]. ML algorithms were discussed in refs [19,20,21], where ANN, KK-means clustering, and random forest (RF) algorithms were proposed and implemented for predicting diabetes early. Among these algorithms, the ANN outperformed with an accuracy of 75.7% [19], the techniques such as DT, SVM, LDA, and NB. were implemented. The LDA performed well with an accuracy of 79% including hypertension and prehypertension [20], and a classification model was proposed using the techniques such as SVM, NB, KNN, and DT for predicting diabetes [21]. DL techniques were discussed in refs [22,24], where a customized deep CNN was used in the proposed model to automate the fundus images’ classification for detecting the diabetic retinopathy [22], and ensemble models of deep CNN such as Dense121, Resnet50, Dense169, Xception, and Inceptionv3 were implemented for detecting diabetic retinopathy [23], and a deep CNN model was proposed to classify fundus image and for the grading of Macular Edema [24]. GA-based approaches were discussed in refs [25,26,27], where an SVM classifier was used for dual classification, and later, these results are combined and fed into a GA to detect diabetic retinopathy [25], a GA- and SVM-based approaches were proposed to diagnose heart disease [26], and a hybrid GA and fuzzy logic classifier were proposed for diagnosing heart disease [27]. An ensemble-based approach were proposed for automated diagnosis and screening of diabetic retinopathy. The proposed approach provided higher accuracy [28]. A moth-flame optimization algorithm were proposed, and the performance was compared with other nature-inspired algorithms [29]. A hybrid firefly-bat optimized fuzzy ANN classifier was proposed for predicting diabetes, and it performed well than other convolutional methods [30]. A hybrid metaheuristic algorithm was proposed by techniques such as whale optimization algorithm and SA [31]. The combination of the elemental analysis of diabetic toenails and ML approaches was proposed to classify type-2 diabetes [32]. Cox proportional hazard, a regression-based method, was implemented to detect cardiovascular disease at an early stage [33]. A model was proposed for predicting the risk of gestational diabetes [34]. An RF classifier was used for predictions in the proposed algorithm DMP-MI to classify diabetes mellitus [35]. The aforementioned works show the wider implementation of various ML algorithms in medical diagnosis. It is observed that most of the previous works were carried out with the prime focus on a performance measure “accuracy” to evaluate the performance of a classifier. This article proposes an integration of SVM, PCA, and moth-flame optimization techniques for predicting the class labels of diabetic retinopathy. The proposed integrated approach evaluates the performance of a classifier using the measure “accuracy” as the same as in the previous works. Besides, measures such as sensitivity, recall, specificity, precision, and FF1-score are also used. The proposed model contributes to a comprehensive analysis of ML algorithms’ performance for the aforementioned measures and the classification of class labels.2Proposed methodology and implementationTo implement the proposed methodology, the dataset diabetic retinopathy is retrieved from the UCI ML repository. The proposed techniques such as normalization, PCA, and SVM are briefly described as follows. The dimensions of a dataset may consist of different levels of data. If the dataset is directly taken for the computation process, the higher-ordered dimensions may dominate the other lower-ordered dimensions. The obtained result is no way useful for decision-making. So, it is necessary to scale the data to make all the dimensions to be at the same level. The normalization technique scales the data and removes anomalies existing in the data. This is a preprocessing technique generally applied to the dataset before proceeding with the analysis process. The normalized data usually lies between 0 and 1. The main functionality of the PCA is to reduce the dimensions of a dataset. This is also referred to as dimensionality reduction. When a dataset contains more dimensions, several dimensions might be highly correlated. This problem is referred to as multicollinearity. The existence of multicollinearity affects the quality of data analysis. The PCA technique is best to deal with the multicollinearity problem. The main elements in PCA are principal components (PCs). The generation of the number of PCs is based on the number of dimensions given as input. The same number of PCs will be generated for the given number of dimensions, and they are ordered by their variance. PCs with high variance come first and then next higher level and so on. SVM is one of the most widely used supervised ML techniques. It is largely used for classification in high-dimensional data. It can be applied to both linearly and non-linearly separable problems. The key elements of the SVM are hyperplanes. In SVM, the classification of data will be done by identifying the hyperplanes. Support vectors are the vectors that describe the hyperplanes. In Section 2.1, the description of the dataset is given. In Section 2.2, the algorithm and objective function for the moth-flame optimization technique is described. The flow of activities in the proposed model is shown in Figure 1.Figure 1Flowchart for the proposed integrated ML approach.As an initial step, the collected diabetic retinopathy dataset is inputted to the proposed model. The details of the retinal images presented in this dataset are used to classify the images and to decide whether there are any symptoms of the diabetic retinopathy existing. Before proceeding to implement ML algorithms, it is important to normalize the data. This normalization can be done by using a standard scaler. After normalization, the ML algorithms such as DT, NB, RF, and SVM will be individually applied to the dataset. Next, the PCA technique is implemented for reducing the features in the dataset. This feature reduction will enable all the features to be at the same level. It means that the domination of higher-order features will be avoided. Now, the reduced features are inputted into aforesaid ML algorithms. The performance of ML algorithms before and after the implementation of PCA is observed. If high performance is observed, then we can continue with the recently applied technique. Otherwise, to improve the performance of ML algorithms, the resultants of previously applied techniques are feeded into the proposed integrated approach, i.e., SVM + PCA + moth-flame optimization technique. Then, the performance of ML algorithms was evaluated and compared concerning the performance measures discussed in Section 1.1. By looking into the comparative analysis, it can be understood that the ML approach that outperformed among all the others represents the correct classification of class labels.2.1Dataset descriptionThe diabetic retinopathy dataset comprises 20 features that represent the Messidor image set. Extracted features from an image will reveal the existence and non-existence of diabetic retinopathy. Features represented in the dataset will provide the information on any detected injury/lesion or description of the image. All 20 features are numbered from 0 through 19 as presented in Table 1. In the dataset, feature 0 consists of the values related to the quality of the image. If lesions are identified effectively in the captured image, it is said to have good quality otherwise bad quality. This quality is represented with binary values 1 and 0. The existence of value 1 means, the image contains good quality, and the value 0 means, the image contains bad quality. Feature 1 consists of the details of pre-screening. This feature is also described with binary values 1 and 0. The existence of value 1 represents that there is a severe abnormality and 0 represents no abnormality in the retina. Features from 2 to 7 consist of the number of values detected that is related to microaneurysms (MA). The microaneurysms cause blood leakage to the tissues of the retina. These MA values are detected with confidence intervals from 0.5 through 1. Features from 8 to 15 consist of the normalized values related to exudates. These are represented as same as the features from 2 to 7. The normalization makes all the features to be at the same level. Feature 16 gives the Euclidean distance information between the centers of the macula and optic disc. Feature 17 consists of the details of the optic disc diameter. The starting point of retinal blood vessels is an optic disc. Feature 18 consists of the binary values related to the classification based on the modulations AM and FM. Finally, feature 19 consists of the binary values related to class labels. The value 1 represents the existence of symptoms of diabetic retinopathy, and 0 represents no symptoms existing.Table 1Description of dataset featuresFeature numberDescription of feature0Image quality is represented as binary values 1 and 0. 1 = good quality, and 0 = bad quality1Pre-screening information is represented as binary values 1 and 0. 1 = severe abnormality in the retina, and 0 = no abnormality2–7These features represent the number of MA values detected. MA causes retinal blood leakage. These features show the results at confidence intervals 0.5 through 1 respectively8–15Same as 2–7 for exudates. These are normalized to make all the features at the same level16This feature gives Euclidean distance information between the centers of the macula and optic disc17This feature contains information about optic disc diameter18The binary values of classification based on amplitude modulation (AM) and frequency modulation (FM)19Class labels are represented as binary values 1 and 0. 1 = symptoms of diabetic retinopathy, and 0 = no symptoms2.2Proposed algorithmThe general phenomenon of the moth-flame optimization technique is described as follows. The moth-flame optimization technique also referred to as a population-based technique. In this technique, both moths and flames are said to be solutions. The moths are said to be agents of search space, and flames are said to be the best positions. The difference between these two depends on the update of each iteration. By updating the position at each iteration, the moth never misses the best position. The steps of the proposed technique are given in Algorithm 1.Algorithm 1 Algorithm for moth-flame optimization technique1:Initiate the parameters.2:Initiate the generation of moths randomly.3:Identify the fitness functions and mark the best positions of flames.4:Update flame numbers.5:Calculate distance related to moth.6:Update the positions related to moth.7:Repeat the steps 2–6 until the expected criteria achieved.8:If criteria are achieved report the best positions of moths.2.3Objective functionThe objective function of moth-flame is given in equations (1)–(11) as follows: initialization of moths is represented in a matrix as shown in (1): (1)M=b1,1b1,2……b1,qb2,1b2,2……b2,q⋮⋮⋮⋮⋮bp,1bp,2……bp,q,M=\left[\begin{array}{ccccc}{b}_{1,1}& {b}_{1,2}& \ldots & \ldots & {b}_{1,q}\\ {b}_{2,1}& {b}_{2,2}& \ldots & \ldots & {b}_{2,q}\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ {b}_{p,1}& {b}_{p,2}& \ldots & \ldots & {b}_{p,q}\end{array}\right],where p=totalp={\rm{total}}number of moths and q=totalq={\rm{total}}number of variables. The fitness function for moths is given in equation (2): (2)FM=FM1FM2…FMq.{\rm{FM}}=\left[\begin{array}{c}{{\rm{FM}}}_{1}\\ {{\rm{FM}}}_{2}\\ \ldots \\ {{\rm{FM}}}_{q}\end{array}\right].Initialization of flames is represented in a matrix ss shown in equation (3): (3)N=c1,1c1,2……c1,qc2,1c2,2……c2,q⋮⋮⋮⋮⋮cp,1cp,2……cp,q,N=\left[\begin{array}{ccccc}{c}_{1,1}& {c}_{1,2}& \ldots & \ldots & {c}_{1,q}\\ {c}_{2,1}& {c}_{2,2}& \ldots & \ldots & {c}_{2,q}\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ {c}_{p,1}& {c}_{p,2}& \ldots & \ldots & {c}_{p,q}\end{array}\right],where p=totalp={\rm{total}}number of flames and q=totalq={\rm{total}}number of variables. The fitness function for flames is given in equation (4): (4)FN=FN1FN2…FNq.{\rm{FN}}=\left[\begin{array}{c}{{\rm{FN}}}_{1}\\ {{\rm{FN}}}_{2}\\ \ldots \\ {{\rm{FN}}}_{q}\end{array}\right].The mathematical model of moth-flame optimization technique represented as a three-tuple is given in equation (5): (5)MNF=(G,H,T).{\rm{MNF}}=\left(G,H,T).GGrepresents the random population of moths and fitness values is given in equation (6): (6)G=ϕ→{M,FM}.G=\phi \to \left\{M,{\rm{FM}}\right\}.The function HHdecides the moth movement for finding the best position of flame and updates every time, which is expressed in equation (7): (7)H=M→M.H=M\to M.The function FFdetermines whether it is true or false: (8)F=M→{true,false}.F=M\to \left\{{\rm{true}},{\rm{false}}\right\}.The equation for updating the position is given in equation (9): (9)Mk=S(Mj,Nk),{M}_{k}=S\left({M}_{j},{N}_{k}),where SS= spiral function, Mj=jth{M}_{j}=j{\rm{th}}moth, and Nk=kth{N}_{k}=k{\rm{th}}flame. The spiral path of the moth flow logarithmically is given in (10): (10)S(Mk,Nl)=Nk⋅ewc⋅cos(2πc)+Nl,S\left({M}_{k},{N}_{l})={N}_{k}\cdot {{\rm{e}}}^{wc}\cdot \cos \left(2\pi c)+{N}_{l},where ww= constant and ccvalues lie between −1-1and 1. Calculation of distance between the kthk{\rm{th}}moth and lthl{\rm{th}}flame is given in equation (11): (11)D=∣Nl−Mj∣c=z−1∗(rand+1),\left\{\begin{array}{l}D=| {N}_{l}-{M}_{j}| \\ c=z-1\ast \left({\rm{rand}}+1),\end{array}\right.where zzvalue varies between −1-1and −2-2, and when zzvalue is less, it represents that the moth is closer to the flame.3Results and discussionThe implementation of ML algorithms was performed on a diabetic retinopathy dataset retrieved from the UCI ML repository. The performance of the proposed integrated ML model is evaluated using measures such as precision, FF1-score, specificity, accuracy, recall, and sensitivity (12)–(16). Precision describes the correctness and is given in equation (12). Recall/sensitivity represents the wholeness and is expressed in equation (13). FF1-score is described in equation (14). Accuracy characterizes the rightness and is given in equation (15). Specificity is described in equation (16).(12)Precision=PTPF+PT,\hspace{0.1em}\text{Precision}=\frac{\text{PT}}{\text{PF}+\text{PT}\hspace{0.1em}},(13)Recall/sensitivity=PTNF+PT,\hspace{0.1em}\text{Recall/sensitivity}=\frac{\text{PT}}{\text{NF}+\text{PT}\hspace{0.1em}},(14)F1-score=2∗Precision∗RecallPrecision+Recall,F\hspace{0.1em}\text{1-score}=\frac{2\ast \text{Precision}\hspace{0.2em}\ast \hspace{0.2em}\text{Recall}}{\text{Precision}+\text{Recall}\hspace{0.1em}},(15)Accuracy=PT+NTPF+PT+NF+NT,\hspace{0.1em}\text{Accuracy}=\frac{\text{PT}+\text{NT}}{\text{PF}\hspace{0.3em}+\hspace{0.3em}\text{PT}+\text{NF}\hspace{0.3em}+\hspace{0.3em}\text{NT}\hspace{0.1em}},(16)Specificity=NTPF+NT,\hspace{0.1em}\text{Specificity}=\frac{\text{NT}}{\text{PF}+\text{NT}\hspace{0.1em}},where PT = positive (true value), PF = positive (false value), NT = negative (true value), and NF = negative (false value). The simulation results of ML algorithms and corresponding performance measures are described as follows: First, the dataset has experimented with four popular ML algorithms, namely, DT, NB, RF, and SVM. Figure 2 depicts the performance of these algorithms.Figure 2Performance of ML algorithms on the original dataset.From Figure 2, it can be observed that the DT classifier has achieved 57% of precision, recall, and FF1-score, and 57.1, 54.5, and 59.2% of accuracy, sensitivity, and specificity, respectively. The NB classifier has achieved 64% of precision, 63% of recall and FF1-score, and 63.2, 64.2, and 62.4% of accuracy, sensitivity, and specificity, respectively. The RF classifier has achieved 70% of precision, recall, and 69% of FF1-score and 68.8, 76.2, and 63% of accuracy, sensitivity, and specificity, respectively. The algorithm SVM has achieved 79% of precision, 76% of recall and FF1-score, and 76.1, 89.4, and 65.3% of accuracy, sensitivity, and specificity, respectively. When these performances are compared with each other, it is observed that the SVM is outperformed.The dataset is then fed in to the PCA for dimensionality reduction. The PCA has reduced the dataset from 20 dimensions to 12 dimensions. These reduced features of the dataset are given as input to the aforementioned classifiers. Figure 3 depicts the results obtained after applying ML algorithms on reduced dimensions.Figure 3Performance of ML algorithms after reducing dimensions using PCA.From Figure 3, it can be observed that the DT classifier has achieved 67% of precision, recall, and FF1-score, and 67.09, 66.3, and 67.6% of accuracy, sensitivity, and specificity, respectively. The NB classifier has achieved 61% of precision, 60% of recall and FF1-score, and 59.7% of accuracy, sensitivity, and specificity. The RF classifier has achieved 69% of precision, recall, and FF1-score, and 69.2, 67.8, and 70.4% of accuracy, sensitivity, and specificity, respectively. The SVM has achieved 71% of precision, 66% of recall and FF1-score, and 65.8, 81.3, and 55.7% of accuracy, sensitivity, and specificity, respectively. Figure 3 shows that the performance of the classifiers NB, RF, and SVM has been reduced when PCA is applied. But, the performance of DT has been enhanced with PCA. The main reason for the degradation of performance is dimensionality reduction. It is understood that these classifiers work better with more samples and dimensions.To achieve better performance of PCA-based classifiers, the moth-flame optimization technique is applied. This technique eliminated the attributes that are affecting the performance of the classification model negatively. It chooses the optimal features/attributes that have a positive impact on performance. By implementing the moth-flame technique, the 12 dimensions were further reduced to 9. Based on this reduction, it is understood three dimensions are affecting the performance of the model. These three dimensions are eliminated by the moth-flame algorithm. Figure 4 depicts the result of the model with PCA and moth-flame optimization techniques.Figure 4Performance of ML algorithms after implementation of moth-flame optimization technique.Figure 4 shows that the DT classifier has achieved 68% of precision, recall, and FF1-score, and 68.3, 67.4, and 69.2% of accuracy, sensitivity, and specificity, respectively. The NB classifier has achieved 62% of precision, 63% of recall and FF1-score, and 61.8% of accuracy, sensitivity, and specificity is 62.1%. The RF classifier has achieved 80% of precision, 79% of recall, and FF1-score, and 78.2, 85.8, 75.1% of accuracy, sensitivity, specificity, respectively. The SVM has achieved 86% of precision, recall, and FF1-score, and 86.3, 94.2, and 75.2% of accuracy, sensitivity, and specificity, respectively. From Figure 4, it is evident that the performance of all the classifiers except NB has improved dramatically with the addition of the moth-flame optimization technique. This technique eliminated the dimensions that impact the performance of the classifiers negatively. It is observed from the proposed model that the integration of SVM, PCA, moth-flame outperformed than all other ML algorithms. The performance of all ML algorithms is summarized in Table 2. In the table, PC, MF, Prscn, Rcl, FF1, Acc, Sens, Spec refer to priniciple component analysis, moth-flame, precision, recall, FF1-score, accuracy, sensitivity and specificity respectively. It can be noticed that the proposed integrated model of SVM, PCA, moth-flame has achieved high performance than the other ML algorithms. The high performance of this model represents that the classification of class labels has been achieved correctly.Table 2Performance summary of ML algorithmsML algorithmsPrscnRclFF1AccSensSpecDT57575757.154.559.2NB64636363.264.262.4RF70696968.876.263SVM79767676.189.465.3DT + PC67676767.0966.367.6NB + PC61606059.759.759.7RF + PC69696969.267.870.4SVM + PC71666665.881.355.7DT + PC + MF68686868.367.469.2NB + PC + MF62636361.361.862.1RF + PC + MF80797978.285.875.1SVM + PC + MF86868686.394.275.24ConclusionTo identify the diabetic retinopathy, this article proposed an integrated approach of ML algorithms and achieved high performance. The dataset that is retrieved from the UCI ML repository is used for the proposed approach. The key observations of the proposed integrated approach are as follows: –From Figure 2, it is observed that when the ML algorithms are implemented individually, SVM is outperformed than other ML algorithms.–From Figure 3, it is evident that the reduction of the dimensions using the PCA technique has negatively influenced the performance of majority ML algorithms.–From Figure 4, it is understood that the integration of SVM, PCA, and moth-flame optimization techniques improved the performance of classification and identified the class labels correctly.–From Table 2, it is easy to interpret and compare the performances achieved by the implemented ML algorithms.Hence, the proposed integrated approach of ML algorithms is very useful for detecting diabetic retinopathy to prevent blindness. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Open Computer Science de Gruyter

An effective integrated machine learning approach for detecting diabetic retinopathy

Loading next page...
 
/lp/de-gruyter/an-effective-integrated-machine-learning-approach-for-detecting-lgFUn44lqQ
Publisher
de Gruyter
Copyright
© 2022 Penikalapati Pragathi and Agastyaraju Nagaraja Rao, published by De Gruyter
eISSN
2299-1093
DOI
10.1515/comp-2020-0222
Publisher site
See Article on Publisher Site

Abstract

1IntroductionSeveral classification techniques of machine learning (ML) algorithms were discussed, and these techniques greatly helped the stakeholders of the medical field for predicting heart disease. A model of the artificial neural network (ANN) was proposed by Dangare and Apte was outperformed with 100% accuracy [1]. To detect anomalies in hyperglycemia classification, techniques such as feedforward ANN, deep belief network, genetic algorithm (GA), support vector machine (SVM), and Bayesian neural network were proposed and implemented [2]. Some of the ML techniques were rarely implemented or not implemented at all. Besides, the accuracy of some ML techniques was lower than the accuracy obtained by DL techniques. Hence, the combined models of ML and deep learning (DL) techniques were discussed to enhance accuracy for diabetes prediction [3]. The principal component analysis (PCA) was discussed to deal with large datasets. The dimensions of these large datasets could be reduced by using PCA to observe the correlation between the attributes and for better interpretability [4]. Diabetic retinopathy would affect eyes. The current trends of disease, mechanisms, and approaches to treat diabetic retinopathy were discussed [5]. A system with ML algorithms namely, kk-nearest neighbor (KNN), variants of SVM, and NB was discussed to detect exudates in retinal images automatically. The proposed system detected the exudates with an accuracy of 98.58% that was greater than other existing techniques [6]. The classification techniques such as C4.5, Naïve Bayes (NB), and clustering technique kk-means clustering were used to detect the risk factors of diabetes disease complications. The proposed system achieved an average accuracy of 68% [7]. Moth-flame optimization algorithm was discussed, which would improve the accuracy of classification [8]. To detect and classify characteristics such as micro-aneurysms (MA) and hemorrhages in retinal images, a model with convolutional neural networks (CNN) was proposed. The proposed model achieved 95% accuracy for the two-class classification of the dataset size 30,000 images and 85% for the five-class classification of the dataset size 3,000 images [9]. Diseases related to heart, breast cancer, and diabetes were analyzed using ML techniques. This study revealed the significance of predicting the risk factors of diseases [10]. The aforementioned discussions exhibit the role of ML in predicting the symptoms and risk factors of different kinds of chronic diseases. At this moment, the statement “prevention is better than cure” is to be reminded. If the symptoms of the disease are identified before the occurrence, then it will help the people to take necessary precautions. Hence, the ML algorithms have great participation and impact on medical diagnosis.1.1State-of-the art literature reviewThe aforementioned works represent various ML algorithms and their corresponding accuracies in the medical diagnosis. In this section, in addition to mentioned earlier, some more state-of-the-art related works are presented as follows: PCA-based techniques were discussed in refs [11,12, 13,14], where PCA and KK-means techniques were integrated with logistic regression for predicting diabetes [11], PCA and linear discriminant analysis (LDA) were discussed for reducing dimensions of a large dataset cardiotocography [12], a deep neural network based on the PCA-firefly method was proposed to detect the signs of diabetic retinopathy at an early stage [13], and PCA-firefly-based classification model with the XGBoost classification method was discussed [14]. SVM-based techniques were discussed in refs [15,16], where SVM and simulated annealing (SA) were proposed for diagnosing the disease hepatitis [15], and SVM with a fruit fly optimization algorithm was proposed to classify medical data effectively [16]. Neural network-based approaches were discussed in refs [17,18], where a multilayer perceptron NN with backpropagation was selected to develop a system that predicts the risk factors of heart disease [17], and a model of deep CNN was proposed to notice and classify the diabetic retinopathy in retinal images [18]. ML algorithms were discussed in refs [19,20,21], where ANN, KK-means clustering, and random forest (RF) algorithms were proposed and implemented for predicting diabetes early. Among these algorithms, the ANN outperformed with an accuracy of 75.7% [19], the techniques such as DT, SVM, LDA, and NB. were implemented. The LDA performed well with an accuracy of 79% including hypertension and prehypertension [20], and a classification model was proposed using the techniques such as SVM, NB, KNN, and DT for predicting diabetes [21]. DL techniques were discussed in refs [22,24], where a customized deep CNN was used in the proposed model to automate the fundus images’ classification for detecting the diabetic retinopathy [22], and ensemble models of deep CNN such as Dense121, Resnet50, Dense169, Xception, and Inceptionv3 were implemented for detecting diabetic retinopathy [23], and a deep CNN model was proposed to classify fundus image and for the grading of Macular Edema [24]. GA-based approaches were discussed in refs [25,26,27], where an SVM classifier was used for dual classification, and later, these results are combined and fed into a GA to detect diabetic retinopathy [25], a GA- and SVM-based approaches were proposed to diagnose heart disease [26], and a hybrid GA and fuzzy logic classifier were proposed for diagnosing heart disease [27]. An ensemble-based approach were proposed for automated diagnosis and screening of diabetic retinopathy. The proposed approach provided higher accuracy [28]. A moth-flame optimization algorithm were proposed, and the performance was compared with other nature-inspired algorithms [29]. A hybrid firefly-bat optimized fuzzy ANN classifier was proposed for predicting diabetes, and it performed well than other convolutional methods [30]. A hybrid metaheuristic algorithm was proposed by techniques such as whale optimization algorithm and SA [31]. The combination of the elemental analysis of diabetic toenails and ML approaches was proposed to classify type-2 diabetes [32]. Cox proportional hazard, a regression-based method, was implemented to detect cardiovascular disease at an early stage [33]. A model was proposed for predicting the risk of gestational diabetes [34]. An RF classifier was used for predictions in the proposed algorithm DMP-MI to classify diabetes mellitus [35]. The aforementioned works show the wider implementation of various ML algorithms in medical diagnosis. It is observed that most of the previous works were carried out with the prime focus on a performance measure “accuracy” to evaluate the performance of a classifier. This article proposes an integration of SVM, PCA, and moth-flame optimization techniques for predicting the class labels of diabetic retinopathy. The proposed integrated approach evaluates the performance of a classifier using the measure “accuracy” as the same as in the previous works. Besides, measures such as sensitivity, recall, specificity, precision, and FF1-score are also used. The proposed model contributes to a comprehensive analysis of ML algorithms’ performance for the aforementioned measures and the classification of class labels.2Proposed methodology and implementationTo implement the proposed methodology, the dataset diabetic retinopathy is retrieved from the UCI ML repository. The proposed techniques such as normalization, PCA, and SVM are briefly described as follows. The dimensions of a dataset may consist of different levels of data. If the dataset is directly taken for the computation process, the higher-ordered dimensions may dominate the other lower-ordered dimensions. The obtained result is no way useful for decision-making. So, it is necessary to scale the data to make all the dimensions to be at the same level. The normalization technique scales the data and removes anomalies existing in the data. This is a preprocessing technique generally applied to the dataset before proceeding with the analysis process. The normalized data usually lies between 0 and 1. The main functionality of the PCA is to reduce the dimensions of a dataset. This is also referred to as dimensionality reduction. When a dataset contains more dimensions, several dimensions might be highly correlated. This problem is referred to as multicollinearity. The existence of multicollinearity affects the quality of data analysis. The PCA technique is best to deal with the multicollinearity problem. The main elements in PCA are principal components (PCs). The generation of the number of PCs is based on the number of dimensions given as input. The same number of PCs will be generated for the given number of dimensions, and they are ordered by their variance. PCs with high variance come first and then next higher level and so on. SVM is one of the most widely used supervised ML techniques. It is largely used for classification in high-dimensional data. It can be applied to both linearly and non-linearly separable problems. The key elements of the SVM are hyperplanes. In SVM, the classification of data will be done by identifying the hyperplanes. Support vectors are the vectors that describe the hyperplanes. In Section 2.1, the description of the dataset is given. In Section 2.2, the algorithm and objective function for the moth-flame optimization technique is described. The flow of activities in the proposed model is shown in Figure 1.Figure 1Flowchart for the proposed integrated ML approach.As an initial step, the collected diabetic retinopathy dataset is inputted to the proposed model. The details of the retinal images presented in this dataset are used to classify the images and to decide whether there are any symptoms of the diabetic retinopathy existing. Before proceeding to implement ML algorithms, it is important to normalize the data. This normalization can be done by using a standard scaler. After normalization, the ML algorithms such as DT, NB, RF, and SVM will be individually applied to the dataset. Next, the PCA technique is implemented for reducing the features in the dataset. This feature reduction will enable all the features to be at the same level. It means that the domination of higher-order features will be avoided. Now, the reduced features are inputted into aforesaid ML algorithms. The performance of ML algorithms before and after the implementation of PCA is observed. If high performance is observed, then we can continue with the recently applied technique. Otherwise, to improve the performance of ML algorithms, the resultants of previously applied techniques are feeded into the proposed integrated approach, i.e., SVM + PCA + moth-flame optimization technique. Then, the performance of ML algorithms was evaluated and compared concerning the performance measures discussed in Section 1.1. By looking into the comparative analysis, it can be understood that the ML approach that outperformed among all the others represents the correct classification of class labels.2.1Dataset descriptionThe diabetic retinopathy dataset comprises 20 features that represent the Messidor image set. Extracted features from an image will reveal the existence and non-existence of diabetic retinopathy. Features represented in the dataset will provide the information on any detected injury/lesion or description of the image. All 20 features are numbered from 0 through 19 as presented in Table 1. In the dataset, feature 0 consists of the values related to the quality of the image. If lesions are identified effectively in the captured image, it is said to have good quality otherwise bad quality. This quality is represented with binary values 1 and 0. The existence of value 1 means, the image contains good quality, and the value 0 means, the image contains bad quality. Feature 1 consists of the details of pre-screening. This feature is also described with binary values 1 and 0. The existence of value 1 represents that there is a severe abnormality and 0 represents no abnormality in the retina. Features from 2 to 7 consist of the number of values detected that is related to microaneurysms (MA). The microaneurysms cause blood leakage to the tissues of the retina. These MA values are detected with confidence intervals from 0.5 through 1. Features from 8 to 15 consist of the normalized values related to exudates. These are represented as same as the features from 2 to 7. The normalization makes all the features to be at the same level. Feature 16 gives the Euclidean distance information between the centers of the macula and optic disc. Feature 17 consists of the details of the optic disc diameter. The starting point of retinal blood vessels is an optic disc. Feature 18 consists of the binary values related to the classification based on the modulations AM and FM. Finally, feature 19 consists of the binary values related to class labels. The value 1 represents the existence of symptoms of diabetic retinopathy, and 0 represents no symptoms existing.Table 1Description of dataset featuresFeature numberDescription of feature0Image quality is represented as binary values 1 and 0. 1 = good quality, and 0 = bad quality1Pre-screening information is represented as binary values 1 and 0. 1 = severe abnormality in the retina, and 0 = no abnormality2–7These features represent the number of MA values detected. MA causes retinal blood leakage. These features show the results at confidence intervals 0.5 through 1 respectively8–15Same as 2–7 for exudates. These are normalized to make all the features at the same level16This feature gives Euclidean distance information between the centers of the macula and optic disc17This feature contains information about optic disc diameter18The binary values of classification based on amplitude modulation (AM) and frequency modulation (FM)19Class labels are represented as binary values 1 and 0. 1 = symptoms of diabetic retinopathy, and 0 = no symptoms2.2Proposed algorithmThe general phenomenon of the moth-flame optimization technique is described as follows. The moth-flame optimization technique also referred to as a population-based technique. In this technique, both moths and flames are said to be solutions. The moths are said to be agents of search space, and flames are said to be the best positions. The difference between these two depends on the update of each iteration. By updating the position at each iteration, the moth never misses the best position. The steps of the proposed technique are given in Algorithm 1.Algorithm 1 Algorithm for moth-flame optimization technique1:Initiate the parameters.2:Initiate the generation of moths randomly.3:Identify the fitness functions and mark the best positions of flames.4:Update flame numbers.5:Calculate distance related to moth.6:Update the positions related to moth.7:Repeat the steps 2–6 until the expected criteria achieved.8:If criteria are achieved report the best positions of moths.2.3Objective functionThe objective function of moth-flame is given in equations (1)–(11) as follows: initialization of moths is represented in a matrix as shown in (1): (1)M=b1,1b1,2……b1,qb2,1b2,2……b2,q⋮⋮⋮⋮⋮bp,1bp,2……bp,q,M=\left[\begin{array}{ccccc}{b}_{1,1}& {b}_{1,2}& \ldots & \ldots & {b}_{1,q}\\ {b}_{2,1}& {b}_{2,2}& \ldots & \ldots & {b}_{2,q}\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ {b}_{p,1}& {b}_{p,2}& \ldots & \ldots & {b}_{p,q}\end{array}\right],where p=totalp={\rm{total}}number of moths and q=totalq={\rm{total}}number of variables. The fitness function for moths is given in equation (2): (2)FM=FM1FM2…FMq.{\rm{FM}}=\left[\begin{array}{c}{{\rm{FM}}}_{1}\\ {{\rm{FM}}}_{2}\\ \ldots \\ {{\rm{FM}}}_{q}\end{array}\right].Initialization of flames is represented in a matrix ss shown in equation (3): (3)N=c1,1c1,2……c1,qc2,1c2,2……c2,q⋮⋮⋮⋮⋮cp,1cp,2……cp,q,N=\left[\begin{array}{ccccc}{c}_{1,1}& {c}_{1,2}& \ldots & \ldots & {c}_{1,q}\\ {c}_{2,1}& {c}_{2,2}& \ldots & \ldots & {c}_{2,q}\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ {c}_{p,1}& {c}_{p,2}& \ldots & \ldots & {c}_{p,q}\end{array}\right],where p=totalp={\rm{total}}number of flames and q=totalq={\rm{total}}number of variables. The fitness function for flames is given in equation (4): (4)FN=FN1FN2…FNq.{\rm{FN}}=\left[\begin{array}{c}{{\rm{FN}}}_{1}\\ {{\rm{FN}}}_{2}\\ \ldots \\ {{\rm{FN}}}_{q}\end{array}\right].The mathematical model of moth-flame optimization technique represented as a three-tuple is given in equation (5): (5)MNF=(G,H,T).{\rm{MNF}}=\left(G,H,T).GGrepresents the random population of moths and fitness values is given in equation (6): (6)G=ϕ→{M,FM}.G=\phi \to \left\{M,{\rm{FM}}\right\}.The function HHdecides the moth movement for finding the best position of flame and updates every time, which is expressed in equation (7): (7)H=M→M.H=M\to M.The function FFdetermines whether it is true or false: (8)F=M→{true,false}.F=M\to \left\{{\rm{true}},{\rm{false}}\right\}.The equation for updating the position is given in equation (9): (9)Mk=S(Mj,Nk),{M}_{k}=S\left({M}_{j},{N}_{k}),where SS= spiral function, Mj=jth{M}_{j}=j{\rm{th}}moth, and Nk=kth{N}_{k}=k{\rm{th}}flame. The spiral path of the moth flow logarithmically is given in (10): (10)S(Mk,Nl)=Nk⋅ewc⋅cos(2πc)+Nl,S\left({M}_{k},{N}_{l})={N}_{k}\cdot {{\rm{e}}}^{wc}\cdot \cos \left(2\pi c)+{N}_{l},where ww= constant and ccvalues lie between −1-1and 1. Calculation of distance between the kthk{\rm{th}}moth and lthl{\rm{th}}flame is given in equation (11): (11)D=∣Nl−Mj∣c=z−1∗(rand+1),\left\{\begin{array}{l}D=| {N}_{l}-{M}_{j}| \\ c=z-1\ast \left({\rm{rand}}+1),\end{array}\right.where zzvalue varies between −1-1and −2-2, and when zzvalue is less, it represents that the moth is closer to the flame.3Results and discussionThe implementation of ML algorithms was performed on a diabetic retinopathy dataset retrieved from the UCI ML repository. The performance of the proposed integrated ML model is evaluated using measures such as precision, FF1-score, specificity, accuracy, recall, and sensitivity (12)–(16). Precision describes the correctness and is given in equation (12). Recall/sensitivity represents the wholeness and is expressed in equation (13). FF1-score is described in equation (14). Accuracy characterizes the rightness and is given in equation (15). Specificity is described in equation (16).(12)Precision=PTPF+PT,\hspace{0.1em}\text{Precision}=\frac{\text{PT}}{\text{PF}+\text{PT}\hspace{0.1em}},(13)Recall/sensitivity=PTNF+PT,\hspace{0.1em}\text{Recall/sensitivity}=\frac{\text{PT}}{\text{NF}+\text{PT}\hspace{0.1em}},(14)F1-score=2∗Precision∗RecallPrecision+Recall,F\hspace{0.1em}\text{1-score}=\frac{2\ast \text{Precision}\hspace{0.2em}\ast \hspace{0.2em}\text{Recall}}{\text{Precision}+\text{Recall}\hspace{0.1em}},(15)Accuracy=PT+NTPF+PT+NF+NT,\hspace{0.1em}\text{Accuracy}=\frac{\text{PT}+\text{NT}}{\text{PF}\hspace{0.3em}+\hspace{0.3em}\text{PT}+\text{NF}\hspace{0.3em}+\hspace{0.3em}\text{NT}\hspace{0.1em}},(16)Specificity=NTPF+NT,\hspace{0.1em}\text{Specificity}=\frac{\text{NT}}{\text{PF}+\text{NT}\hspace{0.1em}},where PT = positive (true value), PF = positive (false value), NT = negative (true value), and NF = negative (false value). The simulation results of ML algorithms and corresponding performance measures are described as follows: First, the dataset has experimented with four popular ML algorithms, namely, DT, NB, RF, and SVM. Figure 2 depicts the performance of these algorithms.Figure 2Performance of ML algorithms on the original dataset.From Figure 2, it can be observed that the DT classifier has achieved 57% of precision, recall, and FF1-score, and 57.1, 54.5, and 59.2% of accuracy, sensitivity, and specificity, respectively. The NB classifier has achieved 64% of precision, 63% of recall and FF1-score, and 63.2, 64.2, and 62.4% of accuracy, sensitivity, and specificity, respectively. The RF classifier has achieved 70% of precision, recall, and 69% of FF1-score and 68.8, 76.2, and 63% of accuracy, sensitivity, and specificity, respectively. The algorithm SVM has achieved 79% of precision, 76% of recall and FF1-score, and 76.1, 89.4, and 65.3% of accuracy, sensitivity, and specificity, respectively. When these performances are compared with each other, it is observed that the SVM is outperformed.The dataset is then fed in to the PCA for dimensionality reduction. The PCA has reduced the dataset from 20 dimensions to 12 dimensions. These reduced features of the dataset are given as input to the aforementioned classifiers. Figure 3 depicts the results obtained after applying ML algorithms on reduced dimensions.Figure 3Performance of ML algorithms after reducing dimensions using PCA.From Figure 3, it can be observed that the DT classifier has achieved 67% of precision, recall, and FF1-score, and 67.09, 66.3, and 67.6% of accuracy, sensitivity, and specificity, respectively. The NB classifier has achieved 61% of precision, 60% of recall and FF1-score, and 59.7% of accuracy, sensitivity, and specificity. The RF classifier has achieved 69% of precision, recall, and FF1-score, and 69.2, 67.8, and 70.4% of accuracy, sensitivity, and specificity, respectively. The SVM has achieved 71% of precision, 66% of recall and FF1-score, and 65.8, 81.3, and 55.7% of accuracy, sensitivity, and specificity, respectively. Figure 3 shows that the performance of the classifiers NB, RF, and SVM has been reduced when PCA is applied. But, the performance of DT has been enhanced with PCA. The main reason for the degradation of performance is dimensionality reduction. It is understood that these classifiers work better with more samples and dimensions.To achieve better performance of PCA-based classifiers, the moth-flame optimization technique is applied. This technique eliminated the attributes that are affecting the performance of the classification model negatively. It chooses the optimal features/attributes that have a positive impact on performance. By implementing the moth-flame technique, the 12 dimensions were further reduced to 9. Based on this reduction, it is understood three dimensions are affecting the performance of the model. These three dimensions are eliminated by the moth-flame algorithm. Figure 4 depicts the result of the model with PCA and moth-flame optimization techniques.Figure 4Performance of ML algorithms after implementation of moth-flame optimization technique.Figure 4 shows that the DT classifier has achieved 68% of precision, recall, and FF1-score, and 68.3, 67.4, and 69.2% of accuracy, sensitivity, and specificity, respectively. The NB classifier has achieved 62% of precision, 63% of recall and FF1-score, and 61.8% of accuracy, sensitivity, and specificity is 62.1%. The RF classifier has achieved 80% of precision, 79% of recall, and FF1-score, and 78.2, 85.8, 75.1% of accuracy, sensitivity, specificity, respectively. The SVM has achieved 86% of precision, recall, and FF1-score, and 86.3, 94.2, and 75.2% of accuracy, sensitivity, and specificity, respectively. From Figure 4, it is evident that the performance of all the classifiers except NB has improved dramatically with the addition of the moth-flame optimization technique. This technique eliminated the dimensions that impact the performance of the classifiers negatively. It is observed from the proposed model that the integration of SVM, PCA, moth-flame outperformed than all other ML algorithms. The performance of all ML algorithms is summarized in Table 2. In the table, PC, MF, Prscn, Rcl, FF1, Acc, Sens, Spec refer to priniciple component analysis, moth-flame, precision, recall, FF1-score, accuracy, sensitivity and specificity respectively. It can be noticed that the proposed integrated model of SVM, PCA, moth-flame has achieved high performance than the other ML algorithms. The high performance of this model represents that the classification of class labels has been achieved correctly.Table 2Performance summary of ML algorithmsML algorithmsPrscnRclFF1AccSensSpecDT57575757.154.559.2NB64636363.264.262.4RF70696968.876.263SVM79767676.189.465.3DT + PC67676767.0966.367.6NB + PC61606059.759.759.7RF + PC69696969.267.870.4SVM + PC71666665.881.355.7DT + PC + MF68686868.367.469.2NB + PC + MF62636361.361.862.1RF + PC + MF80797978.285.875.1SVM + PC + MF86868686.394.275.24ConclusionTo identify the diabetic retinopathy, this article proposed an integrated approach of ML algorithms and achieved high performance. The dataset that is retrieved from the UCI ML repository is used for the proposed approach. The key observations of the proposed integrated approach are as follows: –From Figure 2, it is observed that when the ML algorithms are implemented individually, SVM is outperformed than other ML algorithms.–From Figure 3, it is evident that the reduction of the dimensions using the PCA technique has negatively influenced the performance of majority ML algorithms.–From Figure 4, it is understood that the integration of SVM, PCA, and moth-flame optimization techniques improved the performance of classification and identified the class labels correctly.–From Table 2, it is easy to interpret and compare the performances achieved by the implemented ML algorithms.Hence, the proposed integrated approach of ML algorithms is very useful for detecting diabetic retinopathy to prevent blindness.

Journal

Open Computer Sciencede Gruyter

Published: Jan 1, 2022

Keywords: diabetic retinopathy; support vector machine; machine learning; moth-flame optimization; classification; measures; principal component analysis

There are no references for this article.