Access the full text.
Sign up today, get DeepDyve free for 14 days.
W. Louis, M. Komeili, D. Hatzinakos (2016)
Continuous Authentication Using One-Dimensional Multi-Resolution Local Binary Patterns (1DMRLBP) in ECG BiometricsIEEE Transactions on Information Forensics and Security, 11
L. Biel, O. Pettersson, L. Philipson, P. Wide (1999)
ECG analysis: a new approach in human identificationIMTC/99. Proceedings of the 16th IEEE Instrumentation and Measurement Technology Conference (Cat. No.99CH36309), 1
P. Royston (1995)
A Remark on Algorithm as 181: The W‐Test for NormalityJournal of The Royal Statistical Society Series C-applied Statistics, 44
P. Spachos, Jiexin Gao, D. Hatzinakos (2011)
Feasibility study of photoplethysmographic signals for biometric identification2011 17th International Conference on Digital Signal Processing (DSP)
Irene Rodríguez-Luján, G. Bailador, C. Ávila, Ana Herrero, G. Vidal-de-Miguel (2013)
Analysis of pattern recognition and dimensionality reduction techniques for odor biometricsKnowl. Based Syst., 52
Jiapu Pan, W. Tompkins (1985)
A Real-Time QRS Detection AlgorithmIEEE Transactions on Biomedical Engineering, BME-32
J. Carvalho, Vítor Sá, Sérgio Magalhães, H. Santos (2015)
Enrollment Time as a Requirement for Biometric Hand Recognition Systems
S. Roweis, L. Saul (2000)
Nonlinear dimensionality reduction by locally linear embedding.Science, 290 5500
T. Bailey, C. Elkan (1994)
Fitting a Mixture Model By Expectation Maximization To Discover Motifs In BiopolymerProceedings. International Conference on Intelligent Systems for Molecular Biology, 2
M. Holi (2011)
Electromyography Analysis for Person Identification
Michael Tipping, Charles Bishop (1999)
Probabilistic Principal Component AnalysisJournal of the Royal Statistical Society: Series B (Statistical Methodology), 61
S. Pouryayevali, Saeid Wahabi, S. Hari, D. Hatzinakos (2014)
On establishing evaluation standards for ECG biometrics2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
P. McSharry, G. Clifford, L. Tarassenko, Leonard Smith (2003)
A dynamical model for generating synthetic electrocardiogram signalsIEEE Transactions on Biomedical Engineering, 50
S. Shapiro, M. Wilk (1965)
An Analysis of Variance Test for Normality (Complete Samples)Biometrika, 52
Vítor Sá, Sérgio Magalhães, Henrique Santos (2014)
Enrolment time as a requirement for biometric fingerprint recognitionInt. J. Electron. Secur. Digit. Forensics, 6
G. Clifford, F. Azuaje, P. McSharry (2006)
Advanced Methods And Tools for ECG Data Analysis
JP Royston (1983)
Some Techniques for Assessing Multivarate Normality Based on the Shapiro- Wilk WJ. R. Stat. Soc.: Ser. C: Appl. Stat., 32
P Royston (1995)
Remark as r94: a remark on algorithm as 181: the w-test for normalityJ. R. Stat. Soc.: Ser. C: Appl. Stat., 44
J. Royston (1983)
Some Techniques for Assessing Multivarate Normality Based on the Shapiro‐Wilk WJournal of The Royal Statistical Society Series C-applied Statistics, 32
K. Phua, Jianfeng Chen, T. Dat, L. Shue (2008)
Heart sound as a biometricPattern Recognit., 41
M. Belkin, P. Niyogi (2003)
Laplacian Eigenmaps for Dimensionality Reduction and Data RepresentationNeural Computation, 15
S. Raudys, V. Valaitis, Z. Pabarskaite, G. Biziuleviciene (2015)
A Price We Pay for Inexact Dimensionality Reduction
D. Reynolds, R. Rose (1995)
Robust text-independent speaker identification using Gaussian mixture speaker modelsIEEE Trans. Speech Audio Process., 3
Dr ODUGU, Dr. Reddy, Dr. Prasad (2015)
Face Recognition Using Fused Spatial Patterns
C. Mecklin, D. Mundfrom (2005)
A Monte Carlo comparison of the Type I and Type II error rates of tests of multivariate normalityJournal of Statistical Computation and Simulation, 75
(1996)
Mach. Learn
Juwei Lu, K. Plataniotis, A. Venetsanopoulos (2003)
Regularized discriminant analysis for the small sample size problem in face recognitionPattern Recognit. Lett., 24
Paul Viola, Michael Jones (2001)
Rapid object detection using a boosted cascade of simple featuresProceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 1
W. Louis, M. Komeili, D. Hatzinakos (2016)
Real-time heartbeat outlier removal in electrocardiogram (ECG) biometrie system2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)
W. Louis, D. Hatzinakos, A. Venetsanopoulos (2014)
One dimensional multi-resolution Local Binary Patterns features (1DMRLBP) for regular electrocardiogram (ECG) waveform detection2014 19th International Conference on Digital Signal Processing
Ikenna Odinaka, Po-Hsiang Lai, A. Kaplan, J. O’Sullivan, E. Sirevaag, J. Rohrbaugh (2012)
ECG Biometric Recognition: A Comparative AnalysisIEEE Transactions on Information Forensics and Security, 7
A. Whitney (1971)
A Direct Method of Nonparametric Measurement SelectionIEEE Transactions on Computers, C-20
(2005)
Biometric liveness detection
Juwei Lu, K. Plataniotis, A. Venetsanopoulos (2005)
Regularization studies of linear discriminant analysis in small sample size scenarios with application to face recognitionPattern Recognit. Lett., 26
(2007)
Roystest: Royston's multivariate normality test
S. Pouryayevali (2015)
ECG Biometrics: New Algorithm and Multimodal Biometric System
Jian-Feng Hu, Z. Mu (2013)
Authentication System for Biometric Applications Using Mobile DevicesApplied Mechanics and Materials, 457-458
J. Tenenbaum, V. Silva, J. Langford (2000)
A global geometric framework for nonlinear dimensionality reduction.Science, 290 5500
Minna Qiu, Jian Zhang, Jiayan Yang, Liying Ye (2015)
Fusing Two Kinds of Virtual Samples for Small Sample Face RecognitionMathematical Problems in Engineering, 2015
Malcolm Thaler (1988)
The Only EKG Book You'll Ever Need
L Breiman (1996)
Bagging predictorsMach. Learn., 24
L Biel, O Pettersson, L Philipson, P Wide (2001)
ECG analysis: a new approach in human identificationIEEE Trans. Instrum. Meas., 50
Electrocardiogram is a slow signal to acquire, and it is prone to noise. It can be inconvenient to collect large number of ECG heartbeats in order to train a reliable biometric system; hence, this issue might result in a small sample size phenomenon which occurs when the number of samples is much smaller than the number of observations to model. In this paper, we study ECG heartbeat Gaussianity and we generate synthesized data to increase the number of observations. Data synthesis, in this paper, is based on our hypothesis, which we support, that ECG heartbeats exhibit a multivariate normal distribution; therefore, one can generate ECG heartbeats from such distribution. This distribution is deviated from Gaussianity due to internal and external factors that change ECG morphology such as noise, diet, physical and psychological changes, and other factors, but we attempt to capture the underlying Gaussianity of the heartbeats. When this method was implemented for a biometric system and was examined on the University of Toronto database of 1012 subjects, an equal error rate (EER) of 6.71% was achieved in comparison to 9.35% to the same system but without data synthesis. Dimensionality reduction is widely examined in the problem of small sample size; however, our results suggest that using the proposed data synthesis outperformed several dimensionality reduction techniques by at least 3.21% in EER. With small sample size, classifier instability becomes a bigger issue and we used a parallel classifier scheme to reduce it. Each classifier in the parallel classifier is trained with the same genuine dataset but different imposter datasets. The parallel classifier has reduced predictors’ true acceptance rate instability from 6.52% standard deviation to 1.94% standard deviation. Keywords: Pattern recognition, Electrocardiogram, Data synthesis, Outlier removal 1 Introduction (EMG) [3], muscle signal; phonocardiogram (PCG) [4], Electrocardiogram (ECG) signal is a quasi-periodic sig- heart sound; photoplethysmogram (PPG) [5], organ’s vol- nal with a frequency of 1–1.5 heartbeats per second. It umetric measure; electroencephalogram (EEG) [6], brain is a recording of the electrical activity in the heart. An electrical signal; and ECG [7]. Among all these medi- ECG signal consists of ECG heartbeats, and each healthy cal signals, the ECG signal is widely used and studied heartbeathasthefiducialpointsP,Q,R,S,T,andUas worldwide to diagnose heart problems. Therefore, apart illustratedinFig.1.Heartbeatshaverecentlybeenusedas from establishing extensive knowledge about ECG signal a biometric modality. Biometrics is the field of study that by the scientific community, inexpensive sensing devices models people’s identity using their physical or behavioral to acquire the signal have been produced. For this rea- traits [1]. After the millennium [2], research concentra- son, ECG as biometrics can be an inexpensive system tion on biometrics from signals that are available to all to deploy. human beings and from signals that are hard to spoof Biometric systems require a training stage (interchange- has increased. Some of the biomedical signals that have ably called enrollment stage) to verify/identify individuals. been used as biometrics are as follows: electromyogram During the training stage, subjects identities are mod- eled and stored in a database. Intuitively, the bigger the *Correspondence: wlouis@ece.utoronto.ca sample size (the number of observations) the better the The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, model. However, collecting large number of training data Faculty of Applied Science and Engineering, University of Toronto, Toronto, can sometimes be troublesome. For example, in forensic Canada Full list of author information is available at the end of the article applications, one may have few fingerprints or mug shots © The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 2 of 10 on people’s patience in [10–12]. This enrollment session length can provide a possible range of 20–30 heartbeats. We arrived at this number based on observations from the outlier removal experiment in a subsequent section and our work in [13]. We propose two contributions in this paper: the first is to synthesize ECG heartbeats to increase the number of observations, and second, to study the Gaussianity of ECG signals. Furthermore, due to the small sample size, instability in subjects models occurs; hence, we stabilize the model by fusing several classifiers in a parallel scheme. Thispaperisorganized such thatSection2reviews the literature. Section 3 presents the examined database, method of evaluation, and the preprocessing stage along with heartbeat data synthesis and the parallel classi- fier. Section 4 provides experiments and results. Lastly, Fig. 1 ECG heartbeat with fiducial points Section 5 concludes this paper. 2 Literature review of a subject to model. Collecting training data can also be The problem of having a small sample size persists among expensive and inconvenient. For instance, an ECG signal most biometric systems. Several approaches are available would require minutes of clean data acquisition to con- to tackle this problem such as dimensionality reduction, struct a distinctive dataset. Sparing such amount of time data synthesis, and cascade classifiers which deal with might not be feasible. An airport is a fast-paced environ- data imbalance. ment example where requiring minutes to collect data is Most of the work in the literature apply dimension- not preferred. ality reduction techniques. In [14], the authors claimed The most common configuration to set up ECG elec- that when dimensionality reduction is used, the accuracy trodes is the 12-lead configuration which uses ten elec- increases when sample size increases; however, it starts trodes. Six of the ten electrodes are connected to the decreasing when a specific sample size is reached. Feature chest, and four electrodes are connected to the limbs. selection and feature extraction are other approaches to Misplacing the electrodes affects the acquired ECG sig- handle the small sample size issue, and they are similar nal morphology [8]. Using the 12-lead configuration as in concept to dimensionality reduction. The work in [15] a wearable device may not be very attractive due to its acknowledged the dimensionality reduction problem but inconvenient electrode setup. Other configurations such claimed that using support vector machine (SVM) can be as a 1-lead configuration [9], which collects ECG signals a viable approach since it generalizes with small sample from fingertips using three electrodes, are more appeal- size and high dimensional space. On the other hand, the ing. However, it is more prone to noise than the 12-lead work in [16] reported that SVM underperformed when configuration. compared to bagging classification in ECG biometrics. In In this paper, we tackle the problem of having a small [17], the authors examined several dimensionality reduc- sample size. There are two issues that give rise to this tion techniques with different feature selection methods problem. First, the signal is noisy; thus, removing noisy (Wrapper and ReliefF), feature extraction (principal com- heartbeats reduces the number of observations in the ponent analysis (PCA)), and classifiers (K-nearest neigh- dataset. Also, the ECG heartbeat is a slow signal to acquire bor, linear discriminative analysis (LDA), Naive Bayes, especially if compared to other biometrics traits such SVM, and others). It was demonstrated that the highest as video-based face recognition where it is possible to accuracy was achieved using ReliefF and PCA since they stream 30 frames per second. For practical applications, better generalize the data. In [18, 19], the authors pro- the extent of people’s patience to cooperate and pro- posed quadratic-like discriminative analysis. In this paper, vide their data for enrollment has been recently studied we compare our proposed work to several dimensionality [10–12]. reduction techniques. For this paper, we simulate a small sample size environ- Generating synthesized data is mostly examined in face ment by allocating a small number of observations to train recognition due to symmetry of the face. In [20], the a model. We chose an arbitrary number of 20 observa- authors generated mirror images of the original image tions as our baseline since we aimed to have 30–40 s of and generated extra left and right symmetrical images. In an enrollment session which we decided on from reports [21], the authors proposed a cascade classifier where each Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 3 of 10 classifier was trained with a fixed number of samples to Eachsubjecthas adatarecodingof3minonaverage. reduce data imbalance. Lastly, some techniques are ori- We used the dataset of 1012 to achieve scalability in ented towards synthesizing ECG heartbeats, but they are low-performance variance. not for biometrics applications as in [22]. The work in [22] extracts all fiducial points in Fig. 1, and we argue that error 3.2 Method of evaluation in extracting these fiducial points negatively affects the Quantities and their calculations that are used throughout performance of a biometric system. this paper are explained in this section. False acceptance rate (FAR), false rejection rate (FRR), true acceptance rate 3 Methodology (TAR), true rejection rate (TRR), receiver operating char- Verification biometric system is the focus of this paper. acteristic (ROC) curve, and equal error rate (EER) were A verification biometric system is a binary classification the main measures used to assess the quality of the pro- problem to separate two classes: genuine and imposter. posed system. Each tested dataset has G + I samples, with The genuine class corresponds to data acquired from the G being the number of genuine heartbeats and I being the subject that needs to be modeled while the imposter class number of imposter ECG heartbeat samples. We define corresponds to data collected from subjects other than thenumberoftruepositive, nTP, asthenumberofcor- the genuine subject. The imposter class dataset is much rectly classified genuine heartbeats. Similarly, the number larger than the genuine class dataset since any subject of true negative, nTN, is defined as the number of cor- that is not genuine can be considered as an imposter. In rectly classified imposter heartbeats. Moreover, the num- two-class classification problems, classifiers need to be ber of false positive, nFP, is the number of misclassified trained with both genuine and imposter datasets in order imposter heartbeats as genuine heartbeats. Likewise, the to design a function that can separate them. If an imbal- number of false negative, nFN, is the number of misclassi- anced number of data is used, bias occurs and accuracy is fied genuine heartbeats as imposter heartbeats. Following sacrificed. If the number of imposter data is reduced to be these definitions: in balance with the number of genuine data, the biomet- nFP nTP ric system does not perform too well. Table 1 presents this FAR = ,FRR = 1 − (1) I G phenomena. In this paper, we propose to study the Gaussianity of Also TRR = 1 − FAR and TAR = 1 − FRR. ROC curves ECG signal then synthesize it based on a parametric measure the performance of a system in different operat- model (Gaussian) to increase sample size. The main point ing points. An ROC curve plots FRR versus FAR. Closely of increasing the sample size is to reduce the imbalance related is EER. EER is the error on the operating point for in number between genuine and imposter data. We also which FAR is equal to FRR. use a parallel classifier scheme to reduce instability in clas- sifiers. Before delving into the proposed work, the used 3.3 Preprocessing database throughout this paper along with the method of ECG signal is one among other human body-generated evaluation is presented. electrical signals. Other electrical and non-electrical sig- nals may interfere with ECG signal acquisition (e.g., EMG 3.1 University of Toronto database signal). Respiration also interferes with the acquisition Throughout the past century, clinics have collected sev- on the range of frequencies of 0.15–0.30 Hz [23]. Exter- eral ECG databases. However, most of these databases nal environment signals such as contact noise, power- are for medical purposes. In our work, we rely on the line interference (50 or 60 Hz), and electrode movements University of Toronto database (UofTDB). This database (1–10 Hz) are other sources of noise. A fourth-order was collected at the University of Toronto [9]. This paper band-pass Butterworth filter with cutoff frequencies of examined 1012 subjects. UofTDB was recorded from fin- 0.5–40 Hz was applied to the signal as a first stage of pre- gertips with single lead and with sampling rate of 200 Hz. processing. Afterwards, ECG signals were isolated into Table 1 Experiment illustrating data imbalance influence on accuracy # of imp. obs. 20 40 60 80 100 150 200 250 EER (%) 10.41 9.51 9.70 9.77 9.89 9.35 10.00 9.74 TRR (%) 89.59 95.38 97.50 98.21 98.80 99.32 99.56 99.70 TAR (%) 88.93 80.11 71.68 66.04 60.54 51.85 44.76 38.83 We used 20 observations for genuine data. Despite the fact that EER was not influenced greatly when number of imposter data increased, TAR has decreased significantly and TRR has increased. This suggests that the classifier became biased towards imposter data. EER, TAR, and TRR quantities and their calculations are explained in Section 3.2. TRR and TAR are calculated for the 50% decision threshold of selection between imposter and genuine classes EER equal error rate, TAR true acceptance rate, TRR true rejection rate Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 4 of 10 heartbeats and were centered at the R peaks with 500-ms measurement. Log-likelihood measures quantitatively the duration from each side of the peak [16]. R peaks were likelihood that the tested data belong to the mixture. detected using Pan-Tompkins [24]. Choosing the minimum negative log-likelihood is equiva- After segmenting the signal, we removed outliers lent to choosing the maximum likelihood. using the Gaussian mixture model (GMM) online out- GMM with two components (GMM, M =2) was trained lier removal in [13]. If we model normal heartbeats, then on a dataset of normal heartbeats. GMM, M = 2was used any heartbeat with statistics significantly different from in particular due to our previous work results in [13]. The the normal heartbeat model is classified as an abnor- collection of normal heartbeats was conducted by remov- mal heartbeat. Hence, we constructed a normal heartbeat ing abnormal ECG heartbeats from the examined pool model. For the task, normal heartbeat segments were of heartbeats. A heartbeat that was significantly different collected to train the GMM. We used the GMM as a one- from healthy ECG morphology which contains P, Q, R, class classifier unlike the usual work in the literature which S, T, and U fiducial points was considered as an abnor- uses it as an unsupervised clustering method. GMM is a mal heartbeat. In other words, the R peak of the heart- sum of M-weighted Gaussian densities [25] given by beats were first detected by Pan-Tompkins algorithm, then these heartbeats were manually inspected to ensure they follow the morphology in Fig. 1 to decide whether P(x) = w p(x, μ , C ) (2) m m m they are normal or abnormal heartbeats. During biomet- ric system experiments, every heartbeat in the examined are the weights of the Gaussian densities, where w database was passed through this outlier removal to mea- w = 1. x is a k dimensional feature vector. There- sure heartbeat quality and to decide whether to keep (i.e., fore, the probability density function, p(x, μ , C ),is m m classify as normal) or to eliminate (i.e., classify as abnor- 1 1 T −1 − (x−μ ) C (x−μ ) mal). Figure 2 demonstrates ECG signal heartbeats before m m 2 m p(x, μ , C ) = e m m (3) k 1 and after outlier removal. Table 2 presents the EER for 2 2 (2π) (|C |) the biometric system with and without outlier removal, where μ and C are the mean vector and the covariance m m and it also reports the number of observations exam- matrix, respectively. Also, |C | is the determinant of the ined. It can be noticed that almost half the observations covariance matrix. were removed by applying this method of outlier removal. If we have a vector of 200 features (i.e., k = 200), Other outlier removal approaches might be used, but the then each Gaussian distribution is of 200 dimensions. The GMM-based outlier removal is an online outlier removal motivation behind using the GMM was the assumption that depends on current and previous observations only, that normal ECG heartbeats could be modeled into M and it is subject invariant. Hence, it is more desirable in Gaussian densities, each in k dimensions. practical applications. Therefore, it was used in the paper. The expectation maximization (EM) [26] algorithm was Despite the achieved high accuracy, around 50% of the used to construct the GMM. EMconsiders all training heartbeats were classified as abnormal heartbeats; conse- examples and attempts to fit a Gaussian distribution on it. quently, using such outlier removal may give rise to the The training steps would be as the following: issue of small sample size. Also, for this reason, having 30–40 s of enrollment means we would collect an average 1. Compute the probability that the training sample x of 20 clean observations, which was used as a baseline in belongs to the Gaussian m using (i) (i) (i) this paper. w p(x,μ ,C ) (i) (i) m m m P(x|m) = ,where P(x|μ , C ) is m m M (i) (i) (i) w p(x,μ ,C ) m m j j used to indicate that these values depend on the 3.4 ECG heartbeats synthesis previous iteration We hypothesize that ECG heartbeats exhibit a multi- (i+1) 1 T 2. Estimate the new weight w = P(x |m) variate Gaussian distribution. However, the influence of m t t=1 P(x |m)x internal and external factors deviate the model from (i+1) t t t=1 3. Estimate the new mean μ = P(x |m) t Gaussianity. We attempt to capture this underlying Gaus- t=1 2(i+1) P(x |m)x t t 2 t=1 sianity. Each observation consisted of 200 time samples 4. Update the variance σ = − μ P(x |m) t=1 (random variables) since the sampling rate is 200 Hz, and we segmented the heartbeats to have a 1-s duration. As where T is the number of observations in the train- mentioned earlier and as shown in Table 1, we desired to ing dataset. There is no specific method for termination; generate data that can be appended to the genuine dataset however, it is usually based on a heuristic approach. to reduce data imbalance and to reduce bias towards 3.3.1 Evaluation procedure imposter dataset. n×k After obtaining the Gaussian models from the train- We modeled the genuine data X ∈ R ,where n is ing data, the evaluation was based on the log-likelihood thenumberofobservationsand k = 200 is the number Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 5 of 10 Fig. 2 GMM, M = 2 model outlier removal. a Before outlier removal; b after applying GMM, M = 2 outlier removal Fig. 3 Synthesized data generation from multivariate Gaussian distribution. a Real heartbeats; b synthesized heartbeats of dimensions. Therefore, an observation x with k dimen- sions has probability density p(x) ∼ N (μ, ) such that: We used Royston’s test [27, 28] for multivariate normal- ity test. It is based on Shapiro-Wilk’s test [29], a univariate 1 1 t −1 − (x−μ) (x−μ) p(x) = e (4) normality test. Royston’s test checks normality of each k/2 1/2 (2π) || variable alone using Shapiro-Wilk’s test, then it combines where μ ∈ R is the mean of X, is the covariance matrix Shapiro-Wilk statistics into one statistics test for mul- of X, || is the determinant of the covariance matrix, and tivariate distribution. The combined multiple statistics −1 is the inverse of the covariance matrix. A synthesized 2 would approximate a χ random variable when the data observation is generated by drawing a random vector from is a multivariate Gaussian distribution. If W is Shapiro- this distribution. Wilk’s test of the jth variable in the multivariate data, then A set of data synthesis is in Fig. 3. This result was not Royston’s test, R [30, 31]: surprising. Prior to making such multivariate hypothe- g 2 sis, we analyzed the Gaussianity of the ECG heartbeat. 1 (1 − W ) − m −1 R = φ φ − (5) 2 s Table 2 Biometric system performance with outlier removal system without limiting training sample to 20 observations Parameters g, m,ands are calculated from polynomial −1 approximation. φ(.), φ (.) are the CDF and its inverse for Method EER (%) No. of observations the Gaussian distribution, respectively. If we have p vari- No outlier removal 9.44 158,984 ates, then the aggregation of R in Eq. 6 would have a χ GMM, M = 2 5.94 78,655 distribution. Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 6 of 10 R From this Gaussian model, we create the synthesized ECG H = e (6) heartbeats. j=1 e is the equivalent degree of freedom and is calculated 3.5 Parallel classifier to reduce instability as: The main purpose of data generation is to increase bio- metric system performance by making use of the abun- e = (7) dance of imposter dataset. The number of real genuine 1 + (p − 1)C observations is small; we restricted it to 20 observations. where C is calculated as the average of the correlations of On the other hand, we have thousands of imposter data. R s. Furthermore, we utilized Sequential Forward Selec- Due to small number of real genuine observations, clas- tion (SFS) [32] algorithm with Royston’s test on the train- sifiers’ structures change significantly depending on the ing dataset to investigate the number of variables that imposter data that train the classifiers. We propose to constitutes a multivariate normal distribution. The algo- use a parallel classifier structure, and Fig. 4 presents the rithm we implemented for multivariate Gaussian analysis scheme for it. All classifiers within the parallel classifier is in Algorithm 1. This algorithm incorporates SFS with were trained with same set of genuine training dataset, but Shapiro-Wilk’s and Royston’s tests. each classifier was trained with a different set of imposter data. The mean value of confidences of the classifiers’ outputs was used to make a classification decision. Algorithm 1: SFS for multivariate normality selection n×k 4 Experimentation Data: X ∈ R , X ={x , x , ... , x } 1 2 k This section investigates three main experiments: first, Result:{ S : S ⊆ X, and is multivariate normal } it presents biometric system improvement as a result of Initialization; data synthesis; second, the experiment compares biomet- An empty set, S={}; ric system accuracy with data synthesis versus systems while (1) do with different dimensionality reduction techniques from while i≤k do the literature; and lastly, the third experiment demon- if i=1 then strates the parallel classifier performance. Throughout all Apply Shapiro_Wilk on S*, S*={S x }; experiments, the bagging classifier was used. else There are several classification methods in the liter- Apply Royston’s test on S*, S*={S x }; ature, and bagging [33] is one of them. In a nutshell, end bagging is a machine learning technique that generates end predictors on merely re-sampled data. The aggregated Find x , such that S* has the highest normality; average of predictors makes a decision. Bagging was used Remove x from X S:{S* x }; h h in particular because we observed an unstable classifier k:k-1; end After running Algorithm 1, ECG heartbeats could suc- cessfully have multivariate normality with more than 20 variables out of the 200 variables. In other words, around 20 out of 200 dimensions could constitute a multivariate normal distribution. This multivariate Gaussianity helps us capture the underlying Gaussianity of the heartbeats and supports our hypothesis that it is most likely that ECG heartbeats exhibit a multivariate Gaussian distribution if there are no changing factors that affect its morphology. Also, experiments based on such assumption improved biometric system performance. In other words, we assume that ECG heartbeats for each individual exhibit a multivariate Gaussian distribu- tion; nevertheless, the changes in ECG heartbeat mor- phology due to diet, physical and psychological changes, Fig. 4 Parallel classifier scheme and other factors deviate the signal from Gaussianity. Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 7 of 10 prediction when we examined ECG heartbeat data. It was trained a classifier with 220 genuine observations (200 unstable in a sense that a slight change in the training synthesized genuine data + 20 real genuine data) and 200 data led to a significant change in the construction of the imposter data. Hence, this proves that adding data syn- classifier and a significant change in accuracy. Bagging thesis improves results. One may inquire why do we not usually reduces this issue [33]. Work in [34] suggests the consider the TAR of 400 synthesis data and 20 imposter superiority of bagging over other classifiers. data as the best result? The reason is TAR, unlike EER Suppose a training dataset, L,ispopulated withdata which considers both TAR and TRR, ignores TRR. TRR {y , x , n = 1, ... , N },where y is the data class and x is the for the same experiment (400 synthesized data and 20 n n input data. From these samples, bagging generates mul- imposter data) has a significant drop from the average (B) (B) tiple bootstrap samples, L ,from L.For each L ,it TRR of all experiments; it has a TRR 82.28%. The reason finds a predictor that predicts the class, y. Bootstrapping behind that is that the classifier is biased towards the gen- (B) samples, L , are constructed by drawing N samples with uine data. It is worth mentioning that the reported TAR replacement from L. The predictor used with bagging in and TRR were calculated when the operating threshold this paper is the simple decision tree. The final decision that splits genuine from imposter classes for the bagging on the class is made by voting. classifier was assigned to 50%. From Table 3, a trend can be noticed that increasing the 4.1 Synthesized ECG heartbeat generation number of synthesized samples does indeed improve the This experiment reports the improvement achieved in a result. However, it can improve the result to the extent biometricsystem’sEER,TAR,and TRRquantities. Syn- where real genuine data start to get concealed by the abun- thetic data were generated as explained in Section 3.4. dance of the synthesized data. From this point onwards, The generated data were added to the pool of real gen- the model turns to be mostly a multivariate Gaussian uine data, and they were used to train a bagging classifier. distribution only, i.e., it can be described by mean and Table 3 presents an experiment when the real genuine data standard deviation parameters. This model by itself might were restricted to 20 observations for every subject. not be descriptive enough to classify a large number of FromTable3,itcan be noticedthatthebest EER subjects adequately, e.g., the 1012-subject database in from the examined experiments was achieved when we UofTDB. Table 3 Experiment for 20 real genuine data Number of synthesis Number of imposter samples 20 40 60 80 100 150 200 250 EER (%) 0 10.41 9.51 9.70 9.77 9.89 9.35 10.00 9.74 50 10.07 8.59 8.37 7.46 7.55 7.69 7.78 7.48 100 9.84 8.64 7.71 7.41 7.46 7.50 7.59 7.34 200 9.51 8.75 7.98 7.60 7.64 7.46 6.71 7.18 400 9.80 8.53 8.16 7.82 7.76 7.44 7.36 7.06 TAR (%) 0 88.93 80.11 71.68 66.04 60.54 51.85 44.76 38.83 50 93.99 89.51 85.36 82.90 80.43 74.91 69.88 66.55 100 94.58 90.54 88.04 85.75 83.78 79.79 76.20 73.08 200 95.37 91.54 89.06 87.02 85.48 82.22 79.53 76.99 400 95.51 92.12 90.00 88.61 87.21 84.37 81.85 80.08 TRR (%) 0 89.59 95.37 97.50 98.21 98.80 99.28 99.56 99.70 50 85.45 92.97 95.68 96.63 97.43 98.40 98.89 99.15 100 84.82 92.11 94.83 96.00 96.82 97.95 98.51 98.83 200 83.24 91.25 93.90 95.19 96.09 97.36 98.02 98.43 400 82.28 90.06 93.03 94.34 95.32 96.71 97.45 97.91 Synthetic genuine data are appended with real genuine data in training the bagging classifier Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 8 of 10 Table 4 Experiment for 30 (top) and 60 (bottom) real genuine observations to: emphasize on performance improvement when we have 30 and 60 genuine observations instead of 20 genuine observations (Table 3), show that data synthesis improves the results, and lastly, to observe the upward improvement when training sample size was increased using data drawn from Gaussian distribution Number of synthesis Number of imposter samples EER (%) 20 40 60 80 100 150 200 250 Training with 30 real genuine dataset 0 9.06 8.24 7.96 7.87 8.19 8.44 7.56 8.40 50 8.71 7.88 7.71 7.47 6.77 6.68 6.68 7.25 100 9.05 8.04 7.16 6.77 6.88 6.77 6.60 6.27 200 8.49 7.77 7.27 7.06 7.05 6.21 6.66 6.33 400 9.06 7.99 7.76 7.41 7.33 6.99 6.15 6.38 Training with 60 real genuine dataset 0 7.38 6.50 6.36 6.01 6.06 5.98 5.82 5.38 50 7.24 6.39 6.17 5.98 5.37 5.12 5.63 5.17 100 7.35 6.35 6.20 5.97 5.46 5.17 5.03 5.13 200 7.54 6.52 5.91 5.70 5.56 5.28 5.08 5.28 400 7.35 6.43 6.34 6.07 6.02 5.64 5.40 5.04 We considered a baseline of 20 real genuine observa- examined parameters that achieved the lowest EER while tions, but we also conducted other experiments when Fig. 5 computes ROC curves for the biometric systems real genuine dataset has 30 and 60 observations. Table 4 with data synthesis and all dimensionality reduction tech- tabulates the EER that was achieved along with its niques with parameters that achieved the lowest EER. It corresponding number of synthesized data and number of is pertinent to mention that all biometric systems were imposter data. This table further confirms our hypothesis implemented in an identical environment using same sets on the fact that adding the proposed generated synthe- of real genuine and imposter observations. sized data reduces data imbalance and constructs a better classifier. 4.3 Parallel classifier Bagging classifier has been investigated for ECG heart- 4.2 Comparison to dimensionality reduction beats due to its capability to reduce instability in predic- Dimensionality reduction is one of the most used tech- tors. Despite the reduction in instability, some instability niques in the literature to deal with the small sample still exist. This instability can especially be noticed on size problem [14, 15]. In this experiment, we compared the biometric system with data synthesis versus biomet- ric systems with PCA, probabilistic PCA [35], Isomap [36], Laplacian [37], and local linear embedding (LLE) [38]. In all of these biometric systems, real genuine data of 20 observations were used and a wide range of num- bers of imposter data and numbers of reduced dimensions were experimented. Table 5 tabulates the results with the Table 5 Dimensionality reduction techniques with parameters that achieved the lowest EER Method EER (%) No. of impos. No. of dim. PCA 9.92 20 40 Prob. PCA 13.47 120 20 Isomap 16.16 120 10 LLE 14.82 20 50 Laplacian 13.43 250 20 Fig. 5 ROC curves for the biometric system with different dimensionality reduction techniques Data synthesis 6.71 200 200 Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 9 of 10 Table 7 Experiment shows that parallel classifier and bagging performance of individual subjects rather than consid- complement each other ering hundreds of subjects when calculating biometric system performance using confusion matrix. Our pro- Classifiers EER (%) posed parallel classifier further reduces such instability 50 parallel classifiers with 1 decision tree (no bagging) 20.98 by implementing bagging classifiers in a parallel scheme. 1 parallel classifier with 50 decision trees (bagging) 6.71 Table 6 reports the instability result and presents the influ- ence of the parallel classifier in stabilizing it. It can be observed from Table 6 that when there is no parallel clas- sifier, TAR would have a standard deviation of 6.52% per proper and simple technique to generate ECG heart- subject and TRR of 0.61% per subject while when parallel beat synthesis. Also a methodology to reduce classifiers’ classifier was used, TAR would have a standard deviation instability was presented and used. We used Sequential of 1.94% per subject and TRR of 0.10% per subject. The Forward Selection along with Shapiro-Wilk’s univariate only difference among classifiers in the parallel classifier is and Royston’s multivariate normality tests to find a sub- that the imposter datasets are different in each classifier. set of ECG heartbeat variables that exhibit multivariate Complexity can be an issue. If training a classifier takes normal distribution. Our analysis suggests that more than t seconds, then training n parallel classifiers needs n × t 20 variables in the ECG heartbeats have multivariate nor- seconds. mal distribution. Those multivariate variables capture the One might wonder that the parallel classifier might main features of ECG heartbeats. Therefore, they assist make the bagging classifier a redundant stage since both us in capturing the underlying Gaussianity of heartbeats classifiers attempt to do the same task—the aggregate and further support our hypothesis that ECG heartbeats decision of different classifiers trained with different data. exhibit a multivariate Gaussian distribution should devi- Nevertheless, the main difference is that in bagging, we re- ating factors not occur. ECG heartbeat synthesis was used sample the data from the same pool while in the parallel to generate genuine subject data to increase its sample size classifier, we change the imposter data completely in each in a verification biometric system. When only 20 real gen- classifier. We have conducted an experiment to explain uine heartbeats were used and 200 synthesized heartbeats that parallel classifier and bagging complement each other were generated, the biometric system achieved an equal rather than making one as redundant. The experiment error rate (EER) of 6.71% in comparison to a minimum was conducted on the highest achieving results in Table 3 of 9.35% when data synthesis was not utilized. A biomet- (i.e., 20 real genuine with 200 imposter samples and 200 ric system with data synthesis outperformed several other synthesized data). We once created 50 parallel classi- biometric systems which employed dimensionality reduc- fierswhileusingjustonedecisiontree(i.e.,nobagging), tion techniques. The EER of the biometric system with and once again, we experimented one parallel classifier data synthesis outperformed PCA by 3.21%, probabilistic and bagging with 50 decision trees. Table 7 presents the PCA by 6.76%, Isomap by 9.45%, local linear embedding results. by 8.11%, and Laplacian by 6.72%. From Table 7, we can conclude that parallel classi- Classifier instability is problematic especially when the fier alone does not improve the results greatly or makes sample size of the data is small. Bagging is usually used bagging redundant but it increases robustness towards to reduce such effect, so we used it; however, to further changes in the imposter data as noted in Table 6. reduce instability, we proposed to use a parallel classi- fier scheme. All classifiers were trained with the same set 5Conclusions of genuine data while each classier was trained with a Two contributions have been proposed in this paper: different set of imposter data. This method reduced clas- analyzing the Gaussianity of ECG observations and a sifier instability. Through this scheme, we could reduce the true acceptance rate instability from 6.52% standard deviation to 1.94% standard deviation. The proposed con- tributions are expected to produce promising results in Table 6 Standard deviation of TAR and TRR for biometric other applications. systems with and without parallel classifier Currently, we exploited the Gaussianity of ECG heart- No. of parallel TAR standard TRR standard beats; nevertheless, other approaches such as deep learn- classifiers deviation (%) deviation (%) ing to generate data can be researched in the future. Our 0 ±6.52 ±0.61 preliminary results with deep learning achieve promising 5 ±3.63 ±0.24 results. Furthermore, the maximum number of synthe- sized data before they start concealing the real genuine 10 ±2.43 ±0.17 data might be set up as an optimization problem, and this 20 ±1.94 ±0.10 is also left as a future work. Louis et al. EURASIP Journal on Bioinformatics and Systems Biology (2017) 2017:5 Page 10 of 10 Authors’ contributions 18. J Lu, KN Plataniotis, AN Venetsanopoulos, Regularization studies of linear All the authors were involved in the analysis and had contribution to this discriminant analysis in small sample size scenarios with application to paper. All authors read and approved the final manuscript. face recognition. Pattern Recogn. Lett. 26(2), 181–191 (2005) 19. J Lu, KN Plataniotis, AN Venetsanopoulos, Regularized discriminant Competing interests analysis for the small sample size problem in face recognition. Pattern The authors declare that they have no competing interests. Recogn. Lett. 24(16), 3079–3087 (2003) 20. M Qiu, J Zhang, J Yang, L Ye, Fusing two kinds of virtual samples for small sample face recognition. Math. Probl. Eng. 2015 (2015) Statement of declaration 21. P Viola, M Jones, in Computer Society Conference on Computer Vision and The University of Toronto database examined in this paper was collected in Pattern Recognition. Rapid object detection using a boosted cascade of our BioSec laboratory, and the collection followed the University of Toronto simple features, vol. 1 (IEEE, 2001), p. 511 Ethic Policy as explained in Chapter 3 in [39]. 22. PE McSharry, GD Clifford, L Tarassenko, LA Smith, A dynamical model for generating synthetic electrocardiogram signals. IEEE Trans. Biomed. Eng. Author details 50(3), 289–294 (2003) The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, 23. GD Clifford, F Azuaje, P McSharry, et al, Advanced methods and tools for Faculty of Applied Science and Engineering, University of Toronto, Toronto, ECG data analysis (2006). Artech house, London Canada. University of Toronto, Toronto, Canada. 24. J Pan, WJ Tompkins, A real-time QRS detection algorithm. IEEE Trans. Biomed. Eng. BME-32(3), 230–236 (1985). doi:10.1109/TBME.1985.325532 Received: 7 September 2016 Accepted: 6 January 2017 25. D Reynolds, R Rose, Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995) References 26. T Bailey, C Elkan, in International Conference on Intelligent Systems for 1. B Toth, Biometric liveness detection. Inf. Secur. Bull. 10(8), 291–297 (2005) Molecular Biology. Fitting a mixture model by expectation maximization 2. L Biel, O Pettersson, L Philipson, P Wide, ECG analysis: a new approach in to discover motifs in biopolymers, vol. 2, (1993), pp. 28–36 human identification. IEEE Trans. Instrum. Meas. 50(3), 808–812 (2001) 27. P Royston, Remark as r94: a remark on algorithm as 181: the w-test for 3. MS Holi, Electromyography analysis for person identification. Int. J. Biom. normality. J. R. Stat. Soc.: Ser. C: Appl. Stat. 44(4), 547–551 (1995) Bioinforma. (IJBB). 5(3), 172 (2011) 28. JP Royston, Some Techniques for Assessing Multivarate Normality Based 4. K Phua, J Chen, TH Dat, L Shue, Heart sound as a biometric. Pattern on the Shapiro- Wilk W. J. R. Stat. Soc.: Ser. C: Appl. Stat. 32(2), 121–133 Recogn. 41(3), 906–919 (2008) (1983). http://www.jstor.org/stable/2347291, [Wiley, Royal Statistical 5. P Spachos, J Gao, D Hatzinakos, in International Conference on Digital Society] Signal Processing (DSP). Feasibility study of photoplethysmographic 29. SS Shapiro, MB Wilk, An analysis of variance test for normality (complete signals for biometric identification (IEEE, 2011), pp. 1–5 samples). Biometrika. 52(3/4), 591–611 (1965) 6. JF Hu, ZD Mu, Authentication system for biometric applications using 30. CJ Mecklin, DJ Mundfrom, A Monte Carlo comparison of the Type I and mobile devices. Appl. Mech. Mater. 457, 1224–1227 (2014) Type II error rates of tests of multivariate normality. J. Stat. Comput. Simul. 7. I Odinaka, PH Lai, AD Kaplan, JA O’Sullivan, EJ Sirevaag, JW Rohrbaugh, 75(2), 93–107 (2005) ECG biometric recognition: a comparative analysis. IEEE Trans. Inf. 31. A Trujillo-Ortiz, R Hernandez-Walls, K Barba-Rojo, L Cupul-Magana, Forensics Secur. 7(6), 1812–1824 (2012) Roystest: Royston’s multivariate normality test (2007). MATLAB File 8. MS Thaler, The only EKG book you’ll ever need, 5th edn. (Lippincott Williams Exchange, https://www.mathworks.com/matlabcentral/fileexchange/ & Wilkins, USA, 2007) 17811-roystest?requestedDomain=www.mathworks.com 9. S Pouryayevali, S Wahabi, S Hari, D Hatzinakos, in IEEE International 32. AW Whitney, A direct method of nonparametric measurement selection. Conference on Acoustics, Speech and Signal Processing (ICASSP).On IEEE Trans. Comput. C-20(9), 1100–1103 (1971). establishing evaluation standards for ECG biometrics (IEEE, 2014), doi:10.1109/T-C.1971.223410 pp. 3774–3778 33. L Breiman, Bagging predictors. Mach. Learn. 24(2), 123–140 (1996) 10. J Carvalho, V Sá, S Tenreiro de Magalhães, H Santos, in Iccws 2015—The 34. W Louis, M Komeili, D Hatzinakos, Continuous Authentication Using Proceedings of the 10th International Conference on Cyber Warfare and One-Dimensional Multi-Resolution Local Binary Patterns (1DMRLBP) in Security. Enrollment time as a requirement for biometric hand recognition ECG Biometrics. IEEE Trans. Inf. Forensic Secur. 11(12), 2818–2832 (2016). systems (Academic Conferences Limited, 2015), p. 66 doi:10.1109/TIFS.2016.2599270 11. VJ Sá, ST Magalhães, HD Santos, Enrolment time as a requirement for 35. ME Tipping, CM Bishop, Probabilistic principal component analysis. J. R. biometric fingerprint recognition. Int. J. Electron. Secur. Digit. Forensic. Stat. Soc. Ser. B Stat Methodol. 61(3), 611–622 (1999) 6(1), 18–24 (2014) 36. JB Tenenbaum, V De Silva, JC Langford, A global geometric framework for 12. V Sá, S Magalhães, H Santos, in Proceedings of 13th European Conference on nonlinear dimensionality reduction. Science. 290(5500), 2319–2323 (2000) Cyber Warfare and Security. Enrolment time as a requirement for face 37. M Belkin, P Niyogi, Laplacian eigenmaps for dimensionality reduction and recognition biometric systems, (2014), pp. 167–171 data representation. Neural Comput. 15(6), 1373–1396 (2003) 13. W Louis, M Komeili, D Hatzinakos, in 2016 IEEE Canadian Conference on 38. ST Roweis, LK Saul, Nonlinear dimensionality reduction by locally linear Electrical and Computer Engineering (CCECE). Real-time heartbeat outlier embedding. Science. 290(5500), 2323–2326 (2000) removal in electrocardiogram (ECG) biometrie system, (2016), pp. 1–4. 39. S Pouryayevali, ECG biometrics: new algorithm and multimodal biometric doi:10.1109/CCECE.2016.7726845 system (2015). Master’s thesis, University of Toronto 14. S Raudys, V Valaitis, Z Pabarskaite, G Biziuleviciene, A price we pay for inexact dimensionality reduction. Lect. Notes Comput. Sci. 9044, 289–300 (2015). Springer. http://link.springer.com/chapter/10.1007/978-3-319- 16480-9_29 15. OR Devi, L Reddy, E Prasad, Face recognition using fused spatial patterns. Int. J. 4(2) (2015) 16. W Louis, D Hatzinakos, A Venetsanopoulos, in 2014 19th International Conference on Digital Signal Processing. One dimensional multi-resolution local binary patterns features (1DMRLBP) for regular electrocardiogram (ECG) waveform detection, (2014), pp. 601–606. doi:10.1109/ICDSP.2014.6900735 17. I Rodriguez-Lujan, G Bailador, C Sanchez-Avila, A Herrero, G Vidal-de- Miguel, Analysis of pattern recognition and dimensionality reduction techniques for odor biometrics. Knowl.-Based Syst. 52, 279–289 (2013)
EURASIP Journal on Bioinformatics and Systems Biology – Springer Journals
Published: Feb 21, 2017
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.