Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals

Xueyi Li; Jialin Li; Yongzhi Qu; David He

doi:10.3390/app9040768

Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals

Li, Xueyi;Li, Jialin;Qu, Yongzhi;He, David 2019-02-22 00:00:00 applied sciences Article Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals 1 1 2 3 , Xueyi Li , Jialin Li , Yongzhi Qu and David He * School of Mechanical Engineering and Automation, Northeastern University, Shenyang 110819, China; lixueyineu@gmail.com (X.L.); jialinli_neu@163.com (J.L.) School of Mechanical and Electronic Engineering, Wuhan University of Technology, Wuhan 430070, China; quwong@whut.edu.cn Department of Mechanical and Industrial Engineering, University of Illinois at Chicago, Chicago, IL 60607, USA * Correspondence: davidhe@uic.edu; Tel.: +1-312-996-3410 Received: 23 December 2018; Accepted: 19 February 2019; Published: 22 February 2019 Abstract: This paper deals with gear pitting fault diagnosis problem and presents a method by integrating convolutional neural network (CNN) and gated recurrent unit (GRU) networks with vibration and acoustic emission signals to solve the problem. The presented method ﬁrst trains a one-dimensional CNN with acoustic emission signals and a GRU network with vibration signals. Then the gear pitting fault features obtained by the two networks are concatenated to form a deep learning structure for gear pitting fault diagnosis. Seven different gear pitting conditions are used to test the feasibility of the presented method. The diagnosis result of the gear pitting fault shows that the accuracy of the presented method reaches above 98% with only a relatively small number of training samples. In comparison with the results using CNN or GRU network alone, the presented method gives more accurate diagnosis results. By comparing the results of different loads and learning rates, the robustness of the presented method for gear pitting fault diagnosis is proved. Moreover, the presented deep structure can be easily extended to more other sensor input signals for gear pitting fault diagnosis in the future. Keywords: gear pitting fault diagnosis; gated recurrent unit; one-dimensional convolutional neural network; acoustic emission signal; vibration signal 1. Introduction Gearboxes are an essential part of a mechanical transmission system. The diagnosis of gear pitting faults has always been an important problem in the industry. In recent years, the development of sensing technology and the improvement of computing power have provided more tools for gear fault diagnosis. Analysis of vibrational signals is the most common means of monitoring gear conditions. Vibrational signals have been used as a popular input in the diagnosis of gear pitting faults. Camerini et al. [1] presented an automatic vibration-based program that utilizes health and usage monitoring system data for the early diagnosis of mechanical properties of drivetrain components. There is no general indication of the minimum training collection times required to accurately describe a set of condition indicators, which largely depend on the characteristics of the distribution. Kattelus et al. [2] found that the vibration acceleration descriptor of the peak signal was related to the pitting of the gear contact. Traditional vibrational signals are signiﬁcantly affected by the external environment. For example, the statistical vibration acceleration descriptor which indicates the random Appl. Sci. 2019, 9, 768; doi:10.3390/app9040768 www.mdpi.com/journal/applsci Appl. Sci. 2019, 9, 768 2 of 15 peak value in the vibration signal is more suitable to indicate the tooth wear than the spectral method. So other sensors are gradually applied to the diagnosis of gear pitting faults. Qu et al. [3] presented the use of optic ﬁber sensors to detect the initial gear pitting fault. The results show that the optic ﬁber Bragg grating signal can effectively detect pitting faults under heavy load conditions. However, when the transmission has high structural stiffness, ﬁber Bragg gratings (FBGs) performance may be limited by relatively weak strain signals. Acoustic emission (AE) is a non-destructive diagnostic technique. Sharma et al. [4] took advantage of the Hertz contact method to establish a relationship between the fault/defect size and the AE energy generated during the gear meshing process. The results of the study indicated that if the defect size increases, the AE level also increases. Zhou et al. [5] compared the AE data with traditional vibration data. The results showed that the AE signals were more sensitive to defect excitation and the background noise was reduced in AE signals. The results of Elasha et al. [6] showed that AE recognizes defects earlier than vibration analysis, regardless of the tortuous transmission path. The effect of AE could be limited by how close the AE sensor to the defect. Many methods have been developed for pitting diagnosis of gears. The most representative research is the hidden semi-Markov model [7]. In general, a hidden semi-Markov model is difﬁcult to train. Therefore, other learning models such as support vector machine (SVM) [8] and principal components (PCA) [9] were used. Sanchez et al. [10] presented a method for detecting 11 kinds of rotating machinery faults by feature sorting method and SVM. Thirty features were calculated from the analysis of the vibration signal and the electromyography. Classiﬁcation accuracy was 98.7% achieved using SVM. Fan et al. [11] studied gear tooth surface damage diagnosis based on analyzing the vibration signal of an individual gear tooth. The characteristics of damaged and normal teeth were studied by analyzing their waveforms. The results showed that almost all the damaged teeth were correctly detected by the proposed method, even if there were some misdiagnoses in the identiﬁcation of the extent of the damage. However, the paper does not explore the exact classiﬁcation of damage degree. Wang et al. [12] presented a method for diagnosing the absolute deviation of gear faults. The method used single fault gear broken teeth, pitting corrosion and composite fault gear tooth damage setting dynamics model. The results of different broken teeth were obtained through simulation analysis. This method was to investigate the case of broken teeth, but the authors did not verify the early pitting corrosion of gear. Nevertheless, the above-mentioned methods require much domain expertise and prior knowledge and often rely on hand-crafted features. Often, these methods require a large amount of work in feature extraction. The use of frequency features of vibration signals for gear fault diagnosis has been common over the past decade. Feng et al. [13] obtained the amplitude and frequency demodulation spectra by applying the Fourier transform to the amplitude envelope and the instantaneous frequency of the selected sensitive intrinsic mode functions. The planetary gearbox fault was detected based on the features shown in the demodulated spectrum. Although frequency domain features can be directly related to fault type and level, these features are usually abstract representations and require additional pre-processing [14]. By directly extracting the gear pitting fault signals in the time domain, the calculation cost and the time cost can be saved. Sun et al. [15] used a backpropagation (BP) neural network to train the gears of four typical fault modes and obtained satisfactory results. The results showed that the BP neural network could effectively perform gear fault diagnosis. These methods can only extract the shallow features for gear fault diagnosis. There are limitations in mining deep features. Fortunately, with the development of deep learning, it is possible to extract fault features directly from the raw signals. Deep learning has been rapidly developed in recent decades [16]. Qu et al. [17] integrated dictionary learning into a stacked autoencoder network for gear pitting fault diagnosis. They applied the sparse autoencoder algorithm to gear fault detection for the ﬁrst time. Jiang et al. [18] presented a CNN-based deep learning method that automatically learned effective fault features directly from the raw vibrational signals, and classiﬁes fault types in a single frame to provide a wind turbine gearbox diagnostic system based on end-to-end learning. Under 10ten operating conditions, there were Appl. Sci. 2019, 9, 768 3 of 15 2600 samples for each health condition, and each sample contained 2000 data points. Jing et al. [19] used a convolutional neural network (CNN) to learn features directly from the frequency data of the vibrational signal. Feature learning using CNN can provide better results than manual feature extraction. Zhao et al. [20] presented a local feature-based Gated Recurrent Unit (GRU) network to predict machine conditions. A compact spectral data acquisition instrument was used for signal acquisition with a sampling frequency of 1024 Hz and a sampling window of 512 seconds. The accuracy of gear failure was 95.8%. Dong et al. [21] presented a method of parallel training depth model, which can train different parts of it at different speeds. By splitting the deep neural network model and training on different devices at different speeds, it can speed up the whole training process. The training accuracy of this method was about 70%. Chen et al. [22] used four classical deep neural networks to classify and identify fault conditions in the transmission. It was shown that the vibration signal usually contains abundant information for fault detection, control, and maintenance planning of rotating machinery. Sun et al. [23] used a dual-tree complex wavelet transform to acquire the characteristics of multi-scale signals. The CNN was then used to automatically identify fault features from multi-scale signal features. This method can distinguish 4 kinds of gear faults, but the classiﬁcation of these 4 kinds of faults is relatively easy. However, the detection of early gear pitting fault was not explained in their paper. Their experimental results of gear fault identiﬁcation showed the feasibility and effectiveness of the presented method. In a nutshell, the deep learning method has been used in the diagnosis of gear pitting faults and has made certain progress. Vibration signals have been traditionally used for gear pitting fault diagnosis. Over the years, many signal processing and analysis methods for vibration signals have been developed and matured. Even though it has been reported that AE signals have certain advantages over the vibration signals in early gear fault diagnosis, advanced signal processing and analysis methods for AE signals have not been well developed. Recent development in deep learning provides an excellent opportunity to integrate the AE signals and vibration signals for gear pitting fault diagnosis. In this paper, AE signals are introduced in addition to vibrational signals for gear pitting fault diagnosis. Normally, for the conversion of the time domain signals into the frequency domain signals, additional preprocessing steps are needed. The advantage of deep learning is its capability in dealing directly with the raw signals. In this paper, one-dimensional CNN is integrated with GRU network to process of AE and vibration signals for gear pitting fault diagnosis. The combination of CNN and GRU can effectively utilize their respective advantages and obtain better results for gear pitting fault diagnosis. The method presented in this paper can effectively suppress the over-ﬁtting in gear pitting fault diagnosis. The main contributions of this paper can be summarized as follows: (1) The method presented in this paper directly uses the raw vibrational and the AE signals to diagnose the gear pitting faults without additional feature extraction processes. (2) This method integrates CNN with GRU to make full use of their advantages. (3) The method combines two different kinds of sensor data, the vibration signals and the AE signals, and makes full use of different sensor signal features for gear pitting fault diagnosis. (4) The method presented in this paper uses less training data to make an accurate diagnosis of gear pitting faults with efﬁcient training time. The rest of this paper is organized as follows. Section 2 describes the gear pitting fault diagnosis method presented in this paper. In Section 3, a description of the experiment setup and the data collected for the validation of the proposed method is provided. Section 4 analyzes and discusses the results. Finally, Section 5 concludes the paper. 2. The Methodology The general procedure of the proposed method for gear pitting fault diagnosis is presented in Figure 1. The presented method is the integration of the one-dimensional CNN and the GRU network. The CNN is used to process the raw AE signals and the GRU network is used to process the vibration signals. Then the outputs of the CNN will be concatenated with the outputs of the GRU network. Appl. Sci. 2019, 9, 768 4 of 15 Finally, the concatenated outputs will be input into a softmax layer to perform gear pitting fault Appl. Sci. 2019, 9, x FOR PEER REVIEW 4 of 16 Appl. Sci. 2019, 9, x FOR PEER REVIEW 4 of 16 diagnosis. By using deep learning approaches such as CNN or GRU network, fault features will features will be extracted automatically while the raw sensor signals are being processed. The be extracted automatically while the raw sensor signals are being processed. The outputs coming features will be extracted automatically while the raw sensor signals are being processed. The outputs coming out from the multiple hidden layers in a deep learning network represents fault outputs coming out from the multiple hidden layers in a deep learning network represents fault out from the multiple hidden layers in a deep learning network represents fault features at different features at different abstract levels. The unique contribution of the paper is that it is the first attempt features at different abstract levels. The unique contribution of the paper is that it is the first attempt abstract levels. The unique contribution of the paper is that it is the ﬁrst attempt of developing deep of developing deep learning based approach for gear pitting fault diagnosis with both AE and of developing deep learning based approach for gear pitting fault diagnosis with both AE and learning based approach for gear pitting fault diagnosis with both AE and vibration signals. vibration signals. vibration signals. Figure 1. The general procedure of the presented method. AE: acoustic emission; CNN: convolutional Figure 1. The general procedure of the presented method. AE: acoustic emission; CNN: convolutional Figure 1. The general procedure of the presented method. AE: acoustic emission; CNN: convolutional neural network; GRU: gated recurrent unit. neural network; GRU: gated recurrent unit. neural network; GRU: gated recurrent unit. 2.1. One-Dimensional Convolutional Neural Network 2.1 One-Dimensional Convolutional Neural Network 2.1 One-Dimensional Convolutional Neural Network A typical CNN consists of an input layer, an output layer, a convolution layer, and a pooling A typical CNN consists of an input layer, an output layer, a convolution layer, and a pooling A typical CNN consists of an input layer, an output layer, a convolution layer, and a pooling layer [24]. The convolution layer performs local feature extraction on the input feature map through layer [24]. The convolution layer performs local feature extraction on the input feature map through layer [24]. The convolution layer performs local feature extraction on the input feature map through thethe convolution kernel. T convolution kernel. Thehe further downsampling further downsampling will will be per be performed formed by th by the e poolin pooling g lay layer er. The . The main the convolution kernel. The further downsampling will be performed by the pooling layer. The main features of CNN are local perception, weight sharing, and pooling. In CNN, the convolutional features of CNN are local perception, weight sharing, and pooling. In CNN, the convolutional layer main features of CNN are local perception, weight sharing, and pooling. In CNN, the convolutional layer and the pooling layer appear alternately. The principle of the one-dimensional CNN is shown and the pooling layer appear alternately. The principle of the one-dimensional CNN is shown in layer and the pooling layer appear alternately. The principle of the one-dimensional CNN is shown in Figure 2. Figure 2. in Figure 2. Figure 2. The schematic of one-dimensional CNN. Figure 2. The schematic of one-dimensional CNN. Figure 2. The schematic of one-dimensional CNN. Assuming that the first layer is a convolutional layer, the calculation formula of the one- Assuming that the first layer is a convolutional layer, the calculation formula of the one- Assuming that the first layer is a convolutional layer, the calculation formula of the one-dimensional dimensional convolutional layer is as follow: dimensional convolutional layer is as follow: convolutional layer is as follow: l l1 l l x = f x k + b (1) j i ij j (1) 𝑥 = 𝑓 𝑥 ∗𝑘 +𝑏 𝑥 = 𝑓 𝑥 ∗𝑘 +𝑏 (1) i=1 where, x is the jth feature map of the lth layer, f () represents the activation function, M represents l1 where, 𝑥 is the jth feature map of the lth layer, 𝑓(·) represents the activation function, M represents the number of input feature maps, x represents the ith feature map of the l 1 layer, * represents where, 𝑥 is the jth feature map of the lth layer, 𝑓(·) represents the activation function, M represents the number of input featl ure maps, 𝑥 represents the ith feature map of the 𝑙 -1 l layer, * represents convolution the number o operation, f input feat k ur repr e maps, esents 𝑥 a trainable represents the ith convolution feature m kernel, ap of the and b 𝑙 -1 rlay epreesents r, * repre the sent jth s bias of ij j convolution operation, 𝑘 represents a trainable convolution kernel, and 𝑏 represents the jth bias convolution operation, 𝑘 represents a trainable convolution kernel, and 𝑏 represents the jth bias the lth layer. of the lth layer. of the lth layer. With the consideration of the convergence speed and overﬁtting problem, this paper uses the rectiﬁed linear unit (ReLU) activation function. ReLU has a faster convergence rate than the Sigmoid Appl. Sci. 2019, 9, x FOR PEER REVIEW 5 of 16 Appl. Sci. 2019, 9, 768 5 of 15 With the consideration of the convergence speed and overfitting problem, this paper uses the rectified linear unit (ReLU) activation function. ReLU has a faster convergence rate than the in the gradient descent and can effectively prevent the over-ﬁtting problem. The ReLU activation Sigmoid in the gradient descent and can effectively prevent the over-fitting problem. The ReLU function is as follow: activation function is as follow: f x = max 0, x (2) ( ) ( ) ( ) ( ) (2) 𝑓 𝑥 =𝑚𝑥𝑎 0, 𝑥 After the pooling layer is connected to the convolution layer, the feature map is downsampled according to a certain pooling strategy to obtain a lower resolution feature map. The most commonly used pooling After the pool strategy ing la is yer theis maximum connected to the convoluti pooling. Maximum on layer, the pooling featr ure ma educes p ithe s downsa number mpled of output according to a certain pooling strategy to obtain a lower resolution feature map. The most nodes and enhances the robustness of the network to input characteristics. The l+1th layer is the commonly used pooling strategy is the maximum pooling. Maximum pooling reduces the number pooling layer. It is calculated as follow: of output nodes and enhances the robustness of the network to input characteristics. The 𝑙 +1th layer h i is the pooling layer. It is calculated as follow: l+1 l l+1 x = f down x + b (3) j j 𝑥 = 𝑓 𝑥 + 𝑏 (3) where, down () is a downsampling function. 2.2. Gated Recurrent Unit Network where, down (·) is a downsampling function. GRU network is the optimized structure of the recurrent neural network (RNN) [25]. However, 2.2. Gated Recurrent Unit Network when the input information is increased to a certain length, the RNN cannot connect to the relevant GRU network is the optimized structure of the recurrent neural network (RNN) [25]. However, information. GRU network is aimed at solving the problem of long-range dependence and gradient when the input information is increased to a certain length, the RNN cannot connect to the relevant disappearance of RNN. The GRU neural network with less threshold structure and better efﬁciency information. GRU network is aimed at solving the problem of long-range dependence and gradient is directly selected for the diagnosis of gear pitting fault. Note that similar to GRU, a recurrent unit disappearance of RNN. The GRU neural network with less threshold structure and better efficiency in RNN called long short term memory (LSTM) can also be used. Both LSTM and GRU have the is directly selected for the diagnosis of gear pitting fault. Note that similar to GRU, a recurrent unit same goal of tracking long-term dependencies effectively while mitigating the vanishing/exploding in RNN called long short term memory (LSTM) can also be used. Both LSTM and GRU have the gradient problems. As pointed in Chung et al. [26], after evaluating LSTM and GRU units on the tasks same goal of tracking long-term dependencies effectively while mitigating the vanishing/exploding of polyphonic music modeling and speech signal modeling, they found GRU to be comparable to gradient problems. As pointed in Chung et al. [26], after evaluating LSTM and GRU units on the tasks of polyphonic music modeling and speech signal modeling, they found GRU to be LSTM. Therefore, the GRU is used in this paper as a recurrent unit the same as LSTM. For this reason, comparable to LSTM. Therefore, the GRU is used in this paper as a recurrent unit the same as it is expected that LSTM will give similar results as GRU. LSTM. For this reason, it is expected that LSTM will give similar results as GRU. RNNs are widely used in the ﬁeld of natural language processing. Unlike traditional feedforward RNNs are widely used in the field of natural language processing. Unlike traditional neural networks, RNN introduces directional loops that can handle correlated inputs. As so, it can feedforward neural networks, RNN introduces directional loops that can handle correlated inputs. be used to process sequence data. The basic structure of an RNN is shown in Figure 3. In Figure 3, As so, it can be used to process sequence data. The basic structure of an RNN is shown in Figure 3. x, h, and s represent the input, output, and hidden states, respectively. U, V , and W represent the In Figure 3, x, h, and s represent the input, output, and hidden states, respectively. U, V, and W weight matrix between the input and hidden layers, hidden layers and outputs, and the hidden represent the weight matrix between the input and hidden layers, hidden layers and outputs, and layers, respectively. the hidden layers, respectively. Figure 3. Expansion model of the recurrent neural network. Figure 3. Expansion model of the recurrent neural network. The GRU unit speciﬁc update process is as follow: First, the two gates in the GRU that control the The GRU unit specific update process is as follow: First, the two gates in the GRU that control direction of the data ﬂow are r and z. The update gate model in the GRU neural network is calculated the direction of the data flow are r and z. The update gate model in the GRU neural network is in Equation (4): calculated in Equation (4): z = s W h + U x + b (4) ( ) t Z t1 z t z In Equation (4), the z represents the update gate, h represents the output of the previous t1 neuron, x represents the input of the current neuron, W represents the weight of the update gate, U t Z Z represents the weight of the current neuron, and represents the sigmoid function. The update gate z 𝑑𝑜𝑤𝑛 Appl. Sci. 2019, 9, 768 6 of 15 is operated by h and x , and then it uses the sigmoid function to process. For the update gate z , t1 t t when the value is larger, more information in the previous neuron will be retained. If z is close to 1, it is equivalent to copying the previous hidden layer information to the current layer. It can learn long distance dependence. The reset gate model in the GRU neural networks is calculated in Equation (5): r = s(W h + U x + b ) (5) t t t1 t t r In Equation (5), r represents the reset gate, h represents the output of the previous neuron, x t t1 t represents the input of the current neuron, W represents the weight of the reset gate, U represents t t the weight of the current neuron, and represents the sigmoid function. The reset gate r is operated by h and x , and then it uses the sigmoid function to process. For the reset gate, when its value is 0, t1 t it means to discard the information from the previous neuron. The output value of the GRU hidden layer is in Equation (6): h = tanh(W [r h ] + U x + b ) (6) t h t t1 h t h In Equation (6), h represents the output value to be determined in this neuron, h represents t t1 the output of the previous neuron, x represents the input of the current neuron, W represents the weight of the update gate, and tanh() represents the hyperbolic tangent function. r is used to control how much memory needs to be retained. Finally, z controls how much information is forgotten from the hidden layer at the previous layer and how much hidden layer information h of the current layer needs to be added. Finally, h is t t obtained in Equation (7), and the hidden layer information of the last output is directly obtained. h = (1 z ) h + z h (7) t t t t t1 In Equation (7), if the value of r is 1 and the value of z is 0, the GRU unit is equivalent to a t t standard RNN, which can handle short-range dependencies. 3. Gear Test Experimental Setup and Data Processing Raw AE signals and vibration signals collected from gear pitting fault experiments were used to validate the effectiveness of the presented method for the diagnosis of gear pitting faults. The experiments were carried out on a gearbox test rig. The raw vibrational signals and AE signals of seven different gear pitting conditions were collected during the experiments. The gearbox test rig is shown in Figure 4. It consists of two 45 kW Siemens servos, one of the servos is the driving motor, and the other is the loading motor. An acceleration sensor and an AE sensor were mounted on the surface of the gearbox housing. The main parameters of the gearbox are shown in Table 1. Appl. Sci. 2019, 9, x FOR PEER REVIEW 7 of 16 Figure 4. Picture of the gearbox test rig. Figure 4. Picture of the gearbox test rig. Table 1. The major parameters of the gearbox. Gear Parameter Driving Gear Driving Gear Tooth number 72 40 Module (mm) 3 3 Pitch diameter (mm) 120 120 Base circle diameter (mm) 202.974 112.763 Pressure angle (°) 20 20 Tooth width (mm) 85 85 The gear speed was set to 1000 RPM, and 100 Nm torque was used in the experiments. The vibrational signals were collected with a sampling rate of 10.24 kHz. The AE signals were collected with a sampling rate of 51.2 kHz. Table 2 shows the seven gear pitting conditions. Condition 1 represents a normal gear. In Condition 2, the pitting is about 10% of the area of a middle tooth, and the adjacent two teeth are normal. Condition 3 has a pitting of about 30% of the area of the middle gear tooth, and the adjacent two teeth are normal. Under Condition 4, the middle gear tooth pitting is about 50% of the area, and the adjacent two teeth are normal. Under Condition 5, the middle gear tooth pitting is about 50% of the area, the upper tooth pitting is about 10% of the area, and the lower tooth is normal. Under Condition 6, the pitting of the middle gear tooth is about 50% of the area, and the adjacent two teeth pitting is about 10% of their area. Under condition 7, the middle gear tooth pitting is about 50% of the area, upper tooth pitting is about 30% of the area, and the lower tooth pitting is about 10% of the area. Figure 5 shows pictures of the gear pitting degree under each pitting condition. Table 2. The approximate percentage of pitting area under seven gear conditions. Gear Condition Upper tooth Middle tooth Lower tooth Condition1 Normal Normal Normal Condition2 Normal 10% Normal Condition3 Normal 30% Normal Condition4 Normal 50% Normal Appl. Sci. 2019, 9, 768 7 of 15 Table 1. The major parameters of the gearbox. Gear Parameter Driving Gear Driving Gear Tooth number 72 40 Module (mm) 3 3 Pitch diameter (mm) 120 120 Base circle diameter (mm) 202.974 112.763 Pressure angle ( ) 20 20 Tooth width (mm) 85 85 The gear speed was set to 1000 RPM, and 100 Nm torque was used in the experiments. The vibrational signals were collected with a sampling rate of 10.24 kHz. The AE signals were collected with a sampling rate of 51.2 kHz. Table 2 shows the seven gear pitting conditions. Condition 1 represents a normal gear. In Condition 2, the pitting is about 10% of the area of a middle tooth, and the adjacent two teeth are normal. Condition 3 has a pitting of about 30% of the area of the middle gear tooth, and the adjacent two teeth are normal. Under Condition 4, the middle gear tooth pitting is about 50% of the area, and the adjacent two teeth are normal. Under Condition 5, the middle gear tooth pitting is about 50% of the area, the upper tooth pitting is about 10% of the area, and the lower tooth is normal. Under Condition 6, the pitting of the middle gear tooth is about 50% of the area, and the adjacent two teeth pitting is about 10% of their area. Under condition 7, the middle gear tooth pitting is about 50% of the area, upper tooth pitting is about 30% of the area, and the lower tooth pitting is about 10% of the area. Figure 5 shows pictures of the gear pitting degree under each pitting condition. Table 2. The approximate percentage of pitting area under seven gear conditions. Gear Condition Upper Tooth Middle Tooth Lower Tooth Appl. Sci. 2019, 9, x FOR PEER REVIEW 8 of 16 Condition 1 Normal Normal Normal Condition 2 Normal 10% Normal Condition5 10% 50% Normal Condition 3 Normal 30% Normal Condition 4 Normal 50% Normal Condition6 10% 50% 10% Condition 5 10% 50% Normal Condition 6 10% 50% 10% Condition7 30% 50% 10% Condition 7 30% 50% 10% Figure 5. Pitting degree of driven gears. Figure 5. Pitting degree of driven gears. The sample raw vibrational signals of the gears are shown in Figure 6. As shown in Figure 6, Conditions 1 and 3 have relatively distinct spikes and show slightly different from the remaining five vibrational signals. The raw vibration signals of the remaining five conditions are not significantly different. Figure 6. Raw vibrational signals of the gear pitting fault conditions. The sample AE signals of gears are shown in Figure 7. As can be seen from Figure 7, there are no significant differences among raw AE signals of the seven conditions. It is almost impossible for the naked eye to distinguish the difference in the pitting conditions of the gears from the AE signals. Appl. Sci. 2019, 9, x FOR PEER REVIEW 8 of 16 Condition5 10% 50% Normal Condition6 10% 50% 10% Condition7 30% 50% 10% Appl. Sci. 2019, 9, 768 Figure 5. Pitting degree of driven gears. 8 of 15 The sample raw vibrational signals of the gears are shown in Figure 6. As shown in Figure 6, The sample raw vibrational signals of the gears are shown in Figure 6. As shown in Figure 6, Conditions 1 and 3 have relatively distinct spikes and show slightly different from the remaining Conditions 1 and 3 have relatively distinct spikes and show slightly different from the remaining five five vibrational signals. The raw vibration signals of the remaining five conditions are not vibrational signals. The raw vibration signals of the remaining five conditions are not significantly different. significantly different. Figure 6. Raw vibrational signals of the gear pitting fault conditions. Figure 6. Raw vibrational signals of the gear pitting fault conditions. The sample AE signals of gears are shown in Figure 7. As can be seen from Figure 7, there are The sample AE signals of gears are shown in Figure 7. As can be seen from Figure 7, there are no no significant differences among raw AE signals of the seven conditions. It is almost impossible for signiﬁcant differences among raw AE signals of the seven conditions. It is almost impossible for the the naked eye to distinguish the difference in the pitting conditions of the gears from the AE Appl. Sci. 2019, 9, x FOR PEER REVIEW 9 of 16 naked eye to distinguish the difference in the pitting conditions of the gears from the AE signals. signals. Figure 7. Raw AE signals of the gear pitting fault conditions. Figure 7. Raw AE signals of the gear pitting fault conditions. The number of samples for the vibration data/ AE data under each condition was 1000, with The number of samples for the vibration data/ AE data under each condition was 1000, with 800 for training set, 150 for validation, and 50 for testing. Each condition had the same number of 800 for training set, 150 for validation, and 50 for testing. Each condition had the same number of samples. The CNN was connected using 4 convolutional layers, which was an arbitrary choice samples. The CNN was connected using 4 convolutional layers, which was an arbitrary choice based based on the experience. The number of channels in each layer was set 32, 64, 128, and 128, on the experience. The number of channels in each layer was set 32, 64, 128, and 128, respectively. respectively. The kernel size of all convolutional layers was set as 7 and stride as 1. Padding took ‘Same padding’ in order to maintain the same data size. The pool size was set as 2 for all the pooling layers. For pooling, the strides were set as none and padding as ‘valid’. The AE signals used in the CNN contained 3072 features per sample. The GRU network used 6 stacked GRUs for training. The cell size of each GRU layer was set as 256, 256, 128, 128, 64, and 64, respectively. The vibrational signals used in the GRU network contained 616 features per sample. The batch size was set as 256. Both kernel initializer and recurrent initializer used the he-normal [25] method. The loss function was set as categorical cross-entropy. A stochastic gradient descent algorithm was used as the optimizer. Except for the last layer, the ReLU function was used as the activation function for the layers. In the last layer, a softmax function was used to classify gear pitting faults. An NVIDIA GeForce GTX 1080 Ti graphics card was used in the PC for training purpose. The general procedure of the data processing using the presented method is shown in Figure 8. Appl. Sci. 2019, 9, 768 9 of 15 The kernel size of all convolutional layers was set as 7 and stride as 1. Padding took ‘Same padding’ in order to maintain the same data size. The pool size was set as 2 for all the pooling layers. For pooling, the strides were set as none and padding as ‘valid’. The AE signals used in the CNN contained 3072 features per sample. The GRU network used 6 stacked GRUs for training. The cell size of each GRU layer was set as 256, 256, 128, 128, 64, and 64, respectively. The vibrational signals used in the GRU network contained 616 features per sample. The batch size was set as 256. Both kernel initializer and recurrent initializer used the he-normal [25] method. The loss function was set as categorical cross-entropy. A stochastic gradient descent algorithm was used as the optimizer. Except for the last layer, the ReLU function was used as the activation function for the layers. In the last layer, a softmax function was used to classify gear pitting faults. An NVIDIA GeForce GTX 1080 Ti graphics card was used in the PC for training purpose. The general procedure of the data processing using the presented method is shown in Figure 8. Appl. Sci. 2019, 9, x FOR PEER REVIEW 10 of 16 Figure 8. The general procedure of the data processing using the presented method. Figure 8. The general procedure of the data processing using the presented method. 4. Results and Discussions 4. Results and Discussions The validation results are provided in Table 3. It can be shown from Table 3 that if the CNN with The validation results are provided in Table 3. It can be shown from Table 3 that if the CNN AE signals were used to diagnose the gear pitting faults, high diagnostic accuracy would be obtained with AE signals were used to diagnose the gear pitting faults, high diagnostic accuracy would be for training. Although the accuracy of the training is very high, the method has a serious overﬁtting obtained for training. Although the accuracy of the training is very high, the method has a serious phenomenon, and as a result, the accuracy of the testing is as low as 74.57%. Using the same CNN overfitting phenomenon, and as a result, the accuracy of the testing is as low as 74.57%. Using the with AE signals and GRUs with vibration signals, not only good training and veriﬁcation results were same CNN with AE signals and GRUs with vibration signals, not only good training and obtained, but also accurate gear pitting fault diagnosis results of 98.29% were obtained for testing. verification results were obtained, but also accurate gear pitting fault diagnosis results of 98.29% The results show that in comparison with other methods, the method presented in this paper can were obtained for testing. The results show that in comparison with other methods, the method obtain diagnostic results more effectively. presented in this paper can obtain diagnostic results more effectively. Table 3. Gear pitting fault diagnosis accuracy of the presented method and the other methods. Training Validation Testing Gear Pitting Fault Diagnosis Method accuracy accuracy accuracy Proposed method: CNN with AE signals 1.00000 0.97333 0.98286 + GRU network with vibration signals CNN with vibration signals 1.00000 0.96952 0.98000 + GRU network with AE signals CNN with vibration signals 1.00000 0.96667 0.95714 + CNN with AE signals CNN with vibration signals alone 1.00000 0.92952 0.91429 GRU network with vibration signals alone 1.00000 0.90190 0.89714 Appl. Sci. 2019, 9, 768 10 of 15 Table 3. Gear pitting fault diagnosis accuracy of the presented method and the other methods. Training Validation Testing Gear Pitting Fault Diagnosis Method Accuracy Accuracy Accuracy Proposed method: CNN with AE signals 1.00000 0.97333 0.98286 + GRU network with vibration signals CNN with vibration signals 1.00000 0.96952 0.98000 + GRU network with AE signals CNN with vibration signals 1.00000 0.96667 0.95714 + CNN with AE signals CNN with vibration signals alone 1.00000 0.92952 0.91429 GRU network with vibration signals alone 1.00000 0.90190 0.89714 CNN with AE signals alone 1.00000 0.68571 0.74571 GRU network with AE signals alone 1.00000 0.56190 0.61714 The results in Table 3 indicate that using the combination of CNN with AE signals and GRU with vibration signals or the combination of CNN with vibration signals and GRU with AE signals gave more accurate results than the following three methods: (1) using CNN with both vibration and AE signals, (2) using CNN with vibration signals alone, and (3) using CNN with AE signals alone. For other methods in Table 3, the followings are the discussions regarding why some methods could not achieve a good gear pitting fault diagnosis accuracy. For the GRU network with vibration signals alone, it might be because the vibration signals were greatly interfered by the background noises, and the GRU has the ability to memorize the signals and hence possibly retain the noisy features. So the atypical feature of the received interference was regarded as a typical feature of the gear pitting faults. As a result, it might affect its effectiveness. For the GRU network with AE signals alone, since the sampling frequency of the AE signals is much than higher than the vibration signals, the sampled data of the AE signals was huge. In order to process the AE signals efﬁciently using the GRU network, the AE signal data was down sampled. So the data was partially distorted. As a result, the accuracy of the gear pitting fault diagnosis was low. In Table 4, the gear pitting fault diagnosis accuracies for each pitting fault condition obtained by the proposed method and other methods are provided. Table 4. Gear pitting fault diagnosis accuracy for each fault condition. The Accuracy of each Condition Fault Pattern Method 1 2 3 4 5 6 7 Proposed method: CNN with AE signals 100% 94% 100% 100% 100% 94% 100% + GRU network with vibration signals CNN with vibration signals 100% 96% 98% 100% 100% 94% 98% + GRU network with AE signals CNN with vibration signals 100% 94% 98% 100% 100% 78% 100% + CNN with AE signals CNN with vibration signals alone 86% 82% 94% 100% 100% 82% 96% GRU network with vibration signals alone 84% 82% 86% 100% 100% 82% 94% CNN with AE signals alone 100% 82% 74% 84% 52% 34% 96% GRU network for AE signals alone 98% 46% 42% 66% 40% 44% 96% Appl. Sci. 2019, 9, 768 11 of 15 From Table 4, the method presented in this paper can achieve 100% diagnosis accuracy for ﬁve gear pitting fault conditions. For the other two gears, the pitting diagnosis accuracy reached 94%. In comparison with other methods in Table 4, the presented method gives much better results than the last 4 methods in Table 4 when only one type of signals is used. In comparison with the second and the third methods in Table 4, the performance of the presented method is slightly better. The method presented in this paper uses CNN to process AE data and GRU network to process vibration data. The reason is that the sampling frequency of the AE sensor is about 5 times of the vibration sensor. The number of features extracted from the AE signals is larger than that of the vibration signals. As discussed in Section 2 in this paper, the number of the parameters of a CNN is relatively small, and the number of the parameters of the GRU network is relatively large. Hence, it is computationally beneﬁcial to use CNN to process a relatively larger volume of AE data and GRU network to process the relatively small volume of vibration data. If the GRU network is used to process the relatively larger volume of the AE data, the dimensionality of the data has to be reduced. The reduction of the dimensionality may result in loss of effective diagnostic information. Therefore, the method proposed in this paper that uses CNN to process the AE signals and the GRU network to process the vibration signals gives the best gear pitting fault diagnosis performance. To test the robustness of the proposed method for gear pitting fault diagnosis under different loading conditions, the diagnosis results were obtained by the proposed method at a constant speed of 1000 rpm with different loads and are provided in Table 5. To obtain the results in Table 5, the hyperparameters of the CNN and GRU network remain the same as those used for obtaining the results in Table 3. From Table 5, it can be seen that the training accuracy can reach 100% for all the loads. Since the accuracy results for the training are not signiﬁcantly different from those for testing, there is not an over-ﬁtting phenomenon for the results in Table 5. The accuracy of the testing is all above 94.86%, achieving a good gear pitting fault diagnosis result. The loss in each case is very low. As can be seen from Table 5, the training time in each case is not far away from the average of 838 s. As the load increases, there is an indication of a pattern of changes in fault diagnosis accuracy. This result shows that the performance presented method remains stable under different loads and shows the robustness and adaptability of the method for gear pitting fault diagnosis. Table 5. The gear pitting fault diagnosis results obtained by the proposed method at a constant speed of 1000 rpm with different loads. Training Validation Testing Training Training Working Condition Accuracy Accuracy Accuracy Loss Time (s) 1000 RPM _50N 1.0000 0.99333 0.98857 0.00015 612.25 1000 RPM _100N 1.0000 0.97333 0.98286 0.00014 1033.06 1000 RPM _200N 1.0000 0.94762 0.95429 0.00014 1017.29 1000 RPM _300N 1.0000 0.98095 0.98286 0.00009 903.42 1000 RPM _400N 1.0000 0.93333 0.94857 0.00018 730.88 1000 RPM _500N 1.0000 0.96571 0.96857 0.00016 732.92 It is well known that among all the parameters of deep learning, the learning rate is one of the most critical parameters. It has a great inﬂuence on the effect of the model. In order to test the effect of learning rate on the gear pitting fault diagnosis performance of the presented method, 20 different learning rates with an increment of 0.1 were tested. The testing results are provided in Table 6. As shown in Table 6, in the range from 0.4 to 2.3, it can be seen that a training accuracy of 100% was obtained for all the tested learning rates. The validation accuracy is all above 95.3%, and the standard deviation was computed as 0.0093. The testing accuracy is above 93.3%, and the standard deviation was computed as 0.0056. It can be seen from Table 6 that the training loss is small for 20 tested learning rates. The average training time is about one thousand seconds. In summary, the presented method has a good performance for a large span of learning rate. This result once again veriﬁes the effectiveness and robustness of the presented method for gear pitting fault diagnosis. Appl. Sci. 2019, 9, 768 12 of 15 Table 6. The gear pitting fault diagnosis results under different learning rates. Learning Training Validation Testing Training Training Rate Accuracy Accuracy Accuracy Loss Time (s) 0.4 1.0000 0.95524 0.96286 0.00029 926.86 0.5 1.0000 0.96762 0.97429 0.00011 2769.37 0.6 1.0000 0.97143 0.97429 0.00031 689.65 0.7 1.0000 0.96952 0.98286 0.00012 1309.01 0.8 1.0000 0.96381 0.98286 0.00028 861.12 0.9 1.0000 0.96286 0.97714 0.00011 1664.57 1.0 1.0000 0.95619 0.97714 0.00035 439.49 1.1 1.0000 0.95524 0.96857 0.00021 672.88 1.2 1.0000 0.96857 0.98000 0.00015 735.77 1.3 1.0000 0.96286 0.97429 0.00012 772.78 1.4 1.0000 0.96095 0.97143 0.00019 854.72 1.5 1.0000 0.97333 0.98286 0.00014 1033.06 1.6 1.0000 0.95905 0.97714 0.00012 705.93 1.7 1.0000 0.97143 0.98000 0.00025 673.77 1.8 1.0000 0.97905 0.97429 0.00007 1756.74 1.9 1.0000 0.95429 0.97714 0.00031 830.91 2.0 1.0000 0.97238 0.97143 0.00008 1493.51 2.1 1.0000 0.97238 0.98000 0.00010 839.48 2.2 1.0000 0.95333 0.96857 0.00035 923.91 2.3 1.0000 0.99143 0.98571 0.00015 912.59 The confusion matrix of the obtained results is provided in Figure 9. As can be seen from Figure 9, the classiﬁcation accuracy is 100% except for Condition 2 and Condition 6. 3 cases of Condition 2 were incorrectly diagnosed as Condition 6. Coincidentally, 3 cases of Condition 6 were incorrectly diagnosed: two as Condition 2 and one as Condition 7. The proposed method is accurate in classifying the gear pitting faults. Appl. Sci. 2019, 9, x FOR PEER REVIEW 14 of 16 Figure 9. The confusion matrix by the presented method for the testing set Figure 9. The confusion matrix by the presented method for the testing set. To show the effectiveness of the concatenated features obtained by CNN to extract features To show the effectiveness of the concatenated features obtained by CNN to extract features from from AE signals and GRU network to extract features from vibration signals, samples for t-SNE AE signals and GRU network to extract features from vibration signals, samples for t-SNE visualization visualization were processed. The 3D result of the t-SNE visualization is shown in Figure 10. The were processed. The 3D result of the t-SNE visualization is shown in Figure 10. The 2D result of the 2D result of the t-SNE visualization is shown in Figure 11. It can be seen from the two figures that t-SNE visualization is shown in Figure 11. It can be seen from the two ﬁgures that the concatenated the concatenated features of the seven gear pitting conditions were accurately clustered. The clear features of the seven gear pitting conditions were accurately clustered. The clear clusters formed by clusters formed by the concatenated features obtained by the proposed method shown in 3D and the 2D pictures indicate concatenated featur the e es obtained ffectiveness o by the f the prop proposed osed method method shown for extra in c 3D ting and feat2D urepictur s from t eshindicate e AE the and vibration signals for gear pitting fault diagnosis. Figure 10. The visualization of three-dimensional features of the gear pitting conditions. Appl. Sci. 2019, 9, x FOR PEER REVIEW 14 of 16 Figure 9. The confusion matrix by the presented method for the testing set To show the effectiveness of the concatenated features obtained by CNN to extract features from AE signals and GRU network to extract features from vibration signals, samples for t-SNE visualization were processed. The 3D result of the t-SNE visualization is shown in Figure 10. The 2D result of the t-SNE visualization is shown in Figure 11. It can be seen from the two figures that Appl. Sci. 2019, 9, 768 13 of 15 the concatenated features of the seven gear pitting conditions were accurately clustered. The clear clusters formed by the concatenated features obtained by the proposed method shown in 3D and effectiveness of the proposed method for extracting features from the AE and vibration signals for gear 2D pictures indicate the effectiveness of the proposed method for extracting features from the AE pitting fault diagnosis. and vibration signals for gear pitting fault diagnosis. Appl. Sci. 2019, 9, x FOR PEER REVIEW 15 of 16 Figure 10. The visualization of three-dimensional features of the gear pitting conditions. Figure 10. The visualization of three-dimensional features of the gear pitting conditions. Figure 11. The visualization of two-dimensional features of the gear pitting conditions. Figure 11. The visualization of two-dimensional features of the gear pitting conditions. Future research will include extending the developed method into fault diagnosis of other Future research will include extending the developed method into fault diagnosis of other rotating rotating components such as bearings involving multiple heterogeneous sensor signals such as components such as bearings involving multiple heterogeneous sensor signals such as motor current, motor current, torque, strain gauge, vibration, and AE signals. The future research will also include torque, strain gauge, vibration, and AE signals. The future research will also include investigation investigation of the influence of noise and other external environmental conditions on the sensor of the inﬂuence of noise and other external environmental conditions on the sensor signals and signals and consequently to their method of effective measurement. Testing with a much larger set consequently to their method of effective measurement. Testing with a much larger set of samples of samples should be investigated in the future research. should be investigated in the future research. 5. Con 5. Conclusions clusions In this paper, a new method based on one-dimensional CNN and GRU for gear pitting fault In this paper, a new method based on one-dimensional CNN and GRU for gear pitting fault diagnosis w diagnosis a was s presented. presented. By comparin By comparing g with C withN CNN N or GRU netw or GRU network ork alone, the results show alone, the results showed ed that that the presented method has higher diagnostic accuracy for gear pitting faults. Moreover, the method the presented method has higher diagnostic accuracy for gear pitting faults. Moreover, the method can can achieve achieve mmor ore t eh than an 98 98% % acc accuracy uracy wit with h only only a a s small mall num number ber of of t training rainingsamples, samples, wh which ichpr poves roves the the effectiveness of the presented method. The robustness of the presented method for the effectiveness of the presented method. The robustness of the presented method for the diagnosis of diagnosi gear pitting s of gea faults r pitti was ng fa veriﬁed ults wa by s veri thefcomparison ied by the co of mparison different oload f diffe gears rent and loaddif gefer ars ent and learning different rate learning rate training results. training results. Author Contributions: conceptualization, David He and Xueyi Li; methodology, Xueyi Li; software, Xueyi Li and Jialin Li; validation, Xueyi Li and Jialin Li; resources, Yongzhi Qu and David He; data curation, Yongzhi Qu and David He; writing—original draft preparation, Xueyi Li; writing—review and editing, David He. Funding: This research was funded in part by NSFC, grant number 51675089. Conflicts of Interest: The authors declare no conflict of interest. References 1. Camerini, V.; Coppotelli, G.; Bendisch, S. Fault Detection in Operating Helicopter Drivetrain Components Based on Support Vector Data Description. Aerosp. Sci. Technol. 2018, 73, 48–60. 2. Kattelus, J.; Miettinen, J.; Lehtovaara, A. Detection of Gear Pitting Failure Progression with on-Line Particle Monitoring. Tribol. Int. 2018, 458–464. 3. Qu, Y.Z.; Zhang, H.L.; Liu, H.; Zhao, C.F.; Tan, Y.G.; Zhou, Z.D. On Research of Incipient Gear Pitting Fault Detection Using Optic Fiber Sensors. In Proceedings of the 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Houston, TX, USA, 14–17 May 2018. 4. Sharma, R.B.; Parey, A. Modelling of Acoustic Emission Generated Due to Pitting on Spur Gear. Eng. Fail. Anal. 2018, 86, 1–20. 5. Zhou, L.; Fang, D.; Mba, D.; Faris, E. A Comparative Study of Helicopter Planetary Bearing Diagnosis with Vibration and Acoustic Emission Data. In Proceedings of the 2017 IEEE International Conference on Prognostics and Health Management (ICPHM), Dallas, TX, USA, 19–21 June 2017. 6. Elasha, F.; Greaves, M., Mba, D.; Fang, D. A Comparative Study of the Effectiveness of Vibration and Acoustic Emission in Diagnosing a Defective Bearing in a Planetry Gearbox. Appl. Acoust. 2017, 115, 181– Appl. Sci. 2019, 9, 768 14 of 15 Author Contributions: Conceptualization, D.H. and X.L.; methodology, X.L.; software, X.L. and J.L.; validation, X.L. and J.L.; resources, Y.Q. and D.H.; data curation, Y.Q. and D.H.; writing—original draft preparation, X.L.; writing—review and editing, D.H. Funding: This research was funded in part by NSFC, grant number 51675089. Conﬂicts of Interest: The authors declare no conﬂict of interest. References 1. Camerini, V.; Coppotelli, G.; Bendisch, S. Fault Detection in Operating Helicopter Drivetrain Components Based on Support Vector Data Description. Aerosp. Sci. Technol. 2018, 73, 48–60. [CrossRef] 2. Kattelus, J.; Miettinen, J.; Lehtovaara, A. Detection of Gear Pitting Failure Progression with on-Line Particle Monitoring. Tribol. Int. 2018, 458–464. [CrossRef] 3. Qu, Y.Z.; Zhang, H.L.; Liu, H.; Zhao, C.F.; Tan, Y.G.; Zhou, Z.D. On Research of Incipient Gear Pitting Fault Detection Using Optic Fiber Sensors. In Proceedings of the 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Houston, TX, USA, 14–17 May 2018. 4. Sharma, R.B.; Parey, A. Modelling of Acoustic Emission Generated Due to Pitting on Spur Gear. Eng. Fail. Anal. 2018, 86, 1–20. [CrossRef] 5. Zhou, L.; Fang, D.; Mba, D.; Faris, E. A Comparative Study of Helicopter Planetary Bearing Diagnosis with Vibration and Acoustic Emission Data. In Proceedings of the 2017 IEEE International Conference on Prognostics and Health Management (ICPHM), Dallas, TX, USA, 19–21 June 2017. 6. Elasha, F.; Greaves, M.; Mba, D.; Fang, D. A Comparative Study of the Effectiveness of Vibration and Acoustic Emission in Diagnosing a Defective Bearing in a Planetry Gearbox. Appl. Acoust. 2017, 115, 181–195. [CrossRef] 7. Dong, M.; He, D.; Prashant, B.; Jonathan, K. Equipment Health Diagnosis and Prognosis Using Hidden Semi-Markov Models. Int. J. Adv. Manuf. Technol. 2006, 30, 738–749. [CrossRef] 8. Saravanan, N.; Siddabattuni, V.K.; Ramachandran, K. A Comparative Study on Classification of Features by Svm and Psvm Extracted Using Morlet Wavelet for Fault Diagnosis of Spur Bevel Gear Box. Expert Syst. Appl. 2008, 35, 1351–1366. [CrossRef] 9. Aouabdi, S.; Taibi, M.; Bouras, S.; Boutasseta, N. Using Multi-Scale Entropy and Principal Component Analysis to Monitor Gears Degradation Via the Motor Current Signature Analysis. Mech. Syst. Signal Process. 2017, 90, 298–316. [CrossRef] 10. Sanchez, R.V.; Lucero, P.; Macancela, J.C.; Cerrada, M.; Vasquez, R.E.; Pacheco, F. Multi-Fault Diagnosis of Rotating Machinery by Using Feature Ranking Methods and Svm-Based Classiﬁers. In Proceedings of the 2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Shanghai, China, 16–18 August 2017; pp. 105–110. 11. Fan, Q.R.; Zhou, Q.; Wu, C.Q.; Guo, M. Gear Tooth Surface Damage Diagnosis Based on Analyzing the Vibration Signal of an Individual Gear Tooth. Adv. Mech. Eng. 2017. [CrossRef] 12. Wang, G.B.; Deng, W.H.; Du, X.Y.; Li, X.J. The Absolute Deviation Rank Diagnostic Approach to Gear Tooth Composite Fault. Shock Vib. 2017. [CrossRef] 13. Feng, Z.P.; Zhang, D.; Zuo, M.J. Planetary Gearbox Fault Diagnosis Via Joint Amplitude and Frequency Demodulation Analysis Based on Variational Mode Decomposition. Appl. Sci. 2017, 7, 775. [CrossRef] 14. Qu, Y.Z.; Zhang, Y.; He, M.; He, D.; Jiao, C.; Zhou, Z.D. Gear Pitting Fault Diagnosis Using Disentangled Features from Unsupervised Deep Learning. In Proceedings of the 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Houston, TX, USA, 14–17 May 2018. 15. Sun, S.Y.; Wang, Y. Fault Diagnosis of Gear Box Based on Bp Neural Network. Adv. Comput. Electron. Mechatron. 2014, 667, 349–352. [CrossRef] 16. You, Q.Z.; Bhatia, S.; Luo, J.B. A Picture Tells a Thousand Words—About You! User Interest Proﬁling from User Generated Visual Content. Signal Process. 2016, 124, 45–53. [CrossRef] 17. Qu, Y.Z.; He, M.; Deutsch, J.; He, D. Detection of Pitting in Gears Using a Deep Sparse Autoencoder. Appl. Sci. 2017, 7, 515. [CrossRef] 18. Jiang, G.Q.; He, H.B.; Yan, J.; Xie, P. Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox. IEEE Trans. Ind. Electron. 2018. [CrossRef] Appl. Sci. 2019, 9, 768 15 of 15 19. Jing, L.Y.; Zhao, M.; Li, P.; Xu, X.Q. A Convolutional Neural Network Based Feature Learning and Fault Diagnosis Method for the Condition Monitoring of Gearbox. Measurement 2017, 111, 1–10. [CrossRef] 20. Zhao, R.; Wang, D.Z.; Yan, R.Q.; Mao, K.Z.; Shen, F.; Wang, J.J. Machine Health Monitoring Using Local Feature-Based Gated Recurrent Unit Networks. IEEE Trans. Ind. Electron. 2018, 65, 1539–1548. [CrossRef] 21. Dong, H.; Li, S.; Xu, D.C.; Ren, Y.; Zhang, D. Gear Training: A New Way to Implement High-Performance Model-Parallel Training. arXiv 2018, arXiv:1806.03925. 22. Chen, Z.Q.; Chen, D.; Li, C.; Sanchez, R.V.; Qin, H.F. Vibration-Based Gearbox Fault Diagnosis Using Deep Neural Networks. J. Vibroeng. 2017, 19, 2475–2496. 23. Sun, W.F.; Yao, B.; Zeng, N.Y.; Chen, B.Q.; He, Y.C.; Cao, X.C.; He, W.P. An Intelligent Gear Fault Diagnosis Methodology Using a Complex Wavelet Enhanced Convolutional Neural Network. Materials 2017, 10, 790. [CrossRef] [PubMed] 24. Ren, S.Q.; He, K.M.; Girshick, R.; Sun, J. Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015. 25. He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Delving Deep into Rectiﬁers: Surpassing Human-Level Performance on Imagenet Classiﬁcation. In Proceedings of the IEEE International Conference on Computer Vision, Chile, 7–13 December 2015. 26. Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014;, arXiv:1412.3555. © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Applied Sciences Multidisciplinary Digital Publishing Institute http://www.deepdyve.com/lp/multidisciplinary-digital-publishing-institute/gear-pitting-fault-diagnosis-using-integrated-cnn-and-gru-network-with-albSOkS0ZS

Loading next page...

References (26)

Yongzhi Qu, Haoliang Zhang, Liu Hong, Chongfeng Zhao, Yuegang Tan, Zude Zhou (2018)
On research of incipient gear pitting fault detection using optic fiber sensors
2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC)
Ram Sharma, Anand Parey (2018)
Modelling of acoustic emission generated due to pitting on spur gear
Engineering Failure Analysis, 86
Rui Zhao, Dongzhe Wang, Ruqiang Yan, K. Mao, Fei Shen, Jinjiang Wang (2018)
Machine Health Monitoring Using Local Feature-Based Gated Recurrent Unit Networks
IEEE Transactions on Industrial Electronics, 65
Shangpeng Sun, Yang Wang (2014)
Fault Diagnosis of Gear Box Based on BP Neural Network
Applied Mechanics and Materials, 667
Linghao Zhou, F. Duan, D. Mba, Faris Elasha (2017)
A comparative study of helicopter planetary bearing diagnosis with vibration and acoustic emission data
2017 IEEE International Conference on Prognostics and Health Management (ICPHM)
Yongzhi Qu, Miao He, Jason Deutsch, D. He (2017)
Detection of Pitting in Gears Using a Deep Sparse Autoencoder
Applied Sciences, 7
Luyang Jing, Ming Zhao, Pin Li, Xiaoqiang Xu (2017)
A convolutional neural network based feature learning and fault diagnosis method for the condition monitoring of gearbox
Measurement, 111
S. Aouabdi, M. Taibi, Slimane Bouras, N. Boutasseta (2017)
Using multi-scale entropy and principal component analysis to monitor gears degradation via the motor current signature analysis
Mechanical Systems and Signal Processing, 90
V. Camerini, G. Coppotelli, S. Bendisch (2018)
Fault detection in operating helicopter drivetrain components based on support vector data description
Aerospace Science and Technology, 73
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (2015)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence, 39
Qingrong Fan, Qi Zhou, Chaoqun Wu, Mingfang Guo (2017)
Gear tooth surface damage diagnosis based on analyzing the vibration signal of an individual gear tooth
Advances in Mechanical Engineering, 9
Zhiqiang Chen, Xudong Chen, Chuan Li, Réne-Vinicio Sánchez, Huafeng Qin (2017)
Vibration-based gearbox fault diagnosis using deep neural networks
Journal of Vibroengineering, 19
Hao Dong, Shuai Li, D. Xu, Yi Ren, Di Zhang (2018)
Gear Training: A new way to implement high-performance model-parallel training
ArXiv, abs/1806.03925
Weifang Sun, Bin Yao, Nianyin Zeng, Binqiang Chen, Yuchao He, Xincheng Cao, Wangpeng He (2017)
An Intelligent Gear Fault Diagnosis Methodology Using a Complex Wavelet Enhanced Convolutional Neural Network
Materials, 10
Yongzhi Qu, Yue Zhang, Miao He, D. He, Chen Jiao, Zude Zhou (2018)
Gear pitting fault diagnosis using disentangled features from unsupervised deep learning
Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability, 233
Quanzeng You, S. Bhatia, Jiebo Luo (2015)
A picture tells a thousand words - About you! User interest profiling from user generated visual content
ArXiv, abs/1504.04558
Faris Elasha, M. Greaves, David Mba, Duan Fang (2017)
A comparative study of the effectiveness of vibration and acoustic emission in diagnosing a defective bearing in a planetry gearbox
Applied Acoustics, 115
Guoqian Jiang, Haibo He, Jun Yan, Ping Xie (2019)
Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox
IEEE Transactions on Industrial Electronics, 66
N. Saravanan, V.N.S. Siddabattuni, K. Ramachandran (2008)
A comparative study on classification of features by SVM and PSVM extracted using Morlet wavelet for fault diagnosis of spur bevel gear box
Expert Syst. Appl., 35
M. Dong, D. He, P. Banerjee, J. Keller (2006)
Equipment health diagnosis and prognosis using hidden semi-Markov models
The International Journal of Advanced Manufacturing Technology, 30
Zhipeng Feng, Dong Zhang, M. Zuo (2017)
Planetary Gearbox Fault diagnosis via Joint Amplitude and Frequency Demodulation Analysis Based on Variational Mode Decomposition
Applied Sciences, 7
G. Wang, Wenhui Deng, Xiaoyang Du, Xuejun Li (2017)
The Absolute Deviation Rank Diagnostic Approach to Gear Tooth Composite Fault
Shock and Vibration, 2017
J. Kattelus, J. Miettinen, A. Lehtovaara (2018)
Detection of gear pitting failure progression with on-line particle monitoring
Tribology International, 118
Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun (2015)
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
2015 IEEE International Conference on Computer Vision (ICCV)
René-Vinicio Sánchez, Pablo Lucero, Jean-Carlo Macancela, M. Cerrada, Rafael Vásquez, F. Pacheco (2017)
Multi-fault Diagnosis of Rotating Machinery by Using Feature Ranking Methods and SVM-based Classifiers
2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC)
Junyoung Chung, Çaglar Gülçehre, Kyunghyun Cho, Yoshua Bengio (2014)
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
ArXiv, abs/1412.3555

Publisher: Multidisciplinary Digital Publishing Institute
Copyright: © 1996-2019 MDPI (Basel, Switzerland) unless otherwise stated
ISSN: 2076-3417
DOI: 10.3390/app9040768
Publisher site: See Article on Publisher Site

Abstract

applied sciences Article Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals 1 1 2 3 , Xueyi Li , Jialin Li , Yongzhi Qu and David He * School of Mechanical Engineering and Automation, Northeastern University, Shenyang 110819, China; lixueyineu@gmail.com (X.L.); jialinli_neu@163.com (J.L.) School of Mechanical and Electronic Engineering, Wuhan University of Technology, Wuhan 430070, China; quwong@whut.edu.cn Department of Mechanical and Industrial Engineering, University of Illinois at Chicago, Chicago, IL 60607, USA * Correspondence: davidhe@uic.edu; Tel.: +1-312-996-3410 Received: 23 December 2018; Accepted: 19 February 2019; Published: 22 February 2019 Abstract: This paper deals with gear pitting fault diagnosis problem and presents a method by integrating convolutional neural network (CNN) and gated recurrent unit (GRU) networks with vibration and acoustic emission signals to solve the problem. The presented method ﬁrst trains a one-dimensional CNN with acoustic emission signals and a GRU network with vibration signals. Then the gear pitting fault features obtained by the two networks are concatenated to form a deep learning structure for gear pitting fault diagnosis. Seven different gear pitting conditions are used to test the feasibility of the presented method. The diagnosis result of the gear pitting fault shows that the accuracy of the presented method reaches above 98% with only a relatively small number of training samples. In comparison with the results using CNN or GRU network alone, the presented method gives more accurate diagnosis results. By comparing the results of different loads and learning rates, the robustness of the presented method for gear pitting fault diagnosis is proved. Moreover, the presented deep structure can be easily extended to more other sensor input signals for gear pitting fault diagnosis in the future. Keywords: gear pitting fault diagnosis; gated recurrent unit; one-dimensional convolutional neural network; acoustic emission signal; vibration signal 1. Introduction Gearboxes are an essential part of a mechanical transmission system. The diagnosis of gear pitting faults has always been an important problem in the industry. In recent years, the development of sensing technology and the improvement of computing power have provided more tools for gear fault diagnosis. Analysis of vibrational signals is the most common means of monitoring gear conditions. Vibrational signals have been used as a popular input in the diagnosis of gear pitting faults. Camerini et al. [1] presented an automatic vibration-based program that utilizes health and usage monitoring system data for the early diagnosis of mechanical properties of drivetrain components. There is no general indication of the minimum training collection times required to accurately describe a set of condition indicators, which largely depend on the characteristics of the distribution. Kattelus et al. [2] found that the vibration acceleration descriptor of the peak signal was related to the pitting of the gear contact. Traditional vibrational signals are signiﬁcantly affected by the external environment. For example, the statistical vibration acceleration descriptor which indicates the random Appl. Sci. 2019, 9, 768; doi:10.3390/app9040768 www.mdpi.com/journal/applsci Appl. Sci. 2019, 9, 768 2 of 15 peak value in the vibration signal is more suitable to indicate the tooth wear than the spectral method. So other sensors are gradually applied to the diagnosis of gear pitting faults. Qu et al. [3] presented the use of optic ﬁber sensors to detect the initial gear pitting fault. The results show that the optic ﬁber Bragg grating signal can effectively detect pitting faults under heavy load conditions. However, when the transmission has high structural stiffness, ﬁber Bragg gratings (FBGs) performance may be limited by relatively weak strain signals. Acoustic emission (AE) is a non-destructive diagnostic technique. Sharma et al. [4] took advantage of the Hertz contact method to establish a relationship between the fault/defect size and the AE energy generated during the gear meshing process. The results of the study indicated that if the defect size increases, the AE level also increases. Zhou et al. [5] compared the AE data with traditional vibration data. The results showed that the AE signals were more sensitive to defect excitation and the background noise was reduced in AE signals. The results of Elasha et al. [6] showed that AE recognizes defects earlier than vibration analysis, regardless of the tortuous transmission path. The effect of AE could be limited by how close the AE sensor to the defect. Many methods have been developed for pitting diagnosis of gears. The most representative research is the hidden semi-Markov model [7]. In general, a hidden semi-Markov model is difﬁcult to train. Therefore, other learning models such as support vector machine (SVM) [8] and principal components (PCA) [9] were used. Sanchez et al. [10] presented a method for detecting 11 kinds of rotating machinery faults by feature sorting method and SVM. Thirty features were calculated from the analysis of the vibration signal and the electromyography. Classiﬁcation accuracy was 98.7% achieved using SVM. Fan et al. [11] studied gear tooth surface damage diagnosis based on analyzing the vibration signal of an individual gear tooth. The characteristics of damaged and normal teeth were studied by analyzing their waveforms. The results showed that almost all the damaged teeth were correctly detected by the proposed method, even if there were some misdiagnoses in the identiﬁcation of the extent of the damage. However, the paper does not explore the exact classiﬁcation of damage degree. Wang et al. [12] presented a method for diagnosing the absolute deviation of gear faults. The method used single fault gear broken teeth, pitting corrosion and composite fault gear tooth damage setting dynamics model. The results of different broken teeth were obtained through simulation analysis. This method was to investigate the case of broken teeth, but the authors did not verify the early pitting corrosion of gear. Nevertheless, the above-mentioned methods require much domain expertise and prior knowledge and often rely on hand-crafted features. Often, these methods require a large amount of work in feature extraction. The use of frequency features of vibration signals for gear fault diagnosis has been common over the past decade. Feng et al. [13] obtained the amplitude and frequency demodulation spectra by applying the Fourier transform to the amplitude envelope and the instantaneous frequency of the selected sensitive intrinsic mode functions. The planetary gearbox fault was detected based on the features shown in the demodulated spectrum. Although frequency domain features can be directly related to fault type and level, these features are usually abstract representations and require additional pre-processing [14]. By directly extracting the gear pitting fault signals in the time domain, the calculation cost and the time cost can be saved. Sun et al. [15] used a backpropagation (BP) neural network to train the gears of four typical fault modes and obtained satisfactory results. The results showed that the BP neural network could effectively perform gear fault diagnosis. These methods can only extract the shallow features for gear fault diagnosis. There are limitations in mining deep features. Fortunately, with the development of deep learning, it is possible to extract fault features directly from the raw signals. Deep learning has been rapidly developed in recent decades [16]. Qu et al. [17] integrated dictionary learning into a stacked autoencoder network for gear pitting fault diagnosis. They applied the sparse autoencoder algorithm to gear fault detection for the ﬁrst time. Jiang et al. [18] presented a CNN-based deep learning method that automatically learned effective fault features directly from the raw vibrational signals, and classiﬁes fault types in a single frame to provide a wind turbine gearbox diagnostic system based on end-to-end learning. Under 10ten operating conditions, there were Appl. Sci. 2019, 9, 768 3 of 15 2600 samples for each health condition, and each sample contained 2000 data points. Jing et al. [19] used a convolutional neural network (CNN) to learn features directly from the frequency data of the vibrational signal. Feature learning using CNN can provide better results than manual feature extraction. Zhao et al. [20] presented a local feature-based Gated Recurrent Unit (GRU) network to predict machine conditions. A compact spectral data acquisition instrument was used for signal acquisition with a sampling frequency of 1024 Hz and a sampling window of 512 seconds. The accuracy of gear failure was 95.8%. Dong et al. [21] presented a method of parallel training depth model, which can train different parts of it at different speeds. By splitting the deep neural network model and training on different devices at different speeds, it can speed up the whole training process. The training accuracy of this method was about 70%. Chen et al. [22] used four classical deep neural networks to classify and identify fault conditions in the transmission. It was shown that the vibration signal usually contains abundant information for fault detection, control, and maintenance planning of rotating machinery. Sun et al. [23] used a dual-tree complex wavelet transform to acquire the characteristics of multi-scale signals. The CNN was then used to automatically identify fault features from multi-scale signal features. This method can distinguish 4 kinds of gear faults, but the classiﬁcation of these 4 kinds of faults is relatively easy. However, the detection of early gear pitting fault was not explained in their paper. Their experimental results of gear fault identiﬁcation showed the feasibility and effectiveness of the presented method. In a nutshell, the deep learning method has been used in the diagnosis of gear pitting faults and has made certain progress. Vibration signals have been traditionally used for gear pitting fault diagnosis. Over the years, many signal processing and analysis methods for vibration signals have been developed and matured. Even though it has been reported that AE signals have certain advantages over the vibration signals in early gear fault diagnosis, advanced signal processing and analysis methods for AE signals have not been well developed. Recent development in deep learning provides an excellent opportunity to integrate the AE signals and vibration signals for gear pitting fault diagnosis. In this paper, AE signals are introduced in addition to vibrational signals for gear pitting fault diagnosis. Normally, for the conversion of the time domain signals into the frequency domain signals, additional preprocessing steps are needed. The advantage of deep learning is its capability in dealing directly with the raw signals. In this paper, one-dimensional CNN is integrated with GRU network to process of AE and vibration signals for gear pitting fault diagnosis. The combination of CNN and GRU can effectively utilize their respective advantages and obtain better results for gear pitting fault diagnosis. The method presented in this paper can effectively suppress the over-ﬁtting in gear pitting fault diagnosis. The main contributions of this paper can be summarized as follows: (1) The method presented in this paper directly uses the raw vibrational and the AE signals to diagnose the gear pitting faults without additional feature extraction processes. (2) This method integrates CNN with GRU to make full use of their advantages. (3) The method combines two different kinds of sensor data, the vibration signals and the AE signals, and makes full use of different sensor signal features for gear pitting fault diagnosis. (4) The method presented in this paper uses less training data to make an accurate diagnosis of gear pitting faults with efﬁcient training time. The rest of this paper is organized as follows. Section 2 describes the gear pitting fault diagnosis method presented in this paper. In Section 3, a description of the experiment setup and the data collected for the validation of the proposed method is provided. Section 4 analyzes and discusses the results. Finally, Section 5 concludes the paper. 2. The Methodology The general procedure of the proposed method for gear pitting fault diagnosis is presented in Figure 1. The presented method is the integration of the one-dimensional CNN and the GRU network. The CNN is used to process the raw AE signals and the GRU network is used to process the vibration signals. Then the outputs of the CNN will be concatenated with the outputs of the GRU network. Appl. Sci. 2019, 9, 768 4 of 15 Finally, the concatenated outputs will be input into a softmax layer to perform gear pitting fault Appl. Sci. 2019, 9, x FOR PEER REVIEW 4 of 16 Appl. Sci. 2019, 9, x FOR PEER REVIEW 4 of 16 diagnosis. By using deep learning approaches such as CNN or GRU network, fault features will features will be extracted automatically while the raw sensor signals are being processed. The be extracted automatically while the raw sensor signals are being processed. The outputs coming features will be extracted automatically while the raw sensor signals are being processed. The outputs coming out from the multiple hidden layers in a deep learning network represents fault outputs coming out from the multiple hidden layers in a deep learning network represents fault out from the multiple hidden layers in a deep learning network represents fault features at different features at different abstract levels. The unique contribution of the paper is that it is the first attempt features at different abstract levels. The unique contribution of the paper is that it is the first attempt abstract levels. The unique contribution of the paper is that it is the ﬁrst attempt of developing deep of developing deep learning based approach for gear pitting fault diagnosis with both AE and of developing deep learning based approach for gear pitting fault diagnosis with both AE and learning based approach for gear pitting fault diagnosis with both AE and vibration signals. vibration signals. vibration signals. Figure 1. The general procedure of the presented method. AE: acoustic emission; CNN: convolutional Figure 1. The general procedure of the presented method. AE: acoustic emission; CNN: convolutional Figure 1. The general procedure of the presented method. AE: acoustic emission; CNN: convolutional neural network; GRU: gated recurrent unit. neural network; GRU: gated recurrent unit. neural network; GRU: gated recurrent unit. 2.1. One-Dimensional Convolutional Neural Network 2.1 One-Dimensional Convolutional Neural Network 2.1 One-Dimensional Convolutional Neural Network A typical CNN consists of an input layer, an output layer, a convolution layer, and a pooling A typical CNN consists of an input layer, an output layer, a convolution layer, and a pooling A typical CNN consists of an input layer, an output layer, a convolution layer, and a pooling layer [24]. The convolution layer performs local feature extraction on the input feature map through layer [24]. The convolution layer performs local feature extraction on the input feature map through layer [24]. The convolution layer performs local feature extraction on the input feature map through thethe convolution kernel. T convolution kernel. Thehe further downsampling further downsampling will will be per be performed formed by th by the e poolin pooling g lay layer er. The . The main the convolution kernel. The further downsampling will be performed by the pooling layer. The main features of CNN are local perception, weight sharing, and pooling. In CNN, the convolutional features of CNN are local perception, weight sharing, and pooling. In CNN, the convolutional layer main features of CNN are local perception, weight sharing, and pooling. In CNN, the convolutional layer and the pooling layer appear alternately. The principle of the one-dimensional CNN is shown and the pooling layer appear alternately. The principle of the one-dimensional CNN is shown in layer and the pooling layer appear alternately. The principle of the one-dimensional CNN is shown in Figure 2. Figure 2. in Figure 2. Figure 2. The schematic of one-dimensional CNN. Figure 2. The schematic of one-dimensional CNN. Figure 2. The schematic of one-dimensional CNN. Assuming that the first layer is a convolutional layer, the calculation formula of the one- Assuming that the first layer is a convolutional layer, the calculation formula of the one- Assuming that the first layer is a convolutional layer, the calculation formula of the one-dimensional dimensional convolutional layer is as follow: dimensional convolutional layer is as follow: convolutional layer is as follow: l l1 l l x = f x k + b (1) j i ij j (1) 𝑥 = 𝑓 𝑥 ∗𝑘 +𝑏 𝑥 = 𝑓 𝑥 ∗𝑘 +𝑏 (1) i=1 where, x is the jth feature map of the lth layer, f () represents the activation function, M represents l1 where, 𝑥 is the jth feature map of the lth layer, 𝑓(·) represents the activation function, M represents the number of input feature maps, x represents the ith feature map of the l 1 layer, * represents where, 𝑥 is the jth feature map of the lth layer, 𝑓(·) represents the activation function, M represents the number of input featl ure maps, 𝑥 represents the ith feature map of the 𝑙 -1 l layer, * represents convolution the number o operation, f input feat k ur repr e maps, esents 𝑥 a trainable represents the ith convolution feature m kernel, ap of the and b 𝑙 -1 rlay epreesents r, * repre the sent jth s bias of ij j convolution operation, 𝑘 represents a trainable convolution kernel, and 𝑏 represents the jth bias convolution operation, 𝑘 represents a trainable convolution kernel, and 𝑏 represents the jth bias the lth layer. of the lth layer. of the lth layer. With the consideration of the convergence speed and overﬁtting problem, this paper uses the rectiﬁed linear unit (ReLU) activation function. ReLU has a faster convergence rate than the Sigmoid Appl. Sci. 2019, 9, x FOR PEER REVIEW 5 of 16 Appl. Sci. 2019, 9, 768 5 of 15 With the consideration of the convergence speed and overfitting problem, this paper uses the rectified linear unit (ReLU) activation function. ReLU has a faster convergence rate than the in the gradient descent and can effectively prevent the over-ﬁtting problem. The ReLU activation Sigmoid in the gradient descent and can effectively prevent the over-fitting problem. The ReLU function is as follow: activation function is as follow: f x = max 0, x (2) ( ) ( ) ( ) ( ) (2) 𝑓 𝑥 =𝑚𝑥𝑎 0, 𝑥 After the pooling layer is connected to the convolution layer, the feature map is downsampled according to a certain pooling strategy to obtain a lower resolution feature map. The most commonly used pooling After the pool strategy ing la is yer theis maximum connected to the convoluti pooling. Maximum on layer, the pooling featr ure ma educes p ithe s downsa number mpled of output according to a certain pooling strategy to obtain a lower resolution feature map. The most nodes and enhances the robustness of the network to input characteristics. The l+1th layer is the commonly used pooling strategy is the maximum pooling. Maximum pooling reduces the number pooling layer. It is calculated as follow: of output nodes and enhances the robustness of the network to input characteristics. The 𝑙 +1th layer h i is the pooling layer. It is calculated as follow: l+1 l l+1 x = f down x + b (3) j j 𝑥 = 𝑓 𝑥 + 𝑏 (3) where, down () is a downsampling function. 2.2. Gated Recurrent Unit Network where, down (·) is a downsampling function. GRU network is the optimized structure of the recurrent neural network (RNN) [25]. However, 2.2. Gated Recurrent Unit Network when the input information is increased to a certain length, the RNN cannot connect to the relevant GRU network is the optimized structure of the recurrent neural network (RNN) [25]. However, information. GRU network is aimed at solving the problem of long-range dependence and gradient when the input information is increased to a certain length, the RNN cannot connect to the relevant disappearance of RNN. The GRU neural network with less threshold structure and better efﬁciency information. GRU network is aimed at solving the problem of long-range dependence and gradient is directly selected for the diagnosis of gear pitting fault. Note that similar to GRU, a recurrent unit disappearance of RNN. The GRU neural network with less threshold structure and better efficiency in RNN called long short term memory (LSTM) can also be used. Both LSTM and GRU have the is directly selected for the diagnosis of gear pitting fault. Note that similar to GRU, a recurrent unit same goal of tracking long-term dependencies effectively while mitigating the vanishing/exploding in RNN called long short term memory (LSTM) can also be used. Both LSTM and GRU have the gradient problems. As pointed in Chung et al. [26], after evaluating LSTM and GRU units on the tasks same goal of tracking long-term dependencies effectively while mitigating the vanishing/exploding of polyphonic music modeling and speech signal modeling, they found GRU to be comparable to gradient problems. As pointed in Chung et al. [26], after evaluating LSTM and GRU units on the tasks of polyphonic music modeling and speech signal modeling, they found GRU to be LSTM. Therefore, the GRU is used in this paper as a recurrent unit the same as LSTM. For this reason, comparable to LSTM. Therefore, the GRU is used in this paper as a recurrent unit the same as it is expected that LSTM will give similar results as GRU. LSTM. For this reason, it is expected that LSTM will give similar results as GRU. RNNs are widely used in the ﬁeld of natural language processing. Unlike traditional feedforward RNNs are widely used in the field of natural language processing. Unlike traditional neural networks, RNN introduces directional loops that can handle correlated inputs. As so, it can feedforward neural networks, RNN introduces directional loops that can handle correlated inputs. be used to process sequence data. The basic structure of an RNN is shown in Figure 3. In Figure 3, As so, it can be used to process sequence data. The basic structure of an RNN is shown in Figure 3. x, h, and s represent the input, output, and hidden states, respectively. U, V , and W represent the In Figure 3, x, h, and s represent the input, output, and hidden states, respectively. U, V, and W weight matrix between the input and hidden layers, hidden layers and outputs, and the hidden represent the weight matrix between the input and hidden layers, hidden layers and outputs, and layers, respectively. the hidden layers, respectively. Figure 3. Expansion model of the recurrent neural network. Figure 3. Expansion model of the recurrent neural network. The GRU unit speciﬁc update process is as follow: First, the two gates in the GRU that control the The GRU unit specific update process is as follow: First, the two gates in the GRU that control direction of the data ﬂow are r and z. The update gate model in the GRU neural network is calculated the direction of the data flow are r and z. The update gate model in the GRU neural network is in Equation (4): calculated in Equation (4): z = s W h + U x + b (4) ( ) t Z t1 z t z In Equation (4), the z represents the update gate, h represents the output of the previous t1 neuron, x represents the input of the current neuron, W represents the weight of the update gate, U t Z Z represents the weight of the current neuron, and represents the sigmoid function. The update gate z 𝑑𝑜𝑤𝑛 Appl. Sci. 2019, 9, 768 6 of 15 is operated by h and x , and then it uses the sigmoid function to process. For the update gate z , t1 t t when the value is larger, more information in the previous neuron will be retained. If z is close to 1, it is equivalent to copying the previous hidden layer information to the current layer. It can learn long distance dependence. The reset gate model in the GRU neural networks is calculated in Equation (5): r = s(W h + U x + b ) (5) t t t1 t t r In Equation (5), r represents the reset gate, h represents the output of the previous neuron, x t t1 t represents the input of the current neuron, W represents the weight of the reset gate, U represents t t the weight of the current neuron, and represents the sigmoid function. The reset gate r is operated by h and x , and then it uses the sigmoid function to process. For the reset gate, when its value is 0, t1 t it means to discard the information from the previous neuron. The output value of the GRU hidden layer is in Equation (6): h = tanh(W [r h ] + U x + b ) (6) t h t t1 h t h In Equation (6), h represents the output value to be determined in this neuron, h represents t t1 the output of the previous neuron, x represents the input of the current neuron, W represents the weight of the update gate, and tanh() represents the hyperbolic tangent function. r is used to control how much memory needs to be retained. Finally, z controls how much information is forgotten from the hidden layer at the previous layer and how much hidden layer information h of the current layer needs to be added. Finally, h is t t obtained in Equation (7), and the hidden layer information of the last output is directly obtained. h = (1 z ) h + z h (7) t t t t t1 In Equation (7), if the value of r is 1 and the value of z is 0, the GRU unit is equivalent to a t t standard RNN, which can handle short-range dependencies. 3. Gear Test Experimental Setup and Data Processing Raw AE signals and vibration signals collected from gear pitting fault experiments were used to validate the effectiveness of the presented method for the diagnosis of gear pitting faults. The experiments were carried out on a gearbox test rig. The raw vibrational signals and AE signals of seven different gear pitting conditions were collected during the experiments. The gearbox test rig is shown in Figure 4. It consists of two 45 kW Siemens servos, one of the servos is the driving motor, and the other is the loading motor. An acceleration sensor and an AE sensor were mounted on the surface of the gearbox housing. The main parameters of the gearbox are shown in Table 1. Appl. Sci. 2019, 9, x FOR PEER REVIEW 7 of 16 Figure 4. Picture of the gearbox test rig. Figure 4. Picture of the gearbox test rig. Table 1. The major parameters of the gearbox. Gear Parameter Driving Gear Driving Gear Tooth number 72 40 Module (mm) 3 3 Pitch diameter (mm) 120 120 Base circle diameter (mm) 202.974 112.763 Pressure angle (°) 20 20 Tooth width (mm) 85 85 The gear speed was set to 1000 RPM, and 100 Nm torque was used in the experiments. The vibrational signals were collected with a sampling rate of 10.24 kHz. The AE signals were collected with a sampling rate of 51.2 kHz. Table 2 shows the seven gear pitting conditions. Condition 1 represents a normal gear. In Condition 2, the pitting is about 10% of the area of a middle tooth, and the adjacent two teeth are normal. Condition 3 has a pitting of about 30% of the area of the middle gear tooth, and the adjacent two teeth are normal. Under Condition 4, the middle gear tooth pitting is about 50% of the area, and the adjacent two teeth are normal. Under Condition 5, the middle gear tooth pitting is about 50% of the area, the upper tooth pitting is about 10% of the area, and the lower tooth is normal. Under Condition 6, the pitting of the middle gear tooth is about 50% of the area, and the adjacent two teeth pitting is about 10% of their area. Under condition 7, the middle gear tooth pitting is about 50% of the area, upper tooth pitting is about 30% of the area, and the lower tooth pitting is about 10% of the area. Figure 5 shows pictures of the gear pitting degree under each pitting condition. Table 2. The approximate percentage of pitting area under seven gear conditions. Gear Condition Upper tooth Middle tooth Lower tooth Condition1 Normal Normal Normal Condition2 Normal 10% Normal Condition3 Normal 30% Normal Condition4 Normal 50% Normal Appl. Sci. 2019, 9, 768 7 of 15 Table 1. The major parameters of the gearbox. Gear Parameter Driving Gear Driving Gear Tooth number 72 40 Module (mm) 3 3 Pitch diameter (mm) 120 120 Base circle diameter (mm) 202.974 112.763 Pressure angle ( ) 20 20 Tooth width (mm) 85 85 The gear speed was set to 1000 RPM, and 100 Nm torque was used in the experiments. The vibrational signals were collected with a sampling rate of 10.24 kHz. The AE signals were collected with a sampling rate of 51.2 kHz. Table 2 shows the seven gear pitting conditions. Condition 1 represents a normal gear. In Condition 2, the pitting is about 10% of the area of a middle tooth, and the adjacent two teeth are normal. Condition 3 has a pitting of about 30% of the area of the middle gear tooth, and the adjacent two teeth are normal. Under Condition 4, the middle gear tooth pitting is about 50% of the area, and the adjacent two teeth are normal. Under Condition 5, the middle gear tooth pitting is about 50% of the area, the upper tooth pitting is about 10% of the area, and the lower tooth is normal. Under Condition 6, the pitting of the middle gear tooth is about 50% of the area, and the adjacent two teeth pitting is about 10% of their area. Under condition 7, the middle gear tooth pitting is about 50% of the area, upper tooth pitting is about 30% of the area, and the lower tooth pitting is about 10% of the area. Figure 5 shows pictures of the gear pitting degree under each pitting condition. Table 2. The approximate percentage of pitting area under seven gear conditions. Gear Condition Upper Tooth Middle Tooth Lower Tooth Appl. Sci. 2019, 9, x FOR PEER REVIEW 8 of 16 Condition 1 Normal Normal Normal Condition 2 Normal 10% Normal Condition5 10% 50% Normal Condition 3 Normal 30% Normal Condition 4 Normal 50% Normal Condition6 10% 50% 10% Condition 5 10% 50% Normal Condition 6 10% 50% 10% Condition7 30% 50% 10% Condition 7 30% 50% 10% Figure 5. Pitting degree of driven gears. Figure 5. Pitting degree of driven gears. The sample raw vibrational signals of the gears are shown in Figure 6. As shown in Figure 6, Conditions 1 and 3 have relatively distinct spikes and show slightly different from the remaining five vibrational signals. The raw vibration signals of the remaining five conditions are not significantly different. Figure 6. Raw vibrational signals of the gear pitting fault conditions. The sample AE signals of gears are shown in Figure 7. As can be seen from Figure 7, there are no significant differences among raw AE signals of the seven conditions. It is almost impossible for the naked eye to distinguish the difference in the pitting conditions of the gears from the AE signals. Appl. Sci. 2019, 9, x FOR PEER REVIEW 8 of 16 Condition5 10% 50% Normal Condition6 10% 50% 10% Condition7 30% 50% 10% Appl. Sci. 2019, 9, 768 Figure 5. Pitting degree of driven gears. 8 of 15 The sample raw vibrational signals of the gears are shown in Figure 6. As shown in Figure 6, The sample raw vibrational signals of the gears are shown in Figure 6. As shown in Figure 6, Conditions 1 and 3 have relatively distinct spikes and show slightly different from the remaining Conditions 1 and 3 have relatively distinct spikes and show slightly different from the remaining five five vibrational signals. The raw vibration signals of the remaining five conditions are not vibrational signals. The raw vibration signals of the remaining five conditions are not significantly different. significantly different. Figure 6. Raw vibrational signals of the gear pitting fault conditions. Figure 6. Raw vibrational signals of the gear pitting fault conditions. The sample AE signals of gears are shown in Figure 7. As can be seen from Figure 7, there are The sample AE signals of gears are shown in Figure 7. As can be seen from Figure 7, there are no no significant differences among raw AE signals of the seven conditions. It is almost impossible for signiﬁcant differences among raw AE signals of the seven conditions. It is almost impossible for the the naked eye to distinguish the difference in the pitting conditions of the gears from the AE Appl. Sci. 2019, 9, x FOR PEER REVIEW 9 of 16 naked eye to distinguish the difference in the pitting conditions of the gears from the AE signals. signals. Figure 7. Raw AE signals of the gear pitting fault conditions. Figure 7. Raw AE signals of the gear pitting fault conditions. The number of samples for the vibration data/ AE data under each condition was 1000, with The number of samples for the vibration data/ AE data under each condition was 1000, with 800 for training set, 150 for validation, and 50 for testing. Each condition had the same number of 800 for training set, 150 for validation, and 50 for testing. Each condition had the same number of samples. The CNN was connected using 4 convolutional layers, which was an arbitrary choice samples. The CNN was connected using 4 convolutional layers, which was an arbitrary choice based based on the experience. The number of channels in each layer was set 32, 64, 128, and 128, on the experience. The number of channels in each layer was set 32, 64, 128, and 128, respectively. respectively. The kernel size of all convolutional layers was set as 7 and stride as 1. Padding took ‘Same padding’ in order to maintain the same data size. The pool size was set as 2 for all the pooling layers. For pooling, the strides were set as none and padding as ‘valid’. The AE signals used in the CNN contained 3072 features per sample. The GRU network used 6 stacked GRUs for training. The cell size of each GRU layer was set as 256, 256, 128, 128, 64, and 64, respectively. The vibrational signals used in the GRU network contained 616 features per sample. The batch size was set as 256. Both kernel initializer and recurrent initializer used the he-normal [25] method. The loss function was set as categorical cross-entropy. A stochastic gradient descent algorithm was used as the optimizer. Except for the last layer, the ReLU function was used as the activation function for the layers. In the last layer, a softmax function was used to classify gear pitting faults. An NVIDIA GeForce GTX 1080 Ti graphics card was used in the PC for training purpose. The general procedure of the data processing using the presented method is shown in Figure 8. Appl. Sci. 2019, 9, 768 9 of 15 The kernel size of all convolutional layers was set as 7 and stride as 1. Padding took ‘Same padding’ in order to maintain the same data size. The pool size was set as 2 for all the pooling layers. For pooling, the strides were set as none and padding as ‘valid’. The AE signals used in the CNN contained 3072 features per sample. The GRU network used 6 stacked GRUs for training. The cell size of each GRU layer was set as 256, 256, 128, 128, 64, and 64, respectively. The vibrational signals used in the GRU network contained 616 features per sample. The batch size was set as 256. Both kernel initializer and recurrent initializer used the he-normal [25] method. The loss function was set as categorical cross-entropy. A stochastic gradient descent algorithm was used as the optimizer. Except for the last layer, the ReLU function was used as the activation function for the layers. In the last layer, a softmax function was used to classify gear pitting faults. An NVIDIA GeForce GTX 1080 Ti graphics card was used in the PC for training purpose. The general procedure of the data processing using the presented method is shown in Figure 8. Appl. Sci. 2019, 9, x FOR PEER REVIEW 10 of 16 Figure 8. The general procedure of the data processing using the presented method. Figure 8. The general procedure of the data processing using the presented method. 4. Results and Discussions 4. Results and Discussions The validation results are provided in Table 3. It can be shown from Table 3 that if the CNN with The validation results are provided in Table 3. It can be shown from Table 3 that if the CNN AE signals were used to diagnose the gear pitting faults, high diagnostic accuracy would be obtained with AE signals were used to diagnose the gear pitting faults, high diagnostic accuracy would be for training. Although the accuracy of the training is very high, the method has a serious overﬁtting obtained for training. Although the accuracy of the training is very high, the method has a serious phenomenon, and as a result, the accuracy of the testing is as low as 74.57%. Using the same CNN overfitting phenomenon, and as a result, the accuracy of the testing is as low as 74.57%. Using the with AE signals and GRUs with vibration signals, not only good training and veriﬁcation results were same CNN with AE signals and GRUs with vibration signals, not only good training and obtained, but also accurate gear pitting fault diagnosis results of 98.29% were obtained for testing. verification results were obtained, but also accurate gear pitting fault diagnosis results of 98.29% The results show that in comparison with other methods, the method presented in this paper can were obtained for testing. The results show that in comparison with other methods, the method obtain diagnostic results more effectively. presented in this paper can obtain diagnostic results more effectively. Table 3. Gear pitting fault diagnosis accuracy of the presented method and the other methods. Training Validation Testing Gear Pitting Fault Diagnosis Method accuracy accuracy accuracy Proposed method: CNN with AE signals 1.00000 0.97333 0.98286 + GRU network with vibration signals CNN with vibration signals 1.00000 0.96952 0.98000 + GRU network with AE signals CNN with vibration signals 1.00000 0.96667 0.95714 + CNN with AE signals CNN with vibration signals alone 1.00000 0.92952 0.91429 GRU network with vibration signals alone 1.00000 0.90190 0.89714 Appl. Sci. 2019, 9, 768 10 of 15 Table 3. Gear pitting fault diagnosis accuracy of the presented method and the other methods. Training Validation Testing Gear Pitting Fault Diagnosis Method Accuracy Accuracy Accuracy Proposed method: CNN with AE signals 1.00000 0.97333 0.98286 + GRU network with vibration signals CNN with vibration signals 1.00000 0.96952 0.98000 + GRU network with AE signals CNN with vibration signals 1.00000 0.96667 0.95714 + CNN with AE signals CNN with vibration signals alone 1.00000 0.92952 0.91429 GRU network with vibration signals alone 1.00000 0.90190 0.89714 CNN with AE signals alone 1.00000 0.68571 0.74571 GRU network with AE signals alone 1.00000 0.56190 0.61714 The results in Table 3 indicate that using the combination of CNN with AE signals and GRU with vibration signals or the combination of CNN with vibration signals and GRU with AE signals gave more accurate results than the following three methods: (1) using CNN with both vibration and AE signals, (2) using CNN with vibration signals alone, and (3) using CNN with AE signals alone. For other methods in Table 3, the followings are the discussions regarding why some methods could not achieve a good gear pitting fault diagnosis accuracy. For the GRU network with vibration signals alone, it might be because the vibration signals were greatly interfered by the background noises, and the GRU has the ability to memorize the signals and hence possibly retain the noisy features. So the atypical feature of the received interference was regarded as a typical feature of the gear pitting faults. As a result, it might affect its effectiveness. For the GRU network with AE signals alone, since the sampling frequency of the AE signals is much than higher than the vibration signals, the sampled data of the AE signals was huge. In order to process the AE signals efﬁciently using the GRU network, the AE signal data was down sampled. So the data was partially distorted. As a result, the accuracy of the gear pitting fault diagnosis was low. In Table 4, the gear pitting fault diagnosis accuracies for each pitting fault condition obtained by the proposed method and other methods are provided. Table 4. Gear pitting fault diagnosis accuracy for each fault condition. The Accuracy of each Condition Fault Pattern Method 1 2 3 4 5 6 7 Proposed method: CNN with AE signals 100% 94% 100% 100% 100% 94% 100% + GRU network with vibration signals CNN with vibration signals 100% 96% 98% 100% 100% 94% 98% + GRU network with AE signals CNN with vibration signals 100% 94% 98% 100% 100% 78% 100% + CNN with AE signals CNN with vibration signals alone 86% 82% 94% 100% 100% 82% 96% GRU network with vibration signals alone 84% 82% 86% 100% 100% 82% 94% CNN with AE signals alone 100% 82% 74% 84% 52% 34% 96% GRU network for AE signals alone 98% 46% 42% 66% 40% 44% 96% Appl. Sci. 2019, 9, 768 11 of 15 From Table 4, the method presented in this paper can achieve 100% diagnosis accuracy for ﬁve gear pitting fault conditions. For the other two gears, the pitting diagnosis accuracy reached 94%. In comparison with other methods in Table 4, the presented method gives much better results than the last 4 methods in Table 4 when only one type of signals is used. In comparison with the second and the third methods in Table 4, the performance of the presented method is slightly better. The method presented in this paper uses CNN to process AE data and GRU network to process vibration data. The reason is that the sampling frequency of the AE sensor is about 5 times of the vibration sensor. The number of features extracted from the AE signals is larger than that of the vibration signals. As discussed in Section 2 in this paper, the number of the parameters of a CNN is relatively small, and the number of the parameters of the GRU network is relatively large. Hence, it is computationally beneﬁcial to use CNN to process a relatively larger volume of AE data and GRU network to process the relatively small volume of vibration data. If the GRU network is used to process the relatively larger volume of the AE data, the dimensionality of the data has to be reduced. The reduction of the dimensionality may result in loss of effective diagnostic information. Therefore, the method proposed in this paper that uses CNN to process the AE signals and the GRU network to process the vibration signals gives the best gear pitting fault diagnosis performance. To test the robustness of the proposed method for gear pitting fault diagnosis under different loading conditions, the diagnosis results were obtained by the proposed method at a constant speed of 1000 rpm with different loads and are provided in Table 5. To obtain the results in Table 5, the hyperparameters of the CNN and GRU network remain the same as those used for obtaining the results in Table 3. From Table 5, it can be seen that the training accuracy can reach 100% for all the loads. Since the accuracy results for the training are not signiﬁcantly different from those for testing, there is not an over-ﬁtting phenomenon for the results in Table 5. The accuracy of the testing is all above 94.86%, achieving a good gear pitting fault diagnosis result. The loss in each case is very low. As can be seen from Table 5, the training time in each case is not far away from the average of 838 s. As the load increases, there is an indication of a pattern of changes in fault diagnosis accuracy. This result shows that the performance presented method remains stable under different loads and shows the robustness and adaptability of the method for gear pitting fault diagnosis. Table 5. The gear pitting fault diagnosis results obtained by the proposed method at a constant speed of 1000 rpm with different loads. Training Validation Testing Training Training Working Condition Accuracy Accuracy Accuracy Loss Time (s) 1000 RPM _50N 1.0000 0.99333 0.98857 0.00015 612.25 1000 RPM _100N 1.0000 0.97333 0.98286 0.00014 1033.06 1000 RPM _200N 1.0000 0.94762 0.95429 0.00014 1017.29 1000 RPM _300N 1.0000 0.98095 0.98286 0.00009 903.42 1000 RPM _400N 1.0000 0.93333 0.94857 0.00018 730.88 1000 RPM _500N 1.0000 0.96571 0.96857 0.00016 732.92 It is well known that among all the parameters of deep learning, the learning rate is one of the most critical parameters. It has a great inﬂuence on the effect of the model. In order to test the effect of learning rate on the gear pitting fault diagnosis performance of the presented method, 20 different learning rates with an increment of 0.1 were tested. The testing results are provided in Table 6. As shown in Table 6, in the range from 0.4 to 2.3, it can be seen that a training accuracy of 100% was obtained for all the tested learning rates. The validation accuracy is all above 95.3%, and the standard deviation was computed as 0.0093. The testing accuracy is above 93.3%, and the standard deviation was computed as 0.0056. It can be seen from Table 6 that the training loss is small for 20 tested learning rates. The average training time is about one thousand seconds. In summary, the presented method has a good performance for a large span of learning rate. This result once again veriﬁes the effectiveness and robustness of the presented method for gear pitting fault diagnosis. Appl. Sci. 2019, 9, 768 12 of 15 Table 6. The gear pitting fault diagnosis results under different learning rates. Learning Training Validation Testing Training Training Rate Accuracy Accuracy Accuracy Loss Time (s) 0.4 1.0000 0.95524 0.96286 0.00029 926.86 0.5 1.0000 0.96762 0.97429 0.00011 2769.37 0.6 1.0000 0.97143 0.97429 0.00031 689.65 0.7 1.0000 0.96952 0.98286 0.00012 1309.01 0.8 1.0000 0.96381 0.98286 0.00028 861.12 0.9 1.0000 0.96286 0.97714 0.00011 1664.57 1.0 1.0000 0.95619 0.97714 0.00035 439.49 1.1 1.0000 0.95524 0.96857 0.00021 672.88 1.2 1.0000 0.96857 0.98000 0.00015 735.77 1.3 1.0000 0.96286 0.97429 0.00012 772.78 1.4 1.0000 0.96095 0.97143 0.00019 854.72 1.5 1.0000 0.97333 0.98286 0.00014 1033.06 1.6 1.0000 0.95905 0.97714 0.00012 705.93 1.7 1.0000 0.97143 0.98000 0.00025 673.77 1.8 1.0000 0.97905 0.97429 0.00007 1756.74 1.9 1.0000 0.95429 0.97714 0.00031 830.91 2.0 1.0000 0.97238 0.97143 0.00008 1493.51 2.1 1.0000 0.97238 0.98000 0.00010 839.48 2.2 1.0000 0.95333 0.96857 0.00035 923.91 2.3 1.0000 0.99143 0.98571 0.00015 912.59 The confusion matrix of the obtained results is provided in Figure 9. As can be seen from Figure 9, the classiﬁcation accuracy is 100% except for Condition 2 and Condition 6. 3 cases of Condition 2 were incorrectly diagnosed as Condition 6. Coincidentally, 3 cases of Condition 6 were incorrectly diagnosed: two as Condition 2 and one as Condition 7. The proposed method is accurate in classifying the gear pitting faults. Appl. Sci. 2019, 9, x FOR PEER REVIEW 14 of 16 Figure 9. The confusion matrix by the presented method for the testing set Figure 9. The confusion matrix by the presented method for the testing set. To show the effectiveness of the concatenated features obtained by CNN to extract features To show the effectiveness of the concatenated features obtained by CNN to extract features from from AE signals and GRU network to extract features from vibration signals, samples for t-SNE AE signals and GRU network to extract features from vibration signals, samples for t-SNE visualization visualization were processed. The 3D result of the t-SNE visualization is shown in Figure 10. The were processed. The 3D result of the t-SNE visualization is shown in Figure 10. The 2D result of the 2D result of the t-SNE visualization is shown in Figure 11. It can be seen from the two figures that t-SNE visualization is shown in Figure 11. It can be seen from the two ﬁgures that the concatenated the concatenated features of the seven gear pitting conditions were accurately clustered. The clear features of the seven gear pitting conditions were accurately clustered. The clear clusters formed by clusters formed by the concatenated features obtained by the proposed method shown in 3D and the 2D pictures indicate concatenated featur the e es obtained ffectiveness o by the f the prop proposed osed method method shown for extra in c 3D ting and feat2D urepictur s from t eshindicate e AE the and vibration signals for gear pitting fault diagnosis. Figure 10. The visualization of three-dimensional features of the gear pitting conditions. Appl. Sci. 2019, 9, x FOR PEER REVIEW 14 of 16 Figure 9. The confusion matrix by the presented method for the testing set To show the effectiveness of the concatenated features obtained by CNN to extract features from AE signals and GRU network to extract features from vibration signals, samples for t-SNE visualization were processed. The 3D result of the t-SNE visualization is shown in Figure 10. The 2D result of the t-SNE visualization is shown in Figure 11. It can be seen from the two figures that Appl. Sci. 2019, 9, 768 13 of 15 the concatenated features of the seven gear pitting conditions were accurately clustered. The clear clusters formed by the concatenated features obtained by the proposed method shown in 3D and effectiveness of the proposed method for extracting features from the AE and vibration signals for gear 2D pictures indicate the effectiveness of the proposed method for extracting features from the AE pitting fault diagnosis. and vibration signals for gear pitting fault diagnosis. Appl. Sci. 2019, 9, x FOR PEER REVIEW 15 of 16 Figure 10. The visualization of three-dimensional features of the gear pitting conditions. Figure 10. The visualization of three-dimensional features of the gear pitting conditions. Figure 11. The visualization of two-dimensional features of the gear pitting conditions. Figure 11. The visualization of two-dimensional features of the gear pitting conditions. Future research will include extending the developed method into fault diagnosis of other Future research will include extending the developed method into fault diagnosis of other rotating rotating components such as bearings involving multiple heterogeneous sensor signals such as components such as bearings involving multiple heterogeneous sensor signals such as motor current, motor current, torque, strain gauge, vibration, and AE signals. The future research will also include torque, strain gauge, vibration, and AE signals. The future research will also include investigation investigation of the influence of noise and other external environmental conditions on the sensor of the inﬂuence of noise and other external environmental conditions on the sensor signals and signals and consequently to their method of effective measurement. Testing with a much larger set consequently to their method of effective measurement. Testing with a much larger set of samples of samples should be investigated in the future research. should be investigated in the future research. 5. Con 5. Conclusions clusions In this paper, a new method based on one-dimensional CNN and GRU for gear pitting fault In this paper, a new method based on one-dimensional CNN and GRU for gear pitting fault diagnosis w diagnosis a was s presented. presented. By comparin By comparing g with C withN CNN N or GRU netw or GRU network ork alone, the results show alone, the results showed ed that that the presented method has higher diagnostic accuracy for gear pitting faults. Moreover, the method the presented method has higher diagnostic accuracy for gear pitting faults. Moreover, the method can can achieve achieve mmor ore t eh than an 98 98% % acc accuracy uracy wit with h only only a a s small mall num number ber of of t training rainingsamples, samples, wh which ichpr poves roves the the effectiveness of the presented method. The robustness of the presented method for the effectiveness of the presented method. The robustness of the presented method for the diagnosis of diagnosi gear pitting s of gea faults r pitti was ng fa veriﬁed ults wa by s veri thefcomparison ied by the co of mparison different oload f diffe gears rent and loaddif gefer ars ent and learning different rate learning rate training results. training results. Author Contributions: conceptualization, David He and Xueyi Li; methodology, Xueyi Li; software, Xueyi Li and Jialin Li; validation, Xueyi Li and Jialin Li; resources, Yongzhi Qu and David He; data curation, Yongzhi Qu and David He; writing—original draft preparation, Xueyi Li; writing—review and editing, David He. Funding: This research was funded in part by NSFC, grant number 51675089. Conflicts of Interest: The authors declare no conflict of interest. References 1. Camerini, V.; Coppotelli, G.; Bendisch, S. Fault Detection in Operating Helicopter Drivetrain Components Based on Support Vector Data Description. Aerosp. Sci. Technol. 2018, 73, 48–60. 2. Kattelus, J.; Miettinen, J.; Lehtovaara, A. Detection of Gear Pitting Failure Progression with on-Line Particle Monitoring. Tribol. Int. 2018, 458–464. 3. Qu, Y.Z.; Zhang, H.L.; Liu, H.; Zhao, C.F.; Tan, Y.G.; Zhou, Z.D. On Research of Incipient Gear Pitting Fault Detection Using Optic Fiber Sensors. In Proceedings of the 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Houston, TX, USA, 14–17 May 2018. 4. Sharma, R.B.; Parey, A. Modelling of Acoustic Emission Generated Due to Pitting on Spur Gear. Eng. Fail. Anal. 2018, 86, 1–20. 5. Zhou, L.; Fang, D.; Mba, D.; Faris, E. A Comparative Study of Helicopter Planetary Bearing Diagnosis with Vibration and Acoustic Emission Data. In Proceedings of the 2017 IEEE International Conference on Prognostics and Health Management (ICPHM), Dallas, TX, USA, 19–21 June 2017. 6. Elasha, F.; Greaves, M., Mba, D.; Fang, D. A Comparative Study of the Effectiveness of Vibration and Acoustic Emission in Diagnosing a Defective Bearing in a Planetry Gearbox. Appl. Acoust. 2017, 115, 181– Appl. Sci. 2019, 9, 768 14 of 15 Author Contributions: Conceptualization, D.H. and X.L.; methodology, X.L.; software, X.L. and J.L.; validation, X.L. and J.L.; resources, Y.Q. and D.H.; data curation, Y.Q. and D.H.; writing—original draft preparation, X.L.; writing—review and editing, D.H. Funding: This research was funded in part by NSFC, grant number 51675089. Conﬂicts of Interest: The authors declare no conﬂict of interest. References 1. Camerini, V.; Coppotelli, G.; Bendisch, S. Fault Detection in Operating Helicopter Drivetrain Components Based on Support Vector Data Description. Aerosp. Sci. Technol. 2018, 73, 48–60. [CrossRef] 2. Kattelus, J.; Miettinen, J.; Lehtovaara, A. Detection of Gear Pitting Failure Progression with on-Line Particle Monitoring. Tribol. Int. 2018, 458–464. [CrossRef] 3. Qu, Y.Z.; Zhang, H.L.; Liu, H.; Zhao, C.F.; Tan, Y.G.; Zhou, Z.D. On Research of Incipient Gear Pitting Fault Detection Using Optic Fiber Sensors. In Proceedings of the 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Houston, TX, USA, 14–17 May 2018. 4. Sharma, R.B.; Parey, A. Modelling of Acoustic Emission Generated Due to Pitting on Spur Gear. Eng. Fail. Anal. 2018, 86, 1–20. [CrossRef] 5. Zhou, L.; Fang, D.; Mba, D.; Faris, E. A Comparative Study of Helicopter Planetary Bearing Diagnosis with Vibration and Acoustic Emission Data. In Proceedings of the 2017 IEEE International Conference on Prognostics and Health Management (ICPHM), Dallas, TX, USA, 19–21 June 2017. 6. Elasha, F.; Greaves, M.; Mba, D.; Fang, D. A Comparative Study of the Effectiveness of Vibration and Acoustic Emission in Diagnosing a Defective Bearing in a Planetry Gearbox. Appl. Acoust. 2017, 115, 181–195. [CrossRef] 7. Dong, M.; He, D.; Prashant, B.; Jonathan, K. Equipment Health Diagnosis and Prognosis Using Hidden Semi-Markov Models. Int. J. Adv. Manuf. Technol. 2006, 30, 738–749. [CrossRef] 8. Saravanan, N.; Siddabattuni, V.K.; Ramachandran, K. A Comparative Study on Classification of Features by Svm and Psvm Extracted Using Morlet Wavelet for Fault Diagnosis of Spur Bevel Gear Box. Expert Syst. Appl. 2008, 35, 1351–1366. [CrossRef] 9. Aouabdi, S.; Taibi, M.; Bouras, S.; Boutasseta, N. Using Multi-Scale Entropy and Principal Component Analysis to Monitor Gears Degradation Via the Motor Current Signature Analysis. Mech. Syst. Signal Process. 2017, 90, 298–316. [CrossRef] 10. Sanchez, R.V.; Lucero, P.; Macancela, J.C.; Cerrada, M.; Vasquez, R.E.; Pacheco, F. Multi-Fault Diagnosis of Rotating Machinery by Using Feature Ranking Methods and Svm-Based Classiﬁers. In Proceedings of the 2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Shanghai, China, 16–18 August 2017; pp. 105–110. 11. Fan, Q.R.; Zhou, Q.; Wu, C.Q.; Guo, M. Gear Tooth Surface Damage Diagnosis Based on Analyzing the Vibration Signal of an Individual Gear Tooth. Adv. Mech. Eng. 2017. [CrossRef] 12. Wang, G.B.; Deng, W.H.; Du, X.Y.; Li, X.J. The Absolute Deviation Rank Diagnostic Approach to Gear Tooth Composite Fault. Shock Vib. 2017. [CrossRef] 13. Feng, Z.P.; Zhang, D.; Zuo, M.J. Planetary Gearbox Fault Diagnosis Via Joint Amplitude and Frequency Demodulation Analysis Based on Variational Mode Decomposition. Appl. Sci. 2017, 7, 775. [CrossRef] 14. Qu, Y.Z.; Zhang, Y.; He, M.; He, D.; Jiao, C.; Zhou, Z.D. Gear Pitting Fault Diagnosis Using Disentangled Features from Unsupervised Deep Learning. In Proceedings of the 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Houston, TX, USA, 14–17 May 2018. 15. Sun, S.Y.; Wang, Y. Fault Diagnosis of Gear Box Based on Bp Neural Network. Adv. Comput. Electron. Mechatron. 2014, 667, 349–352. [CrossRef] 16. You, Q.Z.; Bhatia, S.; Luo, J.B. A Picture Tells a Thousand Words—About You! User Interest Proﬁling from User Generated Visual Content. Signal Process. 2016, 124, 45–53. [CrossRef] 17. Qu, Y.Z.; He, M.; Deutsch, J.; He, D. Detection of Pitting in Gears Using a Deep Sparse Autoencoder. Appl. Sci. 2017, 7, 515. [CrossRef] 18. Jiang, G.Q.; He, H.B.; Yan, J.; Xie, P. Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox. IEEE Trans. Ind. Electron. 2018. [CrossRef] Appl. Sci. 2019, 9, 768 15 of 15 19. Jing, L.Y.; Zhao, M.; Li, P.; Xu, X.Q. A Convolutional Neural Network Based Feature Learning and Fault Diagnosis Method for the Condition Monitoring of Gearbox. Measurement 2017, 111, 1–10. [CrossRef] 20. Zhao, R.; Wang, D.Z.; Yan, R.Q.; Mao, K.Z.; Shen, F.; Wang, J.J. Machine Health Monitoring Using Local Feature-Based Gated Recurrent Unit Networks. IEEE Trans. Ind. Electron. 2018, 65, 1539–1548. [CrossRef] 21. Dong, H.; Li, S.; Xu, D.C.; Ren, Y.; Zhang, D. Gear Training: A New Way to Implement High-Performance Model-Parallel Training. arXiv 2018, arXiv:1806.03925. 22. Chen, Z.Q.; Chen, D.; Li, C.; Sanchez, R.V.; Qin, H.F. Vibration-Based Gearbox Fault Diagnosis Using Deep Neural Networks. J. Vibroeng. 2017, 19, 2475–2496. 23. Sun, W.F.; Yao, B.; Zeng, N.Y.; Chen, B.Q.; He, Y.C.; Cao, X.C.; He, W.P. An Intelligent Gear Fault Diagnosis Methodology Using a Complex Wavelet Enhanced Convolutional Neural Network. Materials 2017, 10, 790. [CrossRef] [PubMed] 24. Ren, S.Q.; He, K.M.; Girshick, R.; Sun, J. Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015. 25. He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Delving Deep into Rectiﬁers: Surpassing Human-Level Performance on Imagenet Classiﬁcation. In Proceedings of the IEEE International Conference on Computer Vision, Chile, 7–13 December 2015. 26. Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014;, arXiv:1412.3555. © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Journal

Applied Sciences – Multidisciplinary Digital Publishing Institute

Published: Feb 22, 2019

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals

Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals

Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals

References (26)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies