Access the full text.
Sign up today, get DeepDyve free for 14 days.
Hindawi International Transactions on Electrical Energy Systems Volume 2022, Article ID 4139379, 13 pages https://doi.org/10.1155/2022/4139379 Research Article Joint Feature Selection of Power Load in Time Domain and Frequency Domain Based on Whale Optimization Algorithm Feng Hu , Mengran Zhou , Mei Li , and Kai Bian School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan 232001, China Correspondence should be addressed to Feng Hu; hufeng0106@163.com Received 2 February 2022; Accepted 5 April 2022; Published 22 April 2022 Academic Editor: Pawan Sharma Copyright © 2022 Feng Hu et al. )is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Accurate classiﬁcation of power load is the premise of demand-side response capability evaluation, which is of great signiﬁcance for demand-side management. )erefore, this study proposes a selection method of joint characteristics of power load in time domain and frequency domain based on whale optimization algorithm. )e electrical measurement data of six typical electrical equipment are selected to form the original power load identiﬁcation dataset. Firstly, the time domain features and frequency domain features of the power load data are extracted, and then the joint characteristics of the power load are screened by Whale optimization algorithm (WOA). Finally, the selected feature information is used as the input to verify the performance of WOA on the joint characteristics of power load under back propagation neural network (BP), extreme learning machine (ELM), support vector machine (SVM), k-nearest neighbor (KNN), decision tree (DT), and naive Bayesian (NB) classiﬁers. )e results show that WOA can eﬀectively screen out the 15 most helpful feature attributes for power load identiﬁcation, which can not only improve the accuracy of power load identiﬁcation but also eﬀectively reduce the time cost of the algorithm, which is of reference value for further demand-side response strategy. At the same time, it is of great signiﬁcance for intelligent dispatching of power system and improving the economic operation of the power system. the future development of the power industry. )e research 1. Introduction on the power load identiﬁcation is helpful to provide users With the vigorous development of energy Internet [1, 2], the with eﬃcient energy services and realize green and intelli- development of industrial Internet [3, 4] and energy revo- gent power consumption [13]. lution [5, 6] has made great changes in the power industry At present, many scholars have carried out in-depth and put forward more stringent requirements for the in- research on the problem of power load identiﬁcation, and a teraction and response of power supply and demand. To series of research results have emerged [14–16]. Aiming at accurately evaluate the response ability of the power demand the shortcomings of long training time and low recognition side [7], it is necessary to accurately identify the power load accuracy of existing algorithms, Huang et al. [17] proposed a load identiﬁcation algorithm based on long-short-term [8, 9], which is an important prerequisite for mastering the electricity consumption situation and electricity consump- memory-back propagation (LSTM-BP). On the basis of tion behavior of users. In September 2020, China put for- normalization and principal component analysis (PCA), the ward the Dual Carbon Targets at the United Nations General LSTM-BP power load identiﬁcation model was constructed, Assembly [10, 11], that is, to achieve “Carbon Peak” by 2030 and the model was veriﬁed by the REDD dataset. )e results and achieve “carbon neutrality” by 2060. In December 2020, show that the proposed method has higher stability and China further proposed measures to achieve the “Dual accuracy than the traditional load identiﬁcation algorithm. Carbon Targets” at the Climate Ambitious Summit [12]. It Considering that load identiﬁcation is a key link in the power can be seen that energy saving and emission reduction and structure analysis, Yin et al. [18]proposed a similarity cal- reﬁned electricity consumption are the inevitable trends of culation method for spatial convex hull overlap rate, 2 International Transactions on Electrical Energy Systems ﬁelds such as spectral analysis [26], satellite remote sensing introduced transfer learning to identify unknown loads, and veriﬁed the performance of this method in the PLAID [27], and medical auxiliary diagnosis [28]. By embedding the simulated annealing (SA) algorithm into the WOA, Mafarja dataset. Since power load modeling has attracted more at- tention, Arif et al. [19] systematically reviewed the load et al. [29] proposed a new method for feature selection. )e modeling and identiﬁcation technology and proposed the performance of the proposed method was evaluated on 18 future research direction, focusing on the problems and new standard benchmark datasets in the UCI database, which trends in load modeling and identiﬁcation, so as to meet the proved that the WOA had the ability to search the feature growing interest of industry and academia. By combining space and select the attributes with the largest amount of underdetermined decomposition and feature ﬁltering, Wu information for the classiﬁcation task. To overcome the problem of high dimensionality of hyperspectral data, et al. [20] proposed a noninvasive load identiﬁcation al- gorithm, which uses a two-step iterative shrinkage threshold Kumar et al. [30] proposed a band screening method for hyperspectral images based on WOA, and veriﬁed the ef- algorithm to obtain the optimal solution. )en, according to the unique harmonic component of each power load, a fectiveness of band screening on three benchmark datasets (Indian Pines, Pavia, and Salinas). To solve the problem of feature ﬁlter is established to ﬁlter the decomposed power ﬂow, so as to realize load identiﬁcation. Based on the analysis feature selection in high-dimensional data, Too et al. [31] of the actual measured current waveform of grid-connected proposed a new variant of WOA based on spatial boundary equipment by power monitor, Beck et al. [21] proposed a strategy to play the role of ﬁnding potential features from practical power load identiﬁcation method. A set of features high-dimensional feature space. )e eﬀectiveness of this was extracted by using the current physical component method is veriﬁed on 16 high-dimensional datasets collected based on power theory decomposition, and the high-pre- by Arizona State University. Considering these factors, this paper proposes a new cision identiﬁcation of power load was realized by using artiﬁcial neural network and the nearest neighbor search. feature selection method for the power load, that is, using WOA to realize the selection of the joint features in time )e plug load accounts for one-third of the total energy consumption of commercial buildings. )erefore, Tekler domain and frequency domain, to ensure the accuracy of power load identiﬁcation. Firstly, the authors brieﬂy de- et al. [22] proposed a near real-time identiﬁcation method for plug load in oﬃce space. )is method uses low-frequency scribe the ﬂow chart of the power load feature selection power data (1/60 Hz), extracts power characteristics such as method proposed in this paper, and then introduce the power, average power, and power delta, and ﬁnally realizes source of experimental data, the types of time domain, and the identiﬁcation accuracy of up to 93% with the help of the frequency domain features of power load, as well as the Bagging algorithm. In view of the limitations of the tradi- application of WOA in the selection of power load joint tional household load identiﬁcation method, Yu et al. [23] features. Next, the authors analyze the experimental results. Based on the comparison of six kinds of electrical equipment designed an event detector based on steady-state segmen- tation and a linear discriminant classiﬁer group based on power data, diﬀerent classiﬁers are used to classify the original power data. )en, the time domain features and multi-feature global similarity to realize the accurate iden- tiﬁcation of household power load, and veriﬁed the eﬀec- frequency domain features of the original power data are tiveness of the load identiﬁcation method under the REDD extracted and used for the identiﬁcation of power load. On dataset. In order to overcome the challenges in improving this basis, WOA is used to screen the joint features of power the accuracy of linear load and nonlinear load identiﬁcation, load, and the eﬀectiveness of feature selection is veriﬁed by Le et al. [24] proposed a new idea for power load identiﬁ- six classiﬁers. In addition, the authors also compare the cation, that is, generating a new transient feature based on proposed method with PCA for feature extraction to further Hilbert transform (HT), and then combining sequence to verify the reliability of the model. Finally, the authors sequence LSTM (Seq2Seq LSTM) to achieve load identiﬁ- summarize this study and prospect future research work. cation. Subsequently, the feasibility of the proposed method was veriﬁed on BLUED dataset and PLAID dataset. For the 2. Proposed Methodology for Power purpose of improving the real-time performance of power Load Identification load identiﬁcation, Hamdi et al. [25] ﬁltered the original power signal to reduce the data dimension, and then )e speciﬁc experimental process is shown in Figure 1. )e extracted the maximum value and its location, average value, authors select the line current, real power, reactive power, ﬁnal value, and area under the curve. Finally, the load and apparent power of six typical electrical equipment such identiﬁcation was realized by template matching. as clothes washer (CWE), kitchen dishwasher (DWE), In general, the traditional identiﬁcation methods of kitchen convection wall oven (WOE), clothes dryer (CDE), power load are mainly realized by the combination of time kitchen fridge (FGE), and instant hot water unit (HTE) from domain analysis, frequency domain analysis, feature ex- )e Almanac of Minutely Power dataset Version 2 traction and classiﬁer; therefore, how to select the appro- (AMPds2) [32]. Firstly, the original power load dataset is priate features is crucial. In feature selection, swarm obtained by dividing the original data into days, and then the intelligence optimization algorithm can often show excellent invalid data (samples with all data of 0 in a day) are performance, so it is widely used in feature selection. Whale eliminated to obtain an eﬀective and available power load optimization algorithm (WOA), as one of the swarm in- dataset. Subsequently, the time domain features and fre- telligence optimization algorithms, is widely used in many quency domain features of power load data are extracted, International Transactions on Electrical Energy Systems 3 Prepare data set AMPds2 Data type Electrical equipment CWE DWE WOE I P Select the power load data of some typical electrical equipment CDE FGE HTE Q S Data segmentation in days Eliminate invalid data Samples with all data of 0 in one day Obtain an effective and available power load data set Extract the time-domain and frequency- 16 time domain features and domain features of power load samples 13 frequency domain features WOA selection of time-frequency domain features Classifiers BP ELM SVM Construction of power load identification model using classifiers KNN DT NB Figure 1: Flow chart of the joint feature selection of power load in time domain and frequency domain based on whale optimization algorithm. Six kinds of typical electrical equipment are selected as and the joint features (time domain and frequency domain) of power load are formed. )en, WOA is used to screen the the research object, and the line current (I), active power (P), characteristics of the power load. Finally, the selected feature reactive power (Q), and apparent power (S) of electrical information is used as the input of diﬀerent classiﬁers to equipment are taken as the input information. For 730 days identify the type of power load, to verify the feasibility of of power data (sampled once per minute and the cumulative WOA for feature selection of the power load. length of data is 730 × 24 × 60 � 1051200), ﬁrstly, taking the time length of one day as the basic unit, divide the original power data (I, P, Q, and S) of each electrical equipment into 2.1. Data Acquisition. To verify the eﬀectiveness and feasi- 730 data, and combine the power data of each equipment bility of the method proposed in this paper, AMPds2 is used into one data sample. )en, the samples with all power data as the original data, which records the consumption of water, of 0 (that is, the electrical equipment is not used) are electricity, and natural gas in a residential building in screened out, and ﬁnally, the power datasets of six typical Burnaby, Canada, in two years (730 days in total). Among electrical equipment is formed. )e power dataset includes them, the power-related measurement parameters mainly 276 CWE samples, 250 DWE samples, 730 WOE samples, include line voltage, line current, line frequency, active 349 CDE samples, 730 FGE samples, and 730 HTE samples, power, reactive power, apparent power, etc. In particular, with a total of 3065 valid samples. In the process of power AMPds2 has been pre-cleaned at the time of release, which is data analysis, the authors select 70% of the samples as the very suitable for engineers and scientists in power, energy, training set and 30% of the remaining samples as the test set, construction, and other industries to test the performance of that is, the number of training set samples is 2142 and the the model in a real environment. number of test set samples is 923. In addition, when 4 International Transactions on Electrical Energy Systems constructing the load identiﬁcation model, the authors in- Step 2: Parameters initialization of WOA. A feature troduce 5-fold cross-validation to improve the reliability and combination is randomly selected from the joint fea- eﬀectiveness of the model. ture of time domain and frequency domain of power load as the initial whale position, and the parameters of WOA are set, including the number of groups N, the 2.2. Joint Features of Time Domain and Frequency Domain. maximum number of iterations T, the selection of )e analysis of power signals can be mainly divided into time contraction bounding mechanism, and the probability domain analysis [33], frequency domain analysis [34], and p of spiral position update (p is the random number on time-frequency domain analysis [35]. Considering the [0, 1], and the initial value is set by random function). complexity of power load, this paper extracts the time do- Step 3: Calculate the ﬁtness value of each whale indi- main features and frequency domain features of power data vidual according to formula (1), ﬁnd the best whale to form the joint features of power load, and on this basis, ∗ individual X in the current group, and save the results. WOA is introduced for feature screening. In this paper, 16 Step 4: When p< 0.5, if A< 1, update the spatial po- time domain features and 13 frequency domain features are sition of the individual of the current whale group mainly used, of which the time domain features include 10 according to formula. dimensional statistical indexes and 6 dimensionless statis- X � X − A × C × X − X , (2) tical indexes. )e calculation formulas of time domain t+1 t t t features and frequency domain features are shown in Where A and C are coeﬃcient vectors, X is the best Table 1. spatial position of the current whale group, X is the In particular, the authors extract the time domain fea- individual spatial position of the current whale group, tures and frequency domain features of power data (I, P, Q, and t is the number of current iterations. and S) of electrical equipment, respectively, so that 29 × 4 � 116 power load features can be obtained for each sample. In particular, the coeﬃcient vectors A and C are cal- culated as follows: A � 2a × r − a, 2.3. Description of WOA for Power Load Feature Screening. (3) Inspired by the predatory behavior of humpback whales, C � 2r, Mirjalili et al. [36] proposed a whale optimization algorithm based on the simulation of the process of humpback whale Where a � 2 − (2t/T) is a constant, and its value de- swarm enclosure, hunting and attacking prey. )e optimi- creases linearly from 2 to 0; r is a random vector, and its zation algorithm has the advantages of less parameters, fast value range is [0, 1]. convergence speed, and strong global optimization ability. If A≥ 1, the individual position X of the whale rand )e basic idea of using WOA for the screening of the joint group is randomly selected from the current group, and features of the power load time domain and the frequency the spatial position of the individual of the current domain is to determine the parameters to be optimized whale group is updated according to the following according to the power load time domain and frequency formula: domain joint features screening problem, that is, the joint characteristics of the time domain and frequency domain of X � X − A × C × X − X , (4) t+1 rand rand t the power load, and the spatial position of each individual in the whale group contains a set of screening features. )e Where X is the position randomly selected from the rand ﬁtness function is used to measure the spatial position of the current whale population. individual, and the whale foraging strategy is used to con- Step 5: when p≥ 0.5, update the spatial position of the tinuously update the whale individual position until the best current whale population according to formula. whale spatial position is obtained, which is the best set of ∗ bl ∗ (5) screening features for the optimization problem. X � X − X e cos(2πl) + X , t+1 t t t )e joint feature screening process of power load in the time domain and frequency domain is as follows: where b is the deﬁned logarithmic spiral shape constant and l is a random number in the range [− 1, 1]. Step 1: Deﬁne the ﬁtness function. Since WOA is a Step 6: Calculate the classiﬁcation error of the test set process to solve the minimum value, the classiﬁcation corresponding to each whale group individual as the error of the test set of the power load identiﬁcation ﬁtness value, ﬁnd the best whale group individual X in model is taken as the ﬁtness function of this paper, that the current group, and save the results. is, the objective function is: N At the same time, judge whether the termination correct E � 1 − , (1) conditions are met. If so, go to step 7 and output the test optimization result; otherwise, make t � 1, update A, B and C at the same time, and repeat steps 4 to 6 above. Where N is the number of correctly predicted correct samples in the test set, and N is the total number of At the same time, determine whether to meet the test samples in the test set. termination conditions, if satisﬁed, then go to Step 7 International Transactions on Electrical Energy Systems 5 Table 1: Description of time domain features and frequency domain features. Features Computing formula Mean value Mean � (1/N) x i�1 ����������� � Root mean square RMS � (1/N) x i�1 i ��� N 2 Square root amplitude SRA � ((1/N) |x | ) i�1 Absolute mean AM � (1/N) |x | i�1 i N 3 Skewness Skewness � (1/N) (x − Mean) i�1 i N 4 Kurtosis Kurtosis � (1/N) (x − Mean) i�1 N 2 Variance DX � (1/N) (x − Mean) i�1 Maximum value Max � maxx , x , . . . , x , i � 1, 2, . . . , N 1 2 i Time domain features Minimum value Min � minx , x , . . . , x , i � 1, 2, . . . , N 1 2 i Peak-to-peak value PPV � Max − Min1 Waveform index WI � (RMS/AM) Peak index Peak Index � (Max/RMS) Pulse index Pulse Index � (Max/AM) Margin index MI � (Max/SRA) N 3 Skewness index SI � (1/N) ((x − Mean)/DX) i�1 i N 4 Kurtosis index KI � ((1/N) ((x − Mean)/DX) ) − 3 i�1 Frequency mean FM � (1/N) f i�1 i ������������������ � N 2 Frequency standard deviation FSD � (1/N) (f − FM) i�1 i √���� � 3 N 3 DCS � (1/N × FSD ) (f − FM) 1 i�1 i 2 N 4 DCS � (1/N × FSD ) (f − FM) 2 i�1 i ������������������������������ � N 2 DCS � (1/N) [Ts × (i − 1) − FC] × f 3 i i�1 Dispersion or concentration of DCS � (DCS /FC) 4 3 spectrum Frequency domain N 3 DCS � ( ⟨{[Ts × (i − 1)] − FC} × f ⟩/(DCS × N)) 5 i i�1 3 features (Ts � 60) N 4 DCS � ( ⟨ [Ts × (i − 1)] − FC × f ⟩/(DCS × N)) { } 6 i�1 i 3 ���������������� � ����� DCS � ( |[Ts × (i − 1)] − FC| × f /( DCS × N)) 7 i�1 i 3 N N Frequency center FC � ( Ts × (i − 1) × f / f ) i�1 i i�1 i ����������������������������� N 2 N Root mean square frequency RMSF � ( [Ts × (i − 1)] × f / f ) i�1 i i�1 i �������������������������������������������� � N N 4 2 PCMFB � ( [Ts × (i − 1)] × f / [Ts × (i − 1)] × f ) 1 i i Index of position change of i�1 i�1 main frequency band N 2 N N 4 PCMFB � ( [Ts × (i − 1)] × f /( f × [Ts × (i − 1)] × f )) 2 i�1 i i�1 i i�1 i Where x represents the sampling value of power parameters of electrical equipment in one day, N represents the length of power data (N � 24 × 60 � 1440 in this paper), and f represents the frequency calculated by discrete Fourier transform (DFT). and output the optimization results; otherwise, let identiﬁcation model, so as to realize the identiﬁcation of t � t + 1, update a, A, C, l, and p simultaneously, and power load types. In addition, in the process of building the repeat Step 4 to Step 6 above. power load identiﬁcation model, the authors introduce 10- fold cross validation to increase the reliability and eﬀec- Step 7: Output the individual ﬁtness value of the op- ∗ tiveness of diﬀerent classiﬁcation models. And in the process timal whale population and its spatial location X , of ﬁve independent experiments, we randomly divide the which is the best set of screening features. samples, so that the composition of the test set samples is diﬀerent in each experiment. 2.4. Load Identiﬁcation Model. For the purpose of classiﬁ- cation of power load data, the basic classiﬁcation algorithms, 3. Results and Discussion such as back propagation (BP) neural network, extreme learning machine (ELM), support vector machine (SVM), 3.1.PresentationofRawPowerData. A total of 3065 eﬀective k-nearest neighbor (KNN), decision tree (DT), and naive samples of six types of typical electrical equipment are se- Bayesian (NB), were used to construct the power load lected from the AMPds2. )e authors select a day as an 6 International Transactions on Electrical Energy Systems frequency domain of the power load sample. Since each load example to display the power data of six kinds of electrical equipment more intuitively. In the process of displaying the sample contains four electrical parameters, I, P, Q, and S, the dimension of the time domain feature of a sample is power load data, considering the diﬀerence of current and power data, the authors display these four power parameters, 16 × 4 � 64, the dimension of the frequency domain feature respectively, as shown in Figure 2. of a sample is 13 × 4 � 52, and the dimension of the joint It can be seen that the load curves of diﬀerent electrical feature of the time domain and frequency domain of a equipment have certain diﬀerences. For example, the current sample is (16 + 13) × 4 � 116. For the purpose of comparing and functional parameters of WOE and CDE are relatively the performance of diﬀerent features in power load iden- large, and their power curves are mainly concentrated in the tiﬁcation, the authors take the time domain features, fre- quency domain features, and joint features as the input use period. )e power data of the refrigerator are relatively small, but their duration is very long. At the same time, the vectors, feed them into diﬀerent classiﬁers to construct the load identiﬁcation models, and take the accuracy of the test authors also note that the electrical parameters of some equipment also have a certain overlap, and the discrimi- set as the evaluation index of the model. Five independent experiments were carried out, and the mean and standard nation is not large. )erefore, it is necessary to use the signal analysis method to achieve more accurate identiﬁcation. deviation of the ﬁve experiments were calculated. )e classiﬁcation performance of diﬀerent classiﬁers was drawn, as shown in Figure 3. 3.2.ClassiﬁcationofOriginalPowerSignals. In order to more In Figure 3, the authors show the mean value of classi- intuitively understand the diﬀerence of power data of dif- ﬁcation accuracy in the form of a broken line graph, and the ferent electrical equipment, BP, ELM, SVM, KNN, DT, and standard deviation in the form of an error band, so that the NB are used as classiﬁers, and the original power data are authors can clearly see the classiﬁcation accuracy and stability used as input, that is, all power data (I, P, Q, and S) of a of diﬀerent classiﬁers. When observing the performance of sample within a day are used as input. )e classiﬁcation diﬀerent features, the authors ﬁnd that the joint features of accuracy and analysis time under diﬀerent classiﬁers are time domain and frequency domain have the best eﬀect on observed, and ﬁve independent experiments are repeated. power load identiﬁcation, followed by the performance of )e relevant results are shown in Table 2. At the same time, power load identiﬁcation using frequency domain features the core parameters of the classiﬁers are described in the alone, and the performance of power load identiﬁcation table. using time domain features alone is the worst. )is shows First of all, the authors observe the accuracy. It can be that the frequency domain features of power data of diﬀerent found that the eﬀect of the six classiﬁers is not particularly electrical equipment are more discriminating than the time good. )e average recognition rate of the ﬁve independent domain features. At the same time, the joint features of time experiments using NB as the classiﬁer is 73.44, while the domain and frequency domain can eﬀectively integrate their average recognition rate of the other classiﬁer is less than advantages and achieve the best identiﬁcation eﬀect. At the 50.00%, which shows that the accuracy of electrical equip- same time, the authors also ﬁnd an interesting phenomenon. ment identiﬁcation using original power data is low. In other Although the classiﬁcation eﬀects of diﬀerent features are words, the direct use of original power data cannot meet the quite diﬀerent, the average classiﬁcation accuracy of any requirements of accurate identiﬁcation of electrical equip- feature exceeds 90.00%, which is much higher than the ment. Subsequently, the authors focus on the analysis time. classiﬁcation accuracy of the original power data. )is shows It can be seen that the analysis time of diﬀerent classiﬁers is that the feature extraction method described in Table 1 is quite diﬀerent. )e analysis time of ELM is the shortest, with eﬀective and feasible for the identiﬁcation of power load. an average of only 0.4310 s, while the analysis time of KNN Taking the joint feature as an example, when looking at the and DT is about 2 s. However, the analysis time of BP and performance of diﬀerent classiﬁers, the authors ﬁnd that the SVM is much larger, which takes about 200 s. )e most classiﬁcation accuracy of NB and SVM is high (the average surprising thing is that the analysis time of NB reaches accuracy is 99.78% and 99.72%, respectively), while the 4963.2828 s. )e long analysis time of BP, SVM, and NB is performance of ELM is poor (the average accuracy is mainly due to the fact that the dimension of original power 95.84%). In addition, the authors also noticed that the data reaches 1140 × 4 � 5760. Considering the actual re- standard deviation of the classiﬁcation accuracy of NB in the quirements of accuracy and speed for power load identiﬁ- ﬁve independent experiments was 0, which maintained ex- cation, the authors must process the original power data to cellent classiﬁcation stability. )e standard deviation of the achieve better identiﬁcation performance. classiﬁcation accuracy of ﬁve ELM experiments was 0.68%, and the stability was not as good as NB. 3.3. Classiﬁcation of Time Domain Features and Frequency While recording the classiﬁcation accuracy, the authors DomainFeatures. According to the lack of low accuracy and also analyze the classiﬁcation time of diﬀerent features under diﬀerent classiﬁers, calculate the mean and standard devi- slow analysis speed of the original power signal, the time domain features and frequency domain features of the ation of the analysis time, and display it in a bar chart with error bars, as shown in Figure 4. Because of the large dif- original power signal are extracted according to the feature extraction method described in Table 1, so that the authors ference in the analysis time of diﬀerent classiﬁers, the au- can obtain the time domain features, frequency domain thors enlarge the analysis time of BP, ELM, SVM, KNN, and features, and the joint features of the time domain and DT, so that the time cost of diﬀerent classiﬁers can be more International Transactions on Electrical Energy Systems 7 45 5000 0 0 0 150 300 450 600 750 900 1050 1200 1350 1500 0 150 300 450 600 750 900 1050 1200 1350 1500 Time (min) Time (min) CDE CWE CDE CWE FGE DWE FGE DWE HTE WOE HTE WOE 450 5000 0 0 0 150 300 450 600 750 900 1050 1200 1350 1500 0 150 300 450 600 750 900 1050 1200 1350 1500 Time (min) Time (min) CDE CDE CWE CWE FGE DWE FGE DWE HTE HTE WOE WOE Figure 2: Power data curve of 6 kinds of electrical equipment. intuitively seen. First of all, the authors can ﬁnd that the of the original power data can not only improve the accuracy analysis time of frequency domain features is the shortest, of power load identiﬁcation but also reduce the time cost of followed by the analysis time of time domain features, and the algorithm. the analysis time of joint features is the longest. )is is because the dimension of frequency domain features is the 3.4. Selection of Joint Features by Using WOA. In order to lowest (13 × 4 � 52 dimensions), and the dimension of joint features is the highest (29 × 4 � 116 dimensions). )e more extract the most eﬀective features in the joint features, WOA is used to screen the features to further improve the analysis the feature dimension is, the more is the analysis time accuracy and reduce the analysis time. In the process of needed. Secondly, the authors focus on the time cost of screening the joint features of power load using WOA, KNN diﬀerent classiﬁers. It is obvious that the analysis time of NB classiﬁcation model is established by using the selected is much longer than that of other species algorithms. When NB algorithm is used to analyze the joint features of power feature data (KNN is selected as a classiﬁer to reduce the time of WOA iterative screening features). )e number of load, the average analysis time of ﬁve independent experi- ments is 54.7381 s, which is only 1.10% of the original power feature variables corresponding to the minimum classiﬁ- cation error value is the ﬁnal screening result. In particular, data analysis time, greatly reducing the time cost of the algorithm. )e analysis time of the ﬁve algorithms is lower, during feature selection using WOA, set the number of groups N to 5, and set the maximum number of iterations T in which the analysis time of SVM is 2.5546 s, and that of KNN is the shortest, only 0.0334 s. )us, feature extraction to 100. Q (var) I (A) S (V·A) P (W) 8 International Transactions on Electrical Energy Systems Table 2: Classiﬁcation results of original power signals. Classiﬁer Test 1 Test 2 Test 3 Test 4 Test 5 Mean value Standard deviation BP 39.44 42.36 49.51 44.75 43.01 43.81 3.72 ELM 38.14 40.85 45.39 41.82 41.28 41.50 2.60 SVM 46.56 46.72 46.30 45.91 46.20 46.34 0.32 Accuracy (%) KNN 25.24 25.79 25.46 27.95 23.84 25.66 1.48 DT 49.14 49.14 49.14 49.14 49.14 49.14 0.00 NB 73.44 73.44 73.44 73.44 73.44 73.44 0.00 BP 227.1572 217.4501 222.0936 222.8179 218.7004 221.6438 3.8154 ELM 0.4152 0.4418 0.4476 0.4367 0.4139 0.4310 0.0155 SVM 211.3685 222.0861 223.4372 214.3196 216.0319 217.4487 5.1511 Time (s) KNN 1.8922 1.9682 1.7896 1.9046 1.8769 1.8863 0.0643 DT 2.1488 2.1372 2.0935 2.1837 2.0789 2.1284 0.0425 NB 4970.3256 4975.1241 4961.7400 4950.3201 4958.9041 4963.2828 9.7391 (1) N � 28: N represents the number of hidden layer neurons in BP. BP BP (2) N � 28: N represents the number of hidden layer neurons in ELM. ELM ELM (3) C � 2, g � 1: C represents penalty parameter, and g represents the parameter of kernel function. Description of core (4) K � 5: K represents the number of nearest neighbors in the input for classifying each point when parameters of the classiﬁer predicting. (5) DT � 10: DT represents maximal number of decision splits (or branch nodes) per tree. MAX MAX (6) DN � kernel: DN represents the distribution used to model the data, and “kernel” refers to kernel smoothing density estimate. BP ELM SVM KNN DT NB Classifier Time domain features Frequency domain features Joint features Figure 3: Classiﬁcation accuracy of diﬀerent power load features under diﬀerent classiﬁers. Because WOA is an algorithm to ﬁnd the minimum I_DCS , P_RMS, P_MI, P_DCS , Q_Min, Q_PPV, Q_KI, 1 3 value, in the process of load characteristic screening using Q_FM, Q_DCS , Q_DCS , Q_DCS , S_SI, S_DCS , S_DCS , 2 3 5 3 5 WOA, we take the error rate as its objective function, that is, and S_DCS . In particular, the letters before the underscore select the case with the minimum error rate. Figure 5 shows represent a certain electrical parameter of the electrical the variation trend of classiﬁcation error of WOA in the equipment, and the speciﬁc features of the parameter behind process of screening load features. It can be seen from the the underscore. By observing these 15 features, the authors ﬁgure that the classiﬁcation error reaches the minimum ﬁnd that these features are mainly concentrated in the value of 0.6529% after 29 iterations from the initial 1.8498%. amount indicating the dispersion or concentration of the At this time, the number of features screened is 15, which are spectrum. Accuracy (%) International Transactions on Electrical Energy Systems 9 54.7381 3.0 2.5564 2.5 2.2939 1.8675 2.0 32.3978 1.5 1.2911 0.9361 1.0 0.8627 22.4908 0.5 0.1243 0.1495 0.0661 0.1026 0.0214 0.0334 0.1154 0.0597 0.0188 0.0 BP ELM SVM KNN DT Classifier 2.2939 2.5564 0.9361 1.8675 1.2911 0.0661 0.1026 0.1243 0.0214 0.0334 0.1495 0.8627 0.1154 0.0597 0.0188 BP ELM SVM KNN DT NB Classifier Time domain features Frequency domain features Joint features Figure 4: Analysis time of diﬀerent power load features under diﬀerent classiﬁers. 2.0 )e authors ﬁrst pay attention to the accuracy of the classiﬁcation model. It can be seen that when the features 1.8 selected by WOA are used for power load identiﬁcation, no matter which classiﬁer has a very good performance, the 1.6 ELM with the worst classiﬁcation eﬀect can also achieve an 1.4 average recognition rate of 97.94%, and the average rec- ognition accuracy of the ﬁve classiﬁcation algorithms is 1.2 about 99.00% (in which NB can achieve 100.00% recognition 1.0 accuracy of power load when it is used as the classiﬁer). )erefore, WOA can eﬀectively screen out the most eﬀective 0.8 (29, 0.6529) features for power load identiﬁcation in the joint features of domain and frequency domain. In addition, the authors 0.6 0 102030405060708090 100 observe the analysis time. )e analysis time of the six Number of Iterations classiﬁers has decreased to varying degrees, which is mainly Error Rate due to the fact that WOA carries out load feature selection Figure 5: Error rate under diﬀerent iterations by WOA. and reduces the dimension of data. Among them, the analysis time of KNN is the shortest, only 0.0107 s, while the analysis time of NB is still the longest, 6.3631 s. At this time, the analysis time of NB classiﬁer is 11.62% of the analysis 3.5. Classiﬁcation of WOA Screening Features. )e eﬀec- time of the joint feature data, and 0.13% of the analysis time of the original power data. In summary, using WOA to ﬁlter tiveness and reliability of the features screened by WOA are further veriﬁed. )e selected feature data are used as the the features of power load can not only eﬀectively improve input information of power load identiﬁcation. )e iden- the accuracy of power load identiﬁcation but also greatly tiﬁcation models of power load types are constructed by reduce the time cost of the algorithm model. using BP, ELM, SVM, KNN, DT, and NB, respectively. Five For the purpose of showing the eﬀectiveness of power independent experiments are carried out, and the mean and load identiﬁcation more intuitively, the authors take DT standard deviation of the six classiﬁcation models are classiﬁer as an example and use confusion matrix to show the counted and calculated. )e classiﬁcation accuracy and identiﬁcation results of power load, as shown in Table 3. By analysis time are shown in Figure 6. observing the data in the table, the authors can clearly see the Error Rate (%) Time (s) Time (s) 10 International Transactions on Electrical Energy Systems 100.0 7 100.00 99.76 99.5 6.3631 99.39 99.0 99.15 98.70 98.5 98.0 97.94 97.5 97.0 96.5 1.6010 96.0 0.5896 95.5 0.0951 0.0107 0.0588 95.0 0 BP ELM SVM KNN DT NB Classifier Accuracy Time Figure 6: Results of WOA screening features under diﬀerent classiﬁers. Table 3: Confusion matrix of recognition results under the NB classiﬁer. Predicted class CWE DWE WOE CDE FGE HTE CWE 80 1 0 0 2 0 DWE 2 73 0 0 0 0 WOE 0 1 218 1 0 0 Actual class CDE 0 0 3 102 0 0 FGE 0 0 0 0 220 0 HTE 0 0 2 0 0 218 forecast of each electrical equipment. Among the 83 CWE pattern recognition. )erefore, it is necessary to compare the samples in the test set, 80 samples can accurately predict the feature selection idea proposed in this paper with the feature category, 1 sample is wrongly predicted as DWE, and 2 extraction eﬀect of PCA. In particular, before the process of samples are wrongly predicted as FGE. Among 75 DWE PCA, the authors normalize the feature data, namely, the samples in the test set, 73 samples can accurately predict the original feature is normalized to the interval [0,1]. At the category, and 2 samples are wrongly predicted as CWE. same time, in the process of PCA, the authors set the cu- Among 220 WOE samples in the test set, 218 samples can mulative contribution threshold to 95%. Subsequently, the accurately predict the category, 1 sample is wrongly predicted obtained principal component information was fed into six as DWE, and 1 sample is wrongly predicted as CDE. Among classiﬁers to construct the power load identiﬁcation model, the 105 CDE samples in the test set, 102 samples can accu- and ﬁve independent experiments were carried out. )e rately predict the category, and 3 samples are wrongly pre- identiﬁcation performance of PCA combined with diﬀerent dicted as WOE. All 220 FGE samples in the test set can classiﬁers for power load was statistically analyzed, as shown accurately predict the correct category. Among 220 HTE in Figure 7. To see the results of the ﬁve experiments more samples in the test set, 218 samples can accurately predict the clearly, the authors enlarged some details. category, and 2 samples are wrongly predicted as WOE. It can be seen from the graph that the performance of PCA combined with diﬀerent classiﬁers for power load identiﬁcation is quite diﬀerent. For example, the average 3.6. Comparison with Traditional Strategies. As is known to recognition accuracy of PCA combined with BP for load all, PCA is a common and eﬀective feature extraction and identiﬁcation is only 36.62%, and the average classiﬁcation dimension reduction method, which is widely used in accuracy of PCA combined with ELM or SVM is only about Accuracy (%) Time (s) International Transactions on Electrical Energy Systems 11 69.5 95 features and frequency domain features of the power load 69.0 data are extracted. And, on the basis of extracting the time 68.5 domain features and frequency domain features of power load, WOA is used to further screen the most useful feature 68.0 combination for power load identiﬁcation. Finally, the se- 67.5 90 ELM SVM KNN DT NB lected feature information is used as input to verify the eﬀect Classifier Classifier of WOA on the joint features of power load under BP, ELM, SVM, KNN, DT, and NB classiﬁers. )e research results show that, compared with the original power data, extracting the time domain and frequency domain features of power 90 94.03 data can improve the accuracy of power load identiﬁcation. 91.90 90.15 Compared with the time domain features of power load, the frequency domain features of power load are more dis- criminating, and the eﬀect of joint features (the combination of time domain and frequency domain) is better. Using 68.30 68.94 WOA to screen the joint features, 15 feature attributes that are most helpful for power load identiﬁcation can be selected from the original 116 power load features. )e average classiﬁcation accuracy of the 15 feature attributes under the six classiﬁers is 99.16%, and the average analysis time is 40 1.4531 s. In addition, compared with the traditional PCA 36.62 feature extraction strategy, WOA has more excellent per- 30 formance in feature selection of power load. In summary, BP ELM SVM KNN DT NB WOA is used to screen the joint features of power load, and Classifier the accurate identiﬁcation of power load is realized, which Test 1 Test 4 provides a basis for further developing the diﬀerentiated Test 2 Test 5 demand side response strategy and ﬁne evaluation of de- Test 3 mand side response potential. At the same time, it is of great signiﬁcance for intelligent dispatching of the power system Figure 7: Load identiﬁcation accuracy of feature extraction using PCA. and improving the economic operation of the power system. At present, the research work in this paper still has certain limitations, mainly reﬂected in the following two 68%. )e recognition eﬀect of PCA combined with NB aspects. On the one hand, the current load data source is classiﬁer is the best, which can reach the average recognition residential load, and the load type is relatively small. In the accuracy of 94.03%. However, when the authors compared future, the authors will introduce industrial load and the results of feature selection with WOA, the authors found commercial load to further enrich the load type. On the that the average recognition accuracy (94.03%) of the power other hand, the power load feature selection method pro- load identiﬁcation strategy of even PCA with NB with the posed in this paper is based on the extraction of time domain best classiﬁcation eﬀect was 3.91% lower than that of the features and frequency domain features of the load, and the ELM classiﬁer with the worst load feature classiﬁcation eﬀect feature extraction is a cumbersome process. )erefore, in the selected by WOA (97.94%). )erefore, compared with the future, the authors will explore the feasibility of power load traditional dimension reduction strategy of PCA, WOA has identiﬁcation from the perspective of power signal more excellent performance in feature selection of power decomposition. load. Data Availability 4. Conclusions )e data used to support the ﬁndings of this study are Accurate classiﬁcation of power load is the premise of de- available from the corresponding author upon request. In mand side response capability evaluation, which is of great addition, data can be obtained by visiting http://ampds.org/. signiﬁcance for demand side management. Power load identiﬁcation is essentially a supervised learning and fore- Conflicts of Interest casting process. How to obtain the characteristics of load identiﬁcation is very important to ensure the accuracy of )e authors declare that they have no conﬂicts of interest. power load identiﬁcation. )erefore, a feature selection method based on WOA is proposed in this paper to select Acknowledgments the joint features of power load in time domain and fre- quency domain. )e electrical measurement data of six )is work was supported by the Key Projects of Natural typical electrical equipment are selected to form the original Science Research in Anhui Universities (grant numbers power load identiﬁcation dataset. Firstly, the time domain KJ2021A0470, KJ2021A0471); the University-level Key Accuracy (%) Accuracy (%) Accuracy (%) 12 International Transactions on Electrical Energy Systems based demand response,” International Transactions on Projects of Anhui University of Science and Technology Electrical Energy Systems, vol. 30, 2020. (grant number xjzd2020-06); the Talent Introduction Fund [15] G. De Carne, S. Bruno, M. Liserre, and M. La Scala, “Dis- of Anhui University of Science and Technology (grant tributed online load sensitivity identiﬁcation by smart number 13200404); the Young Talent Project of Anhui transformer and industrial metering,” IEEE Transactions on University of Science and Technology (grant number Industry Applications, vol. 55, no. 6, pp. 7328–7337, 2019. 2020023); the National Key R&D Program of China (grant [16] S. Ghosh, A. Chatterjee, and D. Chatterjee, “Improved non- number 2020YFB1314100); the Energy Internet Joint Fund intrusive identiﬁcation technique of electrical appliances for a of Anhui Province (grant number 2008085UD06); and the smart residential system,” IET Generation, Transmission & Anhui Science and Technology Major Project (grant number Distribution, vol. 13, no. 5, pp. 695–702, 2019. 201903a07020013). [17] L. Huang, S. Chen, Z. Ling, Y. Cui, and Q. Wang, “Non- invasive load identiﬁcation based on LSTM-BP neural net- work,” Energy Reports, vol. 7, pp. 485–492, 2021. References [18] B. Yin, L. Zhao, X. Huang, Y. Zhang, and Z. Du, “Research on non-intrusive unknown load identiﬁcation technology based [1] Y. Liu, X. Zhang, H. Sun, and H. Feng, “Discussion on ap- on deep learning,” International Journal of Electrical Power & plication of big data in electricity market in background of Energy Systems, vol. 131, Article ID 107016, 2021. energy Internet,” Automation of Electric Power Systems, [19] A. Arif, Z. Wang, J. Wang, B. Mather, H. Bashualdo, and vol. 45, pp. 1–10, 2021. [2] Q. Sun, J. Hu, J. Hu, and H. Zhang, “Triple play of energy D. Zhao, “Load modeling-A review,” IEEE Transactions on Internet with Chinese characteristics and its self-mutual- Smart Grid, vol. 9, no. 6, pp. 5986–5999, 2018. group collaboration control technology framework,” Pro- [20] X. Wu, X. Han, and K. X. Liang, “Event-based non-intrusive ceedings of the Chinese Society of Electrical Engineering, vol. 41, load identiﬁcation algorithm for residential loads combined pp. 40–51, 2021. with underdetermined decomposition and characteristic ﬁl- [3] M. Tabaa, F. Monteiro, H. Bensag, and A. Dandache, “Green tering,” IET Generation, Transmission & Distribution, vol. 13, industrial Internet of things from a smart industry perspec- no. 1, pp. 99–107, 2019. tives,” Energy Reports, vol. 6, pp. 430–446, 2020. [21] Y. Beck and R. Machlev, “Harmonic loads classiﬁcation by [4] J.-Q. Li, F. R. Yu, G. Deng, C. Luo, Z. Ming, and Q. Yan, means of currents’ physical components,” Energies, vol. 12, “Industrial Internet: a survey on the enabling technologies, no. 21, p. 4137, 2019. applications, and challenges,” IEEE Communications Surveys [22] Z. D. Tekler, R. Low, Y. Zhou, C. Yuen, L. Blessing, and & Tutorials, vol. 19, no. 3, pp. 1504–1526, 2017. C. Spanos, “Near-real-time plug load identiﬁcation using low- [5] D. Gu, “Dedication to clean power and promotion of the frequency power data in oﬃce spaces: experiments and ap- energy revolution,” Engineering, vol. 6, no. 12, pp. 1331-1332, plications,” Applied Energy, vol. 275, Article ID 115391, 2020. [23] J. Yu, Y. Gao, Y. Wu, D. Jiao, C. Su, and X. Wu, “Non-In- [6] K. Zhou, S. Yang, and Z. Shao, “Energy Internet: the business trusive load disaggregation by linear classiﬁer group con- perspective,” Applied Energy, vol. 178, pp. 212–222, 2016. sidering multi-feature integration,” Applied Sciences, vol. 9, [7] S. Davarzani, I. Pisica, G. A. Taylor, and K. J. Munisami, no. 17, p. 3558, 2019. “Residential demand response strategies and applications in [24] T.-T.-H. Le, S. Heo, and H. Kim, “Toward load identiﬁcation active distribution network management,” Renewable and based on the Hilbert transform and sequence to sequence long Sustainable Energy Reviews, vol. 138, Article ID 110567, 2021. short-term memory,” IEEE Transactions on Smart Grid, [8] D. Ma, B. Sun, and C. Liu, “Short-term cooling and heating vol. 12, no. 4, pp. 3252–3264, 2021. power load prediction method based on multi-weather in- [25] M. Hamdi, H. Messaoud, and N. Bouguila, “A new approach formation, dianwang jishu/power syst,” 5e Tech, vol. 45, of electrical appliance identiﬁcation in residential buildings,” pp. 1015–1022, 2021. Electric Power Systems Research, vol. 178, Article ID 106037, [9] Y. Li, X. Liu, F. Xing et al., “Daily Peak load prediction based on correlation analysis and Bi-directional long short-term [26] P. Manoharan and P. K. L. N. Boggavarapu, “Improved whale memory network,” Dianwang Jishu/Power Syst. optimization based band selection for hyperspectral remote Technol.vol. 45, pp. 2719–2730, 2021. sensing image classiﬁcation,” Infrared Physics & Technology, [10] X. Tan, J. Liu, Z. Xu, L. Yao, G. Ji, and B. Shan, “Power supply vol. 119, 2021. and demand balance during the 14th ﬁve-year plan period [27] Q.-T. Bui, M. V. Pham, Q.-H. Nguyen, L. X. Nguyen, and under the goal of carbon emission Peak and carbon neu- H. M. Pham, “Whale Optimization Algorithm and Adaptive trality,” Electric power, vol. 54, pp. 1–6, 2021. Neuro-Fuzzy Inference System: a hybrid method for feature [11] Y. Zhang, H. Dai, X. Wu, R. Chen, and N. Zhang, “Devel- selection and land pattern classiﬁcation,” International opment trends and key issues of China’s integrated energy Journal of Remote Sensing, vol. 40, no. 13, pp. 5078–5093, services,” Electric power, vol. 54, pp. 1–10, 2021. [12] B. Yu, G. Zhao, R. An, J. Chen, J. Tan, and X. Li, “Research on [28] G. I. Sayed, A. Darwish, and A. E. Hassanien, “Binary whale China’s CO2 emission pathway under carbon neutral target,” optimization algorithm and binary moth ﬂame optimization J. Beijing Inst. Technol. Sci. Ed.vol. 23, pp. 14–24, 2021. with clustering algorithms for clinical breast cancer diagno- [13] K. P. Marimuthu, D. Durairaj, and S. Karthik Srinivasan, ses,” Journal of Classiﬁcation, vol. 37, no. 1, pp. 66–96, 2020. “Development and implementation of advanced metering [29] M. M. Mafarja and S. Mirjalili, “Hybrid Whale Optimization infrastructure for eﬃcient energy utilization in smart grid Algorithm with simulated annealing for feature selection,” environment,” International Transactions on Electrical Energy Systems, vol. 28, no. 3, Article ID e2504, 2018. Neurocomputing, vol. 260, pp. 302–312, 2017. [30] B. L. N. Phaneendra Kumar and P. Manoharan, “Whale [14] K. P. Swain and M. De, “Identiﬁcation and mitigation of congestion in distribution system utilizing proximity index- optimization-based band selection technique for International Transactions on Electrical Energy Systems 13 hyperspectral image classiﬁcation,” International Journal of Remote Sensing, vol. 42, no. 13, pp. 5105–5143, 2021. [31] J. Too, M. Mafarja, and S. Mirjalili, “Spatial bound whale optimization algorithm: an eﬃcient high-dimensional feature selection approach,” Neural Computing and Applications, vol. 33, no. 23, pp. 16229–16250, 2021. [32] S. Makonin, B. Ellert, I. V. Bajic, ´ and F. Popowich, “Electricity, water, and natural gas consumption of a residential house in Canada from 2012 to 2014,” Scientiﬁc Data, vol. 3, no. 1, Article ID 160037, 2016. [33] Y. Zhang, J. Cheng, J. Su, L. Zheng, X. Qiu, and W. Shang, “Time-domain resonant characteristics between the distur- bances and the RS422 communication signals in Tesla pulse driver, and analysis on the caused RS422 communication interference,” IET Science, Measurement & Technology, vol. 14, no. 8, pp. 905–913, 2020. [34] J. Li, G. Qin, Y. Li, and X. Ruan, “Research on power quality disturbance identiﬁcation and classiﬁcation technology in high noise background,” IET Generation, Transmission & Distribution, vol. 13, no. 9, pp. 1661–1671, 2019. [35] C.-K. Lee and Y.-J. Shin, “Detection and assessment of I&C cable faults using time-frequency R-CNN-based reﬂectome- try,” IEEE Transactions on Industrial Electronics, vol. 68, no. 2, pp. 1581–1590, 2021. [36] S. Mirjalili and A. Lewis, “)e whale optimization algorithm,” Advances in Engineering Software, vol. 95, pp. 51–67, 2016.
International Transactions on Electrical Energy Systems – Hindawi Publishing Corporation
Published: Apr 22, 2022
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.