Detecting Anomalies in Meteorological Data Using Support Vector Regression
Detecting Anomalies in Meteorological Data Using Support Vector Regression
Lee, Min-Ki;Moon, Seung-Hyun;Yoon, Yourim;Kim, Yong-Hyuk;Moon, Byung-Ro
2018-06-26 00:00:00
Hindawi Advances in Meteorology Volume 2018, Article ID 5439256, 14 pages https://doi.org/10.1155/2018/5439256 Research Article Detecting Anomalies in Meteorological Data Using Support Vector Regression 1 1 2 3 Min-Ki Lee , Seung-Hyun Moon, Yourim Yoon , Yong-Hyuk Kim , and Byung-Ro Moon School of Computer Science and Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea Department of Computer Engineering, College of Information Technology, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do 13120, Republic of Korea School of Software, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Republic of Korea Correspondence should be addressed to Byung-Ro Moon; moon@snu.ac.kr Received 20 February 2018; Accepted 10 May 2018; Published 26 June 2018 Academic Editor: Alastair Williams Copyright © 2018 Min-Ki Lee et al. )is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Significant errors exist in automated meteorological data, and identifying them is very important. In this paper, we present a novel method for determining abnormal values in meteorological observations based on support vector regression (SVR). SVR is used to predict the observation value from a spatial perspective. )e difference between the estimated value and the actual observed value determines if the observed value is abnormal or not. In addition, SVR input variables are deliberately selected to improve SVR performance and shorten computing time. In the selection process, a multiobjective genetic algorithm is used to optimize the two objective functions. In experiments using real-world data sets collected from accredited agencies, the proposed estimation method using SVR reduced the RMSE by an average of 45.44% whilst maintaining competitive computing times compared to baseline estimators. has enabled (i) real-time information retrieval, (ii) reduced 1. Introduction maintenance costs, (iii) increased accuracy of observations, Meteorological observations play an important role in (iv) a larger amount of data, and (v) easier weather observations weather forecasting, disaster warning, and policy formula- in poorly accessible regions. tion in agriculture and various industries [1–4]. In addition, However, meteorological data gathered by AWS often meteorological observations are used for efficient operation includes errors, and unusual metrics can be observed for of alternative energy sources such as solar power, hydro- a variety of reasons. Causes of unusual values include sensor power, and wind power [5–7]. In recent years, as climate malfunction, hardware error, power supply error, ambient change due to global warming has accelerated, the extent of environment change, and, in some rare circumstances, damage due to abnormal weather phenomena is increasing abnormal weather phenomena. )erefore, a quality control and becoming more difficult to predict. Consequently, there procedure is required for the collected data. Abnormal data is a greater need for accurate and quantitative climate data identified during the quality control process are examined thoroughly by an expert and may become the subject of based on meteorological observations. )e collection of me- teorological information, which was previously done manu- further research. If an abnormality is detected due to an ally, has been automated in line with computational advances. error in the measurement process, it is necessary to replace An automatic weather station (AWS) is an automated system the observed value with a corrected value [8–11]. Quality that allows a computer to observe and collect numerical values control of meteorological observations can also be regarded of multiple meteorological elements. )e development of AWS as anomaly detection [12] because anomalous values are of 2 Advances in Meteorology (a) (b) Figure 1: Locations of the 572 automatic weather stations (AWSs) in South Korea [18]. Table 1: Meteorological elements in automatic weather station substantial interest to researchers. As the installation of (AWS) data. AWS is expanding, and the amount and types of collected data are increasing, a fast and reliable quality control al- Meteorological element Unit gorithm must be developed. Wind direction Quality control is achieved by several methods, ranging Wind speed m/s from simple discrimination using criteria related to physical Temperature C limits to relatively complex discrimination related to spa- Humidity % tiotemporal relationships with other observations [13]. Daly Atmospheric pressure hPa et al. [14] performed quality control of meteorological data Hourly precipitation Rainfall occurrence 0 or 1 metrics using climate statistics and spatial interpolation, and Sciuto et al. [15] proposed a spatial quality control procedure for daily rainfall data using a neural network. We propose )e remainder of this paper is organized as follows. a spatial quality control method, which uses values obtained Section 2 describes the problem that we attempt to solve in from observational stations surrounding the target obser- this study. Spatial quality control is defined, and we describe vational station to determine spatial compatibility and es- real-world data sets used to test the performance of the timate the value of the observation station. It is possible to proposed methods. Section 3 introduces the methods pre- determine if an observed value is abnormal or not, based on viously used in spatial quality control, Section 4 describes an differences with the estimated value. )e developed spatial estimation model using SVR, Section 5 describes the algo- quality control method uses support vector regression (SVR) rithm for selecting SVR input variables, and Section 6 presents and a genetic algorithm. It can be applied to a wide range of the experimental results using the real-world dataset. Finally, meteorological elements to reflect the geographic and cli- Section 7 discusses our conclusions. matic characteristics of observation stations by studying past data through SVR. During preprocessing of the SVR, input 2. Problem Description variables, that is, the surrounding observation stations are selected according to two objective functions: similarity and 2.1. Data Sets. )is study covers data from 572 AWSs op- spatial dispersion. Multiobjective optimization is required to erated by KMA in South Korea. Figure 1 shows the locations simultaneously optimize the objective function that could be of the target AWSs. incompatible with these two functions. )is is effectively Target data include meteorological information measured performed by the genetic algorithm, which improves SVR every 1 minute from January 1, 2014 at 00:00 to December 31, performance and reduces execution time in this study. To 2014, at 23:59. In one year, 525,600 pieces of observational data verify the performance of the proposed method, we applied it are collected for each meteorological element at each station. to observational data measured by the Korea Meteorological We selected seven major meteorological elements for analysis: Administration (KMA) for one year in 2014. Experiments on 10-minute average wind direction, 10-minute average wind real-world data sets show that the performance of the pro- speed, 1-minute average temperature, 1-minute average hu- posed method is superior to previous methods such as the midity, 1-minute average pressure, 1-hour cumulative amount Cressman method [16] and the Barnes method [17], which of precipitation, and precipitation. Table 1 shows the types and have previously been used for spatial quality control. units of meteorological elements used in this study. Advances in Meteorology 3 Table 2: Limits for physical limit test. )e wind direction is expressed in degrees; however, this leads to a large error in the algorithm internal operation. For Meteorological element Lower limit Upper limit ° ° ° example, 1 and 359 differ only by 2 , but arithmetically, they ° ° Wind direction 0 360 differ by 358 . In addition, the average of the two wind di- Wind speed 0 m/s 75 m/s ° ° rections should be regarded as 0 , but arithmetically it is 180 . ° ° Temperature −80 C 60 C )erefore, to avoid these problems, wind direction was con- Humidity 1 100 verted into a two-dimensional unit vector, as expressed in [8]. Atmospheric pressure 500 hPa 1080 hPa π π Precipitation 0 400 v � cosθ · , sinθ · , (1) Rainfall occurrence 0 1 180 180 where v is the transformed two-dimensional vector and θ is the original wind direction in degrees. In the quality control Table 3: Maximum amount of change for step test. process, the components of the two vectors are processed Meteorological element Maximum amount of change separately. When a quantitative comparison of wind di- Wind speed 10 m/s rection is required, the wind direction represented by the Temperature 1 C vector must be converted back to degrees: Humidity 10 Atmospheric pressure 2 hPa (2) θ � atan2(v, u) · , where u and v are v’s first and second components, re- Table 4: Minimum amount of change for persistence test. spectively, and atan2 is defined as follows: Meteorological element Minimum amount of change ⎧ ⎪ arctan , x > 0 ⎪ Wind speed 0.5 m/s Temperature 0.1 C Humidity 1.0 Atmospheric pressure 0.1 hPa ⎪ arctan + π, x < 0, y ≥ 0 ⎪ limit, respectively, it is classed as an error. )e arctan − π, x < 0, y < 0 physical limit test is performed on all meteorological atan2(y, x) � (3) ⎪ elements. Table 2 shows the physical limits of each meteorological element, which are based on World + , x � 0, y > 0 ⎪ Meteorological Organization (WMO) standards [19]. (ii) Step test: )e step test is performed for wind speed, ⎪ temperature, humidity, and atmospheric pressure. If − , x � 0, y < 0 the difference between the current observation value and the value one minute prior is more than a certain undefined, x � 0, y � 0. value, it is classed as an error. Table 3 shows the maximum variation of each meteorological element. (iii) Persistence test: )e persistence test is performed for 2.2. Quality Control. During quality control, when the ob- wind speed, temperature, humidity, and atmo- served value is determined as not normal, it is then classified spheric pressure. A value is classified as an error as “suspect” or “error” according to its level of abnormality. when the accumulated change in the observed value “Suspect” means that the value is likely to be abnormal, and within 60 minutes is smaller than a certain value. “error” indicates that it is definitely anomalous. Table 4 shows the minimum variation within 60 minutes for each meteorological element. 2.2.1. Basic Quality Control. Basic quality control is a rela- (iv) Internal consistency test: )e internal consistency test tively simple quality control procedure performed in real- is performed for pairs of wind direction and wind time. Abnormalities are determined using only the observed speed data and pairs of precipitation and rainfall value itself or observations of a short time before and after. occurrence data. If any one of the factors in each pair )e data used in this study were first filtered through the is determined to be an error in another test, the other factor is also perceived as an error. Also, if the rainfall following four basic quality control procedures. Each test was performed sequentially. If any test failed, the data were occurrence value is 0 but the precipitation value is not 0, both values are classed as suspects. classified as an error, and subsequent tests were not per- formed. Each test and the numerical criteria are the same as Table 5 shows the percentages of normal, error, and those used by KMA. suspect values, respectively, after performing each test on the KMA dataset. If the observed meteorological element is not (i) Physical limit test: If the observed value is higher or lower than the physically possible upper or lower available due to an absence of observational equipment, or if 4 Advances in Meteorology Table 5: Results of basic quality control. Consistency Limit Step Persistence Meteorological element Normal (%) Uninspected (%) Error (%) Error (%) Error (%) Suspect (%) Error (%) −3 Wind direction 81.97 1.68e N/A N/A 0.00 2.80 15.23 −4 −3 −1 Wind speed 81.90 2.85e 1.36e 8.47e 0.00 2.80 14.45 −3 −3 −1 Temperature 96.41 3.22e 7.73e 1.02e N/A N/A 3.48 −2 −3 Humidity 54.47 2.23e 3.38e 5.43 N/A N/A 40.07 −1 −5 −2 Atmospheric pressure 38.30 1.83e 5.72e 3.69e N/A N/A 61.48 Hourly precipitation 93.08 0.00 N/A N/A 1.94 0.00 4.98 −4 Rainfall occurrence 93.08 2.34e N/A N/A 1.94 0.00 4.98 the observed value is missing, it is classified as uninspected. spatial interpolation methods such as the Cressman method All subsequent experiments were performed only on data [16] and the Barnes method [17]. However, these methods determined as normal after basic quality control. do not reflect the geographical features of each region be- cause they depend only on relative position to estimate the predicted value. Here, we propose a method to improve the 2.2.2. Spatial Quality Control. Spatial quality control de- accuracy of estimates by overcoming the shortcomings of termines whether the observation data of the target station existing methods by using supervised learning techniques. are abnormal based on the values of other observation stations around the target station. It is sometimes referred to 3. Previous Methods as a spatial consistency test [13]. Because this test is based on a large amount of data, it involves more time and resources )is section describes the spatial interpolation methods used in than basic quality control. )erefore, spatial quality control this study: the Cressman method and the Barnes method. )e is often performed in quasi-real-time. Typical spatial quality two methods have been slightly modified by the KMA to detect control process is as follows: meteorological anomalies in South Korea. Actual observations are compared with estimates generated by the spatial in- (i) Estimate the value of the target station using the terpolation methods. If there is a significant difference between values of surrounding observation stations. observed and predicted values, the observation is classed as (ii) If the difference between the observed and the pre- “suspect” or “error” according to the degree of difference. dicted value of target station is greater than the critical value, the observation is considered as abnormal. )e meteorological elements of the KMA dataset, ex- 3.1. Cressman Method. )e Cressman method performs spatial interpolation on a two-dimensional distribution of cluding rainfall occurrence, consist of continuous values; therefore, the predicted value can be estimated naturally via meteorological elements. Meteorological elements at each sta- tion are irregularly distributed in two dimensions and converted the interpolation or regression model. In the case of rainfall into estimated values of the grid points at regular intervals. In occurrence, it has a value of 0 or 1, so the value should be taken as 0 if the estimated value is less than 0.5, and 1 if the estimated this study, the grid interval is 0.2 for both longitude and lat- itude. )e estimated values of the grid points are called the value is 0.5 or more. )e acceptable range for the difference between the observed value and the predicted value is gen- background field and are calculated with respect to the effective radius r. )e effective radius is the control parameter describing erally determined using the standard deviation of the sur- rounding stations, which we set to the observation stations the maximum station distance when estimating each grid point. Let z be the observed value of the station i, and d denote the within 30 km of the target station. If the standard deviation is i ei 0, because the observation values of all neighboring stations distance between the grid point e and the station i. )en, Z (e), the estimated value of the grid point e, is the weighted average of are the same, it is difficult to determine the acceptable range; therefore, the test is not performed. In the KMA dataset, this the observations within the effective radius r (Figure 2): was often the case for elements such as precipitation and w (i) · z r i rainfall occurrence, which are always zero during periods of Z (e) � , (4) w (i) nonrain. Moreover, if there are less than three stations within 30 km, spatial quality control does not proceed because reliable where w (i), the weight of the station i, depends only on the standard deviations cannot be calculated. Also, observations distance: that are missing or identified as abnormal during basic quality 2 2 r − d ⎧ ⎪ control are not considered for spatial quality control. ei if d ≤ r ei 2 2 r + d If the tolerance of the difference between the observed ei w (i) � (5) and predicted value is the same, the accuracy of the predicted value estimation will determine the reliability of spatial 0 otherwise. quality control. In this study, we aim for more accurate spatial prediction and thus improved spatial quality control To obtain Z(i), the estimated value of a station i, the performance. Traditional spatial prediction methods include estimates of the four closest grid points from the station are Advances in Meteorology 5 Longitude If |z − Z(i)| is greater than 3 · σ , z is classified as an error. i i i If (z − Z(i)) is greater than 2 · σ , z is classified as a suspect. i i i 4. SVR-Based Approach In this section, we propose a method using support vector z regression (SVR) to overcome the spatial prediction limitations Latitude of the Cressman and Barnes methods for a target observation Z (e) station from a spatial quality control perspective. )e support vector machine (SVM) is a supervised machine learning method developed by Vapnik and Lerner in 1963 [20]. In the 1990s, nonlinear classification using SVM became popular as an alternative to artificial neural networks, and SVM generally has less overfitting problems than artificial neural networks. In the SVM, learning proceeds in the direction of maximizing the margin of the support vector, which is a hyperplane that divides each class of the given data. During early research, only linear Grid point classification was possible; however, nonlinear classification Observations became feasible by mapping data to a higher dimensional space Effective radius using a kernel function. A typical nonlinear kernel function is Figure 2: Calculation of the Z (e), the estimated value of the grid a radial basis function (RBF). )e RBF function transforms the point e in the Cressman method. Only observations of stations original space into an infinite dimension Hilbert space. In this located within the effective radius r are used. In this example, z 1 study, the RBF function was used because the RBF function is and z are used to calculate Z (e), but z is not used. 2 r 3 better, on average, than the linear or polynomial function. Support vector regression (SVR) was proposed, which uses SVM for regression with continuous values as the output [21]. averaged. After calculating the estimates of all the stations, Preliminary study on meteorological elements has shown that the background field can be recalculated using the estimates the estimation capability of SVR is superior to other machine instead of the observations. )e estimates of the stations can learning techniques [8]. In this study, the SVMlight [22] library also be recalculated over the new background field. )is was used for C language implementation. process can be repeated as many times as desired. We set the )e input and output of the SVR model for spatial effective radius to 50 km, 30 km, and 10 km and updated the quality control are as follows: background field and the estimates of the stations. Let σ be the standard deviation of the observations at all i (i) Input: observations of stations surrounding the stations located within the final effective radius of the station i. target station If |z − Z(i)| is greater than 3 · σ , z is classified as an error. If i i i (ii) Output: observation value of the target station. |z − Z(i)| is greater than 2 · σ , z is classified as a suspect. i i i In the input, values that are missing or classified as errors during basic quality control are replaced by the temporally 3.2. Barnes Method. )e Barnes method is a statistical closest values. )e wind speed converted into a 2D vector technique that can derive accurate two-dimensional distri- representation was learned and tested by two separate models bution from randomly distributed data in space. It is similar, for each dimension. Once the model is learned from the values in many respects, to the Cressman method, but instead uses of the target station and the surrounding observation stations a Gaussian function in the weight function: in the past, the predicted value of the target station can be estimated for the input that has not been learned. Because past −d ⎧ ⎪ ij exp if d ≤ r observation values are not labeled as normal or abnormal with ij 2r respect to spatial quality control, they are learned regardless of w (i) � (6) normal and abnormal values. )erefore, this approach assumes 0 otherwise, that most observations are normal and abnormalities are few. Once the predicted value of the target station is estimated, where d is the distance between stations i and j. )e KMA ij the process of determining whether the observed value of the uses observations without using grid points when calculating target station is normal is the same as the Cressman method the estimates by the Barnes method: or Barnes method. Let z and Z(i) be the observations value and the estimated value by SVR of station i, respectively, and w (i) · z r i Z(i) � , (7) let σ be the standard deviation of the observations from all w (i) stations within a radius of 30 km of station i. If |z − Z(i)| is where r is set to 30 km. )e process of determining whether or greater than 3 · σ , z is classified as an error. If |z − Z(i)| is i i i not observations are normal is almost identical to that of the greater than 2 · σ , z is classified as a suspect. i i Cressman method. Let σ be the standard deviation of the )e SVR model can implicitly capture the geographic observations at all stations located within 30 km of the station i. characteristics of the target station while learning past data. 6 Advances in Meteorology (a) (b) Figure 3: Examples of neighbor selection. (a) Neighbor stations with high spatial dispersion. (b) Neighbor stations with low spatial dispersion. )rough this process, the combination of each station and To measure the similarity of stations according to their each meteorological element has its own specific model. )is meteorological elements, the time series values of the elements is an advantage of SVR over non-ML approaches. However, are expressed as vectors, and the distance between them is an approach based on machine learning also has its draw- measured in various ways. We used the L distance, L dis- 1 2 backs; specifically, that it takes a long time to learn. A tance, Pearson correlation coefficient, and mutual information method to overcome this is introduced in Section 5. to measure the similarity between two vectors. After the dis- tance of all the station pairs was calculated, the smallest value 5. Selecting Neighboring Stations was zeroed and the largest value was normalized to one. )e L distance, known as the Manhattan distance or taxicab distance, )e input of the SVR model uses the observations of between two vectors x and y was calculated as follows: neighboring AWSs within a certain radius of the target AWS. However, if there are too many neighbors, the learning time of ‖x − y‖ � x − y , (8) 1 i i SVR becomes too long. Also, some neighboring stations act as i�1 noise instead of helping to estimate the value of the target where x is the i-th element of x. We used (1−L distance) as i 1 station. )erefore, it is necessary to select the best core a similarity measure to ensure that larger measurement was neighbors to estimate the value of the target station while given to two vectors which were more similar )e L dis- reducing the number of neighbors used in the input. tance, known as the Euclidean distance, between two vectors x and y was calculated as follows: ���������� 5.1. Similarity and Spatial Dispersion. Two criteria were applied to select key neighbors. )e first considered the ‖x − y‖ � x − y . (9) 2 i i similarity of the observations between the target station and i�1 the neighbor station. Observations at locations with similar meteorological phenomena are helpful in deriving observa- We used (1−L distance) as a similarity measure to tions at the target site. )e second considered how widespread ensure that larger measurement was given to two vectors the neighboring stations were in space. If one constructs a core which were more similar. )e Pearson correlation coefficient neighborhood at stations concentrated in a narrow area, the is used to measure the degree of the linear relationship model cannot be flexible to various situations. For example, between two variables. It has a value 1 when there is a perfect if there is a peculiar meteorological phenomenon within positive linear correlation and −1 when there is a perfect a narrow area (e.g., a local storm), the estimate will be misled. negative linear correlation. )e Pearson correlation co- Spatial dispersion ensures statistical robustness of the model. efficient is calculated as follows, where x � x /n: Figure 3 shows two different choices of neighboring stations. i�1 When the amount of rainfall in target station is estimated, the x − x y − y i�1 i i amount of rainfall in neighboring stations is used. If localized ���������������������� r � . xy (10) 2 2 n n heavy rain happens in an area including neighboring stations x − x y − y i i i�1 i�1 with low spatial dispersion, the estimated amount of rainfall will be inclined to be very high even though target station is Mutual information measures the mutual dependence out of influence of localized heavy rain. On the other hand, it between two random variables X and Y. It quantifies the reflects overall surrounding circumstances of rainfall when reduction in uncertainty of one of the variables due to spatial dispersion of neighboring stations is high.” knowing the other. Mutual information is calculated as Advances in Meteorology 7 non-dominated set E =Ø; initialize population P; repeat assign a fitness value to each solution in P; select 2N parents from P; create N offspring applying crossover on the parents; mutate offspring; repair offspring; local-optimize offspring; P offspring; update E; remove n solutions from P; add n solutions from E to P; until stopping condition; return E; Figure 4: )e framework of our hybrid multiobjective genetic algorithm. follows, where p(x, y) is the joint probability function of x converge. )e hybrid genetic algorithm solves this problem by and y and p(x) and p(y) are the marginal probability combining the local optimization algorithm with the GA. density functions of x and y, respectively: Several successful attempts have been made to solve multi- objective problems using GA [25–29]. Among them, NSGA-II p(x, y) I(X; Y) � p(x, y) log . by Deb [30] is the most well-known. To maximize the function (11) p(x)p(y) y∈Y x∈X f , f , . . . , f with n number of objects, if solution x and 1 2 n solution y satisfy the following condition, then it can be said We computed the mutual information from the ob- that solution y dominates solution x: served frequency of two vectors, x and y, assuming that these f (x) ≤ f (y) ∀i and ∃j : f (x) < f (y). (13) vectors constitute an independent and identically distributed i i j j sample of (X, Y). When a solution is not dominated by any other solution, the As a measure of spatial dispersion, we used the average of solution is called Paretooptimal. To improve an objective function the geographical distance from the nearest station [23]. If the in a Paretooptimal solution, one has to sacrifice another objective set of target stations and selected neighbors is x, and d is x x i j function. )e multiobjective genetic algorithm (MOGA) does not the normalized geographic distance between the two stations output one solution but several Paretooptimal solutions. )e final x and x , then the spatial dispersion is calculated as i j solution selection is performed by the decision-maker. min d x ∈x x ∈x,x ≠x x x i j i j i j dispersion(x) � . (12) 1 x ∈x 5.3. Our GA Framework. In this study, we tested the SVR )e larger the spatial dispersion is, the better the with several Paretooptimal solutions for each meteorological neighborhood selection is. )e two criteria of similarity and element, and selected the best solution on average. Figure 4 spatial dispersion often conflict. In general, similarities in shows the structure of the GA used in this study. climatic characteristics are often due to geographic prox- (i) Encoding: In a genetic algorithm, one solution is imity. )erefore, the key neighborhood screening problem is expressed as a set of genes, or a chromosome. Here, a multiobjective optimization problem that simultaneously one chromosome is represented by a one- optimizes two or more objectives that are not independent of dimensional binary string. Each gene corresponds each other. In this study, we solve the multiobjective opti- to one station. If the value of the gene is “0,” the mization problem using genetic algorithms. observation value of the corresponding station is not used as the input of the SVR. If it is “1,” it is selected 5.2. Multiobjective Genetic Algorithm. )e genetic algorithm as an input of the SVR. (GA) is a global optimization technique developed by Holland, (ii) Fitness: )is indicates the validity of the solution for which mimics the natural evolution of biological selection a given problem. When the individual objective [24]. It is used to find a solution with high (or low) fitness while function is f , f , . . . , f , the fitness value of solution 1 2 n repeating a genetic operation that imitates processes such as x is calculated as selection, crossover, and mutation, which are important ele- ments of evolution. GA is a type of metaheuristic that does not f(x) � w f (x) + w f + · · · + w f (x), (14) 1 1 2 2 n n depend substantially on the nature of the problem. It can search all ranges and is less likely to fall into a local optimum. where w , . . . , w are nonnegative and w � 1, 1 n i Pure GA is disadvantageous in that it takes a long time to each weight w is randomly set for every i 8 Advances in Meteorology updates every time a new solution is created. In other generation, not as a fixed value. )is allows the algorithm to search for various Paretooptimal so- words, the solution that is dominated by the new solution is removed from the existing nondominant lutions [31]. )is method is more intuitive than the algorithm that uses Pareto ranking-based fitness solution archive, and the new solution is stored in the evaluation [32] and easier to be combined with archive when it is a nondominant solution. As sur- a local optimization algorithm [33]. In this problem, vival of good solutions within a population can result n � 2, and f (x) and f (x) correspond to similarity in a good solution for the next generation, some of 1 2 and spatial dispersion, respectively, as described in the population are replaced by solutions in non- Section 5. dominant solution archive. In this algorithm, 20% of the entire population was randomly replaced with (iii) Population: Population is a set of chromosomes. solutions in nondominant solution archive. Chromosomes in the population interact each (x) Stopping condition: )e genetic algorithm stops other to generate new solutions and cull existing solutions. In this study, the size of the population when 1,000 generations have passed. was set to 50. )e initial population consisted of 50 Figure 5 describes the whole spatial quality control randomly generated chromosomes. process including neighbor selection. (iv) Selection: )is is the operator used to select the parent chromosome for the crossover. To mimic the principle of survival of the fittest in nature, chro- 6. Experimental Results mosomes with high fitness are selected with high probability. In this study, roulette-wheel selection, In this section, (i) detailed good parameters are selected, (ii) one of the most widely used selection operators, was the performances of the estimation methods are compared, used. )e probability that the best fitness solution and (iii) the results of the proposed spatial quality control will be selected is four times the probability that the procedure are presented using meteorological data collected lowest fitness solution will be selected. by the KMA for a year in 2014. )e root mean square error (RMSE) was used as a measure to compare the accuracy of (v) Crossover: A key operator of the genetic algorithm. each estimation method. RMSE is a standard metrics for In inheriting the features of the parents, we expect dealing with errors between model-estimated values and that the different advantageous traits combine to observed values in a real environment, including meteo- produce an offspring chromosome that is superior rology [34]. If θ is the observed vector and θ is the estimated to the parents. In this study, we used a two-point vector, then the RMSE of θ is calculated as crossover with two cut points. ���������� (vi) Mutation: )is statistically modifies a portion of the (15) RMSE(θ ) � E(θ − θ) . offspring chromosome to increase solution diversity and prevent premature convergence. In this study, )e lower the RMSE value, the better the model estimate. each gene was flipped with a probability of 10%. As the accuracy of estimates should be based on normal (vii) Repair: After crossover and mutation, offspring observations, only observations classed as normal by the still may not meet the constraints of the problem. model are used to calculate RMSE. When comparing the )at is, the number of genes with a value of “1” in RMSE of two or more models, only those observations the chromosome may be different from the determined as normal by all models were used. number of stations to be selected. If the number of Performance evaluation of SVR estimation models was genes with a value of “1” is insufficient, we repeat achieved through 10-fold cross-validation. All data were di- the process of changing the value of the randomly vided into 10 folds, of which 9 were used as the training set, selected gene among genes with a value of “0” to and the other was used as the test set. Learning and test are “1.” On the other hand, if the number of genes with performed 10 times so that each fold can be used as a test set. a value of “0” is insufficient, we repeat the process Due to there being 7 meteorological elements in 572 AWSs, of changing randomly selected genes to “0” among and 10 models must be learned each time, a total of 40,040 genes with a value of “1.” models were created for each experiment. )e entire training (viii) Local optimization: )is exchanges the values of set consists of 473,040 data sets. Because of the large number two genes whose fitness value increases when of models and the overly long total execution time, we exchanged. )is process is repeated until the ex- sampled 5,000 data sets and used them as training sets for the change of any two gene values can no longer in- parameter optimization experiments. We then describe the crease the fitness value. change in performance and time caused by increasing the size (ix) Replacement and elitism: We used a generational of the training set once the final parameter is determined. GA to generate offspring as large as the size of the )e experiment was performed on an Intel i7 quad-core population and replace the entire population. Among 2.93 GHz CPU. Each experiment used only one core. Exper- the solutions found so far, nondominant solutions iments with a long execution time were performed by dividing that are closest to the Paretooptimal are stored in an each of the seven machines by observatories, and the total external archive. )is nondominant solution archive execution time included the execution time of each machine. Advances in Meteorology 9 Select the best neighbors by GA Calculate the standard deviation σ of Estimate the observation of target AWS by SVR observations of neighboring AWSs z : observed value |z – Z(i)| Z(i): estimated value >2σ i, ≤2σ >3σ i i ≤3σ Error Normal Suspect Figure 5: )e proposed spatial quality control process. Table 7: Accuracy of estimates for each similarity measure. Table 6: Accuracy of estimates according to wind direction rep- 1 2 Meteorological element L L PCC MI 1 2 resentation (RMSE). Wind direction 104.574 105.624 102.786 101.424 Representation RMSE Wind speed 1.228 1.224 1.306 1.317 Degree 92.17 Temperature 1.241 1.241 1.327 1.319 Vector 68.28 Humidity 8.085 8.086 8.757 8.829 Atmospheric pressure 6.497 6.497 8.134 7.256 Hourly precipitation 1.074 1.065 1.066 1.155 Rainfall occurrence 0.151 0.151 0.152 0.157 6.1. Representation of Wind Direction. Section 2.1 describes 1 2 Pearson correlation coefficient. Mutual information. the process of converting wind direction expressions from degrees to 2D vectors. Table 6 compares the accuracy of SVR estimates for each wind direction representation. )e accuracy 6.3. Selecting Neighboring Stations. In Section 5, we proposed of the estimate is much higher when expressed in terms of MOGA to select input variables to improve SVR performance vector expression than degrees. )us, all subsequent experi- and speed. Figure 7 shows the accuracy of estimates based on ments used a vector expression to represent wind direction. the number of neighboring stations selected by MOGA. )e greater the number of parameters (over a certain amount), the worse the performance of the SVR and the longer it takes to 6.2. Similarity Measure. Section 5 describes four measures train. )e optimal number of neighboring stations with the used to calculate the similarity between two observation best performance differs with the meteorological element. stations. To compare the usefulness of each measure, the Table 8 shows the optimal number of neighboring stations accuracy of the estimates predicted by the Madsen-Allerup according to each meteorological element. All subsequent method [35] is examined. )e Madsen-Allerup technique experiments were fixed using the optimal number of selects the stations similar to the target station and then uses neighbors. Table 9 compares the estimation accuracy of SVR the observed values of selected stations to obtain the estimate when neighboring stations were selected randomly, with the of the target station; therefore, the higher the quality of the accuracy of SVR when neighboring stations were selected similarity measure, the more accurate the estimate. Table 7 using MOGA. We confirm that selection of neighbors using shows the estimation accuracy of the Madsen-Allerup method MOGA improves the estimation performance of SVR. for each similarity measure. In all subsequent experiments, we used the highest quality similarity measure for each meteo- rological element. Figure 6 shows the connected station pairs 6.4. Comparison of Estimation Models. Table 10 shows the with a similarity greater than 0.5. accuracy of estimates for each estimation model. Estimation 10 Advances in Meteorology (a) (b) (c) (d) (e) (f) (g) Figure 6: Similarity map for different meteorological elements. (a) Wind direction. (b) Wind speed. (c) Temperature. (d) Humidity. (e) Atmospheric pressure. (f) Precipitation. (g) Rainfall occurrence. using SVR model is better than that using Cressman or be inspected in real time. For example, in our test data, there Barnes algorithms. Hourly precipitation does not show are 572 stations, and they collect 7 kinds of meteorological much improvement compared to other meteorological el- observation data. It takes about 5.77 seconds to inspect every ements. Because there are many more days without rain than data from every station using the Cressman method, and it should be executed in every minute. Moreover, the execu- with rain, there is rather sparse data distribution for rainy days, which results in learning difficulties. tion time becomes more important as the number of stations Table 11 shows the execution time of spatial quality and the kind of data get bigger and the time interval for control according to each estimation model. )e execution collecting data gets shortened. time might be considered of little importance as a single Spatial quality control is fastest using the Barnes algo- process of spatial quality control can be executed in a very rithm, but the accuracy of the estimation is very poor. Spatial small time. But if a quality contol process should be per- quality control using SVR is approximately 6 times faster formed in a centralized single facility, a large number of than that using the Cressman algorithm, but more time is meteorological data from every observational station need to required to learn the SVR model. However, as it does not Advances in Meteorology 11 51 2.5 2.48 2.46 2.44 2.42 2.4 2.38 2.36 2.34 40 2.32 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 Number of neighbors Number of neighbors (a) (b) 0.96 6.2 0.95 5.8 0.94 5.6 0.93 5.4 0.92 5.2 0.91 0.9 4.8 0.89 4.6 0.88 4.4 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 Number of neighbors Number of neighbors (c) (d) 1.15 0.78 0.77 1.1 0.76 1.05 0.75 0.74 0.95 0.73 0.9 0.72 0.71 0.85 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 Number of neighbors Number of neighbors (e) (f) 0.032 0.031 0.03 0.029 0.028 0.027 0.026 0.025 2 4 6 8 10 12 14 16 18 20 Number of neighbors (g) Figure 7: Accuracy of estimates according to the number of selected neighboring stations (RMSE). (a) Wind direction. (b) Wind speed. (c) Temperature. (d) Humidity. (e) Atmospheric pressure. (f) Precipitation. (g) Rainfall occurrence. give weight to more recent data in the learning process, there data, the performance of spatial quality control is not ad- is no need to learn the model every time the spatial quality versely aected, even if the learning cycle for model updates control is performed. If the model uses sucient previous is only once a week or a month. RMSE RMSE RMSE RMSE RMSE RMSE RMSE 12 Advances in Meteorology Table 8: Optimal number of neighboring stations per meteoro- Table 12: Accuracy of estimated values based on the size of the logical element. training set (RMSE). Meteorological element Optimal number of neighbors Meteorological 5000 10000 15000 20000 25000 30000 element Wind direction 7 Wind speed 3 Wind direction 43.820 42.831 42.298 41.948 41.691 41.481 Temperature 11 Wind speed 2.363 2.365 2.367 2.369 2.369 2.370 Humidity 20 Temperature 0.902 0.879 0.870 0.863 0.860 0.857 Atmospheric pressure 3 Humidity 4.710 4.330 4.130 3.998 3.904 3.831 Hourly precipitation 8 Atmospheric 0.871 0.837 0.817 0.807 0.797 0.785 Rainfall occurrence 10 pressure Hourly 0.763 0.746 0.736 0.732 0.727 0.724 precipitation Table 9: Comparison of SVR estimation accuracy with neighboring Rainfall 0.026 0.025 0.024 0.024 0.024 0.024 stations selected randomly or by MOGA. occurrence Weather element Random MOGA Wind direction 50.390 48.499 Wind speed 2.523 2.513 Temperature 0.970 0.902 Humidity 5.216 5.038 Atmospheric pressure 1.066 1.063 Hourly precipitation 0.847 0.762 Rainfall occurrence 0.028 0.026 Table 10: Comparison of estimation accuracy based on estimation model (RMSE). 5000 10000 15000 20000 25000 30000 Meteorological element Cressman Barnes SVR Training set size Wind direction 53.568 75.470 48.341 Wind speed 2.347 2.315 2.179 Figure 8: Average time spent on learning one model depending on the size of the training set. Temperature 1.180 2.583 0.880 Humidity 6.755 12.767 4.582 Atmospheric pressure 5.663 11.601 0.847 0.8 Hourly precipitation 0.583 0.833 0.583 Rainfall occurrence 0.071 0.137 0.021 0.7 0.6 0.5 Table 11: Execution time for spatial quality control based on estimation model. 0.4 0.3 Average time spent Average time spent Estimator in learning one in determining one 0.2 model (second) observation (second) 0.1 −3 Cressman — 1.442e −5 Barnes — 8.427e 5000 10000 15000 20000 25000 30000 −4 SVR 6.839 2.303e Training set size Figure 9: Time spent on determining one value depending on the size of the training set. 6.5. Size of Training Set. In general, the higher the number of training samples in the SVR, the higher the accuracy of the estimate but the longer the learning time. Table 12 shows the time. Žeoretically, the time taken to test the SVR model is accuracy of estimates based on the number of training not aected by the size of the training set, but as the training samples. Exceptionally, in the case of wind speed, the per- set grows, the complexity of the model becomes larger formance tends to decrease as the number of training (e.g., the number of support vectors increases), and the time samples increases. Figure 7 also shows that the fewer input required for the test also increases. However, as the number variables of SVR, the better the performance with regards to of samples increases, the increase in test time gradually wind speed. In the present model structure, it is dicult to decreases. Že test time is expected not to increase after the learn wind speed; thus, overŠtting seems to occur if the number of samples reaches a certain point. Experiments on model becomes overly complicated. all observation stations using 30,000 samples took approxi- Figure 8 shows the learning time according to the mately 15 days on seven machines. Due to time limitations, number of training samples, and Figure 9 shows the time we could not experiment with more samples, but there seems taken purely for spatial quality control, excluding learning to be room for further performance improvement. In this Time (millisecond) Time (second) Advances in Meteorology 13 Table 13: Results of the proposed spatial quality control method. Meteorological element Normal (%) Suspect (%) Error (%) Uninspected (%) −1 Wind direction 72.9 6.32 6.31e 20.2 −1 Wind speed 75.7 3.49 6.56e 20.2 −1 −2 Temperature 93.8 3.98e 9.08e 5.67 −1 −2 Humidity 52.8 2.08e 5.33e 47.0 −4 −4 Atmospheric pressure 36.4 4.75e 1.60e 63.6 −1 Hourly precipitation 87.1 8.73e 2.99 9.04 −1 Rainfall occurrence 89.2 1.38 4.16e 9.04 study, all the stations were analyzed together, but the burden compare the anomaly detection technique with unsuper- of the learning time would not be as great if each test were vised learning technology as opposed to that based on conducted separately for each observation station. prediction using supervised learning. Conflicts of Interest 6.6. Result of Spatial Quality Control. Table 13 shows the results of applying the proposed spatial quality control )e authors declare that they have no conflicts of interest. procedure to actual data. As described in Section 2, the spatial quality control applies only to observations that are Acknowledgments determined as normal during basic quality control. )ere- fore, values that did not pass the basic quality control are )is work was supported by Research for the Forecasting classed as uninspected during spatial quality control. )e Technology and Its Application, through the National Institute high ratio of uninspected observations of humidity and of Meteorological Sciences of Korea, in 2015 (NIMR-2012-B-1). atmospheric pressure is due to the lack of measuring in- )is work was also partly supported by BK21 Plus for Pioneers struments for those elements in many observation stations. in Innovative Computing (Department of Computer Science and Engineering, SNU) funded by the National Research 7. Conclusion Foundation of Korea (NRF) (21A20151113068), by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (In- In this study, we proposed a method to detect the abnor- formation Technology Research Center) support program mality of meteorological data using SVR. First, the value of (IITP-2018-2017-0-01630) supervised by the IITP (Institute for the corresponding station was predicted using observations Information & Communications Technology Promotion) and made in the surrounding area, and then any abnormality was by a grant (KCG-01-2017-05) through the Disaster and Safety detected by checking whether the observation differs from Management Institute funded by Korea Coast Guard of the the predicted value outside of a predetermined range. SVR Korean government. )e Institute of Computer Technology was used to create a model to predict the value of observation (ICT) at the Seoul National University provides research fa- stations. In addition, we used MOGA to select SVR input cilities for this study. variables to improve model performance and to reduce computation time. Experiments on actual weather data collected by KMA References show that using SVR is more accurate than the existing [1] J. H. Seo and Y. H. Kim, “Genetic feature selection for very Cressman or Barnes methods for estimating the value of an short-term heavy rainfall prediction,” in Convergence and observation station. )erefore, more accurate anomaly de- Hybrid Information Technology, pp. 312–322, Springer, Berlin, tection is possible through more accurate predictions. If the Germany, 2012. model can be learned in advance for a fixed cycle rather than [2] Y. H. Kim and Y. Yoon, “Spatiotemporal pattern networks of learning the model every time, the proposed method has an heavy rain among automatic weather stations and very-short- acceptable execution time. A limitation of the method is that term heavy-rain prediction,” Advances in Meteorology, vol. 2016, preaccumulated data are required, but it was confirmed Article ID 4063632, 13 pages, 2016. through experiments that data collected over approximately [3] P. Cortez and A. D. J. R. Morais, “A data mining approach to one year provide sufficiently high performance. predict forest fires using meteorological data,” in Proceedings of the 13th Portuguese Conference of Artificila Intelligence, In future study, the proposed method can be applied to pp. 512–523, Guimarães, Portugal, December 2007. additional meteorological elements such as sunshine du- [4] A. Stoppa and U. Hess, “Design and use of weather derivatives ration and cloud height. Other valuable research could in agricultural policies: the case of rainfall index insurance in examine whether state-of-the-art learning techniques such Morocco,” in Proceedings of the International Conference on as deep learning can yield more accurate predictions than Agricultural Policy Reform and the WTO: Where are We SVR, which was not attempted here due to limitations of the Heading, Capri, Italy, June 2003. system environment. In addition to accurate predictions, [5] M. Kubik, P. J. Coker, J. F. Barlow, and C. Hunt, “A study into additional studies are required on the acceptable difference the accuracy of using meteorological wind data to estimate between the observation and the estimate which we set using turbine generation output,” Renewable Energy, vol. 51, pp. 153– the standard deviation. Furthermore, it will be interesting to 158, 2013. 14 Advances in Meteorology [6] H. Yang, L. Lu, and J. Burnett, “Weather data and probability approach,” IEEE Transactions on Evolutionary Computation, vol. 3, no. 4, pp. 257–271, 1999. analysis of hybrid photovoltaic–wind power generation systems in hong kong,” Renewable Energy, vol. 28, no. 11, pp. 1813– [26] C. M. Fonseca and P. J. Fleming, “Multiobjective optimization and multiple constraint handling with evolutionary algo- 1824, 2003. [7] S. A. Kalogirou, “Applications of artificial neural-networks for rithms. I. A unified formulation,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, energy systems,” Applied Energy, vol. 67, no. 1, pp. 17–35, 2000. [8] M. K. Lee, S. H. Moon, Y. H. Kim, and B. R. Moon, “Correcting vol. 28, no. 1, pp. 26–37, 1998. [27] C. A. Coello, “An updated survey of GA-based multiobjective abnormalities in meteorological data by machine learning,” in optimization techniques,” ACM Computing Surveys (CSUR), Proceedings of the 2014 IEEE International Conference on Sys- vol. 32, no. 2, pp. 109–143, 2000. tems, Man and Cybernetics (SMC), pp. 888–893, San Diego, CA, [28] L. Xiujuan and S. Zhongke, “Overview of multi-objective USA, April 2014. optimization methods,” Journal of Systems Engineering and [9] N. Y. Kim, Y. H. Kim, Y. Yoon, H. H. Im, R. K. Choi, and Electronics, vol. 15, no. 2, pp. 142–146, 2004. Y. H. Lee, “Correcting air-pressure data collected by mems [29] A. Konak, D. W. Coit, and A. E. Smith, “Multi-objective opti- sensors in smartphones,” Journal of Sensors, vol. 2015, Article mization using genetic algorithms: a tutorial,” Reliability Engi- ID 245498, 10 pages, 2015. neering and System Safety, vol. 91, no. 9, pp. 992–1007, 2006. [10] Y.-H. Kim, J.-H. Ha, Y. Yoon et al., “Improved correction of [30] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and atmospheric pressure data obtained by smartphones through elitist multiobjective genetic algorithm: NSGA-II,” IEEE Trans- machine learning,” Computational Intelligence and Neuro- actions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, science, vol. 2016, Article ID 9467878, 12 pages, 2016. [11] J.-H. Ha, Y.-H. Kim, H.-H. Im, N.-Y. Kim, S. Sim, and [31] T. Murata and H. Ishibuchi, “MOGA: Multi-objective genetic Y. Yoon, “Error correction of meteorological data obtained algorithms,” in Proceedings of the 1995 IEEE International with mini-AWSs based on machine learning,” Advances in Conference on Evolutionary Computation, vol. 1, p. 289, Perth, Meteorology, vol. 2018, Article ID 7210137, 8 pages, 2018. Australia, 1995. [12] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: [32] C. M. Fonseca and P. J. Fleming, “Multiobjective genetic a survey,” ACM Computing Surveys (CSUR), vol. 41, no. 3, algorithms,” in Proceedings of the IEE Colloquium on Genetic pp. 1–58, 2009. Algorithms for Control Systems Engineering (Digest No. 1993/ ´ ´ ´ [13] J. Estevez, P. Gavilan, and J. V. Giraldez, “Guidelines on validation 130), London, UK, 1993. procedures for meteorological data from automatic weather sta- [33] H. Ishibuchi and T. Murata, “Multi-objective genetic local tions,” Journal of Hydrology, vol. 402, no. 1, pp. 144–154, 2011. search algorithm,” in Proceedings of the IEEE International [14] C. Daly, W. Gibson, M. Doggett, J. Smith, and G. Taylor, “A Conference on Evolutionary Computation, pp. 119–124, An- probabilistic-spatial approach to the quality control of climate chorage, AK, USA, 1996. observations,” in Proceedings of the 14th AMS Conference on [34] T. Chai and R. R. Draxler, “Root mean square error (RMSE) or Applied Climatology, pp. 13–16, Seattle, WA, USA, January 2004. mean absolute error (MAE)?—arguments against avoiding [15] G. Sciuto, B. Bonaccorso, A. Cancelliere, and G. Rossi, RMSE in the literature,” Geoscientific Model Development, “Quality control of daily rainfall data with neural networks,” vol. 7, no. 3, pp. 1247–1250, 2014. Journal of Hydrology, vol. 364, no. 1, pp. 13–22, 2009. [35] P. Allerup, H. Madsen, and F. Vejen, “A comprehensive model [16] G. P. Cressman, “An operational objective analysis system,” for correcting point precipitation,” Hydrology Research, vol. 28, Monthly Weather Review, vol. 87, no. 10, pp. 367–374, 1959. no. 1, pp. 1–20, 1997. [17] S. L. Barnes, “A technique for maximizing details in numerical weather map analysis,” Journal of Applied Meteorology, vol. 3, no. 4, pp. 396–409, 1964. [18] J.-H. Seo, Y. H. Lee, and Y.-H. Kim, “Feature selection for very short-term heavy rainfall prediction using evolutionary computation,” Advances in Meteorology, vol. 2014, Article ID 203545, 15 pages, 2014. [19] M. Jarraud, Guide to Meteorological Instruments and Methods of Observation (wmo-no. 8), World Meteorological Organi- sation, Geneva, Switzerland, 2008. [20] V. Vapnik and A. Lerner, “Pattern recognition using gen- eralized portrait method,” Automation and Remote Control, vol. 24, pp. 774–780, 1963. [21] H. Drucker, C. J. Burges, L. Kaufman et al., “Support vector regression machines,” Advances in Neural Information Pro- cessing Systems, vol. 9, pp. 155–161, 1997. [22] T. Joachims, “SVMlight: support vector machine,” 1999, http://svmlight.joachims.org. [23] P. J. Clark and F. C. Evans, “Distance to nearest neighbor as a measure of spatial relationships in populations,” Ecology, vol. 35, no. 4, pp. 445–453, 1954. [24] J. H. Holland, Adaptation in Natural and Artificial Systems: an Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press, Cambridge, MA, USA, 1992. [25] E. Zitzler and L. )iele, “Multiobjective evolutionary algo- rithms: a comparative case study and the strength Pareto International Journal of The Scientific Advances in Advances in Geophysics Chemistry Scientica World Journal Public Health Hindawi Hindawi Hindawi Hindawi Publishing Corporation Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www www.hindawi.com .hindawi.com V Volume 2018 olume 2013 www.hindawi.com Volume 2018 Journal of Environmental and Public Health Advances in Meteorology Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 Submit your manuscripts at www.hindawi.com Applied & Environmental Journal of Soil Science Geological Research Hindawi Volume 2018 Hindawi www.hindawi.com www.hindawi.com Volume 2018 International Journal of International Journal of Agronomy Ecology International Journal of Advances in International Journal of Forestry Research Microbiology Agriculture Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 International Journal of Journal of Journal of International Journal of Biodiversity Archaea Analytical Chemistry Chemistry Marine Biology Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png
Advances in Meteorology
Hindawi Publishing Corporation
http://www.deepdyve.com/lp/hindawi-publishing-corporation/detecting-anomalies-in-meteorological-data-using-support-vector-zHX6WCl2Vz