Development of Stacked Long Short-Term Memory Neural Networks with Numerical Solutions for Wind Velocity Predictions
Development of Stacked Long Short-Term Memory Neural Networks with Numerical Solutions for Wind...
Wei, Chih-Chiang
2020-07-23 00:00:00
Hindawi Advances in Meteorology Volume 2020, Article ID 5462040, 18 pages https://doi.org/10.1155/2020/5462040 Research Article Development of Stacked Long Short-Term Memory Neural NetworkswithNumericalSolutionsforWindVelocityPredictions Chih-Chiang Wei Department of Marine Environmental Informatics and Center of Excellence for Ocean Engineering, National Taiwan Ocean University, Keelung City, Taiwan Correspondence should be addressed to Chih-Chiang Wei; ccwei@ntou.edu.tw Received 27 December 2019; Accepted 8 July 2020; Published 23 July 2020 Academic Editor: 'eodore Karacostas Copyright © 2020 Chih-Chiang Wei. 'is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Taiwan, being located on a path in the west Pacific Ocean where typhoons often strike, is often affected by typhoons. 'e accompanying strong winds and torrential rains make typhoons particularly damaging in Taiwan. 'erefore, we aimed to establish an accurate wind speed prediction model for future typhoons, allowing for better preparation to mitigate a typhoon’s toll on life and property. For more accurate wind speed predictions during a typhoon episode, we used cutting-edge machine learning techniques to construct a wind speed prediction model. To ensure model accuracy, we used, as variable input, simulated values from the Weather Research and Forecasting model of the numerical weather prediction system in addition to adopting deeper neural networks that can deepen neural network structures in the construction of estimation models. Our deeper neural networks comprise multilayer perceptron (MLP), deep recurrent neural networks (DRNNs), and stacked long short-term memory (LSTM). 'ese three model-structure types differ by their memory capacity: MLPs are model networks with no memory capacity, whereas DRNNs and stacked LSTM are model networks with memory capacity. A model structure with memory capacity can analyze time-series data and continue memorizing and learning along the time axis. 'e study area is northeastern Taiwan. Results showed that MLP, DRNN, and stacked LSTM prediction error rates increased with prediction time (1–6 hours). Comparing the three models revealed that model networks with memory capacity (DRNN and stacked LSTM) were more accurate than those without memory capacity. A further comparison of model networks with memory capacity revealed that stacked LSTM yielded slightly more accurate results than did DRNN. Additionally, we determined that in the construction of the wind speed prediction model, the use of numerically simulated values reduced the error rate approximately by 30%. 'ese results indicate that the inclusion of numerically simulated values in wind speed prediction models enhanced their prediction accuracy. Taiwan even if the typhoon itself does not hit Taiwan. 1. Introduction 'erefore, typhoons constitute a serious natural disaster in A typhoon is a severe natural disaster that affects tropical Taiwan. and subtropical coastal countries, and it occurs most fre- For example, the 2015 Typhoon Soudelor was the most quently in the northwestern Pacific Ocean. Taiwan is to the destructive typhoon that occurred in Taiwan in recent east of the Eurasian Continent and at the western side of the history, with gust intensity exceeding 12 on the Beaufort Pacific Ocean; its climate is intermediate between tropical wind force scale (32.7 m/s). Its strong gusts caused wide- and subtropical climate. 'us, typhoons frequently occur in spread damage to infrastructure, affecting gas supply, power Taiwan, generally in summer and fall. Typhoons affecting and utilities, transportation and communication, and Taiwan typically develop at the sea surface southeast of weather radar stations. Electricity was cut off in approxi- Taiwan, and most typhoons are accompanied by torrential mately 4.5 million households simultaneously during Ty- rains and strong winds [1]. Such rain and wind add to the phoon Soudelor—the greatest recorded number in recent damage from typhoons, posing a great threat to the trans- history. 'e economic loss from the typhoon was estimated portation, economic, agricultural, and fishery activities in to be as high as US$76 million [2]. 2 Advances in Meteorology results to be explained in terms of physical relationships. 'erefore, we aimed to establish an accurate wind speed prediction model for future typhoons, allowing for better NWP models simulate the atmosphere on user-defined grid scales as a moving fluid. 'rough several types of param- preparation to mitigate a typhoon’s toll on life and property. In this study, cutting-edge machine learning (ML) tech- eterizations, NWP models also account for the influence of niques were used to improve predictive accuracy. In general, subgrid physical processes on grid-resolved motions ML algorithms learn from a huge dataset, improving their [31, 32]. NWP models, such as the Weather Research and ability to identify patterns in the data. Specifically, ML in- Forecasting model (WRF), have been increasingly popular as volves creating algorithms to make prediction from sets of a low-cost alternative source of data for such assessments of unknown data. Given their ability to perform parameter climate parameters [33]. WRF offers a wide variety of physical and dynamical elements to choose from; these el- adjustment and achieve optimization through self-learning, neural network algorithms in ML are particularly powerful ements must be put together to form model configurations, with which the model can be run [34]. However, because of [3]. Such algorithms have been extensively used in recent wind speed prediction models, and these models are in- imperfect models and uncertain initial boundary atmo- spheric conditions, errors exist in the NWP output [35, 36]. creasingly data-driven due to developments in ML [4–11]. In the development of neural networks, multilayer perceptron Recent studies have applied the NWP model to typhoons (MLP) networks are a classic approach that are often used and tropical cyclones [37–39]. However, the prediction of and compared with other neural network models. For ex- severe meteorological phenomena (such as typhoons), ample, Wei [12] compared the accuracy of MLP with that of which result from a multiplicity of mutually interacting adaptive network-based fuzzy-inference-system neural multiscale processes, remains a major challenge for NWP networks in the construction of typhoon wind speed pre- systems [25, 40, 41]. 'ese models are typically unable to predict wind intensity with satisfactory accuracy, even at diction models. Deep learning has become possible due to the expo- short forecast times and a high horizontal resolution [42, 43]. Some studies have tried combining NWP with ML nential increase in computing power in recent years. 'is approach is the further derivation of multiple neural layers models to develop an integrated climate prediction model. For example, Zhao et al. [44] evaluated the performance and from the original neural layers of a model. Such derivation improves an algorithm’s ability to learn, better approxi- enhanced accuracy of a day-ahead wind-power forecasting mating the complex neural network structure of a human system. 'e system comprised artificial neural networks and being. For example, Hu et al. [13] formulated multilayer an NWP model. deep neural networks that were trained using data from Furthermore, to enhance the predictive accuracy of data-rich wind plants. 'ese networks extracted wind speed constructed ML models, other researchers have used nu- patterns, and the mapping was finely tuned using data from merically simulated results as input data for the construction of ML models. For example, Zhao et al. [45] developed the newly constructed wind plants. Tiancheng et al. [14] pro- posed a sandstorm prediction method that considered both ARIMAX model, where wind speed results from the WRF simulation were chosen as an exogenous input variable. the effect of atmospheric movement and ground factors on sandstorm occurrence, called improved naive Bayesian However, to the best of our knowledge, in the literature on convolutional neural network classification algorithm. short-term wind speed prediction for typhoons (or tropical In the field of neural networks, recurrent neural networks cyclones), numerically simulated values have seldom been (RNNs), which can analyze sequential (or time-series) data, used as an input variable for ML prediction models. have recently been developed [15–20]. RNNs are con- 'erefore, in relation to the construction of a ML-based nectionist models with the ability to selectively pass infor- wind speed prediction model, we evaluated the improve- mation across sequence steps while processing sequential data ments to predictive accuracy afforded by the use of nu- one element at a time [21]. 'erefore, RNNs are important, merically simulated values (by comparing between its use and nonuse). especially in the analysis of sequential data. A particular type of RNNs is long short-term memory (LSTM), a class of ar- 'erefore, our study has two primary aims: (1) develop an ML- and neural network-based wind speed prediction tificial neural networks where connections between units form a directed cycle [22] introduced the LSTM primarily to model and compare the predictive accuracy of various overcome the problem of diminishing gradients. LSTM neural network-based algorithms and (2) evaluate the im- creates an internal network state that allows it to exhibit provements to predictive accuracy afforded by the use of dynamic temporal behavior. Unlike feedforward neural numerical solutions obtained from NWP models (by networks, RNNs can use their internal memory to process comparing between its use and nonuse) in a typhoon-surface arbitrary sequences of input [23]. To model the time series of wind-speed prediction model. wind speed data, Byeon et al. [24] developed the LSTM for the prediction of typhoon wind speeds. Hence, feature en- 2. Methodology and Algorithms hancement from RNNs has been explored in wind prediction. 'e widespread application of ensembles in numerical Figure 1 illustrates the flow of the construction, involving weather prediction (NWP) has helped researchers improve NWP numerical solutions, of our neural network-based weather forecasts [25–30]. Numerical models can be used to typhoon wind-velocity prediction model. In the first stage, calculate all climate parameters through atmospheric dy- we collected data on the typhoon characteristics and ground namics and numerical methods, allowing for simulated wind speed of the research area’s historical typhoon events. Advances in Meteorology 3 Collect data associated with historical typhoon events Perform the wind field simulation of typhoons with an NWP numerical model Refine data of typhoon characteristics and surface wind Generate wind-speed speed observations simulation solutions Preprocess the datasets and split data into training-validation and testing subsets Conduct the neural networks-based wind speed prediction model using MLP, DRNN, and stacked LSTM Predict wind speed using different neural networks-based prediction models Evaluate the forecast accuracy in ML models using different neural networks-based algorithms Evaluate of forecast accuracy in ML models using inputs with and without NWP numercial solutions Verify the optimal neural networks-based wind velocity prediction models Figure 1: Flowchart of neural network-based typhoon wind velocity prediction model using NWP numerical solutions. In the second step, wind speed solutions were obtained from forecast accuracy of the stacked LSTM model was evaluated a wind field simulation of typhoons, involving an NWP against other neural networks. We also evaluated whether numerical model. In our study, we also employed a WRF forecast accuracy in ML-based models improved when NWP numerical model to simulate circulation distribution, thus numerical solutions were used as input. obtaining the wind speed values in the research area. Two datasets could be built, one comprising typhoon data and the 2.1. Frameworks Underlying the Proposed Neural Networks. other comprising wind simulation results from the NWP model. 'e datasets were split into testing and training- In this section, we describe the neural network-based ar- chitectures, which used the MLP, DRNN, and stacked LSTM validation subsets. 'e training-validation set was used for algorithms, that were adopted for model construction. As the learning of several ML-based wind velocity prediction illustrated in Figure 2(a), the MLP is a typical type of models, and the testing set was used for the identification of feedforward backpropagation neural network that uses the optimal prediction model among these models. 'e best model among a set of ML neural network-based processing units placed in the input, hidden, and output layers [47–49]. Each unit (with an associated weight) in a models, involving MLP, DRNN, and stacked LSTM, was determined. According to [46], real-time dynamics consti- layer is connected to the units in adjacent layers [50, 51]. In the study, to enhance learning efficacy—and by implication, tute the most challenging aspect of wind speed forecasting. We determined the stacked LSTM model to be the best for approximation, and prediction accuracy—we added hidden layers to a simple MLP neural network; the MLP was trained wind speed forecasting due to its appropriate handling of long- and short-term time dependency. In the final stage, the through backpropagation. Generally, the weight updates 4 Advances in Meteorology Predicted V Predicted V Predicted H t+i t+i t+i Output layer Output layer Output layer Dropout layer Dropout layer Hidden layer Activation fun Summing fun Context Hidden layer units t f Hidden layer Activation fun Summing fun Context Hidden layer units Input layer Input layer t Inputs for wind velocity prediction Input layer Inputs for wind velocity prediction Inputs for wind velocity prediction (a) (b) (c) Figure 2: Architecture of (a) MLP, (b) DRNN, and (c) stacked LSTM neural networks. between layers are calculated in terms of the stochastic the first LSTM layer produces sequence vectors used as the gradient descent [52]. Specifically, input of the subsequent LSTM layer. Moreover, the LSTM layer receives feedback from its previous time step, thus zE Δw (t + 1) � βΔw (t) + η , (1) allowing for the capturing of data patterns. 'e dropout ij ij zw ij layer also excludes 10% of the neurons to avoid overfitting. 'e basic structure of LSTM, as illustrated in the LSTM where w (t) is the weight set connecting layers i and j at ij layer in Figure 2(c), comprises an input gate i , output gate o , t t time t, Δw is the weight correction, η is the learning rate, β is forget gate f , and memory cell c . A single LSTM layer has a t t a momentum coefficient, and E is a cost function that in- second-order RNN architecture that excels at storing se- dicates the difference between the target and predicted quential short-term memories and retrieving them at many values. In particular, η and β are hyperparameters for time steps later [55]. An LSTM network is identical to a adjusting the spacing of weight correction. standard RNN, with the exception of the summation units in As illustrated in Figure 2(b), the multilayers of the RNN the hidden layer being replaced by memory blocks [56]. structure comprise an input layer, multiple recurrent layers, Equations (2)–(6) describe how output values are updated at and an output layer. When the length of recurrent layer is 1, each step [22, 46, 57]. Specifically, the framework is a simple RNN. Here, the recurrent network is based on the networks developed by [53]. In the RNN, the f � σW · x + U · h + b , (2) t f t f t−1 f hidden units are connected to context units; in the successive time step, the units feed back into the hidden units. 'e i � σ W · x + U · h + b , (3) t i t i t−1 i hidden state at any time step can contain information from an (almost) arbitrarily long context window [21]. 'e DRNN o � σ W · x + U · h + b , (4) model framework has multiple recurrent layers before the t o t o t−1 o forwarding to a dropout layer and output layer at the final c � f · c + i · σ