Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model

Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model J. Hydrol. Hydromech., 69, 2021, 1, 13–28 ©2021. This is an open access article distributed DOI: 10.2478/johh-2020-0043 under the Creative Commons Attribution ISSN 1338-4333 NonCommercial-NoDerivatives 4.0 License Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model 1 2* Evanice Pinheiro Gomes , Claudio José Cavalcante Blanco Civil Engineering Graduate Program, Federal University of Pará – PPGEC/ITEC/UFPA, Av. Augusto Corrêa, 01, 66075-110, Belém, Brazil. School of Environmental and Sanitary Engineering, Federal University of Pará – FAESA/ITEC/UFPA, Av. Augusto Corrêa, 01, 66075-110, Belém, Brazil. Corresponding author. Tel.: +55 91 3201-8859. E-mail: blanco@ufpa.br Abstract: Analyses based on precipitation data may be limited by the quality of the data, the size of the available historical series and the efficiency of the adopted methodologies; these factors are especially limiting when conducting analyses at the daily scale. Thus, methodologies are sought to overcome these barriers. The objective of this work is to develop a hybrid model through the maximum overlap discrete wavelet transform (MODWT) to estimate daily rainfall in homogeneous regions of the Tocantins-Araguaia Hydrographic Region (TAHR) in the Amazon (Brazil). Data series from the Climate Prediction Center morphing (CMORPH) satellite products and rainfall data from the National Water Agency (ANA) were divided into seasonal periods (dry and rainy), which were adopted to train the model and for model forecasting. The results show that the hybrid model had a good performance when forecasting daily rainfall using both databases, indicated by the Nash–Sutcliffe efficiency coefficients (0.81–0.95), thus, the hybrid model is considered to be potentially useful for modelling daily rainfall. Keywords: Artificial Intelligence; Climate Prediction Center morphing; Dry and rainy periods; Amazon. INTRODUCTION knowledge of the behaviour of measurement processes. In fact, these methods rely primarily on information that is derived Analyses based on available precipitation data may be from existing hydrological and weather series sets and use a limited by the quality of the data, the size of the available “black box” approach to simulate underlying processes. historical series and the efficiency of the adopted According to Tealab et al. (2017), the dynamic behaviour of methodologies. An alternative used to overcome these most time series and their autoregressive moving averages limitations can be obtained by daily precipitation forecasting create the challenge of predicting nonlinear time series that models. Models used for forecasting variables, such as contain inherited moving average terms using AI precipitation, can be classified as conceptual or empirical. methodologies such as neural networks. They also emphasized Conceptual models are based on equations representing the the importance of formulating new neural network models with physical processes and evolution of atmospheric phenomena deep learning and hybrid methodologies. In attempts to that make up the climate system and may include the improve the prediction of data, such as precipitation data, the atmosphere, hydrosphere, biosphere and geosphere, for combination of models has been proven to be very example, global circulation models are conceptual models (Cuo advantageous. For example, the use of DWs to filter ANN- et al., 2011; Golding, 2014). Empirical models make input data is a good example of this combination. One of the mathematical adjustments of calculated values to observed first studies on this modelling technique was carried out by values without relating the physical behaviour of hydrological Partal and Kisi (2007). They established a predictive model processes (Siad et al., 2019). Examples of these models are integrating DWs and neuro-fuzzy techniques with daily autoregressive models (ARs), moving averages (MAs), precipitation data in the Mediterranean region; this hybrid is autoregressive moving averages (ARMAs), artificial neural considered a solid basis for modelling precipitation processes in networks (ANNs), decomposition wavelets (DWs), support the region. Kuo et al. (2010) investigated the seasonal vector regression (RVS), etc. predictability of rainfall using the DW-ANN approach, using The implementation of these daily precipitation forecasting seasonal data on rainfall and Pacific Ocean sea surface models presents difficulties due to the large number of temperatures (SSTs) and revealing a high coherence of DWs variables (climatic and geomorphological) that influence and ANNs in predicting the variables studied. precipitation trends and the lack of understanding about the role There are several studies in the literature on the application of precipitation trends in controlling the distribution of the of hybrid models to forecast daily precipitation. Partal and variables (Frumau et al., 2011; Gupta et al., 2017; Osborn et al., Cigizoglu (2009) decomposed time series of meteorological 2016). Therefore, due to the complexity of this problem, variables using DWs and combined them with ANNs (feed- models incorporating artificial intelligence (AI) methods are forward back-propagation, FFBP). The authors estimated daily potentially useful approaches for simulating the precipitation rainfall at various points in Turkey. They achieved the best process (Fahimi et al., 2017; Nourani et al., 2014). According results with the combination of DWs and ANNs when using ten to Sulaiman et al. (2018), this utility is due to the remarkable levels of decomposition. Kisi and Shiri (2011) adopted genetic flexibility of AI methods in modelling highly nonlinear systems programming (GP) and neuro-fuzzy (NF) techniques, isolated and stochastic patterns, and these methods do not require prior and combined with DWs, without informing the adopted Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco mother wavelet, using only three levels of decomposition to and those of the Climate Prediction Center morphing forecast daily precipitation in the region of Aegean, western (CMORPH) products, allowing the models to be applied even Turkey. The authors found that the models combined with in the absence of point rainfall monitoring by stations, since wavelets improved the forecast in comparison with the models CMORPH data have the advantage of obtaining spatialized used in isolation. Kisi and Cimen (2011) investigated the information and punctual precipitation. Therefore, the use of accuracy of the DW-hybrid model and of support vector this hybrid approach is novel, as it does not use conventional machines (SVRs) for monthly flow forecasting in the eastern DWs coupled with ANNs and instead considers the boundary Black Sea region. The results showed that the hybrid models conditions in the estimation of daily precipitation. In addition, with DWs and SVRs improved the flow forecast at the studied the modelling of daily amounts of precipitation is interesting stations, reducing the average absolute errors by up to 46% because, on this scale (daily), the precipitation patterns are when compared with the single SVR model, considering only more complex (a long time series with values of zero indicates three levels of wavelet decomposition. Partal et al. (2015) used a drought) and have greater versatility in their application. three types of ANNs (feed forward back propagation, FFBP, radial base function, RBF and generalized regression neural MATERIALS AND METHODS network, GRNN) integrated with DWs to forecast daily Study area and database precipitation at five points in the territory of Turkey. They decomposed the meteorological time series, using the DWs, The Tocantins-Araguaia Hydrographic Region (TAHR) is into ten levels with the Haar mother wavelet and concluded that located between 0° 30' S and 18° 05' S and between 45° 45' W the DW model with the FFBP was the most suitable for and 56° 20' W (Figure 1). Its configuration is elongated in a estimating precipitation in the studied locations. He et al. south-north direction, following the predominant direction of (2015) and Ramana et al. (2013) also implemented the use of the main watercourses (Tocantins and Araguaia Rivers), which DWs with ANN to forecast precipitation. The authors adopted a unite in the northern part of the region, from there, the river is monthly rainfall scale and different DWs, with the first called the Tocantins River until it reaches Marajó Bay. The adopting the maximum overlapping discrete wavelet transform TAHR's total area is 918,822 km², covering parts of the (MODWT) in addition to other methodologies with great Central-West, North and Northeast Regions of Brazil. computational requirements. The second model used only the According to the Brazilian Institute of Geography and Statistics conventional DW. However, neither considered the boundary (IBGE) (2014), the region occupies 11% of the national conditions of the DWs. territory, including areas in the states of Goiás (21.4%), DWs, when used alone, can study the frequency of a signal, Tocantins (30.2%), Pará (30.3%), Maranhão (3.3%), Mato making it possible to evaluate the behaviour of a time series. In Grosso (14.7%) and the Federal District (0.1%). The studied this sense, Zeri et al. (2019) used DWs to estimate the variance region is important for the development of the country due to of different frequencies present in a precipitation signal using a the Tucuruí Hydroelectric Power Plant (HPP), which provides rainfall dataset from the state of Tocantins in Central Brazil. electricity to most of the Brazilian regions. The region is also Their results showed that the northern region of the study area still important in mining and agribusiness. was under greater exposure to interannual variability in Gomes et al. (2018) identified three homogeneous regions of precipitation. The main advantages to using DWs in the various precipitation in the TAHR (Figure 1) by means of fuzzy studies mentioned are that DWs filter the input data (removing c-means clustering, presenting the rainfall rates of the basin as noise), decompose the data at various levels, and improve the decreasing from north to south, with an average of 2,838 mm in inputs in the proposed ANN model. However, none of the the northern, central and south-eastern basins, an average of studies mentioned considered the boundary conditions of DWs, 1,990 mm in the south basin, and an overall average of and previous studies rarely used the MODWT, which considers 1,989 mm of annual rainfall. Thus, in applying the hybrid the boundary conditions and reduces error in the estimates. model for daily precipitation estimations, one rainfall gauge Despite generating good results, there is still a gap regarding station was adopted in each of the homogeneous regions of the the use of wavelet functions and the levels of decomposition, TAHR, as the rainfall rates differ among each location, and most research is not concerned with examining different these differences may influence the model results. This families of available mother wavelets, which can improve the technique was applied to demonstrate that the proposed models decomposition of the adopted time series. However, it is a can be used in any of the three homogeneous regions. The popular technique to use a combination of DWs and ANNs in rainfall gauge stations presented an average daily precipitation the estimation of variables, including precipitation. The (ADP) of approximately 5.0 mm (Table 1 and Figure 1). Daily arrangement and proper interpretation of the estimates are not precipitation data from the CMORPH product were obtained always carefully examined; specifically, authors of previous for each station location. The choice of stations prioritized work do not consider the boundary conditions (originating from series with minimum failures (an average failure of 0.2% in the errors), which involve conventional DWs. This lack of total observed data), and the period of daily observations was consideration has generated estimates not consistent with 19 years, from 1998 to 2016. The estimated satellite reality (Du et al., 2017; Zhang et al., 2015). precipitation data come from the CMORPH product and are Thus, this study contributes to filling these gaps by available from the National Oceanic and Atmospheric evaluating the use of four families of wavelets, different levels Administration (NOAA) and National Center for of decomposition and boundary conditions that involve DWs. Environmental Prediction (NCEP). The rainfall recorded by the The objective is the development of a hybrid model involving ANA is timely and is recorded every 24 hours. The information the MODWT and ANNs (MODWT-ANN) that considers the produced by CMORPH has a spatial resolution of 8 km (at the boundary conditions in the decomposition of the historical equator) and is recorded every 30 minutes. These differences series. Such a hybridization configuration is still little-explored served as a motivation to use both databases and allowed for daily rainfall estimates, a fact that motivates this work. In for the possibility of data substitution in the absence of punctual addition, different rainfall databases are adopted, such as those monitoring, which is common in some places in the Amazon. observed occasionally by the Brazilian Water Agency (ANA) 14 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model Wavelet decomposition – WD wavelet transform (DWT) is preferably adopted in the decomposition of time series (Mehr et al., 2014; Ramana et al., Wavelet decomposition (WD) is a mathematical technique 2013). Among the existing TWs, the maximum overlapping used for time-series analysis and forecasting in many fields, discrete wavelet transform has been highlighted in the use of such as climatology, geophysics and hydrology (Rivera et al., time-series decompositions. This technique is highlighted due 2012). The central idea of WD is signal decomposition at to the potential of the MODWT to consider the boundary different time scales, defined as a set of basic functions conditions (BC) that involve the decomposition of data and avoid errors that can be introduced in the entire development of (φ (t)) , which can be generated by translating and scaling the ab , the proposed forecasting model. In their works, Percival and mother wavelet ψ(t)according to Equation (1): Walden (2000), Bašta (2014) and Du et al. (2017) demonstrate how BCs influence the decomposition of time series and how 1 tb −  φ t =ψ , a > 0, –∞ < b < ∞ (1) BCs can produce incorrect predictions if they are not treated () ab ,  a  properly. However, the use of DWT associated with BC presents where a is the scaling parameter that adjusts the wavelet three problems: a) future data - when the adopted TW requires expansion, and b determines the location of the wavelet observations of future time series to perform decomposition in (Daubechies, 1992). The mother wavelet can be thought of as a the present, it makes the decomposition unfeasible if the future short-lived wave that grows and decays over a limited period, data are not available. Therefore, TW that incorporates future which is crucial for the good performance of the data should be avoided (as in the case of DWT); b) inadequate transformation. Depending on the wavelet chosen, the method selection of decomposition levels - when selecting a very long will filter out specific information during the process, revealing wavelet filter and a very high level of decomposition, leaving information from the original data, such as trends, few scale coefficients and wavelets (free of BC-related disintegration points and discontinuities that the raw signal does uncertainties) to calibrate the forecast model; c) partitioning of not expose (Holdefer and Severo, 2015). the calibration and validation set - when the time series records Wavelet theory is divided into two types of wavelet used in the calibration and validation of the model are not transformations (TWs): continuous wavelet transform (TWC) sufficient to allow adequate training of the model. More details and discrete wavelet transform (DWT) (Addison et al., 2001; can be found in Percival and Walden (2000). Daubechies, 1992); however, as hydrometeorological data are generally recorded in discrete time intervals, the discrete Table 1. Data from the rainfall gauge stations from ANA (E1, E2 and E3) and average daily precipitation data from the CMORPH product. ID Rainfall gauge Latitude Longitude Number of Failures (%) ADP ADP Period station failures ANA CMORPH E1 Badajós –2.51 –47.77 1 0.0 6.3 5.5 19 years E2 Colônia –7.88 –48.88 2 0.0 4.6 4.2 1998–2016 E3 Palmeirópolis –13.04 –48.40 98 2.0 4.0 3.1 ID = Identification of the rainfall gauge stations; ADP = Average daily precipitation. 70°0'0"W 60°0'0"W 50°0'0"W 40°0'0"W RR AP AM PA MA CE RN PB PI PE AC SEAL RO MT TO BA Southern America DF GO MG ES MS SP RJ PR SC RS 70°0'0"W 60°0'0"W 50°0'0"W 40°0'0"W Coordinate System: WGS-1984. Projection: Transverse Mercator. Datum: D-WGS-1984. Source: IBGE/Gomes et al. (2018). Fig. 1. Homogeneous precipitation regions in the TAHR and the locations of rainfall gauge stations used in the application of the hybrid model for daily rainfall estimates. 30°0'0"S 20°0'0"S 10°0'0"S 0°0'0" 30°0'0"S 20°0'0"S 10°0'0"S 0°0'0" Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco Maximum overlap discrete wavelet transform – MODWT et al. (2010), there are several types of ANNs whose purpose is to train the behaviour of a given variable. The main differences The MODWT definition is derived from the DWT defini- among these types lie in the type of network architecture and the learning process characteristics of each type. The best tion. The DWT filter is () h and the scale filter is () g , jk , jk , known types are perceptrons (Ps), multilayer perceptrons where k =1,..., L; L is the length of the filter and j is the level of (MLPs), Adalines (A) and radial basis networks (RBNs), which decomposition. The wavelet filter in the MODWT (h ) and can be expressed in mathematical terms by Equations (6–7): jk , the scale filter in the MODWT (g ) are defined as jk , uw=⋅ x−θ (6) ii i =1 h g jk ,  jk , h = and g = (Percival and Walden, jk , jk , j /2 j /2 2 2 yg = u (7) ( ) 2000). Then, the MODWT wavelet coefficients of level j are defined as the convolution of the time series (Xt), and the filters where w is the weight associated with the i-th entry; x are the i i in the MODWT are obtained through Equations (2) and (3): network entries; θ is the activation threshold; u is the result of the difference between the linear combiner and the activation k −1   Wh = X (2) threshold; g(u) is the activation function; and y is the final value jt,,  j k tk − modN k =0 produced by the neuron from a set of input signals. The ANN k −1 used refers to the time-delay neural network (TDNN), proposed  (3) Vg jt , = X j, k t − k modN k =0 by Lang and Hinton (1988). This network involves feedforward architecture (without feedback from the first layer of the neu-   where W is the wavelet coefficient; V is the scale coeffi- jt , jt , rons’ outputs), where the prediction of later values from time t associated with the process behaviour is computed as a function cient; and modN is the operation module when treating the of the knowledge of their previous values (Equation 8): historical series as periodic, with periods equal to N, and K can be obtained by Equation (4). x(t) = f (x(t–1), x(t–2), ..., x(t–n )) (8) where n is the order of the predictor, i.e., the number of past KK =− 21 −1+1 (4) () ( ) measurements (samples) that will be required to estimate the value x(t). This type of arrangement (Figure 2) is called a fo- K is the wavelet and scale coefficient number affected by BC, cused time-lagged feedforward network (TDNN), where the time delay acts as memory, ensuring that previous samples that for the decomposition level J and a wavelet filter length level reflect the temporal behaviour of the process are always insert- . Thus, through this equation, it is possible to obtain the wave- ed into the network without the need for feedback from the let and scale coefficients “corrected by the limits”, i.e., those network outputs. that avoid adding uncertainty to the wavelets and the scale coefficients due to the “future data” problem (Bašta, 2014; Percival and Walden, 2000). The MODWT uses a high pass filter () h to calculate its wavelet coefficients and applies an iterative construction of the time series ( t), which can be re- constructed using Equation (5).   X = WV jt,, + jt (5) The MODWT decomposition is performed on a series of data, and the type of filter (wavelet), the level of decomposition and the limit are selected; the limit can be either periodic or reflective. If the limit is periodic, the resulting wavelet and scale coefficients are calculated without duplicating the original series, treating (Xt) as if it were circular. If the limit is reflec- tive, a new series is reflected at twice the length of the original Fig. 2. PMC topology with time-delayed entries. series. In the present study, a periodic limit was adopted, and three types of wavelet families, Haar, Daubechies (d4 and d6) The adopted TDNN has three layers with the following and Least Asymmetric (la14), were selected, as they are the configuration: TDNN (a, b, c, d), with a hidden layer, where a most usual in data series analyses (Guimarães Santos e Silva, is the number of neurons in the input layer; b is the network 2014; Maheswaran and Khosa, 2012; Sang, 2012; Santos et al., delay; c is the number of neurons in the hidden layer; and d is 2019) and for carrying out diversified decompositions. the number of neurons in the output layer, in which the inputs are formed by the daily precipitation values. TDNN training Artificial neural network – ANN consists of inputting the data, the parameters and the weights of the neurons to be adjusted, according to the behaviour of the ANNs are models inspired by the functioning of biological time series, by the process of successive approximations. Thus, neurons and aim to learn a particular system and reproduce it. the following parameters were defined in this study: a) a hidden layer; b) the stopping criterion for a maximum number of 1,000 The construction of an ANN is obtained by the interconnections iterations; c) the learning rate, ranging from 0.025 to 1.0; d) the of neurons, which are layered and are initially connected by providing a stimulus to the model (the data inputs), then calcu- Levenberg-Marquardt algorithm, a learning algorithm; e) the lating the output and adjusting the weights until the desired hidden layer transfer function (log sigmoid); and f) the linear output is achieved (dos Santos et al., 2016). According to Silva output function. 16 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model To observe the influence of seasonality on the model re- repeatedly with different types of neural network structures sponse, daily rainfall data from the ANA and CMORPH sta- until a setting that best fit each daily data estimate was found. tions were organized into the rainy period (RP, November, Past records were trained to predict future days, considering a December, January, February, March and April) and the dry 14-year training period and a 5-year forecast. The initial num- period (DP, May, June, July, August, September and October). ber of days needed for the data to be considered for network These seasonal periods correspond to those identified by Falck entry ranged from 2 to 30 days. The number of neurons found et al. (2015), with six months for each period at the TAHR. The in the hidden layer ranged from 1 to 45, and only one output RP was composed of 3,444 daily precipitation data points col- neuron was found. Therefore, the hybrid model that was finally lected from November to April of 1998 to 2016 and was divid- established ensured the best predictive ability. ed into 2,584 days for calibration (02/1996 – 01/2012) and 860 days for validation (02/2012 – 12/2016). The DP was composed Performance criteria of 3,496 daily precipitation data points (May to October of 1998 to 2016) and was divided into 2,624 days for calibration Model performance was assessed using statistical parame- (05/1998 – 06/2012) and 872 days for validation (07/2012 – ters, which are used to quantify the agreement between ob- 10/2016). The series was organized such that 75% of the data served and estimated data. In this study, we used three classic was used for network calibration and 25% of the data was used criteria, the mean square error (MSE), the Nash–Sutcliffe coef- for series validation and delay in the network structure for- ficient (Nash) and the Bias, represented by Equations (10), (11) mation (Nourani et al., 2017; Shoaib et al., 2016). In addition to and (12), respectively, in addition to statistics such as the dis- these considerations, the input data for the tests with the ANNs persion, mean, standard deviation and extreme values of the were selected according to the result of the MODWT decompo- data. The equations are as follows: sition, which will be discussed later. n 2 In ANN processing, the data were standardized (Equation MSE=− X Y ) (10) obs i =1 (9)) because, according to Silva et al. (2010), when working n with ANNs, standardization implies scaling the samples to the dynamic range of the hidden layer activation functions (typical- YY − () obs est Nash =− 1 (11) ly represented by the logistic or hyperbolic tangent function) to YX − () avoid neuron saturation; this method was also adopted by obs Yaseen et al. (2016) and Nourani et al. (2017). The calculation is as follows: 1 Bias=− Y Y (12) () obs est i =1 PP − i min P = (9) pad where n is the number of samples, Y is the observed precipi- PP − obs max min tation, Y is the estimated precipitation and is the average est observed precipitation. The best-performing models are those where P is the standardized precipitation; P is the precipita- pad i with low MSE values and Nash values close to 1 (Chai and tion to be standardized; and P and P are the smallest and min max Draxler, 2014; Nash and Sutcliffe, 1970). largest values observed in the precipitation series, respectively. In short, the methodology consisted of the following: - obtaining the data series organized in a certain period (dry or WD-ANN hybrid modelling rainy); - data standardization (Equation (9)), wavelet decomposition The hybrid model consisted of the WD-ANN combination, and training with different ANN configurations. with the original precipitation series passing through a decom- - simulation of the data series by an ANN and a comparison of position filter (WD), which acted as a signal filter and sought to the inputs (observed) with the output (simulated) results. transform and correct the input data. These inputs were then subjected to ANN training, in which the inputs were evaluated Fig. 3. Scheme of the proposed methodology for obtaining daily rainfall estimates using WD–ANN. Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco For each rainfall gauge station, an ANN was defined that ever, when choosing a filter, other parameters are also associat- served to estimate the daily data in the dry and rainy periods ed with it. Thus, according to the tests performed in this study, from the ANA and CMORPH data series. A scheme of the the factors that most influenced the simulations were the level adopted methodology can be observed in Figure 3. of decomposition and the length of the wavelet, the best model, with a medium level (6) and more smoothness adjustment and RESULTS AND DISCUSSION consideration of the boundary conditions, performed a moderate Wavelet decomposition by the MODWT and admissible adjustment to the decomposition of the precipi- tation data. The longer filter length (14) did not present higher The MODWT was used to decompose the precipitation se- quality estimations and could remove a much larger number of ries and generate the “limit-corrected” wavelet and scale coeffi- wavelet coefficients adjusted by the BCs, compromising the cients. This decomposition was accomplished by adopting three amount of input data in the simulation of the model with the RNA. types of wavelet families and three levels of decomposition, denoted as j (4, 6 and 8). The filter length varied according to the adopted wavelet family (Bašta, 2014). The maximum level of decomposition was defined as six (J = 6), and the length max of the wavelet filters (K) was 2 for the Haar wavelet, 4 for the d4, and 14 for the la14. Thus, using a maximum level of de- composition equal to six and a K equal to 4, we found that K = (2 –1)(4 – 1) + 1; therefore, K was equal to 190 coefficients affected by the limit of j (this practice was also adopted, for j = 4 and 8 and for K = 2 and 14). Therefore, the first 190 records of input data from the stations were removed after the decom- position was completed. Thus, limit-corrected wavelets and scale coefficients were defined before the input variables were selected for calibration and model validation using ANNs. The smallest errors and the highest Nash value resulting from the validation of the use of ANNs indicated the best MODWT- ANN structure for forecasting daily precipitation (Table 2). The large MSE and Nash values obtained after ANN pro- cessing showed that the wavelet decomposition by the Haar filter was not accurate in decomposing the daily precipitation series (Table 2). This result may be related to the fact that the Haar filter has decomposition properties that are aimed at series a) Haar, j=4. that show very sudden changes (Mallat, 2009), which does not happen in homogeneous rainfall series because in dry or rainy periods (lack of rain or constantly rain), there are no considera- ble sudden changes. The Daubechies wavelet (d4) was able to decompose the seasonality element of the time series more efficiently than the Haar filter. The results of the d4 for levels 6 and 8 showed small errors and a high Nash value. The good performance of the Daubechies wavelet (d4) in the decomposition of variables can be explained, according to Maheswaran and Khosa (2012), by the ability of the d4 to smooth the signal and locate the tim- ing and frequency of the data. These abilities are necessary when analysing precipitation series, which present temporal complications. The la14 wavelet (which was less asymmetric), when com- bined with the ANN, also showed good results, with small errors and a high Nash value. The performance of the la14 in relation to the d4, using lengths (L) 6 and 8, did not differ con- siderably. This finding shows that the increase in the filter length, in this case, did not induce considerable improvements in the estimations and that the use of the d4 filter with level 6 is sufficient to obtain good decomposition of the signal. As shown in Figure 4, the decomposition of the historical series of station b) d4, j=6. E1 (Figure 1) by the d4 filter managed to better smooth the Fig. 4. - Decomposition by the MODWT, using the Haar, d4 and la14 precipitation signal (X) than did the other filters, showing more wavelet filters, in which each W series represents the wavelet coeffi- details of the signal with the wavelet coefficient (W ), with six cients on a specific scale, and V represents the scale coefficients. levels, in relation to the Haar, which incorporated four levels (W ). The choice of the best filter, according to Zhang et al. To avoid errors and bypass the BCs, it is necessary to (2015), must be the one with the most decomposition character- choose adequate wavelets and sufficient input data for training istics of the studied series, that is, the best filter is the one in and forecasting (Du et al., 2017; Ramírez-Hernández et al., which the signal is best represented in the decomposition. How- 2016). Thus, in the selection of the precipitation series, three 18 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model wavelet filters and three decomposition levels were tested by The configurations of the networks are quite different for removing the values that interfere with the coefficients affected each station. This observation can be verified when comparing by the j limit and by adjusting the division of the number of the values of the number of neurons in the input and output data used in the calibration and validation. Of the 3,444 days of layers and the delay of each station in the different periods the rainy season, 190 were removed, leaving 3,254, which were (Table 2). This difference may be explained by the uneven further divided into 2,440 days used for calibration and 814 occurrence of rainfall in the rainy and dry periods and by the days used for validation. In the dry period, there were 3,496 influences of different biomes on each station. Station E1 is in a days (of which 190 were removed), divided into 2,480 days region of the Amazon biome, E2 is located in a transition area used for calibration and 826 days used for validation. This between the Cerrado and Amazon biomes, and E3 is in the division corresponds to approximately six and a half years for Cerrado (Brazilian biome). The interactions between the mete- calibration and two years and two months for validation. With orological systems that operate in each biome and the altitude these divisions, it was possible to filter the data series, leaving of each station reflect the behaviour of the spatial and temporal them free of uncertainties related to BCs and adjusting adequate precipitation values in the basins, as observed in studies by numbers of input data for the training and validation of the Levy et al. (2017), Oliveira-Junior et al. (2017) and Teodoro et neural networks. al. (2016). Zeri et al. (2019) investigated the interannual variability of Neural network simulations drought-associated rainfall using the wavelet transform and found that the northern region of the state of Tocantins is influ- Two neurons in the input layer were the most favourable enced more by interannual variability in drought events than the and recurrent in the ANN tests, while the most frequent net- surrounding regions are, indicating that some stations are af- work delay was determined to be 4 days. In the series consisting fected by continuous drought events. In fact, the occurrence of of the dry and rainy periods, at the three stations involved (Fig- rainfall is quite heterogeneous among the stations. ure 1), the delays were 2, 4, 5, 6, 10 and 12 days late. Another From the performance coefficients obtained from the data observation concerns the tendency of networks to require a validation, it can be observed that the hybrid model made satis- smaller number of neurons in the input layer (ranging from 2 to factory predictions for both the ANA and CMORPH series, the 12) and a higher number of neurons in the hidden layer (between MSE values were small (less than 0.88) and the Nash coeffi- 10 and 35) (Table 2). This fact shows that the ANN adopted has cients were close to 1 when evaluating the series of the three a better performance than the other ANNs, with 10 or more stations considered in the study, E1, E2 and E3 (Figure 1), with hidden layer neurons, to estimate daily rainfall in the TAHR. the exception of the use of the Haar wavelet (Table 2). Table 2. MODWT-ANN model results. Result of the MODWT-ANN model - ANA data Result of the MODWT-ANN model - CMORPH data Season Wavelet Level (j) RNA MSE Nash Season Wavelet Level (j) RNA MSE Nash Haar 4 2 7,2,12,1 0.55 0.5 Haar 4 2 2,7,12,1 0.47 0.52 d4 6 4 7,2,12,1 0.15 0.91 d4 6 4 2,7,12,1 0.16 0.94 E1- Rainy E1- Rainy d4 8 4 7,2,12,1 0.17 0.91 d4 8 4 2,7,12,1 0.26 0.93 la14 6 14 7,2,12,1 0.18 0.91 la14 6 14 2,7,12,1 0.18 0.94 Haar 4 2 2,4,10,1 0.72 0.56 Haar 4 2 2,4,10,1 0.72 0.88 d4 6 4 2,4,10,1 0.14 0.83 d4 6 4 2,4,10,1 0.18 0.83 E1- Dry E1- Dry d4 8 4 2,4,10,1 0.17 0.83 d4 8 4 2,4,10,1 0.21 0.98 la14 6 14 2,4,10,1 0.20 0.82 la14 6 14 2,4,10,1 0.21 0.99 Haar 4 2 2,4,15,1 0.27 0.71 Haar 4 2 6,6,10,1 0.47 0.47 d4 6 4 2,4,15,1 0.14 0.90 d4 6 4 6,6,10,1 0.13 0.95 E2- Rainy E2- Rainy d4 6 6 2,4,15,1 0.15 0.89 d4 8 6 6,6,10,1 0.16 0.95 la14 6 14 2,4,15,1 0.19 0.88 la14 6 14 6,6,10,1 0.18 0.96 Haar 4 2 2,4,15,1 0.72 0.57 Haar 4 2 2,5,12,1 0.81 0.68 d4 6 4 2,4,15,1 0.13 0.90 d4 6 4 2,5,12,1 0.13 0.85 E2- Dry E2- Dry d4 8 4 2,4,15,1 0.17 0.88 d4 8 4 2,5,12,1 0.14 0.85 la14 6 14 2,4,151 0.20 0.89 la14 6 14 2,5,12,1 0.16 0.85 Haar 4 2 3,4,20,1 0.51 0.78 Haar 4 2 2,4,20,1 0.68 0.74 d4 6 4 3,4,20,1 0.17 0.91 d4 6 4 2,4,20,1 0.07 0.95 E3- Rainy E3- Rainy d4 6 4 3,4,20,1 0.32 0.89 d4 8 4 2,4,20,1 0.18 0.95 la14 6 14 3,4,20,1 0.18 0.89 la14 6 14 2,4,20,1 0.14 0.95 Haar 4 2 4,6,35,1 0.89 0.56 Haar 4 2 4,4,25,1 0.88 0.64 d4 6 4 4,6,35,1 0.12 0.81 d4 6 4 4,4,25,1 0.14 0.85 E3- Dry E3- Dry d4 8 4 4,6,35,1 0.13 0.80 d4 8 4 4,4,25,1 0.16 0.84 la14 6 14 4,6,35,1 0.14 0.79 la14 6 14 4,4,25,1 0.18 0.84 Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco Table 3. Observed and simulated series statistics. Regarding the daily rainfall estimates, in general, the rainy period was better estimated by the hybrid model. This result may be related to the greater number of zeroes in the dry peri- Rainfall gauge stations with ANA data od; zeroes indicate scarce information in the samples, which Dry Rainy Station Data r days days µ. ẟ Bias Max may cause them to be insufficient for the neural network to learn and, consequently, may be insufficient for simulating E1- Obs 318 496 8.7 14.9 100 0.96 1.7 Rainy future data. This issue may also be related to the interannual Sim 180 633 8.0 13.9 99 variation in precipitation, as the sequences of dry days require Obs 584 242 2.9 8.3 120 E1 – 0.81 2.05 much more information from previous days than sequences of Dry Sim 442 384 2.9 6.9 100 rainy days; this explanation was suggested in the work of Wilks Obs 368 444 7.2 15.2 123 (1999). In addition, Hellassa and Souag-Gamane (2019) E2 - 0.97 0.74 Rainy Sim 58 756 6.6 14.1 119 demonstrated that removing noise in the precipitation series Obs 706 121 1.6 6.9 82 through wavelet decomposition induces a significant improve- E2 - 0.92 1.82 ment in the daily rainfall estimations, and, depending on the Dry Sim 554 272 1.4 5.8 81 adopted wavelet function, rainy periods may have a better re- Obs 437 377 7.4 14.9 119 E3 - 0.91 0.53 sponse than dry periods. Thus, while the wavelet method decom- Rainy Sim 125 689 7.2 13.2 119 poses the signal and incorporates the seasonal characteristics of Obs 722 104 1.1 4.5 34 the rainfall series, long periods of dry days need more attention in E3 - 0.92 3.5 Dry Sim 603 223 0.9 3.3 30 the modelling of daily. Despite this, and considering the results, Rainfall gauge stations with CMORPH data the developed hybrid model estimated the daily rainfall in the Dry Rainy evaluated locations well, but on some days, the precipitation was Station Data r days days µ. ẟ Bias Max underestimated by the model, as evidenced in the graphs compar- E1- Obs 246 568 6.8 11.2 89 ing the observed and estimated precipitation values (Figures 5, 6 0.96 0.86 Rainy and 7) for both the ANA and CMORPH data. Sim 150 664 6.0 10.5 86 Other studies that used CMORPH data to forecast rainfall Obs 534 292 2.0 4.9 39 E1 - 0.89 1.5 in the TAHR obtained good results with the introduction of Dry Sim 300 526 1.6 4.1 37 satellite data, such as those reported by Falck et al. (2015) and Obs 223 591 6.4 10.6 74 E2 - 0.93 2.0 Germano et al. (2017). In the study by Germano et al. (2017), Rainy Sim 64 750 5.2 9.5 72 the CMORPH series, when inserted in the forecast model (sto- Obs 664 162 1.3 5.1 75 chastic - SREM2D), overestimated the daily rainfall in the E2 - 0.64 1.5 region. Thus, the input data may respond differently in each Dry Sim 307 519 1.9 4.8 73 model. However, underestimation of daily rainfall was not Obs 296 518 4.7 8.6 74 E3 - 0.90 0.49 observed in the literature for the analysed region. Rainy Sim 73 741 4.5 7.1 69 Obs 673 153 0.9 4.2 56 E3 - 0.85 3.2 Statistical parameters of the observed and simulated data Dry Sim 570 256 0.7 3.6 50 The statistical parameters of the observed and simulated da- r - correlation coefficient; ẟ – standard deviation; µ – average; max ta from ANA and CMORPH are shown in Table 3. Good corre- - maximum. lation is obtained between the observed and simulated data, with r values ranging between 0.81 and 0.96, with the excep- In the rainfall estimate for station E1, the Nash value was tion of the data from the E2 station in the dry season (r = 0.64). closer to 1.0 for the rainy season than it was for the dry season The total numbers of dry days in the simulated series are all for both the ANA and CMORPH series (Figure 5). The results below those observed. This result is due to the approximation obtained for the rainfall gauge station E2 were similar to those of the zero value by the ANN model and to the number of more obtained for station E1, but the differences in the Nash values recurrent rainy days in the simulated series than in the observed were greater between the dry and rainy periods for both data series, as there are more records of precipitation (values above series (ANA and CMORPH). However, the simulations zero). The rainfall gauge station with the most rainy days is E1, performed well, with Nash values between 0.72 and 0.95, when with 496 and 568 days, and the station with the fewest days, considering data from ANA and CMORPH (Figure 6). At E3, recorded 104 and 153 days, according to the ANA and rainfall gauge station E3, the highest Nash value occurred in the CMORPH data, respectively. rainy season (ANA series), with an increase of 0.1 between the The means, standard deviations and maximum values of the rainy and dry periods, indicating that the proposed ANN had a observed series were higher than those of the simulated series, better response for the rainy season than it did for the dry which is related to underestimation by the model, as confirmed in season (Figure 7). Figures 5, 6 and 7 and in Table 3. It is important to highlight the The behaviour of precipitation in the TAHR is quite result that, in the dry period, the ANA series, despite having diverse, as demonstrated by Gomes et al. (2018), who identified higher averages (indicating more rainy days than did the three homogeneous precipitation regions. The models proposed CMORPH series), presented equal Nash coefficients for rainfall here were applied to each of these regions to knowledge the gauge stations E1 (0.83) and lower Nash coefficients for the regional differences. The results obtained were well diversified station E3 (0.85 and 0.81) compared to the CMORPH data (Ta- and confirmed mainly by the structure of the RNA, which ble 2). In addition, the simulations using ANA data for the dry required a much larger number of neurons (25 and 35) in the period showed greater errors related to Bias of 2.05, 1.82 and 3.5 hidden layer (c) at the E3 station, which is located in the at stations E1, E2 and E3, respectively. In the dry period, using southern part of the TAHR, and lower numbers of neurons in the CMORPH data, the Bias values were equal to 1.5 (E1), 1.5 the central and northern areas (Table 2). Chau and Wu (2010), (E2) and 3.2 (E3) (Table 3). Therefore, the ANA series did not making daily rainfall forecasts with ANNs, also pointed out that provide an advantage over the CMORPH data in the dry period. a greater or lesser frequency of precipitation directly affected 20 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model a) E1 – Rainy period – CMORPH b) E1 – Rainy period – CMORPH c) E1 – Dry period – CMORPH d) E1 – Dry period – CMORPH f) E1 – Rainy period – ANA e) E1 – Rainy period – ANA g) E1 – Dry period – ANA h) E1 – Dry period – ANA Fig. 5. Graph of the observed and estimated precipitation values and the dispersion for the rainfall gauge station E1. the performance of the ANNs. This relationship between the Seasonal precipitation indexes behaviour of the daily precipitation and the network structure shows the importance of prior knowledge of the conditions of The rainfall indexes of stations E1, E2 and E3 were well- the studied variable, especially in places with large dimensions defined in space and time. In relation to this characteristic, we such as the TAHR. In addition, although we used only three observed that the biggest errors resulting from bias occurred at rainfall gauge stations and one type of ANN, when considering station E3 (3.5 and 3.2), which had the lowest rainfall index, model limitations, our study had the advantage of incorporating both for ANA and CMORPH data (Table 3). the regional rainfall characteristics of homogeneous regions Although the rainy period was better simulated with biases that were represented by each of the three rainfall gauge sta- between 0.49 and 1.7, the extreme values in the three homoge- tions distributed in the area of the TAHR. neous regions were not as well estimated. This result can be Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco a) E2 – Rainy period – CMORPH b) E2 – Rainy period – CMORPH c) E2 – Dry period – CMORPH d) E2 – Dry period – CMORPH e) E2 – Rainy period – ANA f) E2 – Rainy period – ANA g) E2 – Dry period – ANA h) E2 – Dry period – ANA Fig. 6. Graph of the observed and estimated precipitation values and the dispersion for the rainfall gauge station E2. explained by the fact that this type of occurrence is influenced by their use in long-term planning, mainly for activities that re- atmospheric events, which are not considered by the model; the quire previous establishments of precipitation, such as agricul- model is based only on the observed historical series. Accord- ture, water supply and energy generation. There are still ad- ing to Nerantzaki and Papalexiou (2019), the lower predictabil- justments to be considered, such as testing at other points of the ity of extreme precipitation events is still a challenge in the TAHR, the use of different types of RNA, and introducing literature, and specific methods are required for the modelling other climatic variables. of extreme events. The performances of the models previously adjusted for Regarding the simulated forecasts for the seasonal periods each rainfall gauge station were also evaluated by the simula- (each spanning six months), we highlighted the possibility of tion of 1,460 daily rainfall data, corresponding to the years 22 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model a) E3 – Rainy period – CMORPH b) E3 – Rainy period – CMORPH c) E3 – Dry period – CMORPH d) E3 – Dry period – CMORPH e) E3 – Rainy period – ANA f) E3 – Rainy period – ANA g) E3 – Dry period – ANA h) E3 – Dry period – ANA Fig. 7. Graph of the observed and estimated precipitation values and the dispersion for the rainfall gauge station E3. 2013 to 2016. With these simulations, the model's ability to number of consecutive days of rainfall occurring or not estimate the daily seasonal averages and the maximum length occurring, these predictions are crucial for the management of of the dry and rainy periods was evaluated. The quantile- water demands. The metrics are plotted for all observations and quantile graph was also used to verify whether the distribution simulations for each rainfall gauge station. As the model of the generated synthetic series was adjusted to the observed already considers seasonality and evaluates the rainy and dry data series. This approach intended to verify whether the model periods of the studied region separately, the graphs are could generate daily rainfall data predictions, and the maximum organized in this way. Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco a) E1 – Rainy b) E1 – Dry c) E2 – Rainy d) E2 – Dry e) E3 – Rainy f) E3 – Dry Fig. 8. Daily and seasonal averages (µ) observed and simulated for each rainfall gauge station: E1 (a, b), E2 (b, c) and E3 (e, f). The daily averages of each month of the dry and rainy peri- the number of rainy days and underestimated the number of dry ods and the seasonal averages of each year are organized in the days. However, there was no discrepancy between the graphs in Figure 8. Thus, for the average daily precipitation, the maximum rainfall values; thus, the model had good reproduced quantities showed percentage errors between the performance when simulating extreme rainfall. Among the wet simulated and observed values of 13%, 9% and 9% in the rainy months (NDJFMA), the months in which the most rainfall was season and 14%, 10% and 11% in the dry season for the rainfall recorded were February (E1 and E3) and January (E2). Among gauge stations E1, E2 and E3, respectively. For the seasonal the dry months (MJJASO), July and August presented greater averages, the differences were 8%, 7% and 4% in the rainy droughts than did the other months, mainly at the rainfall gauge season and 9%, 12% and 10% in the dry season for the rainfall stations E2 and E3, which registered rainfall amounts less than gauge stations E1, E2 and E3, respectively. These results show 2 mm, and the maximum number of consecutive dry days was that the maximum daily and seasonal average rainfall amounts above 80. were underestimated for all rainfall gauge stations in the dry The variability of rainfall in the Amazon region is quite dy- season and were better simulated in the rainy season (Figure 8). namic, and according to Wang et al. (2018), who evaluated the This result reinforces the idea defended by Wilks (1989) that behaviour of precipitation in South America in the rainy period the prediction of daily models depends on the total amount of (DJFMAM), the total precipitation can vary greatly in the precipitation in a month and on the monthly and seasonal tran- months from December to May. Reichle et al. (2017) indicated sitions. Therefore, higher amounts of monthly rainfall, which that the precipitation bias is strong in many tropical and sub- results from higher numbers of rainy days, promote better re- tropical areas, especially in South America, during December, sults. This average quantification of rainfall helps in the plan- January and February (DJF). In addition, in the studies by ning of forest management, as rainfall volumes far above or Gloor et al. (2013), Espinoza et al. (2016) and Latrubesse et al. below a certain value can favour or hinder the development of (2017), there is increasing trends towards the occurrence of vegetation (Bonal et al., 2016). extreme seasonal droughts and floods in this region. Therefore, Figure 9 shows the maximum number of rainy and dry days, the current work, even if in a simplified way, can assist in fu- the maximum amount of precipitation occurring in the dry and ture estimates of prolonged droughts and intense rains. These rainy periods, and the driest and rainiest months observed and estimates are important for activities such as agribusiness, simulated between 2013 and 2016. The model overestimated supply, energy generation, etc. 24 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model b) E1 - Dry a) E1 - Rainy c) E2 - Rainy d)E2 - Dry e) E3 - Rainy f) E3 - Dry Fig. 9. Maximum length of dry or rainy periods and maximum rainfall observed and simulated in the rainy and dry seasons for each rainfall gauge station: E1 (a, b), E2 (c, d) and E3 (e, f). Figure 10 shows the quantile-quantile graphs for each observed and estimated values allow the confirmation of a good rainfall gauge station in the different periods (dry and rainy). simulation. For the daily series, the numbers of neurons in the According to Gnanadesikan (2011), the closer the points are to hidden layer above layer 9 are more favourable in building the the diagonal line, the closer the generated data is to the ANN. The initial input amounts and the network delays are also theoretical distribution and the better the model performance is. determining factors in the model responses. Considering the Thus, from the graphical results, it is evident that the model data types, both the ANA and CMORPH series performed well better simulates the rainy periods than the dry periods; in the when estimating daily rainfall. However, regarding the rainy periods, the simulated and observed distributions fit the CMORPH series, the simulation of the rainy period was better line better than in the dry periods (Figure 10a, 10c and 10e). than that of the dry period. The results show the benefits of This graphical approach confirms what had already been wavelet decomposition in estimating daily rainfall through the perceived by the previous analyses in this study regarding the combination of wavelet decomposition with an ANN. inferior performance of the MODWT-ANN model for the dry Moreover, the good reproduction of the daily rainfall averages period. and the lengths of dry and humid consecutive days on seasonal scales show the potential for the model to be used in the control CONCLUSIONS of floods and droughts, in the quantification of favourable rainfall conditions for agriculture and in the supply of water The performance of the MODWT-ANN model was resources. The good results of this hybrid approach stimulate evaluated for daily precipitation forecasts using different the development of new research on this subject. For instance, databases. The model presented better results when the one way to improve the results achieved is to test other types of Daubechies filter (d4) was applied in the decomposition of the ANNs and to incorporate variables such as temperature and precipitation series and the model was coupled to RNA, mainly humidity into the developed model. These variables also for the rainy season series. For the application of this technique, influence the production of rainfall and were not analysed in the the input information is crucial for the model performance, present study. which must be tested numerous times, and the results of the Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco b) E1 - Dry a) E1 - Rainy c) E2 - Rainy d) E2 - Dry e) E3 - Rainy f) E3 - Dry Fig. 10. Quantile-quantile graphs of the observations and simulations in the wet and dry periods for the stations E1 (a, b), E2 (c, d) and E3 (e, f). Acknowledgements. The authors thank the ANA and NOAA for nomica Pragensia, 2014, 48–70. the available precipitation data. The authors would also like to Bonal, D., Burban, B., Stahl, C., Wagner, F., Hérault, B., 2016. thank the Coordination for the Improvement of Higher Educa- The response of tropical rainforests to drought - lessons from tion Personnel of Brasil (CAPES), Finance Code 001. The recent research and future prospects. Annals of Forest Sci- second author would like to thank CNPq for funding the re- ence, 73, 27–44. DOI: 10.1007/s13595-015-0522-5 search with a productivity grant (Process 303542/2018-7). We Chai, T., Draxler, R.R., 2014. Root mean square error (RMSE) would also like to thank the office for research (PROPESP) and or mean absolute error (MAE)? – Arguments against avoid- Foundation for Research Development (FADESP) of the Fed- ing RMSE in the literature. Geoscientific Model Develop- eral University of Pará through grant no. PAPQ 2019 and 2020. ment, 7, 1247–1250. DOI: 10.5194/gmd-7-1247-2014 Chau, K.W., Wu, C.L., 2010. A hybrid model coupled with REFERENCES singular spectrum analysis for daily rainfall prediction. Jour- nal of Hydroinformatics, 12, 458–473. DOI: Addison, P.S., Murray, K.B., Watson, J.N., 2001. Wavelet 10.2166/hydro.2010.032 transform analysis of open channel wake flows. Journal of Cuo, L., Pagano, T.C., Wang, Q.J., 2011. A review of quantita- Engineering Mechanics, 127, 58–70. tive precipitation forecasts and their use in short-to-medium Bašta, M., 2014. Additive decomposition and boundary condi- streamflow forecasting. Journal of Hydrometeorology, 12, tions in wavelet-based forecasting approaches. Acta Oeco- 713–728. 26 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model Daubechies, I., 1992. Ten Lectures on Wavelet. Society for 527, 88–100. http://dx.doi.org/10.1016/j.jhydrol.2015.04.047 Industrial and Applied Mathematics, Philadelphia. 0022. https://doi.org/10.1137/1.9781611970104 Hellassa, S., Souag-Gamane, D., 2019. Improving a stochastic dos Santos, T.S., Mendes, D., Torres, R.R., 2016. Artificial multi-site generation model of daily rainfall using discrete neural networks and multiple linear regression model using wavelet de-noising: a case study to a semi-arid region. Ara- principal components to estimate rainfall over South Ameri- bian Journal of Geosciences, 12, 53. ca. Nonlinear Processes in Geophysics, 23, 13–20. DOI: https://doi.org/10.1007/s12517-018-4168-0 10.5194/npg-23-13-2016 Holdefer, A.E., Severo, D.L., 2015. Análise por ondaletas sobre Du, K., Zhao, Y., Lei, J., 2017. The incorrect usage of singular níveis de rios submetidos à influência de maré. Revista spectral analysis and discrete wavelet transform in hybrid Brasileira de Recursos Hídricos, 20, 192–201. DOI: models to predict hydrological time series. J. Hydrol., 552, 10.21168/rbrh.v20n1.p192-201 44–51. DOI: 10.1016/j.jhydrol.2017.06.019 IBGE – INSTITUTO BRASILEIRO DE GEOGRAFIA E ESTATÍSTICA. Cobertura do uso da terra do Brasil (Land Espinoza, J.C., Segura, H., Ronchail, J., Drapeau, G., Gutierrez‐ use coverage in Brazil). Rio de Janeiro: IBGE, 2014. Avail- Cori, O., 2016. Evolution of wet‐day and dry‐day frequency able from: in the western Amazon basin: Relationship with atmospheric https://www.ibge.gov.br/geocienciasnovoportal/informacoes circulation and impacts on vegetation. Water Resources Re- -ambientais/cobertura-e-uso-da-terra (accessed in 13 Sept. search, 52, 8546–8560. 2017) https://doi.org/10.1002/2016WR019305 Kisi, O., Cimen, M., 2011. A wavelet-support vector machine Fahimi, F., Yaseen, Z.M., El-shafie, A., 2017. Application of conjunction model for monthly streamflow forecasting. soft computing based hybrid models in hydrological varia- Journal of Hydrology, 399, 132–140. bles modeling: a comprehensive review. Theoretical and Kisi, O., Shiri, J., 2011. Precipitation forecasting using wavelet- Applied Climatology, 128, 875–903. genetic programming and wavelet-neuro-fuzzy conjunction https://doi.org/10.1007/s00704-016-1735-8 models. Water Resources Management, 25, 3135–3152. Falck, A.S., Maggioni, V., Tomasella, J., Vila, D.A., Diniz, https://doi.org/10.1007/s11269-011-9849-3 F.L., 2015. Propagation of satellite precipitation uncertain- Kuo, C.C., Gan, T.Y., Yu, P.-S., 2010. Wavelet analysis on the ties through a distributed hydrologic model: A case study in variability, teleconnectivity, and predictability of the seasonal the Tocantins–Araguaia basin in Brazil. Journal of Hydrology, rainfall of Taiwan. Monthly Weather Review, 138, 162–175. 527, 943–957. http://dx.doi.org/10.1016/j.jhydrol.2015.05.042 Lang, K.J., Hinton, G.E., 1988. The development of the time- Frumau, K.A., Bruijnzeel, L.A., Tobón, C., 2011. Precipitation delay neural network architecture for speech recognition. measurement and derivation of precipitation inclination in a Technical Report CMU-CS-88-152. windy mountainous area in northern Costa Rica. Hydrological Latrubesse, E.M., Arima, E.Y., Dunne, T., Park, E., Baker, Processes, 25, 499–509. https ://doi.org/10.1002/hyp.7860 V.R., d’Horta, F.M., Ribas, C.C., 2017. Damming the rivers Germano, M.F., Vitorino, M.I., Cohen, J.C.P., Costa, G.B., of the Amazon basin. Nature, 546, 363–369. Souto, J.I.D.O., Rebelo, M.T.C., de Sousa, A.M.L., 2017. https://doi.org/10.1038/nature22333 Analysis of the breeze circulations in Eastern Amazon: an Levy, M.C., Cohn, A., Lopes, A.V., Thompson, S.E., 2017. observational study. Atmospheric Science Letters, 18, Addressing rainfall data selection uncertainty using connec- 67–75. https://doi.org/10.1002/asl.726 tions between rainfall and streamflow. Scientific Reports, 7, Gloor, M.R.J.W., Brienen, R.J., Galbraith, D., Feldpausch, 219. DOI: 10.1038/s41598-017-00128-5 T.R., Schöngart, J., Guyot, J.L., Phillips, O.L., 2013. Intensi- Maheswaran, R., Khosa, R., 2012. Comparative study of different fication of the Amazon hydrological cycle over the last two wavelets for hydrologic forecasting. Computers & Geoscienc- decades. Geophysical Research Letters, 40, 1729–1733. es, 46, 284–295. https://doi.org/10.1016/j.cageo.2011.12.015 https://doi.org/10.1002/grl.50377 Mallat, S., 2009. A Wavelet Tour of Signal Processing. Aca- Gnanadesikan, R., 2011. Methods for Statistical Data Analysis demic Press, 832 p. https://doi.org/10.1016/B978-0-12- of Multivariate Observations. John Wiley & Sons. 374370-1.X0001-8. DOI:10.1002/9781118032671 Mehr, A.D., Kahya, E., Bagheri, F., Deliktas, E., 2014. Succes- Golding, B.W., 2014. Regional prediction models. In: North, sive-station monthly streamflow prediction using neuro- G., Pyle, J., Zhang, F. (Eds.): Encyclopedia of Atmospheric wavelet technique. Earth Science Informatics, 7, 217–229. Sciences. 2nd Edition. Academic Press, p. 2008. DOI: 10.1007/ s12145-013-0141-3 Gomes, E.P., Blanco, C.J.C., Pessoa, F.C.L., 2018. Regionali- Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through zation of precipitation with determination of homogeneous conceptual models part I – A discussion of principles. Jour- regions via fuzzy c-means. Revista Brasileira de Recursos nal of Hydrology, 10, 282–290. http://doi.org/10.1016/0022- Hídricos, 23. https://doi.org/10.1590/2318- 1694(70)90255-6 0331.231820180079 Nerantzaki, S.D., Papalexiou, S.M., 2019. Tails of extremes: Guimarães Santos, C.A., Silva, G.B.L.D., 2014. Daily stream- Advancing a graphical method and harnessing big data to as- flow forecasting using a wavelet transform and artificial neu- sess precipitation extremes. Advances in Water Resources, ral network hybrid models. Hydrological Sciences Journal, 59, 134, Article Number: 103448. 312–324. http://dx.doi.org/10.1080/02626667.2013.800944 Nourani, V., Baghanam, A.H., Adamowski, J., Kisi, O., 2014. Gupta, A., Kamble, T., Machiwal, D., 2017. Comparison of Applications of hybrid wavelet–artificial intelligence models ordinary and Bayesian kriging techniques in depicting rain- in hydrology: a review. Journal of Hydrology, 514, 358–377. fall variability in arid and semi-arid regions of north-west https://doi.org/10.1016/j. jhydrol.2014.03.057 India. Environmental Earth Sciences, 76, 512. Nourani, V., Andalib, G., Sadikoglu, F., 2017. Multi-station https://doi.org/10.1007/s12665-017-6814-3 streamflow forecasting using wavelet denoising and artificial He, X., Guan, H., Qin, J., 2015. A hybrid wavelet neural network intelligence models. Procedia Computer Science, 120, 617– model with mutual information and particle swarm optimiza- 624. DOI: 10.1016/j.procs.2017.11.287 tion for forecasting monthly rainfall. Journal of Hydrology, Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco Oliveira-Junior, J.F.D., Xavier, F.M.G., Teodoro, P.E., Gois, namic neural network approaches for runoff prediction. G.D., Delgado, R.C., 2017. Cluster analysis identified rain- Journal of Hydrology, 535, 211–225. fall homogeneous regions in Tocantins State, Brazil. Biosci- http://dx.doi.org/10.1016/j.jhydrol.2016.01.076 ence Journal, 33, 333–340. https://doi.org/10.14393/BJ- Siad, S.M., Iacobellisb, V., Zdrulie, P., Gioiab, A., Stavid, I., v33n2-32739 Hoogenboom, G., 2019. A review of coupled hydrologic and Osborn, T.J., Wallace, C.J., Harris, I.C., Melvin, T.M., 2016. crop growth models. Agricultural Water Management, 224, Pattern scaling using ClimGen: monthly-resolution future Article Number: 105746. climate scenarios including changes in the variability of pre- Silva, I.D., Spatti, D.H., Flauzino, R.A., 2010. Redes neurais cipitation. Climatic Change, 134, 353–369. artificiais para engenharia e ciências aplicadas. Artliber, São https://doi.org/10.1007/s10584-015-1509-9 Paulo, Brasil, 646 p. Partal, T., Kişi, Ö., 2007. Wavelet and neuro-fuzzy conjunction Sulaiman, S.O., Shiri, J., Shiralizadeh, H., Kisi, O., Yaseen, model for precipitation forecasting. Journal of Hydrology, Z.M., 2018. Precipitation pattern modeling using cross-station 342, 199–212. https://doi.org/10.1016/j.jhydrol.2007.05.026 perception: regional investigation. Environmental Earth Sci- Partal, T., Cigizoglu, H.K., 2009. Prediction of daily precipita- ences, 77, 709. https://doi.org/10.1007/s12665-018-7898-0 tion using wavelet—neural networks. Hydrological Sciences Tealab, A., Hefny, H., Badr, A., 2017. Forecasting of nonlinear Journal, 54:2, 234–246, DOI: 10.1623/hysj.54.2.234 time series using ANN. Future Computing and Informatics Partal, T., Cigizoglu, H.K., Kahya, E., 2015. Daily precipitation Journal, 2, 39–47. https://doi.org/10.1016/j.fcij.2017.05.001 predictions using three different wavelet neural network al- Teodoro, P.E., de Oliveira-Júnior, J.F., Da Cunha, E.R., Correa, gorithms by meteorological data. Stochastic Environmental C.C.G., Torres, F.E., Bacani, V.M., Ribeiro, L.P., 2016. Research and Risk Assessment, 29, 1317–1329. Cluster analysis applied to the spatial and temporal variabil- https://doi.org/10.1007/s00477-015-1061-1 ity of monthly rainfall in Mato Grosso do Sul State, Brazil. Percival, D.B., Walden, A.T., 2000. Wavelet methods for time Meteorology and Atmospheric Physics, 128, 197–209. DOI: series analysis. Cambridge Series in Statistical and Probabil- 10.1007/s00703-015-0408-y istic Mathematics. 1st ed. Cambridge University Press, Wang, X.Y., Li, X., Zhu, J., Tanajura, C.A., 2018. The Cambridge. strengthening of Amazonian precipitation during the wet Ramana, R.V., Krishna, B., Kumar, S.R., Pandey, N.G., 2013. season driven by tropical sea surface temperature forcing. Monthly rainfall prediction using wavelet neural network Environmental Research Letters, 13, Article Number: analysis. Water Resources Management, 27, 3697–3711. 094015. https://doi.org/10.1088/1748-9326/aadbb9 https://doi.org/10.1007/s11269-013-0374-4 Wilks, D.S., 1989. Conditioning stochastic daily precipitation Ramírez-Hernández, J., Infante-Prieto, S.O., Villa-Angulo, R., models on total monthly precipitation. Water Resources Re- Hallack-Alegría, M., 2016. La influencia del efecto de borde search, 25, 1429–1439. en el pronóstico de precipitaciones utilizando DWT diádica, https://doi.org/10.1029/WR025i006p01429 MODWT, ANN y ANFIS. Tecnología y ciencias del agua, Wilks, D.S., 1999. Interannual variability and extreme-value 73, 93–113. characteristics of several stochastic daily precipitation mod- Reichle, R.H., Liu, Q., Koster, R.D., Draper, C.S., Mahanama, els. Agricultural and Forest Meteorology, 93, 153–169. S.P., Partyka, G.S., 2017. Land surface precipitation in https://doi.org/10.1016/S0168-1923(98)00125-7 MERRA-2. Journal of Climate, 30, 1643–1664. Yaseen, Z.M., Jaafar, O., Deo, R.C., Kisi, O., Adamowski, J., https://doi.org/10.1175/JCLI-D-16-0570.1 Quilty, J., El-Shafie, A., 2016. Stream-flow forecasting us- Rivera, D., Lillo, M., Uvo, C.B., Billib, M., Arumí, J.L., 2012. ing extreme learning machines: A case study in a semi-arid Forecasting monthly precipitation in Central Chile: a self- region in Iraq. Journal of Hydrology, 542, 603–614. organizing map approach using filtered sea surface tempera- http://dx.doi.org/10.1016/j.jhydrol.2016.09.035 ture. Theoretical and Applied Climatology, 107, 1–13. Zhang, X., Peng, Y., Zhang, C., Wang, B., 2015. Are hybrid https://doi.org/10.1007/s00704-011-0453-5 models integrated with data preprocessing techniques suita- Sang, Y.F., 2012. A practical guide to discrete wavelet decom- ble for monthly streamflow forecasting? Some experiment position of hydrologic time series. Water Resources Man- evidences. J. Hydrol., 530, 137–152. agement, 26, 3345–3365. https://doi.org/10.1007/s11269- http://dx.doi.org/10.1016/j.jhydrol. 2015.09.047 012-0075-4 Zeri, M., Cunha-Zeri, G., Gois, G., Lyra, G.B., Oliveira‐ Santos, C.A., Freire, P.K., Silva, R.M.D., Akrami, S.A., 2019. Júnior, J.F., 2019. Exposure assessment of rainfall to inter- Hybrid wavelet neural network approach for daily inflow annual variability using the wavelet transform. International forecasting using Tropical Rainfall Measuring Mission data. Journal of Climatology, 39, 568–578. Journal of Hydrologic Engineering, 24, Article Number: https://doi.org/10.1002/joc.5812 04018062. https://doi.org/10.1061/(ASCE)HE.1943- 5584.0001725 Received 15 January 2020 Shoaib, M., Shamseldin, A.Y., Melville, B.W., Khan, M.M., Accepted 13 November 2020 2016. A comparison between wavelet based static and dy- http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Hydrology and Hydromechanics de Gruyter

Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model

Loading next page...
 
/lp/de-gruyter/daily-rainfall-estimates-considering-seasonality-from-a-modwt-ann-2JSyDRmRtv

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
de Gruyter
Copyright
© 2021 Evanice Pinheiro Gomes et al., published by Sciendo
ISSN
0042-790X
eISSN
0042-790X
DOI
10.2478/johh-2020-0043
Publisher site
See Article on Publisher Site

Abstract

J. Hydrol. Hydromech., 69, 2021, 1, 13–28 ©2021. This is an open access article distributed DOI: 10.2478/johh-2020-0043 under the Creative Commons Attribution ISSN 1338-4333 NonCommercial-NoDerivatives 4.0 License Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model 1 2* Evanice Pinheiro Gomes , Claudio José Cavalcante Blanco Civil Engineering Graduate Program, Federal University of Pará – PPGEC/ITEC/UFPA, Av. Augusto Corrêa, 01, 66075-110, Belém, Brazil. School of Environmental and Sanitary Engineering, Federal University of Pará – FAESA/ITEC/UFPA, Av. Augusto Corrêa, 01, 66075-110, Belém, Brazil. Corresponding author. Tel.: +55 91 3201-8859. E-mail: blanco@ufpa.br Abstract: Analyses based on precipitation data may be limited by the quality of the data, the size of the available historical series and the efficiency of the adopted methodologies; these factors are especially limiting when conducting analyses at the daily scale. Thus, methodologies are sought to overcome these barriers. The objective of this work is to develop a hybrid model through the maximum overlap discrete wavelet transform (MODWT) to estimate daily rainfall in homogeneous regions of the Tocantins-Araguaia Hydrographic Region (TAHR) in the Amazon (Brazil). Data series from the Climate Prediction Center morphing (CMORPH) satellite products and rainfall data from the National Water Agency (ANA) were divided into seasonal periods (dry and rainy), which were adopted to train the model and for model forecasting. The results show that the hybrid model had a good performance when forecasting daily rainfall using both databases, indicated by the Nash–Sutcliffe efficiency coefficients (0.81–0.95), thus, the hybrid model is considered to be potentially useful for modelling daily rainfall. Keywords: Artificial Intelligence; Climate Prediction Center morphing; Dry and rainy periods; Amazon. INTRODUCTION knowledge of the behaviour of measurement processes. In fact, these methods rely primarily on information that is derived Analyses based on available precipitation data may be from existing hydrological and weather series sets and use a limited by the quality of the data, the size of the available “black box” approach to simulate underlying processes. historical series and the efficiency of the adopted According to Tealab et al. (2017), the dynamic behaviour of methodologies. An alternative used to overcome these most time series and their autoregressive moving averages limitations can be obtained by daily precipitation forecasting create the challenge of predicting nonlinear time series that models. Models used for forecasting variables, such as contain inherited moving average terms using AI precipitation, can be classified as conceptual or empirical. methodologies such as neural networks. They also emphasized Conceptual models are based on equations representing the the importance of formulating new neural network models with physical processes and evolution of atmospheric phenomena deep learning and hybrid methodologies. In attempts to that make up the climate system and may include the improve the prediction of data, such as precipitation data, the atmosphere, hydrosphere, biosphere and geosphere, for combination of models has been proven to be very example, global circulation models are conceptual models (Cuo advantageous. For example, the use of DWs to filter ANN- et al., 2011; Golding, 2014). Empirical models make input data is a good example of this combination. One of the mathematical adjustments of calculated values to observed first studies on this modelling technique was carried out by values without relating the physical behaviour of hydrological Partal and Kisi (2007). They established a predictive model processes (Siad et al., 2019). Examples of these models are integrating DWs and neuro-fuzzy techniques with daily autoregressive models (ARs), moving averages (MAs), precipitation data in the Mediterranean region; this hybrid is autoregressive moving averages (ARMAs), artificial neural considered a solid basis for modelling precipitation processes in networks (ANNs), decomposition wavelets (DWs), support the region. Kuo et al. (2010) investigated the seasonal vector regression (RVS), etc. predictability of rainfall using the DW-ANN approach, using The implementation of these daily precipitation forecasting seasonal data on rainfall and Pacific Ocean sea surface models presents difficulties due to the large number of temperatures (SSTs) and revealing a high coherence of DWs variables (climatic and geomorphological) that influence and ANNs in predicting the variables studied. precipitation trends and the lack of understanding about the role There are several studies in the literature on the application of precipitation trends in controlling the distribution of the of hybrid models to forecast daily precipitation. Partal and variables (Frumau et al., 2011; Gupta et al., 2017; Osborn et al., Cigizoglu (2009) decomposed time series of meteorological 2016). Therefore, due to the complexity of this problem, variables using DWs and combined them with ANNs (feed- models incorporating artificial intelligence (AI) methods are forward back-propagation, FFBP). The authors estimated daily potentially useful approaches for simulating the precipitation rainfall at various points in Turkey. They achieved the best process (Fahimi et al., 2017; Nourani et al., 2014). According results with the combination of DWs and ANNs when using ten to Sulaiman et al. (2018), this utility is due to the remarkable levels of decomposition. Kisi and Shiri (2011) adopted genetic flexibility of AI methods in modelling highly nonlinear systems programming (GP) and neuro-fuzzy (NF) techniques, isolated and stochastic patterns, and these methods do not require prior and combined with DWs, without informing the adopted Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco mother wavelet, using only three levels of decomposition to and those of the Climate Prediction Center morphing forecast daily precipitation in the region of Aegean, western (CMORPH) products, allowing the models to be applied even Turkey. The authors found that the models combined with in the absence of point rainfall monitoring by stations, since wavelets improved the forecast in comparison with the models CMORPH data have the advantage of obtaining spatialized used in isolation. Kisi and Cimen (2011) investigated the information and punctual precipitation. Therefore, the use of accuracy of the DW-hybrid model and of support vector this hybrid approach is novel, as it does not use conventional machines (SVRs) for monthly flow forecasting in the eastern DWs coupled with ANNs and instead considers the boundary Black Sea region. The results showed that the hybrid models conditions in the estimation of daily precipitation. In addition, with DWs and SVRs improved the flow forecast at the studied the modelling of daily amounts of precipitation is interesting stations, reducing the average absolute errors by up to 46% because, on this scale (daily), the precipitation patterns are when compared with the single SVR model, considering only more complex (a long time series with values of zero indicates three levels of wavelet decomposition. Partal et al. (2015) used a drought) and have greater versatility in their application. three types of ANNs (feed forward back propagation, FFBP, radial base function, RBF and generalized regression neural MATERIALS AND METHODS network, GRNN) integrated with DWs to forecast daily Study area and database precipitation at five points in the territory of Turkey. They decomposed the meteorological time series, using the DWs, The Tocantins-Araguaia Hydrographic Region (TAHR) is into ten levels with the Haar mother wavelet and concluded that located between 0° 30' S and 18° 05' S and between 45° 45' W the DW model with the FFBP was the most suitable for and 56° 20' W (Figure 1). Its configuration is elongated in a estimating precipitation in the studied locations. He et al. south-north direction, following the predominant direction of (2015) and Ramana et al. (2013) also implemented the use of the main watercourses (Tocantins and Araguaia Rivers), which DWs with ANN to forecast precipitation. The authors adopted a unite in the northern part of the region, from there, the river is monthly rainfall scale and different DWs, with the first called the Tocantins River until it reaches Marajó Bay. The adopting the maximum overlapping discrete wavelet transform TAHR's total area is 918,822 km², covering parts of the (MODWT) in addition to other methodologies with great Central-West, North and Northeast Regions of Brazil. computational requirements. The second model used only the According to the Brazilian Institute of Geography and Statistics conventional DW. However, neither considered the boundary (IBGE) (2014), the region occupies 11% of the national conditions of the DWs. territory, including areas in the states of Goiás (21.4%), DWs, when used alone, can study the frequency of a signal, Tocantins (30.2%), Pará (30.3%), Maranhão (3.3%), Mato making it possible to evaluate the behaviour of a time series. In Grosso (14.7%) and the Federal District (0.1%). The studied this sense, Zeri et al. (2019) used DWs to estimate the variance region is important for the development of the country due to of different frequencies present in a precipitation signal using a the Tucuruí Hydroelectric Power Plant (HPP), which provides rainfall dataset from the state of Tocantins in Central Brazil. electricity to most of the Brazilian regions. The region is also Their results showed that the northern region of the study area still important in mining and agribusiness. was under greater exposure to interannual variability in Gomes et al. (2018) identified three homogeneous regions of precipitation. The main advantages to using DWs in the various precipitation in the TAHR (Figure 1) by means of fuzzy studies mentioned are that DWs filter the input data (removing c-means clustering, presenting the rainfall rates of the basin as noise), decompose the data at various levels, and improve the decreasing from north to south, with an average of 2,838 mm in inputs in the proposed ANN model. However, none of the the northern, central and south-eastern basins, an average of studies mentioned considered the boundary conditions of DWs, 1,990 mm in the south basin, and an overall average of and previous studies rarely used the MODWT, which considers 1,989 mm of annual rainfall. Thus, in applying the hybrid the boundary conditions and reduces error in the estimates. model for daily precipitation estimations, one rainfall gauge Despite generating good results, there is still a gap regarding station was adopted in each of the homogeneous regions of the the use of wavelet functions and the levels of decomposition, TAHR, as the rainfall rates differ among each location, and most research is not concerned with examining different these differences may influence the model results. This families of available mother wavelets, which can improve the technique was applied to demonstrate that the proposed models decomposition of the adopted time series. However, it is a can be used in any of the three homogeneous regions. The popular technique to use a combination of DWs and ANNs in rainfall gauge stations presented an average daily precipitation the estimation of variables, including precipitation. The (ADP) of approximately 5.0 mm (Table 1 and Figure 1). Daily arrangement and proper interpretation of the estimates are not precipitation data from the CMORPH product were obtained always carefully examined; specifically, authors of previous for each station location. The choice of stations prioritized work do not consider the boundary conditions (originating from series with minimum failures (an average failure of 0.2% in the errors), which involve conventional DWs. This lack of total observed data), and the period of daily observations was consideration has generated estimates not consistent with 19 years, from 1998 to 2016. The estimated satellite reality (Du et al., 2017; Zhang et al., 2015). precipitation data come from the CMORPH product and are Thus, this study contributes to filling these gaps by available from the National Oceanic and Atmospheric evaluating the use of four families of wavelets, different levels Administration (NOAA) and National Center for of decomposition and boundary conditions that involve DWs. Environmental Prediction (NCEP). The rainfall recorded by the The objective is the development of a hybrid model involving ANA is timely and is recorded every 24 hours. The information the MODWT and ANNs (MODWT-ANN) that considers the produced by CMORPH has a spatial resolution of 8 km (at the boundary conditions in the decomposition of the historical equator) and is recorded every 30 minutes. These differences series. Such a hybridization configuration is still little-explored served as a motivation to use both databases and allowed for daily rainfall estimates, a fact that motivates this work. In for the possibility of data substitution in the absence of punctual addition, different rainfall databases are adopted, such as those monitoring, which is common in some places in the Amazon. observed occasionally by the Brazilian Water Agency (ANA) 14 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model Wavelet decomposition – WD wavelet transform (DWT) is preferably adopted in the decomposition of time series (Mehr et al., 2014; Ramana et al., Wavelet decomposition (WD) is a mathematical technique 2013). Among the existing TWs, the maximum overlapping used for time-series analysis and forecasting in many fields, discrete wavelet transform has been highlighted in the use of such as climatology, geophysics and hydrology (Rivera et al., time-series decompositions. This technique is highlighted due 2012). The central idea of WD is signal decomposition at to the potential of the MODWT to consider the boundary different time scales, defined as a set of basic functions conditions (BC) that involve the decomposition of data and avoid errors that can be introduced in the entire development of (φ (t)) , which can be generated by translating and scaling the ab , the proposed forecasting model. In their works, Percival and mother wavelet ψ(t)according to Equation (1): Walden (2000), Bašta (2014) and Du et al. (2017) demonstrate how BCs influence the decomposition of time series and how 1 tb −  φ t =ψ , a > 0, –∞ < b < ∞ (1) BCs can produce incorrect predictions if they are not treated () ab ,  a  properly. However, the use of DWT associated with BC presents where a is the scaling parameter that adjusts the wavelet three problems: a) future data - when the adopted TW requires expansion, and b determines the location of the wavelet observations of future time series to perform decomposition in (Daubechies, 1992). The mother wavelet can be thought of as a the present, it makes the decomposition unfeasible if the future short-lived wave that grows and decays over a limited period, data are not available. Therefore, TW that incorporates future which is crucial for the good performance of the data should be avoided (as in the case of DWT); b) inadequate transformation. Depending on the wavelet chosen, the method selection of decomposition levels - when selecting a very long will filter out specific information during the process, revealing wavelet filter and a very high level of decomposition, leaving information from the original data, such as trends, few scale coefficients and wavelets (free of BC-related disintegration points and discontinuities that the raw signal does uncertainties) to calibrate the forecast model; c) partitioning of not expose (Holdefer and Severo, 2015). the calibration and validation set - when the time series records Wavelet theory is divided into two types of wavelet used in the calibration and validation of the model are not transformations (TWs): continuous wavelet transform (TWC) sufficient to allow adequate training of the model. More details and discrete wavelet transform (DWT) (Addison et al., 2001; can be found in Percival and Walden (2000). Daubechies, 1992); however, as hydrometeorological data are generally recorded in discrete time intervals, the discrete Table 1. Data from the rainfall gauge stations from ANA (E1, E2 and E3) and average daily precipitation data from the CMORPH product. ID Rainfall gauge Latitude Longitude Number of Failures (%) ADP ADP Period station failures ANA CMORPH E1 Badajós –2.51 –47.77 1 0.0 6.3 5.5 19 years E2 Colônia –7.88 –48.88 2 0.0 4.6 4.2 1998–2016 E3 Palmeirópolis –13.04 –48.40 98 2.0 4.0 3.1 ID = Identification of the rainfall gauge stations; ADP = Average daily precipitation. 70°0'0"W 60°0'0"W 50°0'0"W 40°0'0"W RR AP AM PA MA CE RN PB PI PE AC SEAL RO MT TO BA Southern America DF GO MG ES MS SP RJ PR SC RS 70°0'0"W 60°0'0"W 50°0'0"W 40°0'0"W Coordinate System: WGS-1984. Projection: Transverse Mercator. Datum: D-WGS-1984. Source: IBGE/Gomes et al. (2018). Fig. 1. Homogeneous precipitation regions in the TAHR and the locations of rainfall gauge stations used in the application of the hybrid model for daily rainfall estimates. 30°0'0"S 20°0'0"S 10°0'0"S 0°0'0" 30°0'0"S 20°0'0"S 10°0'0"S 0°0'0" Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco Maximum overlap discrete wavelet transform – MODWT et al. (2010), there are several types of ANNs whose purpose is to train the behaviour of a given variable. The main differences The MODWT definition is derived from the DWT defini- among these types lie in the type of network architecture and the learning process characteristics of each type. The best tion. The DWT filter is () h and the scale filter is () g , jk , jk , known types are perceptrons (Ps), multilayer perceptrons where k =1,..., L; L is the length of the filter and j is the level of (MLPs), Adalines (A) and radial basis networks (RBNs), which decomposition. The wavelet filter in the MODWT (h ) and can be expressed in mathematical terms by Equations (6–7): jk , the scale filter in the MODWT (g ) are defined as jk , uw=⋅ x−θ (6) ii i =1 h g jk ,  jk , h = and g = (Percival and Walden, jk , jk , j /2 j /2 2 2 yg = u (7) ( ) 2000). Then, the MODWT wavelet coefficients of level j are defined as the convolution of the time series (Xt), and the filters where w is the weight associated with the i-th entry; x are the i i in the MODWT are obtained through Equations (2) and (3): network entries; θ is the activation threshold; u is the result of the difference between the linear combiner and the activation k −1   Wh = X (2) threshold; g(u) is the activation function; and y is the final value jt,,  j k tk − modN k =0 produced by the neuron from a set of input signals. The ANN k −1 used refers to the time-delay neural network (TDNN), proposed  (3) Vg jt , = X j, k t − k modN k =0 by Lang and Hinton (1988). This network involves feedforward architecture (without feedback from the first layer of the neu-   where W is the wavelet coefficient; V is the scale coeffi- jt , jt , rons’ outputs), where the prediction of later values from time t associated with the process behaviour is computed as a function cient; and modN is the operation module when treating the of the knowledge of their previous values (Equation 8): historical series as periodic, with periods equal to N, and K can be obtained by Equation (4). x(t) = f (x(t–1), x(t–2), ..., x(t–n )) (8) where n is the order of the predictor, i.e., the number of past KK =− 21 −1+1 (4) () ( ) measurements (samples) that will be required to estimate the value x(t). This type of arrangement (Figure 2) is called a fo- K is the wavelet and scale coefficient number affected by BC, cused time-lagged feedforward network (TDNN), where the time delay acts as memory, ensuring that previous samples that for the decomposition level J and a wavelet filter length level reflect the temporal behaviour of the process are always insert- . Thus, through this equation, it is possible to obtain the wave- ed into the network without the need for feedback from the let and scale coefficients “corrected by the limits”, i.e., those network outputs. that avoid adding uncertainty to the wavelets and the scale coefficients due to the “future data” problem (Bašta, 2014; Percival and Walden, 2000). The MODWT uses a high pass filter () h to calculate its wavelet coefficients and applies an iterative construction of the time series ( t), which can be re- constructed using Equation (5).   X = WV jt,, + jt (5) The MODWT decomposition is performed on a series of data, and the type of filter (wavelet), the level of decomposition and the limit are selected; the limit can be either periodic or reflective. If the limit is periodic, the resulting wavelet and scale coefficients are calculated without duplicating the original series, treating (Xt) as if it were circular. If the limit is reflec- tive, a new series is reflected at twice the length of the original Fig. 2. PMC topology with time-delayed entries. series. In the present study, a periodic limit was adopted, and three types of wavelet families, Haar, Daubechies (d4 and d6) The adopted TDNN has three layers with the following and Least Asymmetric (la14), were selected, as they are the configuration: TDNN (a, b, c, d), with a hidden layer, where a most usual in data series analyses (Guimarães Santos e Silva, is the number of neurons in the input layer; b is the network 2014; Maheswaran and Khosa, 2012; Sang, 2012; Santos et al., delay; c is the number of neurons in the hidden layer; and d is 2019) and for carrying out diversified decompositions. the number of neurons in the output layer, in which the inputs are formed by the daily precipitation values. TDNN training Artificial neural network – ANN consists of inputting the data, the parameters and the weights of the neurons to be adjusted, according to the behaviour of the ANNs are models inspired by the functioning of biological time series, by the process of successive approximations. Thus, neurons and aim to learn a particular system and reproduce it. the following parameters were defined in this study: a) a hidden layer; b) the stopping criterion for a maximum number of 1,000 The construction of an ANN is obtained by the interconnections iterations; c) the learning rate, ranging from 0.025 to 1.0; d) the of neurons, which are layered and are initially connected by providing a stimulus to the model (the data inputs), then calcu- Levenberg-Marquardt algorithm, a learning algorithm; e) the lating the output and adjusting the weights until the desired hidden layer transfer function (log sigmoid); and f) the linear output is achieved (dos Santos et al., 2016). According to Silva output function. 16 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model To observe the influence of seasonality on the model re- repeatedly with different types of neural network structures sponse, daily rainfall data from the ANA and CMORPH sta- until a setting that best fit each daily data estimate was found. tions were organized into the rainy period (RP, November, Past records were trained to predict future days, considering a December, January, February, March and April) and the dry 14-year training period and a 5-year forecast. The initial num- period (DP, May, June, July, August, September and October). ber of days needed for the data to be considered for network These seasonal periods correspond to those identified by Falck entry ranged from 2 to 30 days. The number of neurons found et al. (2015), with six months for each period at the TAHR. The in the hidden layer ranged from 1 to 45, and only one output RP was composed of 3,444 daily precipitation data points col- neuron was found. Therefore, the hybrid model that was finally lected from November to April of 1998 to 2016 and was divid- established ensured the best predictive ability. ed into 2,584 days for calibration (02/1996 – 01/2012) and 860 days for validation (02/2012 – 12/2016). The DP was composed Performance criteria of 3,496 daily precipitation data points (May to October of 1998 to 2016) and was divided into 2,624 days for calibration Model performance was assessed using statistical parame- (05/1998 – 06/2012) and 872 days for validation (07/2012 – ters, which are used to quantify the agreement between ob- 10/2016). The series was organized such that 75% of the data served and estimated data. In this study, we used three classic was used for network calibration and 25% of the data was used criteria, the mean square error (MSE), the Nash–Sutcliffe coef- for series validation and delay in the network structure for- ficient (Nash) and the Bias, represented by Equations (10), (11) mation (Nourani et al., 2017; Shoaib et al., 2016). In addition to and (12), respectively, in addition to statistics such as the dis- these considerations, the input data for the tests with the ANNs persion, mean, standard deviation and extreme values of the were selected according to the result of the MODWT decompo- data. The equations are as follows: sition, which will be discussed later. n 2 In ANN processing, the data were standardized (Equation MSE=− X Y ) (10) obs i =1 (9)) because, according to Silva et al. (2010), when working n with ANNs, standardization implies scaling the samples to the dynamic range of the hidden layer activation functions (typical- YY − () obs est Nash =− 1 (11) ly represented by the logistic or hyperbolic tangent function) to YX − () avoid neuron saturation; this method was also adopted by obs Yaseen et al. (2016) and Nourani et al. (2017). The calculation is as follows: 1 Bias=− Y Y (12) () obs est i =1 PP − i min P = (9) pad where n is the number of samples, Y is the observed precipi- PP − obs max min tation, Y is the estimated precipitation and is the average est observed precipitation. The best-performing models are those where P is the standardized precipitation; P is the precipita- pad i with low MSE values and Nash values close to 1 (Chai and tion to be standardized; and P and P are the smallest and min max Draxler, 2014; Nash and Sutcliffe, 1970). largest values observed in the precipitation series, respectively. In short, the methodology consisted of the following: - obtaining the data series organized in a certain period (dry or WD-ANN hybrid modelling rainy); - data standardization (Equation (9)), wavelet decomposition The hybrid model consisted of the WD-ANN combination, and training with different ANN configurations. with the original precipitation series passing through a decom- - simulation of the data series by an ANN and a comparison of position filter (WD), which acted as a signal filter and sought to the inputs (observed) with the output (simulated) results. transform and correct the input data. These inputs were then subjected to ANN training, in which the inputs were evaluated Fig. 3. Scheme of the proposed methodology for obtaining daily rainfall estimates using WD–ANN. Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco For each rainfall gauge station, an ANN was defined that ever, when choosing a filter, other parameters are also associat- served to estimate the daily data in the dry and rainy periods ed with it. Thus, according to the tests performed in this study, from the ANA and CMORPH data series. A scheme of the the factors that most influenced the simulations were the level adopted methodology can be observed in Figure 3. of decomposition and the length of the wavelet, the best model, with a medium level (6) and more smoothness adjustment and RESULTS AND DISCUSSION consideration of the boundary conditions, performed a moderate Wavelet decomposition by the MODWT and admissible adjustment to the decomposition of the precipi- tation data. The longer filter length (14) did not present higher The MODWT was used to decompose the precipitation se- quality estimations and could remove a much larger number of ries and generate the “limit-corrected” wavelet and scale coeffi- wavelet coefficients adjusted by the BCs, compromising the cients. This decomposition was accomplished by adopting three amount of input data in the simulation of the model with the RNA. types of wavelet families and three levels of decomposition, denoted as j (4, 6 and 8). The filter length varied according to the adopted wavelet family (Bašta, 2014). The maximum level of decomposition was defined as six (J = 6), and the length max of the wavelet filters (K) was 2 for the Haar wavelet, 4 for the d4, and 14 for the la14. Thus, using a maximum level of de- composition equal to six and a K equal to 4, we found that K = (2 –1)(4 – 1) + 1; therefore, K was equal to 190 coefficients affected by the limit of j (this practice was also adopted, for j = 4 and 8 and for K = 2 and 14). Therefore, the first 190 records of input data from the stations were removed after the decom- position was completed. Thus, limit-corrected wavelets and scale coefficients were defined before the input variables were selected for calibration and model validation using ANNs. The smallest errors and the highest Nash value resulting from the validation of the use of ANNs indicated the best MODWT- ANN structure for forecasting daily precipitation (Table 2). The large MSE and Nash values obtained after ANN pro- cessing showed that the wavelet decomposition by the Haar filter was not accurate in decomposing the daily precipitation series (Table 2). This result may be related to the fact that the Haar filter has decomposition properties that are aimed at series a) Haar, j=4. that show very sudden changes (Mallat, 2009), which does not happen in homogeneous rainfall series because in dry or rainy periods (lack of rain or constantly rain), there are no considera- ble sudden changes. The Daubechies wavelet (d4) was able to decompose the seasonality element of the time series more efficiently than the Haar filter. The results of the d4 for levels 6 and 8 showed small errors and a high Nash value. The good performance of the Daubechies wavelet (d4) in the decomposition of variables can be explained, according to Maheswaran and Khosa (2012), by the ability of the d4 to smooth the signal and locate the tim- ing and frequency of the data. These abilities are necessary when analysing precipitation series, which present temporal complications. The la14 wavelet (which was less asymmetric), when com- bined with the ANN, also showed good results, with small errors and a high Nash value. The performance of the la14 in relation to the d4, using lengths (L) 6 and 8, did not differ con- siderably. This finding shows that the increase in the filter length, in this case, did not induce considerable improvements in the estimations and that the use of the d4 filter with level 6 is sufficient to obtain good decomposition of the signal. As shown in Figure 4, the decomposition of the historical series of station b) d4, j=6. E1 (Figure 1) by the d4 filter managed to better smooth the Fig. 4. - Decomposition by the MODWT, using the Haar, d4 and la14 precipitation signal (X) than did the other filters, showing more wavelet filters, in which each W series represents the wavelet coeffi- details of the signal with the wavelet coefficient (W ), with six cients on a specific scale, and V represents the scale coefficients. levels, in relation to the Haar, which incorporated four levels (W ). The choice of the best filter, according to Zhang et al. To avoid errors and bypass the BCs, it is necessary to (2015), must be the one with the most decomposition character- choose adequate wavelets and sufficient input data for training istics of the studied series, that is, the best filter is the one in and forecasting (Du et al., 2017; Ramírez-Hernández et al., which the signal is best represented in the decomposition. How- 2016). Thus, in the selection of the precipitation series, three 18 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model wavelet filters and three decomposition levels were tested by The configurations of the networks are quite different for removing the values that interfere with the coefficients affected each station. This observation can be verified when comparing by the j limit and by adjusting the division of the number of the values of the number of neurons in the input and output data used in the calibration and validation. Of the 3,444 days of layers and the delay of each station in the different periods the rainy season, 190 were removed, leaving 3,254, which were (Table 2). This difference may be explained by the uneven further divided into 2,440 days used for calibration and 814 occurrence of rainfall in the rainy and dry periods and by the days used for validation. In the dry period, there were 3,496 influences of different biomes on each station. Station E1 is in a days (of which 190 were removed), divided into 2,480 days region of the Amazon biome, E2 is located in a transition area used for calibration and 826 days used for validation. This between the Cerrado and Amazon biomes, and E3 is in the division corresponds to approximately six and a half years for Cerrado (Brazilian biome). The interactions between the mete- calibration and two years and two months for validation. With orological systems that operate in each biome and the altitude these divisions, it was possible to filter the data series, leaving of each station reflect the behaviour of the spatial and temporal them free of uncertainties related to BCs and adjusting adequate precipitation values in the basins, as observed in studies by numbers of input data for the training and validation of the Levy et al. (2017), Oliveira-Junior et al. (2017) and Teodoro et neural networks. al. (2016). Zeri et al. (2019) investigated the interannual variability of Neural network simulations drought-associated rainfall using the wavelet transform and found that the northern region of the state of Tocantins is influ- Two neurons in the input layer were the most favourable enced more by interannual variability in drought events than the and recurrent in the ANN tests, while the most frequent net- surrounding regions are, indicating that some stations are af- work delay was determined to be 4 days. In the series consisting fected by continuous drought events. In fact, the occurrence of of the dry and rainy periods, at the three stations involved (Fig- rainfall is quite heterogeneous among the stations. ure 1), the delays were 2, 4, 5, 6, 10 and 12 days late. Another From the performance coefficients obtained from the data observation concerns the tendency of networks to require a validation, it can be observed that the hybrid model made satis- smaller number of neurons in the input layer (ranging from 2 to factory predictions for both the ANA and CMORPH series, the 12) and a higher number of neurons in the hidden layer (between MSE values were small (less than 0.88) and the Nash coeffi- 10 and 35) (Table 2). This fact shows that the ANN adopted has cients were close to 1 when evaluating the series of the three a better performance than the other ANNs, with 10 or more stations considered in the study, E1, E2 and E3 (Figure 1), with hidden layer neurons, to estimate daily rainfall in the TAHR. the exception of the use of the Haar wavelet (Table 2). Table 2. MODWT-ANN model results. Result of the MODWT-ANN model - ANA data Result of the MODWT-ANN model - CMORPH data Season Wavelet Level (j) RNA MSE Nash Season Wavelet Level (j) RNA MSE Nash Haar 4 2 7,2,12,1 0.55 0.5 Haar 4 2 2,7,12,1 0.47 0.52 d4 6 4 7,2,12,1 0.15 0.91 d4 6 4 2,7,12,1 0.16 0.94 E1- Rainy E1- Rainy d4 8 4 7,2,12,1 0.17 0.91 d4 8 4 2,7,12,1 0.26 0.93 la14 6 14 7,2,12,1 0.18 0.91 la14 6 14 2,7,12,1 0.18 0.94 Haar 4 2 2,4,10,1 0.72 0.56 Haar 4 2 2,4,10,1 0.72 0.88 d4 6 4 2,4,10,1 0.14 0.83 d4 6 4 2,4,10,1 0.18 0.83 E1- Dry E1- Dry d4 8 4 2,4,10,1 0.17 0.83 d4 8 4 2,4,10,1 0.21 0.98 la14 6 14 2,4,10,1 0.20 0.82 la14 6 14 2,4,10,1 0.21 0.99 Haar 4 2 2,4,15,1 0.27 0.71 Haar 4 2 6,6,10,1 0.47 0.47 d4 6 4 2,4,15,1 0.14 0.90 d4 6 4 6,6,10,1 0.13 0.95 E2- Rainy E2- Rainy d4 6 6 2,4,15,1 0.15 0.89 d4 8 6 6,6,10,1 0.16 0.95 la14 6 14 2,4,15,1 0.19 0.88 la14 6 14 6,6,10,1 0.18 0.96 Haar 4 2 2,4,15,1 0.72 0.57 Haar 4 2 2,5,12,1 0.81 0.68 d4 6 4 2,4,15,1 0.13 0.90 d4 6 4 2,5,12,1 0.13 0.85 E2- Dry E2- Dry d4 8 4 2,4,15,1 0.17 0.88 d4 8 4 2,5,12,1 0.14 0.85 la14 6 14 2,4,151 0.20 0.89 la14 6 14 2,5,12,1 0.16 0.85 Haar 4 2 3,4,20,1 0.51 0.78 Haar 4 2 2,4,20,1 0.68 0.74 d4 6 4 3,4,20,1 0.17 0.91 d4 6 4 2,4,20,1 0.07 0.95 E3- Rainy E3- Rainy d4 6 4 3,4,20,1 0.32 0.89 d4 8 4 2,4,20,1 0.18 0.95 la14 6 14 3,4,20,1 0.18 0.89 la14 6 14 2,4,20,1 0.14 0.95 Haar 4 2 4,6,35,1 0.89 0.56 Haar 4 2 4,4,25,1 0.88 0.64 d4 6 4 4,6,35,1 0.12 0.81 d4 6 4 4,4,25,1 0.14 0.85 E3- Dry E3- Dry d4 8 4 4,6,35,1 0.13 0.80 d4 8 4 4,4,25,1 0.16 0.84 la14 6 14 4,6,35,1 0.14 0.79 la14 6 14 4,4,25,1 0.18 0.84 Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco Table 3. Observed and simulated series statistics. Regarding the daily rainfall estimates, in general, the rainy period was better estimated by the hybrid model. This result may be related to the greater number of zeroes in the dry peri- Rainfall gauge stations with ANA data od; zeroes indicate scarce information in the samples, which Dry Rainy Station Data r days days µ. ẟ Bias Max may cause them to be insufficient for the neural network to learn and, consequently, may be insufficient for simulating E1- Obs 318 496 8.7 14.9 100 0.96 1.7 Rainy future data. This issue may also be related to the interannual Sim 180 633 8.0 13.9 99 variation in precipitation, as the sequences of dry days require Obs 584 242 2.9 8.3 120 E1 – 0.81 2.05 much more information from previous days than sequences of Dry Sim 442 384 2.9 6.9 100 rainy days; this explanation was suggested in the work of Wilks Obs 368 444 7.2 15.2 123 (1999). In addition, Hellassa and Souag-Gamane (2019) E2 - 0.97 0.74 Rainy Sim 58 756 6.6 14.1 119 demonstrated that removing noise in the precipitation series Obs 706 121 1.6 6.9 82 through wavelet decomposition induces a significant improve- E2 - 0.92 1.82 ment in the daily rainfall estimations, and, depending on the Dry Sim 554 272 1.4 5.8 81 adopted wavelet function, rainy periods may have a better re- Obs 437 377 7.4 14.9 119 E3 - 0.91 0.53 sponse than dry periods. Thus, while the wavelet method decom- Rainy Sim 125 689 7.2 13.2 119 poses the signal and incorporates the seasonal characteristics of Obs 722 104 1.1 4.5 34 the rainfall series, long periods of dry days need more attention in E3 - 0.92 3.5 Dry Sim 603 223 0.9 3.3 30 the modelling of daily. Despite this, and considering the results, Rainfall gauge stations with CMORPH data the developed hybrid model estimated the daily rainfall in the Dry Rainy evaluated locations well, but on some days, the precipitation was Station Data r days days µ. ẟ Bias Max underestimated by the model, as evidenced in the graphs compar- E1- Obs 246 568 6.8 11.2 89 ing the observed and estimated precipitation values (Figures 5, 6 0.96 0.86 Rainy and 7) for both the ANA and CMORPH data. Sim 150 664 6.0 10.5 86 Other studies that used CMORPH data to forecast rainfall Obs 534 292 2.0 4.9 39 E1 - 0.89 1.5 in the TAHR obtained good results with the introduction of Dry Sim 300 526 1.6 4.1 37 satellite data, such as those reported by Falck et al. (2015) and Obs 223 591 6.4 10.6 74 E2 - 0.93 2.0 Germano et al. (2017). In the study by Germano et al. (2017), Rainy Sim 64 750 5.2 9.5 72 the CMORPH series, when inserted in the forecast model (sto- Obs 664 162 1.3 5.1 75 chastic - SREM2D), overestimated the daily rainfall in the E2 - 0.64 1.5 region. Thus, the input data may respond differently in each Dry Sim 307 519 1.9 4.8 73 model. However, underestimation of daily rainfall was not Obs 296 518 4.7 8.6 74 E3 - 0.90 0.49 observed in the literature for the analysed region. Rainy Sim 73 741 4.5 7.1 69 Obs 673 153 0.9 4.2 56 E3 - 0.85 3.2 Statistical parameters of the observed and simulated data Dry Sim 570 256 0.7 3.6 50 The statistical parameters of the observed and simulated da- r - correlation coefficient; ẟ – standard deviation; µ – average; max ta from ANA and CMORPH are shown in Table 3. Good corre- - maximum. lation is obtained between the observed and simulated data, with r values ranging between 0.81 and 0.96, with the excep- In the rainfall estimate for station E1, the Nash value was tion of the data from the E2 station in the dry season (r = 0.64). closer to 1.0 for the rainy season than it was for the dry season The total numbers of dry days in the simulated series are all for both the ANA and CMORPH series (Figure 5). The results below those observed. This result is due to the approximation obtained for the rainfall gauge station E2 were similar to those of the zero value by the ANN model and to the number of more obtained for station E1, but the differences in the Nash values recurrent rainy days in the simulated series than in the observed were greater between the dry and rainy periods for both data series, as there are more records of precipitation (values above series (ANA and CMORPH). However, the simulations zero). The rainfall gauge station with the most rainy days is E1, performed well, with Nash values between 0.72 and 0.95, when with 496 and 568 days, and the station with the fewest days, considering data from ANA and CMORPH (Figure 6). At E3, recorded 104 and 153 days, according to the ANA and rainfall gauge station E3, the highest Nash value occurred in the CMORPH data, respectively. rainy season (ANA series), with an increase of 0.1 between the The means, standard deviations and maximum values of the rainy and dry periods, indicating that the proposed ANN had a observed series were higher than those of the simulated series, better response for the rainy season than it did for the dry which is related to underestimation by the model, as confirmed in season (Figure 7). Figures 5, 6 and 7 and in Table 3. It is important to highlight the The behaviour of precipitation in the TAHR is quite result that, in the dry period, the ANA series, despite having diverse, as demonstrated by Gomes et al. (2018), who identified higher averages (indicating more rainy days than did the three homogeneous precipitation regions. The models proposed CMORPH series), presented equal Nash coefficients for rainfall here were applied to each of these regions to knowledge the gauge stations E1 (0.83) and lower Nash coefficients for the regional differences. The results obtained were well diversified station E3 (0.85 and 0.81) compared to the CMORPH data (Ta- and confirmed mainly by the structure of the RNA, which ble 2). In addition, the simulations using ANA data for the dry required a much larger number of neurons (25 and 35) in the period showed greater errors related to Bias of 2.05, 1.82 and 3.5 hidden layer (c) at the E3 station, which is located in the at stations E1, E2 and E3, respectively. In the dry period, using southern part of the TAHR, and lower numbers of neurons in the CMORPH data, the Bias values were equal to 1.5 (E1), 1.5 the central and northern areas (Table 2). Chau and Wu (2010), (E2) and 3.2 (E3) (Table 3). Therefore, the ANA series did not making daily rainfall forecasts with ANNs, also pointed out that provide an advantage over the CMORPH data in the dry period. a greater or lesser frequency of precipitation directly affected 20 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model a) E1 – Rainy period – CMORPH b) E1 – Rainy period – CMORPH c) E1 – Dry period – CMORPH d) E1 – Dry period – CMORPH f) E1 – Rainy period – ANA e) E1 – Rainy period – ANA g) E1 – Dry period – ANA h) E1 – Dry period – ANA Fig. 5. Graph of the observed and estimated precipitation values and the dispersion for the rainfall gauge station E1. the performance of the ANNs. This relationship between the Seasonal precipitation indexes behaviour of the daily precipitation and the network structure shows the importance of prior knowledge of the conditions of The rainfall indexes of stations E1, E2 and E3 were well- the studied variable, especially in places with large dimensions defined in space and time. In relation to this characteristic, we such as the TAHR. In addition, although we used only three observed that the biggest errors resulting from bias occurred at rainfall gauge stations and one type of ANN, when considering station E3 (3.5 and 3.2), which had the lowest rainfall index, model limitations, our study had the advantage of incorporating both for ANA and CMORPH data (Table 3). the regional rainfall characteristics of homogeneous regions Although the rainy period was better simulated with biases that were represented by each of the three rainfall gauge sta- between 0.49 and 1.7, the extreme values in the three homoge- tions distributed in the area of the TAHR. neous regions were not as well estimated. This result can be Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco a) E2 – Rainy period – CMORPH b) E2 – Rainy period – CMORPH c) E2 – Dry period – CMORPH d) E2 – Dry period – CMORPH e) E2 – Rainy period – ANA f) E2 – Rainy period – ANA g) E2 – Dry period – ANA h) E2 – Dry period – ANA Fig. 6. Graph of the observed and estimated precipitation values and the dispersion for the rainfall gauge station E2. explained by the fact that this type of occurrence is influenced by their use in long-term planning, mainly for activities that re- atmospheric events, which are not considered by the model; the quire previous establishments of precipitation, such as agricul- model is based only on the observed historical series. Accord- ture, water supply and energy generation. There are still ad- ing to Nerantzaki and Papalexiou (2019), the lower predictabil- justments to be considered, such as testing at other points of the ity of extreme precipitation events is still a challenge in the TAHR, the use of different types of RNA, and introducing literature, and specific methods are required for the modelling other climatic variables. of extreme events. The performances of the models previously adjusted for Regarding the simulated forecasts for the seasonal periods each rainfall gauge station were also evaluated by the simula- (each spanning six months), we highlighted the possibility of tion of 1,460 daily rainfall data, corresponding to the years 22 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model a) E3 – Rainy period – CMORPH b) E3 – Rainy period – CMORPH c) E3 – Dry period – CMORPH d) E3 – Dry period – CMORPH e) E3 – Rainy period – ANA f) E3 – Rainy period – ANA g) E3 – Dry period – ANA h) E3 – Dry period – ANA Fig. 7. Graph of the observed and estimated precipitation values and the dispersion for the rainfall gauge station E3. 2013 to 2016. With these simulations, the model's ability to number of consecutive days of rainfall occurring or not estimate the daily seasonal averages and the maximum length occurring, these predictions are crucial for the management of of the dry and rainy periods was evaluated. The quantile- water demands. The metrics are plotted for all observations and quantile graph was also used to verify whether the distribution simulations for each rainfall gauge station. As the model of the generated synthetic series was adjusted to the observed already considers seasonality and evaluates the rainy and dry data series. This approach intended to verify whether the model periods of the studied region separately, the graphs are could generate daily rainfall data predictions, and the maximum organized in this way. Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco a) E1 – Rainy b) E1 – Dry c) E2 – Rainy d) E2 – Dry e) E3 – Rainy f) E3 – Dry Fig. 8. Daily and seasonal averages (µ) observed and simulated for each rainfall gauge station: E1 (a, b), E2 (b, c) and E3 (e, f). The daily averages of each month of the dry and rainy peri- the number of rainy days and underestimated the number of dry ods and the seasonal averages of each year are organized in the days. However, there was no discrepancy between the graphs in Figure 8. Thus, for the average daily precipitation, the maximum rainfall values; thus, the model had good reproduced quantities showed percentage errors between the performance when simulating extreme rainfall. Among the wet simulated and observed values of 13%, 9% and 9% in the rainy months (NDJFMA), the months in which the most rainfall was season and 14%, 10% and 11% in the dry season for the rainfall recorded were February (E1 and E3) and January (E2). Among gauge stations E1, E2 and E3, respectively. For the seasonal the dry months (MJJASO), July and August presented greater averages, the differences were 8%, 7% and 4% in the rainy droughts than did the other months, mainly at the rainfall gauge season and 9%, 12% and 10% in the dry season for the rainfall stations E2 and E3, which registered rainfall amounts less than gauge stations E1, E2 and E3, respectively. These results show 2 mm, and the maximum number of consecutive dry days was that the maximum daily and seasonal average rainfall amounts above 80. were underestimated for all rainfall gauge stations in the dry The variability of rainfall in the Amazon region is quite dy- season and were better simulated in the rainy season (Figure 8). namic, and according to Wang et al. (2018), who evaluated the This result reinforces the idea defended by Wilks (1989) that behaviour of precipitation in South America in the rainy period the prediction of daily models depends on the total amount of (DJFMAM), the total precipitation can vary greatly in the precipitation in a month and on the monthly and seasonal tran- months from December to May. Reichle et al. (2017) indicated sitions. Therefore, higher amounts of monthly rainfall, which that the precipitation bias is strong in many tropical and sub- results from higher numbers of rainy days, promote better re- tropical areas, especially in South America, during December, sults. This average quantification of rainfall helps in the plan- January and February (DJF). In addition, in the studies by ning of forest management, as rainfall volumes far above or Gloor et al. (2013), Espinoza et al. (2016) and Latrubesse et al. below a certain value can favour or hinder the development of (2017), there is increasing trends towards the occurrence of vegetation (Bonal et al., 2016). extreme seasonal droughts and floods in this region. Therefore, Figure 9 shows the maximum number of rainy and dry days, the current work, even if in a simplified way, can assist in fu- the maximum amount of precipitation occurring in the dry and ture estimates of prolonged droughts and intense rains. These rainy periods, and the driest and rainiest months observed and estimates are important for activities such as agribusiness, simulated between 2013 and 2016. The model overestimated supply, energy generation, etc. 24 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model b) E1 - Dry a) E1 - Rainy c) E2 - Rainy d)E2 - Dry e) E3 - Rainy f) E3 - Dry Fig. 9. Maximum length of dry or rainy periods and maximum rainfall observed and simulated in the rainy and dry seasons for each rainfall gauge station: E1 (a, b), E2 (c, d) and E3 (e, f). Figure 10 shows the quantile-quantile graphs for each observed and estimated values allow the confirmation of a good rainfall gauge station in the different periods (dry and rainy). simulation. For the daily series, the numbers of neurons in the According to Gnanadesikan (2011), the closer the points are to hidden layer above layer 9 are more favourable in building the the diagonal line, the closer the generated data is to the ANN. The initial input amounts and the network delays are also theoretical distribution and the better the model performance is. determining factors in the model responses. Considering the Thus, from the graphical results, it is evident that the model data types, both the ANA and CMORPH series performed well better simulates the rainy periods than the dry periods; in the when estimating daily rainfall. However, regarding the rainy periods, the simulated and observed distributions fit the CMORPH series, the simulation of the rainy period was better line better than in the dry periods (Figure 10a, 10c and 10e). than that of the dry period. The results show the benefits of This graphical approach confirms what had already been wavelet decomposition in estimating daily rainfall through the perceived by the previous analyses in this study regarding the combination of wavelet decomposition with an ANN. inferior performance of the MODWT-ANN model for the dry Moreover, the good reproduction of the daily rainfall averages period. and the lengths of dry and humid consecutive days on seasonal scales show the potential for the model to be used in the control CONCLUSIONS of floods and droughts, in the quantification of favourable rainfall conditions for agriculture and in the supply of water The performance of the MODWT-ANN model was resources. The good results of this hybrid approach stimulate evaluated for daily precipitation forecasts using different the development of new research on this subject. For instance, databases. The model presented better results when the one way to improve the results achieved is to test other types of Daubechies filter (d4) was applied in the decomposition of the ANNs and to incorporate variables such as temperature and precipitation series and the model was coupled to RNA, mainly humidity into the developed model. These variables also for the rainy season series. For the application of this technique, influence the production of rainfall and were not analysed in the the input information is crucial for the model performance, present study. which must be tested numerous times, and the results of the Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco b) E1 - Dry a) E1 - Rainy c) E2 - Rainy d) E2 - Dry e) E3 - Rainy f) E3 - Dry Fig. 10. Quantile-quantile graphs of the observations and simulations in the wet and dry periods for the stations E1 (a, b), E2 (c, d) and E3 (e, f). Acknowledgements. The authors thank the ANA and NOAA for nomica Pragensia, 2014, 48–70. the available precipitation data. The authors would also like to Bonal, D., Burban, B., Stahl, C., Wagner, F., Hérault, B., 2016. thank the Coordination for the Improvement of Higher Educa- The response of tropical rainforests to drought - lessons from tion Personnel of Brasil (CAPES), Finance Code 001. The recent research and future prospects. Annals of Forest Sci- second author would like to thank CNPq for funding the re- ence, 73, 27–44. DOI: 10.1007/s13595-015-0522-5 search with a productivity grant (Process 303542/2018-7). We Chai, T., Draxler, R.R., 2014. Root mean square error (RMSE) would also like to thank the office for research (PROPESP) and or mean absolute error (MAE)? – Arguments against avoid- Foundation for Research Development (FADESP) of the Fed- ing RMSE in the literature. Geoscientific Model Develop- eral University of Pará through grant no. PAPQ 2019 and 2020. ment, 7, 1247–1250. DOI: 10.5194/gmd-7-1247-2014 Chau, K.W., Wu, C.L., 2010. A hybrid model coupled with REFERENCES singular spectrum analysis for daily rainfall prediction. Jour- nal of Hydroinformatics, 12, 458–473. DOI: Addison, P.S., Murray, K.B., Watson, J.N., 2001. Wavelet 10.2166/hydro.2010.032 transform analysis of open channel wake flows. Journal of Cuo, L., Pagano, T.C., Wang, Q.J., 2011. A review of quantita- Engineering Mechanics, 127, 58–70. tive precipitation forecasts and their use in short-to-medium Bašta, M., 2014. Additive decomposition and boundary condi- streamflow forecasting. Journal of Hydrometeorology, 12, tions in wavelet-based forecasting approaches. Acta Oeco- 713–728. 26 Daily rainfall estimates considering seasonality from a MODWT-ANN hybrid model Daubechies, I., 1992. Ten Lectures on Wavelet. Society for 527, 88–100. http://dx.doi.org/10.1016/j.jhydrol.2015.04.047 Industrial and Applied Mathematics, Philadelphia. 0022. https://doi.org/10.1137/1.9781611970104 Hellassa, S., Souag-Gamane, D., 2019. Improving a stochastic dos Santos, T.S., Mendes, D., Torres, R.R., 2016. Artificial multi-site generation model of daily rainfall using discrete neural networks and multiple linear regression model using wavelet de-noising: a case study to a semi-arid region. Ara- principal components to estimate rainfall over South Ameri- bian Journal of Geosciences, 12, 53. ca. Nonlinear Processes in Geophysics, 23, 13–20. DOI: https://doi.org/10.1007/s12517-018-4168-0 10.5194/npg-23-13-2016 Holdefer, A.E., Severo, D.L., 2015. Análise por ondaletas sobre Du, K., Zhao, Y., Lei, J., 2017. The incorrect usage of singular níveis de rios submetidos à influência de maré. Revista spectral analysis and discrete wavelet transform in hybrid Brasileira de Recursos Hídricos, 20, 192–201. DOI: models to predict hydrological time series. J. Hydrol., 552, 10.21168/rbrh.v20n1.p192-201 44–51. DOI: 10.1016/j.jhydrol.2017.06.019 IBGE – INSTITUTO BRASILEIRO DE GEOGRAFIA E ESTATÍSTICA. Cobertura do uso da terra do Brasil (Land Espinoza, J.C., Segura, H., Ronchail, J., Drapeau, G., Gutierrez‐ use coverage in Brazil). Rio de Janeiro: IBGE, 2014. Avail- Cori, O., 2016. Evolution of wet‐day and dry‐day frequency able from: in the western Amazon basin: Relationship with atmospheric https://www.ibge.gov.br/geocienciasnovoportal/informacoes circulation and impacts on vegetation. Water Resources Re- -ambientais/cobertura-e-uso-da-terra (accessed in 13 Sept. search, 52, 8546–8560. 2017) https://doi.org/10.1002/2016WR019305 Kisi, O., Cimen, M., 2011. A wavelet-support vector machine Fahimi, F., Yaseen, Z.M., El-shafie, A., 2017. Application of conjunction model for monthly streamflow forecasting. soft computing based hybrid models in hydrological varia- Journal of Hydrology, 399, 132–140. bles modeling: a comprehensive review. Theoretical and Kisi, O., Shiri, J., 2011. Precipitation forecasting using wavelet- Applied Climatology, 128, 875–903. genetic programming and wavelet-neuro-fuzzy conjunction https://doi.org/10.1007/s00704-016-1735-8 models. Water Resources Management, 25, 3135–3152. Falck, A.S., Maggioni, V., Tomasella, J., Vila, D.A., Diniz, https://doi.org/10.1007/s11269-011-9849-3 F.L., 2015. Propagation of satellite precipitation uncertain- Kuo, C.C., Gan, T.Y., Yu, P.-S., 2010. Wavelet analysis on the ties through a distributed hydrologic model: A case study in variability, teleconnectivity, and predictability of the seasonal the Tocantins–Araguaia basin in Brazil. Journal of Hydrology, rainfall of Taiwan. Monthly Weather Review, 138, 162–175. 527, 943–957. http://dx.doi.org/10.1016/j.jhydrol.2015.05.042 Lang, K.J., Hinton, G.E., 1988. The development of the time- Frumau, K.A., Bruijnzeel, L.A., Tobón, C., 2011. Precipitation delay neural network architecture for speech recognition. measurement and derivation of precipitation inclination in a Technical Report CMU-CS-88-152. windy mountainous area in northern Costa Rica. Hydrological Latrubesse, E.M., Arima, E.Y., Dunne, T., Park, E., Baker, Processes, 25, 499–509. https ://doi.org/10.1002/hyp.7860 V.R., d’Horta, F.M., Ribas, C.C., 2017. Damming the rivers Germano, M.F., Vitorino, M.I., Cohen, J.C.P., Costa, G.B., of the Amazon basin. Nature, 546, 363–369. Souto, J.I.D.O., Rebelo, M.T.C., de Sousa, A.M.L., 2017. https://doi.org/10.1038/nature22333 Analysis of the breeze circulations in Eastern Amazon: an Levy, M.C., Cohn, A., Lopes, A.V., Thompson, S.E., 2017. observational study. Atmospheric Science Letters, 18, Addressing rainfall data selection uncertainty using connec- 67–75. https://doi.org/10.1002/asl.726 tions between rainfall and streamflow. Scientific Reports, 7, Gloor, M.R.J.W., Brienen, R.J., Galbraith, D., Feldpausch, 219. DOI: 10.1038/s41598-017-00128-5 T.R., Schöngart, J., Guyot, J.L., Phillips, O.L., 2013. Intensi- Maheswaran, R., Khosa, R., 2012. Comparative study of different fication of the Amazon hydrological cycle over the last two wavelets for hydrologic forecasting. Computers & Geoscienc- decades. Geophysical Research Letters, 40, 1729–1733. es, 46, 284–295. https://doi.org/10.1016/j.cageo.2011.12.015 https://doi.org/10.1002/grl.50377 Mallat, S., 2009. A Wavelet Tour of Signal Processing. Aca- Gnanadesikan, R., 2011. Methods for Statistical Data Analysis demic Press, 832 p. https://doi.org/10.1016/B978-0-12- of Multivariate Observations. John Wiley & Sons. 374370-1.X0001-8. DOI:10.1002/9781118032671 Mehr, A.D., Kahya, E., Bagheri, F., Deliktas, E., 2014. Succes- Golding, B.W., 2014. Regional prediction models. In: North, sive-station monthly streamflow prediction using neuro- G., Pyle, J., Zhang, F. (Eds.): Encyclopedia of Atmospheric wavelet technique. Earth Science Informatics, 7, 217–229. Sciences. 2nd Edition. Academic Press, p. 2008. DOI: 10.1007/ s12145-013-0141-3 Gomes, E.P., Blanco, C.J.C., Pessoa, F.C.L., 2018. Regionali- Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through zation of precipitation with determination of homogeneous conceptual models part I – A discussion of principles. Jour- regions via fuzzy c-means. Revista Brasileira de Recursos nal of Hydrology, 10, 282–290. http://doi.org/10.1016/0022- Hídricos, 23. https://doi.org/10.1590/2318- 1694(70)90255-6 0331.231820180079 Nerantzaki, S.D., Papalexiou, S.M., 2019. Tails of extremes: Guimarães Santos, C.A., Silva, G.B.L.D., 2014. Daily stream- Advancing a graphical method and harnessing big data to as- flow forecasting using a wavelet transform and artificial neu- sess precipitation extremes. Advances in Water Resources, ral network hybrid models. Hydrological Sciences Journal, 59, 134, Article Number: 103448. 312–324. http://dx.doi.org/10.1080/02626667.2013.800944 Nourani, V., Baghanam, A.H., Adamowski, J., Kisi, O., 2014. Gupta, A., Kamble, T., Machiwal, D., 2017. Comparison of Applications of hybrid wavelet–artificial intelligence models ordinary and Bayesian kriging techniques in depicting rain- in hydrology: a review. Journal of Hydrology, 514, 358–377. fall variability in arid and semi-arid regions of north-west https://doi.org/10.1016/j. jhydrol.2014.03.057 India. Environmental Earth Sciences, 76, 512. Nourani, V., Andalib, G., Sadikoglu, F., 2017. Multi-station https://doi.org/10.1007/s12665-017-6814-3 streamflow forecasting using wavelet denoising and artificial He, X., Guan, H., Qin, J., 2015. A hybrid wavelet neural network intelligence models. Procedia Computer Science, 120, 617– model with mutual information and particle swarm optimiza- 624. DOI: 10.1016/j.procs.2017.11.287 tion for forecasting monthly rainfall. Journal of Hydrology, Evanice Pinheiro Gomes, Claudio José Cavalcante Blanco Oliveira-Junior, J.F.D., Xavier, F.M.G., Teodoro, P.E., Gois, namic neural network approaches for runoff prediction. G.D., Delgado, R.C., 2017. Cluster analysis identified rain- Journal of Hydrology, 535, 211–225. fall homogeneous regions in Tocantins State, Brazil. Biosci- http://dx.doi.org/10.1016/j.jhydrol.2016.01.076 ence Journal, 33, 333–340. https://doi.org/10.14393/BJ- Siad, S.M., Iacobellisb, V., Zdrulie, P., Gioiab, A., Stavid, I., v33n2-32739 Hoogenboom, G., 2019. A review of coupled hydrologic and Osborn, T.J., Wallace, C.J., Harris, I.C., Melvin, T.M., 2016. crop growth models. Agricultural Water Management, 224, Pattern scaling using ClimGen: monthly-resolution future Article Number: 105746. climate scenarios including changes in the variability of pre- Silva, I.D., Spatti, D.H., Flauzino, R.A., 2010. Redes neurais cipitation. Climatic Change, 134, 353–369. artificiais para engenharia e ciências aplicadas. Artliber, São https://doi.org/10.1007/s10584-015-1509-9 Paulo, Brasil, 646 p. Partal, T., Kişi, Ö., 2007. Wavelet and neuro-fuzzy conjunction Sulaiman, S.O., Shiri, J., Shiralizadeh, H., Kisi, O., Yaseen, model for precipitation forecasting. Journal of Hydrology, Z.M., 2018. Precipitation pattern modeling using cross-station 342, 199–212. https://doi.org/10.1016/j.jhydrol.2007.05.026 perception: regional investigation. Environmental Earth Sci- Partal, T., Cigizoglu, H.K., 2009. Prediction of daily precipita- ences, 77, 709. https://doi.org/10.1007/s12665-018-7898-0 tion using wavelet—neural networks. Hydrological Sciences Tealab, A., Hefny, H., Badr, A., 2017. Forecasting of nonlinear Journal, 54:2, 234–246, DOI: 10.1623/hysj.54.2.234 time series using ANN. Future Computing and Informatics Partal, T., Cigizoglu, H.K., Kahya, E., 2015. Daily precipitation Journal, 2, 39–47. https://doi.org/10.1016/j.fcij.2017.05.001 predictions using three different wavelet neural network al- Teodoro, P.E., de Oliveira-Júnior, J.F., Da Cunha, E.R., Correa, gorithms by meteorological data. Stochastic Environmental C.C.G., Torres, F.E., Bacani, V.M., Ribeiro, L.P., 2016. Research and Risk Assessment, 29, 1317–1329. Cluster analysis applied to the spatial and temporal variabil- https://doi.org/10.1007/s00477-015-1061-1 ity of monthly rainfall in Mato Grosso do Sul State, Brazil. Percival, D.B., Walden, A.T., 2000. Wavelet methods for time Meteorology and Atmospheric Physics, 128, 197–209. DOI: series analysis. Cambridge Series in Statistical and Probabil- 10.1007/s00703-015-0408-y istic Mathematics. 1st ed. Cambridge University Press, Wang, X.Y., Li, X., Zhu, J., Tanajura, C.A., 2018. The Cambridge. strengthening of Amazonian precipitation during the wet Ramana, R.V., Krishna, B., Kumar, S.R., Pandey, N.G., 2013. season driven by tropical sea surface temperature forcing. Monthly rainfall prediction using wavelet neural network Environmental Research Letters, 13, Article Number: analysis. Water Resources Management, 27, 3697–3711. 094015. https://doi.org/10.1088/1748-9326/aadbb9 https://doi.org/10.1007/s11269-013-0374-4 Wilks, D.S., 1989. Conditioning stochastic daily precipitation Ramírez-Hernández, J., Infante-Prieto, S.O., Villa-Angulo, R., models on total monthly precipitation. Water Resources Re- Hallack-Alegría, M., 2016. La influencia del efecto de borde search, 25, 1429–1439. en el pronóstico de precipitaciones utilizando DWT diádica, https://doi.org/10.1029/WR025i006p01429 MODWT, ANN y ANFIS. Tecnología y ciencias del agua, Wilks, D.S., 1999. Interannual variability and extreme-value 73, 93–113. characteristics of several stochastic daily precipitation mod- Reichle, R.H., Liu, Q., Koster, R.D., Draper, C.S., Mahanama, els. Agricultural and Forest Meteorology, 93, 153–169. S.P., Partyka, G.S., 2017. Land surface precipitation in https://doi.org/10.1016/S0168-1923(98)00125-7 MERRA-2. Journal of Climate, 30, 1643–1664. Yaseen, Z.M., Jaafar, O., Deo, R.C., Kisi, O., Adamowski, J., https://doi.org/10.1175/JCLI-D-16-0570.1 Quilty, J., El-Shafie, A., 2016. Stream-flow forecasting us- Rivera, D., Lillo, M., Uvo, C.B., Billib, M., Arumí, J.L., 2012. ing extreme learning machines: A case study in a semi-arid Forecasting monthly precipitation in Central Chile: a self- region in Iraq. Journal of Hydrology, 542, 603–614. organizing map approach using filtered sea surface tempera- http://dx.doi.org/10.1016/j.jhydrol.2016.09.035 ture. Theoretical and Applied Climatology, 107, 1–13. Zhang, X., Peng, Y., Zhang, C., Wang, B., 2015. Are hybrid https://doi.org/10.1007/s00704-011-0453-5 models integrated with data preprocessing techniques suita- Sang, Y.F., 2012. A practical guide to discrete wavelet decom- ble for monthly streamflow forecasting? Some experiment position of hydrologic time series. Water Resources Man- evidences. J. Hydrol., 530, 137–152. agement, 26, 3345–3365. https://doi.org/10.1007/s11269- http://dx.doi.org/10.1016/j.jhydrol. 2015.09.047 012-0075-4 Zeri, M., Cunha-Zeri, G., Gois, G., Lyra, G.B., Oliveira‐ Santos, C.A., Freire, P.K., Silva, R.M.D., Akrami, S.A., 2019. Júnior, J.F., 2019. Exposure assessment of rainfall to inter- Hybrid wavelet neural network approach for daily inflow annual variability using the wavelet transform. International forecasting using Tropical Rainfall Measuring Mission data. Journal of Climatology, 39, 568–578. Journal of Hydrologic Engineering, 24, Article Number: https://doi.org/10.1002/joc.5812 04018062. https://doi.org/10.1061/(ASCE)HE.1943- 5584.0001725 Received 15 January 2020 Shoaib, M., Shamseldin, A.Y., Melville, B.W., Khan, M.M., Accepted 13 November 2020 2016. A comparison between wavelet based static and dy-

Journal

Journal of Hydrology and Hydromechanicsde Gruyter

Published: Mar 1, 2021

Keywords: Artificial Intelligence; Climate Prediction Center morphing; Dry and rainy periods; Amazon

There are no references for this article.