Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Linkage Between In-Stream Total Phosphorus and Land Cover in Chugoku District, Japan: An Ann Approach

Linkage Between In-Stream Total Phosphorus and Land Cover in Chugoku District, Japan: An Ann... J. Hydrol. Hydromech., 60, 2012, 1, 33­44 DOI: 10.2478/v10098-012-0003-6 BAHMAN JABBARIAN AMIRI1), K. P. SUDHEER2), NICOLA FOHRER3) Department of Environmental Science, Faculty of Natural Resources, University of Tehran, Karaj, P.O. Box: 4314, Iran; Mailto: jabbarian@ut.ac.ir and j.amiri@yahoo.com; Telephone: +98- 261- 222- 9721 2) Department of Hydrology and Water Resources Management, Ecology Centre, Institute of Nature Protection and Water Resources Management, Christian Albrecht Universität zu Kiel, Olshausenstrasse 75, Geb. I, 24118 Kiel, Germany. 3) Department of Civil Engineering, Indian Institute of Technology Madras, Chennai 600036, India. 1) Development of any area often leads to more intensive land use and increase in the generation of pollutants. Modeling these changes is critical to evaluate emerging changes in land use and their effect on stream water quality. The objective of this study was to assess the impact of spatial patterns in land use and population density on the water quality of streams, in case of data scarcity, in the Chugoku district of Japan. The study employed artificial neural network (ANN) technique to assess the relationship between the total phosphorous (TP) in river water and the land use in 21 river basins in the district, and the model was able to reasonably estimate the TP in the stream water. Uncertainty analysis of ANN estimates was performed using the Monte Carlo framework, and the results indicated that the ANN model predictions are statistically similar to the characteristics of the measured TP values. It was observed that any reduction in forested area or increase in agricultural land in the watersheds may cause the increase of TP concentration in the stream. Therefore, appropriate watershed management practices should be followed before making any land use change in the Chugoku district. KEY WORDS: Water Quality Modeling, Land Use, Total Phosphorus, ANN, Uncertainty Analysis. Bahman Jabbarian Amiri, K. P. Sudheer, Nicola Fohrer: VZAH MEDZI CELKOVÝM OBSAHOM FOSFORU V TOKU A PORASTOM V DISTRIKTE CHUGOKU, JAPONSKO: VYUZITIE NEURÓNOVÝCH SIETÍ. J. Hydrol. Hydromech., 60, 2012, 1; 51 lit., 6 obr., 5 tab. Rozvoj územia casto súvisí so zintenzívnením vyuzívania krajiny a produkciou znecistenia. Dôlezité je modelovanie týchto zmien a ich vplyvu na kvalitu vody v tokoch. Cieom stúdie je urci vplyv priestorových zmien pri vyuzívaní krajiny a zmeny hustoty osídlenia na kvalitu vody v tokoch v case nedostatku vody v oblasti Chugoku, Japonsko. Pri riesení sa vyuzívajú umelé neurónové siete (artificial neural network -ANN), prostredníctvom ktorých sa urcuje vzah medzi celkovým obsahom fosforu (TP) v toku a vyuzívaním kajiny v 21 povodiach oblasti; tento model je schopný vypocíta TP v tokoch. Analýza neurcitosti výsledkov dosiahnutých pomocou ANN bola vykonaná metódou Monte Carlo; výsledky analýzy naznacujú, ze predpovede pomocou metódy ANN sú statisticky podobné meraným hodnotám TP. Bolo zistené, ze redukcia lesnatosti a zvýsenie plochy ponohospodársky vyuzívanej pôdy v povodí môze vies k zvýseniu koncentrrácie TP v toku. Je preto potrebné pred zmenou vo vyuzívaní krajiny prija zodpovedajúce opatrenia v manazmente krajiny, ktoré budú minimalizova negatívne dôsledky zmien vyuzívania krajiny. KÚCOVÉ SLOVÁ: modelovanie kvality vody, vyuzívanie krajiny, celkový obsah fosforu, ANN, analýza neurcitosti. Introduction Fresh water quality in many countries is deteriorating due to uncontrolled urbanization and improper land management practices. Stream water quality is affected by numerous natural and anthro- pogenic sources (Ahearn et al., 2005; Schmalz et al., 2008). They can either be diffused (e.g. runoff from urban and agricultural fields, interflow through organically rich soils) or point pollutants (e.g. industrial effluents). In addition, watershed characteristics (topography and geology) can influ33 ence the water quality (Silva and Williams, 2001; Li et al., 2008). Water quality is generally linked to land use in the watershed (Hem, 1985; Ahearn et al., 2005; Amiri, 2007; Lam et al., 2009; Lam et al., 2010), and consequently many studies have focused on the relationship between the land use and water quality in terms of dissolved salts, suspended solids (Kiesel et al., 2009), and nutrients (Hill, 1981; Allan et al., 1997; Turner and Rabalais, 2003; Li et al., 2008; among many others). Most of these studies concluded that land use strongly influence nitrogen (Johnson et al., 1997; Smart et al., 1998), phosphorus (Hill, 1991; Schmalz et al., 2007) and sediment concentrations (Allan et al., 1997; Ahearn et al., 2005; Kiesel et al., 2009) in stream water. Therefore efficient management measures are required to deal with the situation of deterioration of the stream water quality. Water quality models are practical tools in catchment management practices because of their ability to apply current knowledge to predict water quality in response to various scenarios such as land use change (Liu et al., 2005; Lenhart et al., 2005). While fully distributed process-based water quality models are the most suited for developing catchment management decisions, they are necessarily complex because they attempt to describe all factors and processes so that the relative importance of these may be understood and investigated in response to environmental change (Dean et al., 2009). The key sources and processes controlling nutrient water quality characteristics are well established (Neitsch et al., 2002), but the understanding of how the sources and processes vary in time and space is still limited (Schmalz et al., 2007). This is due to the heterogeneity of environmental factors, which define source-areas and control process rates and delivery from the land to the stream network, such as land use, soil type, moisture and temperature, and flow routing. Often, the data available to develop and apply predictive models are generally insufficient, even for small research catchments. Thus, while it is useful to develop models based on process understanding, they will always necessarily be simplifications of reality. In this context, datadriven models, which can discover relationships from input-output data without having the complete physical understanding of the system, may be preferable. In recent decades, the advent of increasingly efficient computing technology has provided exciting new tools for the mathematical modelling of dy34 namic systems. Artificial neural network (ANN) is one such tool that relates a set of predictor variables to a set of target variables. Artificial neural networks are well known massively parallel computing models that have exhibited excellent performance in the resolution of complex problems in science and engineering. In recent years, the ANN technique, which is a data driven modelling tool, has become an increasingly popular tool for water quality modelling among researchers and practicing engineers (e.g. Keiner and Yan, 1998; Gross et al., 1999; Sciller et al., 1999; Tanaka et al., 2000; Baruah et al., 2001; Panda et al., 2004; Gatts et al., 2005; Sudheer et al., 2006). Nonetheless, since ANN is a data demanding approach for model development, the uncertainty associated with the ANN models developed in data scarce situations may be very high, and the robustness of any model application will be affected. Therefore it is important to be the uncertainties in model predictions are well understood. Estimating prediction uncertainties in water quality modelling is becoming increasingly appreciated (Krueger et al., 2007; Page et al., 2004, 2005; Radwan et al., 2004; Rode et al., 2007; Singh et al., 2007; van Griensven and Meixner, 2006). While there are various methods available for quantifying the uncertainty in physical hydrologic models (Christiaens and Feyen, 2002; Beven and Binley, 1992), little discussion is found in literature regarding the uncertainty analysis of the ANN hydrologic models except a few (Kingston et al., 2005; Khan and Coulibali, 2006; Han et al., 2006; Srivastav et al., 2007). The primary objective of the present paper was to investigate the relationship between the total phosphorus in river water and the land use in 21 river basins in the Chugoku district of Japan using artificial neural network. The study also focused on the evaluation of the uncertainties associated with water quality predictions based on human activities such as agriculture, forestry, industry, and urbanization in the drainage basin on the stream water quality, in case of insufficient dataset, in the Monte Carlo framework. Artificial neural network An ANN attempts to mimic, in a very simplified way, the human mental and neural structure and functions (Hsieh, 1993). It can be characterized as massively parallel interconnections of simple neurons that function as a collective system. The network topology consists of a set of nodes (neurons) connected by links and usually organized in a number of layers. Each node in a layer receives and processes weighted input from previous layer and transmits its output to nodes in the following layer through links. Each link is assigned by weight, which is by numerical estimate of the connection strength. The weighted summation of inputs to a node is converted to an output according to transfer function (typically a sigmoid function). Most ANNs have three layers or more: an input layer, which is used to present data to the network; an output layer, which is used to produce an appropriate response to the given input; and one or more intermediate layers, which are used to act as a collection of feature detectors (Fig. 2). The multi layer perceptron (MLP) is the most popular ANN architecture in use today (Maier et al., 2010). It assumes that the unknown function (between input and output) is represented by multi layer feed forward network of sigmoid units. The working of three layer ANN can be mathematically described as follows. Fig. 1. Location map of the Chugoku district, Japan. Consider an ANN model with n input neurons (x1, ..., xn), h hidden neurons (z1, ..., zh), and m output neurons (y1, ..., ym). Let i, j, and k be the indices representing input, hidden, and output layers, respectively. Let j be the bias for neuron zj and k , the bias for neuron yk. Let wij be the weight of the connection from neuron xi to neuron zj and jk the weight of connection from neuron zj to yk. The function that the ANN calculates is: non-decreasing. The most commonly employed transfer function is the logistic function, which is defined for any variable s: f (s) = 1 1+ e-s (3) h yk = g A z j jk + k j=1 (1) n z j = f A xi wij + j , i=1 (2) The training of MLP involves finding the optimal weight vector for the network. There are many training techniques available. The aim of training the network is to find a global solution to the weight matrix, which is typically a nonlinear optimization problem (White, 1989). Consequently the theory of nonlinear optimization is applicable to training of MLP. The suitability of a particular method is generally compromise between computation cost and performance, and the most popular is the back propagation algorithm (Rumelhart et al., 1986), and has been employed in the current study. where g A and f A are activation (transfer) functions, which are usually continuous, bounded, and Fig. 2. General structure of a typical three layers ANN. Study area and data The present study was carried out in the Chugoku district of Japan. The district is in the west of Honshu island bounded by longitude 130o 55' 16'' and 133o 12' 11'', and latitude 33o 57' 40'' and 35o 23' 34''. The district is composed of five Prefectures (Hiroshima, Yamaguchi, Tottori, Shimane and Okayama), and covers an area of 32,000 km2 (Fig. 1). While there are a large number of small watersheds (total 51) in this district, the current study was restricted to only 21 of them. The 36 locations of these 21 watersheds are presented in Fig. 2. Watershed boundaries were digitized using the Japan Geological Survey Institute (JGSI) topographic quadrangle maps (scale 1 : 200,000). County-scale population database was linked with the digital map of counties for generating human population density map. The land cover map of all the 21 watersheds have been derived from Landsat-5 TM imagery for the year 2000, using Integrated Land and Water Information System (ILWIS, 2004). The study area experienced the mean annual precipitation of 1738 mm (average over the period of 10 years), and the mean temperature was reported to be 15.9 0C. The underlying geology is largely volcanic (andesite and rhyolite) in the central and northern parts, Mesozoic sedimentary formations (sandstones/shale/pudding stone) in the western part and quaternary sedimentary formations (gravel and clay) in the low land of the study area. The dominant soil groups in the study area are Dystric Regosols, Gleysols, Humic Cambisols, Ochric Cambisols, and Rhodic Arcsols. The land use characteristics of the 21 watersheds are presented in Tab. 1. It is noted that 79.34% of the study area is under forest cover. Agriculture is the second largest land use in the study area covering an area of 8.33% of the total. The other land uses include urban (2.24%), grassland (6.63%) and water bodies (0.46%). It is noted from Tab. 1 that there is a wide variation of land uses across the watersheds. For example, the forest cover varies from 57.61 % in Kurose to 91.76% in Nishki, and the urban area ranges from 1.04 in Nishki to 13.99 in Washino. The Tab. 1 also presents the population density in each of these watersheds. The population data was based on the 2000 census. The population density ranges from 63 persons/km2 to 819 persons/km2. The water quality used in this study was secondary data, which were obtained from the five Prefecture offices (Hiroshima, Yamaguchi, Tottori, Shimane and Okayama). Annual mean of the water quality data in the year 2000 used in this study was calculated using monthly measurements, which are carried out by the respective Prefecture offices. The monitoring network maintained by the Prefecture offices is very extensive and is distributed well across the district. The current study used Total Phosphorus (TP) as the representative parameter of the water quality. Tab. 2 presents the summary statistics of the water quality in the 21 watersheds in the study area. It is observed that the coefficient of variation of TP in various watersheds varied from 15 to 94 %. Methodology ANN Model Development One of the most important steps in the ANN model development process is the determination of significant input variables. Usually, not all of the potential input variables will be equally informative since some may be correlated, noisy or have no significant relationship with the output variable being modelled (Maier and Dandy, 2000). Generally some degree of a priori knowledge is used to specify the initial set of candidate inputs (e.g. Campolo et al., 1999; Thirumalaiah and Deo, 2000). Although a priori identification is widely used in many applications and is necessary to define the candidate set of inputs, it is dependent on the ex- T a b l e 1. Compositional attributes (%) of land cover and human population density in the catchments of the Chugoku district. Basin number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 River name Awano Kakefuchi Fuka Misumi Hamada Gonoo Shizuma Kando Numata Kamo Kurose Ota Oze Nishki Shimada Saba Washino Kotou Ariho Asa Koya Area [km2] 182 85 72 67 253 2622 174 495 627 98 282 1700 354 932 284 572 300 416 98 226 299 Land use Agriculture Grassland 5.62 5.89 16.35 5.42 5.01 4.27 5.05 4.18 3.72 6.87 7.44 6.61 5.39 10.05 11.84 2.76 24.74 5.04 7.62 9.63 20.64 10.1 4.29 4.71 3.31 6.15 2.76 3.72 8.14 7.72 2.89 5.61 6.96 5.89 8.83 10.91 9.24 8.34 7.14 7.94 7.97 7.46 Population density [person/km2] 63 106 150 70 176 72 150 222 165 211 819 386 183 165 267 225 434 253 477 109 362 Urban 1.97 4.36 6.39 1.36 8.48 1.79 2.46 7.09 4.42 3.79 10.8 4.74 3.01 1.04 5.91 2.62 13.99 3.80 9.23 6.24 4.67 Forest 86.61 71.94 84.17 88.41 78.59 83.89 84.23 78.16 65.60 78.92 57.61 85.22 86.91 91.76 77.75 88.35 72.84 75.64 72.76 78.09 78.67 Water body 0.28 1.04 0.10 0.00 1.01 0.27 0.22 0.14 0.21 0.00 0.75 0.29 0.56 0.71 0.43 0.53 0.30 0.79 0.44 0.52 1.07 T a b l e 2. Descriptive statistics of the annual mean of TP in stream water. Catchment No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 TP [mg L-1] Standard Deviation (±0.024) (±0.017) (±0.009) (±0.007) (±0.021) (±0.015) (±0.011) (±0.022) (±0.013) (±0.001) (±0.008) (±0.013) (±0.016) (±0.002) (±0.014) (±0.018) (±0.062) (±0.019) (±0.025) (±0.027) (±0.033) Mean 0.0255 0.0412 0.0325 0.0238 0.0618 0.0320 0.0713 0.0398 0.0420 0.0250 0.0460 0.0210 0.0320 0.0129 0.0384 0.0453 0.1644 0.0453 0.0546 0.0540 0.0796 pert's knowledge, and hence, is very subjective and case dependent. Relying on the whole-watershed land cover approach suggested by Silva and Williams (2001), the percentage of different land use in each watershed has been considered as input variables to map in-stream TP concentration in the respective watershed. While developing any model, the total available samples are generally divided into training and validation sets prior to the model building, and in some cases a cross-validation set is also used. It should be noted that the available data in the present study is limited and therefore cross validation was not performed. Based on the early stopping idea (Rech, 2002), data set is classified into three groups called training (60%), controlling (25%) and testing (15%) data sub-sets. The first subset is used to estimate the parameters. The second subset is called the validation set. The third subset which is used to monitor the calibration process error, is called the cross validation set. When the network begins to overfit the data, the error on the cross validation set typically begins to rise. When the cross validation error increases for a specified number of iterations, the estimation process is discontinued, and the parameters estimated at the minimum of the cross validation error serve as final estimates (Hsieh, 38 1993). Considering the number of rivers for which the water quality data were available (21 basins), it seems that number of rivers that should be kept for testing the trained network would be very few (three basins). Subsequently, reliable information could not be provided for judging on the trained network. Therefore, testing the trained networks was given up using testing data that was real but few in numbers. They were added to the controlling data set. Finally, twenty-one catchments were classified into two sets in proportion to 60% and 40% as training (12 river basins) and controlling (9 river basins) data, respectively. The controlling dataset was then used for validation, as well. However, the uncertainty analysis performed on the developed ANN model would certainly help build the confidence in the model predictions. The ANN used here is the three layer feed forward network (Fig. 2), trained using the standard back propagation algorithm, with sigmoid transforming function in the hidden layer and the output layer, which were fixed after various trials. The number of hidden neurons in the network, which is responsible for capturing the dynamic and complex relationship between various input and output variables was identified by various trials (Eberhart and Dobbins, 1990; Maier and Dandy, 2000). The trial and error procedure started with two hidden neurons initially, and the number of hidden neurons was increased up to 13 during the trails with a step size of 1 in each trial. For each set of hidden neurons, the network was trained in batch mode to minimize the mean square error at the output layer. The training was stopped when there was no significant improvement in the efficiency, and the model was then tested for its generalization properties. Uncertainty analysis and impact of land use change on TP Due to the possibility of high uncertainty which might be originated from the limited data points, uncertainty analysis of the ANN model predictions based on Monte Carlo simulations was performed to assess the confidence in model predictions. Accordingly, the probability distribution of the input variables was initially identified and 500 synthetic combinations of land use in the watersheds were generated using the information about the derived probability distribution of variables. While generating the synthetic land uses, the following conditions were considered: (i) the land use percentage should be within 0 to 100, (ii) the summation of land use percentages for a watershed should be 100% since the land use change is not independent, and (iii) the generated land use follow the probability distribution of the variable concerned. Five hundred such random combinations of land uses were generated in this procedure, and all of them were inputs to the trained (and tested) ANN model. The generated synthetic land uses represent any combination of land use that is plausible in the future and therefore can be employed for assessing the impact of land use change on TP concentrations in the study area. Results and discussions further analysis, thereby making the architecture of the ANN be 6-4-1. The minimum coefficient of variation during the trials implies that the trained network has minimum uncertainty. Fig. 3 depicts the scatter plot of measured and ANN estimated TP values in the watershed. It can be noted from the Fig. 3 that the plot shows less scatter and data point do not significantly deviate from the 1 : 1 line (solid lines in the plot). Therefore it can be inferred that the trained ANN has sufficiently learned the relationship between the input and the output variables. The model showed the RMSE of 0.0018 mg L-1 on the validation data sets. Statistical variability of input variables Performance of ANN Model Tab. 3 presents the performance of ANN models (in terms of the r-square between the computed and measured TP value) during the trial and error procedure used for model development. The trial included varying the number of hidden neurons in the network, as well as training the network with different initial weights. It can be noted from Tab. 3 that the performance of the ANN model varies the number of hidden neurons in the model architecture, which was as expected. It can be seen from Tab. 3 that ANN with higher than 5 neurons may be over fitting the data as standard deviation of performance between various trials (different initial weights) are quite high. The maximum performance with minimum coefficient of variation is observed for the model with 4 neurons and is selected for The probability distribution of land use and population density was quantified using the Best Fit program (Palisades Corp. CA, USA), which considered 28 different distributions to the data, and ranked them according to the specified goodness of fit criterion. The parameters of the probability distributions were estimated using maximumlikelihood estimator (Haan, 2002). The Chi-Square goodness of fit test was performed to evaluate and rank the distributions that best described the data. The statistical properties of the best fitted distribution for each of the variables are presented in Tab. 4 and the shape of the distribution is presented in Fig. 4. The major land use in the watershed, forest, followed 3-parameter log-logistic distribution, while agriculture land use followed log-Pearson type III distribution. T a b l e 3. R-square value between the ANN output and measured TP value during the model development stage. Number of hidden neurons in the network 1 2 3 4 5 6 7 8 9 10 11 12 13 1 0.45 0.55 0.56 0.57 0.97 0.55 0.52 0.54 0.56 0.55 0.55 0.99 0.87 R-square during trials with different initial weights 2 3 4 0.45 0.45 0.46 0.53 0.47 0.52 0.51 0.51 0.53 0.62 0.53 0.54 0.82 0.57 0.54 0.53 0.51 0.86 0.58 0.53 0.91 0.54 0.54 0.88 0.54 0.87 0.53 0.54 0.54 0.56 0.55 0.89 0.91 0.56 0.57 0.53 0.89 0.93 0.52 Summary statistics of the performance during trials Mean S.D. 0.45 0.00 0.52 0.04 0.53 0.02 0.56 0.03 0.76 0.17 0.60 0.18 0.62 0.15 0.61 0.15 0.70 0.18 0.54 0.16 0.76 0.17 0.64 0.19 0.75 0.20 0.18 0.07 0.06 ( a) T ra ining 0.16 0.14 ( b) Valid atio n AN Estiamted TP (mg/L) N ANNEsti at ed TP (mg/L) m 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 Me as ure d T P ( mg /L) Measured TP ( mg/L) Fig. 3. Comparison of the measured and ANN estimated TP values in the district during training and validation of the ANN model. T a b l e 4. Results of the distribution fitting of the 21 catchments data set. Variable Urban Forest Agriculture Grassland Water body Human population density Statistical Distribution Dagum Log-Logistic (3P) Log-Pearson 3 Johnson Johnson Log-Normal Parameters k = 0.607 = 3.1893 = 5.6559 = 1.9051E+8 = 8.5099E + 8 =­8.5099E + 8 = 1.8099 = 4.1985 = 2.258 = 0.46987 = 1.0045 = 11.236 = 2.0892 = 0.5658 = 0.76132 = 1.397 =­0.0481 = 0.64718 = 5.2726 Uncertainty analysis of ANN output As mentioned earlier, the generated samples of land use combinations and population density in the watersheds, whose statistics were summarized in Tab. 5, are given the trained ANN model to get the output (TP). The ANN estimated TP values (500 numbers) are used to identify the probability distribution that it follows. It is noted that both the measured and predicted values of TP followed exponential distribution. It is observed during the analysis that the shape and location parameters of the probability distribution followed by the measured and predicted TP were sufficiently close to each other, indicating a good confidence on the model predictions. The parameter value of the exponential distribution (Schmidt and Makalic, 2009) for the measured TP was 20.58, while that for the ANN generated data was 21.25. Fig. 5 shows the probability distribution followed by the measured value of TP and the ANN estimated value of TP obtained from synthetic replicates of land use. It is noted that the distribution characteristics of mea40 sured TP and ANN estimated TP do not vary much, indicating that the ANN estimated TP is statistically similar to the measured values. The results of the analysis ensure that the ANN model predictions are satisfactory and can be employed for decisionmaking. Impact of land use change on TP Fig. 6 depicts the impact of land use change on TP concentration in the streams of the Chugoku district, Japan. It can be observed that any decrease in forest cover will cause the increase in TP concentration in the stream flow for this region, which is obvious since the forest area reduces the erosion from the watershed and the phosphorus transport will be less in densely forested areas. The results of the analysis indicate that the rate of increase in TP will be high when the forested area is less than 30% in any catchment. These results also revealed that forest ecosystem could play a significant role in improvement of in-stream water quality in the rivers of the study area. Fig. 4. Probability distribution of the land use variation in the Chugoku district. T a b l e 5. Summary of the statistics of the synthetically generated data set, which was used for uncertainty analysis of the ANNbased TP model. Human population density 242.46 8.83 181.89 197.36 38950.36 9.67 2.58 1434.06 45.52 1479.59 17.34 Statistics Mean Standard error Median Standard deviation Sample variance Kurtosis Skewness Range Minimum Maximum Confidence level (95.0%) Urban 4.10 0.12 3.52 2.72 7.37 2.36 1.35 16.80 0.00 16.80 0.24 Forest 65.09 0.35 66.34 7.81 61.00 1.97 ­1.07 52.30 27.93 80.23 0.69 Agriculture 18.24 0.40 16.09 8.94 80.00 2.60 1.32 56.35 4.39 60.75 0.79 Grassland 12.18 0.10 12.07 2.17 4.69 ­0.12 0.13 13.07 5.18 18.26 0.19 Water body 0.39 0.01 0.31 0.30 0.09 4.12 1.66 1.89 0.00 1.90 0.03 The agriculture land use is of major concern for TP loading in stream mainly due to fertilizer application in the land. The results presented in Fig. 6 suggests that effective management strategies are required to contain the increase in TP concentration in stream flow if intensive agriculture is being planned in the watersheds. Any increase beyond 60% of the total area for agriculture use will significantly increase the TP concentrations. It may be noted that the increase in agricultural land use 41 would be mostly at the expense of the forest reduction, and the changes in both land uses jointly attribute to the increase in TP. It is noted from Fig. 6 that increase in grassland also would enhance the TP concentration in the stream flow plausibly due to fertilizer and manure application. The rate of increase of TP due to change in area of grassland is higher compared to that of agriculture land use. Fig. 6 also suggests some thresholds in the land use subsystem of the catchments. If the catchment system passes through these thresholds, drastic change will appear in the response of the catchment in term of TP concentration. Fig. 5. Probability distribution of the measured and ANN estimated Total Phosphorus (TP). Fig. 6. ANN model predicted variation of TP along the variation of land use percentage. Summary and conclusions Rapidly growing countries, and particularly those emerging from rural to urban, frequently lack the resources to adequately monitor and forecast the impacts of land development on biologic, hydrologic, and social resources. Development often leads to more intensive land use and related increase in the generation of pollutants. Modeling these changes is critical to evaluate emerging land use and potential problems in water quality in the watershed. The objective of this study was to assess the nature and emergence of spatial patterns in land use and population density on the water quality of streams in the Chugoku district of Japan. The study employed artificial neural network (ANN) technique to assess the relationship between the total phosphorus in river water and the land use in 21 river basins in the district. The results suggest that the ANN could be a viable tool to relate the land use characteristics to the water quality. The developed model was able to reasonably estimate the Total Phosphorus (TP) in the stream water. The study also focused on the evaluation of the uncertainties associated with water quality predictions from the ANN model using the Monte Carlo framework. The results of the uncertainty analysis indicated that the ANN model predictions are statistically similar to the characteristics of the measured TP values in the 21 watersheds. It was observed from the study that any reduction in forested area or increase in agricultural land in the watersheds may cause an increase in TP concentrations in the stream water, and therefore appropriate watershed management practices should be followed before making any land use change in the Chugoku district. Acknowledgements. The first author would like to appreciate the postdoctoral fellowship from the Alexander von Humboldt Foundation. The second author acknowledges the STAR program of Deutscher Akademischer Austausch Dienst (DAAD), Germany for financial support to visit Kiel University to take part in this work. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Hydrology and Hydromechanics de Gruyter

Linkage Between In-Stream Total Phosphorus and Land Cover in Chugoku District, Japan: An Ann Approach

Loading next page...
 
/lp/de-gruyter/linkage-between-in-stream-total-phosphorus-and-land-cover-in-chugoku-gaD2eozPvN
Publisher
de Gruyter
Copyright
Copyright © 2012 by the
ISSN
0042-790X
DOI
10.2478/v10098-012-0003-6
Publisher site
See Article on Publisher Site

Abstract

J. Hydrol. Hydromech., 60, 2012, 1, 33­44 DOI: 10.2478/v10098-012-0003-6 BAHMAN JABBARIAN AMIRI1), K. P. SUDHEER2), NICOLA FOHRER3) Department of Environmental Science, Faculty of Natural Resources, University of Tehran, Karaj, P.O. Box: 4314, Iran; Mailto: jabbarian@ut.ac.ir and j.amiri@yahoo.com; Telephone: +98- 261- 222- 9721 2) Department of Hydrology and Water Resources Management, Ecology Centre, Institute of Nature Protection and Water Resources Management, Christian Albrecht Universität zu Kiel, Olshausenstrasse 75, Geb. I, 24118 Kiel, Germany. 3) Department of Civil Engineering, Indian Institute of Technology Madras, Chennai 600036, India. 1) Development of any area often leads to more intensive land use and increase in the generation of pollutants. Modeling these changes is critical to evaluate emerging changes in land use and their effect on stream water quality. The objective of this study was to assess the impact of spatial patterns in land use and population density on the water quality of streams, in case of data scarcity, in the Chugoku district of Japan. The study employed artificial neural network (ANN) technique to assess the relationship between the total phosphorous (TP) in river water and the land use in 21 river basins in the district, and the model was able to reasonably estimate the TP in the stream water. Uncertainty analysis of ANN estimates was performed using the Monte Carlo framework, and the results indicated that the ANN model predictions are statistically similar to the characteristics of the measured TP values. It was observed that any reduction in forested area or increase in agricultural land in the watersheds may cause the increase of TP concentration in the stream. Therefore, appropriate watershed management practices should be followed before making any land use change in the Chugoku district. KEY WORDS: Water Quality Modeling, Land Use, Total Phosphorus, ANN, Uncertainty Analysis. Bahman Jabbarian Amiri, K. P. Sudheer, Nicola Fohrer: VZAH MEDZI CELKOVÝM OBSAHOM FOSFORU V TOKU A PORASTOM V DISTRIKTE CHUGOKU, JAPONSKO: VYUZITIE NEURÓNOVÝCH SIETÍ. J. Hydrol. Hydromech., 60, 2012, 1; 51 lit., 6 obr., 5 tab. Rozvoj územia casto súvisí so zintenzívnením vyuzívania krajiny a produkciou znecistenia. Dôlezité je modelovanie týchto zmien a ich vplyvu na kvalitu vody v tokoch. Cieom stúdie je urci vplyv priestorových zmien pri vyuzívaní krajiny a zmeny hustoty osídlenia na kvalitu vody v tokoch v case nedostatku vody v oblasti Chugoku, Japonsko. Pri riesení sa vyuzívajú umelé neurónové siete (artificial neural network -ANN), prostredníctvom ktorých sa urcuje vzah medzi celkovým obsahom fosforu (TP) v toku a vyuzívaním kajiny v 21 povodiach oblasti; tento model je schopný vypocíta TP v tokoch. Analýza neurcitosti výsledkov dosiahnutých pomocou ANN bola vykonaná metódou Monte Carlo; výsledky analýzy naznacujú, ze predpovede pomocou metódy ANN sú statisticky podobné meraným hodnotám TP. Bolo zistené, ze redukcia lesnatosti a zvýsenie plochy ponohospodársky vyuzívanej pôdy v povodí môze vies k zvýseniu koncentrrácie TP v toku. Je preto potrebné pred zmenou vo vyuzívaní krajiny prija zodpovedajúce opatrenia v manazmente krajiny, ktoré budú minimalizova negatívne dôsledky zmien vyuzívania krajiny. KÚCOVÉ SLOVÁ: modelovanie kvality vody, vyuzívanie krajiny, celkový obsah fosforu, ANN, analýza neurcitosti. Introduction Fresh water quality in many countries is deteriorating due to uncontrolled urbanization and improper land management practices. Stream water quality is affected by numerous natural and anthro- pogenic sources (Ahearn et al., 2005; Schmalz et al., 2008). They can either be diffused (e.g. runoff from urban and agricultural fields, interflow through organically rich soils) or point pollutants (e.g. industrial effluents). In addition, watershed characteristics (topography and geology) can influ33 ence the water quality (Silva and Williams, 2001; Li et al., 2008). Water quality is generally linked to land use in the watershed (Hem, 1985; Ahearn et al., 2005; Amiri, 2007; Lam et al., 2009; Lam et al., 2010), and consequently many studies have focused on the relationship between the land use and water quality in terms of dissolved salts, suspended solids (Kiesel et al., 2009), and nutrients (Hill, 1981; Allan et al., 1997; Turner and Rabalais, 2003; Li et al., 2008; among many others). Most of these studies concluded that land use strongly influence nitrogen (Johnson et al., 1997; Smart et al., 1998), phosphorus (Hill, 1991; Schmalz et al., 2007) and sediment concentrations (Allan et al., 1997; Ahearn et al., 2005; Kiesel et al., 2009) in stream water. Therefore efficient management measures are required to deal with the situation of deterioration of the stream water quality. Water quality models are practical tools in catchment management practices because of their ability to apply current knowledge to predict water quality in response to various scenarios such as land use change (Liu et al., 2005; Lenhart et al., 2005). While fully distributed process-based water quality models are the most suited for developing catchment management decisions, they are necessarily complex because they attempt to describe all factors and processes so that the relative importance of these may be understood and investigated in response to environmental change (Dean et al., 2009). The key sources and processes controlling nutrient water quality characteristics are well established (Neitsch et al., 2002), but the understanding of how the sources and processes vary in time and space is still limited (Schmalz et al., 2007). This is due to the heterogeneity of environmental factors, which define source-areas and control process rates and delivery from the land to the stream network, such as land use, soil type, moisture and temperature, and flow routing. Often, the data available to develop and apply predictive models are generally insufficient, even for small research catchments. Thus, while it is useful to develop models based on process understanding, they will always necessarily be simplifications of reality. In this context, datadriven models, which can discover relationships from input-output data without having the complete physical understanding of the system, may be preferable. In recent decades, the advent of increasingly efficient computing technology has provided exciting new tools for the mathematical modelling of dy34 namic systems. Artificial neural network (ANN) is one such tool that relates a set of predictor variables to a set of target variables. Artificial neural networks are well known massively parallel computing models that have exhibited excellent performance in the resolution of complex problems in science and engineering. In recent years, the ANN technique, which is a data driven modelling tool, has become an increasingly popular tool for water quality modelling among researchers and practicing engineers (e.g. Keiner and Yan, 1998; Gross et al., 1999; Sciller et al., 1999; Tanaka et al., 2000; Baruah et al., 2001; Panda et al., 2004; Gatts et al., 2005; Sudheer et al., 2006). Nonetheless, since ANN is a data demanding approach for model development, the uncertainty associated with the ANN models developed in data scarce situations may be very high, and the robustness of any model application will be affected. Therefore it is important to be the uncertainties in model predictions are well understood. Estimating prediction uncertainties in water quality modelling is becoming increasingly appreciated (Krueger et al., 2007; Page et al., 2004, 2005; Radwan et al., 2004; Rode et al., 2007; Singh et al., 2007; van Griensven and Meixner, 2006). While there are various methods available for quantifying the uncertainty in physical hydrologic models (Christiaens and Feyen, 2002; Beven and Binley, 1992), little discussion is found in literature regarding the uncertainty analysis of the ANN hydrologic models except a few (Kingston et al., 2005; Khan and Coulibali, 2006; Han et al., 2006; Srivastav et al., 2007). The primary objective of the present paper was to investigate the relationship between the total phosphorus in river water and the land use in 21 river basins in the Chugoku district of Japan using artificial neural network. The study also focused on the evaluation of the uncertainties associated with water quality predictions based on human activities such as agriculture, forestry, industry, and urbanization in the drainage basin on the stream water quality, in case of insufficient dataset, in the Monte Carlo framework. Artificial neural network An ANN attempts to mimic, in a very simplified way, the human mental and neural structure and functions (Hsieh, 1993). It can be characterized as massively parallel interconnections of simple neurons that function as a collective system. The network topology consists of a set of nodes (neurons) connected by links and usually organized in a number of layers. Each node in a layer receives and processes weighted input from previous layer and transmits its output to nodes in the following layer through links. Each link is assigned by weight, which is by numerical estimate of the connection strength. The weighted summation of inputs to a node is converted to an output according to transfer function (typically a sigmoid function). Most ANNs have three layers or more: an input layer, which is used to present data to the network; an output layer, which is used to produce an appropriate response to the given input; and one or more intermediate layers, which are used to act as a collection of feature detectors (Fig. 2). The multi layer perceptron (MLP) is the most popular ANN architecture in use today (Maier et al., 2010). It assumes that the unknown function (between input and output) is represented by multi layer feed forward network of sigmoid units. The working of three layer ANN can be mathematically described as follows. Fig. 1. Location map of the Chugoku district, Japan. Consider an ANN model with n input neurons (x1, ..., xn), h hidden neurons (z1, ..., zh), and m output neurons (y1, ..., ym). Let i, j, and k be the indices representing input, hidden, and output layers, respectively. Let j be the bias for neuron zj and k , the bias for neuron yk. Let wij be the weight of the connection from neuron xi to neuron zj and jk the weight of connection from neuron zj to yk. The function that the ANN calculates is: non-decreasing. The most commonly employed transfer function is the logistic function, which is defined for any variable s: f (s) = 1 1+ e-s (3) h yk = g A z j jk + k j=1 (1) n z j = f A xi wij + j , i=1 (2) The training of MLP involves finding the optimal weight vector for the network. There are many training techniques available. The aim of training the network is to find a global solution to the weight matrix, which is typically a nonlinear optimization problem (White, 1989). Consequently the theory of nonlinear optimization is applicable to training of MLP. The suitability of a particular method is generally compromise between computation cost and performance, and the most popular is the back propagation algorithm (Rumelhart et al., 1986), and has been employed in the current study. where g A and f A are activation (transfer) functions, which are usually continuous, bounded, and Fig. 2. General structure of a typical three layers ANN. Study area and data The present study was carried out in the Chugoku district of Japan. The district is in the west of Honshu island bounded by longitude 130o 55' 16'' and 133o 12' 11'', and latitude 33o 57' 40'' and 35o 23' 34''. The district is composed of five Prefectures (Hiroshima, Yamaguchi, Tottori, Shimane and Okayama), and covers an area of 32,000 km2 (Fig. 1). While there are a large number of small watersheds (total 51) in this district, the current study was restricted to only 21 of them. The 36 locations of these 21 watersheds are presented in Fig. 2. Watershed boundaries were digitized using the Japan Geological Survey Institute (JGSI) topographic quadrangle maps (scale 1 : 200,000). County-scale population database was linked with the digital map of counties for generating human population density map. The land cover map of all the 21 watersheds have been derived from Landsat-5 TM imagery for the year 2000, using Integrated Land and Water Information System (ILWIS, 2004). The study area experienced the mean annual precipitation of 1738 mm (average over the period of 10 years), and the mean temperature was reported to be 15.9 0C. The underlying geology is largely volcanic (andesite and rhyolite) in the central and northern parts, Mesozoic sedimentary formations (sandstones/shale/pudding stone) in the western part and quaternary sedimentary formations (gravel and clay) in the low land of the study area. The dominant soil groups in the study area are Dystric Regosols, Gleysols, Humic Cambisols, Ochric Cambisols, and Rhodic Arcsols. The land use characteristics of the 21 watersheds are presented in Tab. 1. It is noted that 79.34% of the study area is under forest cover. Agriculture is the second largest land use in the study area covering an area of 8.33% of the total. The other land uses include urban (2.24%), grassland (6.63%) and water bodies (0.46%). It is noted from Tab. 1 that there is a wide variation of land uses across the watersheds. For example, the forest cover varies from 57.61 % in Kurose to 91.76% in Nishki, and the urban area ranges from 1.04 in Nishki to 13.99 in Washino. The Tab. 1 also presents the population density in each of these watersheds. The population data was based on the 2000 census. The population density ranges from 63 persons/km2 to 819 persons/km2. The water quality used in this study was secondary data, which were obtained from the five Prefecture offices (Hiroshima, Yamaguchi, Tottori, Shimane and Okayama). Annual mean of the water quality data in the year 2000 used in this study was calculated using monthly measurements, which are carried out by the respective Prefecture offices. The monitoring network maintained by the Prefecture offices is very extensive and is distributed well across the district. The current study used Total Phosphorus (TP) as the representative parameter of the water quality. Tab. 2 presents the summary statistics of the water quality in the 21 watersheds in the study area. It is observed that the coefficient of variation of TP in various watersheds varied from 15 to 94 %. Methodology ANN Model Development One of the most important steps in the ANN model development process is the determination of significant input variables. Usually, not all of the potential input variables will be equally informative since some may be correlated, noisy or have no significant relationship with the output variable being modelled (Maier and Dandy, 2000). Generally some degree of a priori knowledge is used to specify the initial set of candidate inputs (e.g. Campolo et al., 1999; Thirumalaiah and Deo, 2000). Although a priori identification is widely used in many applications and is necessary to define the candidate set of inputs, it is dependent on the ex- T a b l e 1. Compositional attributes (%) of land cover and human population density in the catchments of the Chugoku district. Basin number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 River name Awano Kakefuchi Fuka Misumi Hamada Gonoo Shizuma Kando Numata Kamo Kurose Ota Oze Nishki Shimada Saba Washino Kotou Ariho Asa Koya Area [km2] 182 85 72 67 253 2622 174 495 627 98 282 1700 354 932 284 572 300 416 98 226 299 Land use Agriculture Grassland 5.62 5.89 16.35 5.42 5.01 4.27 5.05 4.18 3.72 6.87 7.44 6.61 5.39 10.05 11.84 2.76 24.74 5.04 7.62 9.63 20.64 10.1 4.29 4.71 3.31 6.15 2.76 3.72 8.14 7.72 2.89 5.61 6.96 5.89 8.83 10.91 9.24 8.34 7.14 7.94 7.97 7.46 Population density [person/km2] 63 106 150 70 176 72 150 222 165 211 819 386 183 165 267 225 434 253 477 109 362 Urban 1.97 4.36 6.39 1.36 8.48 1.79 2.46 7.09 4.42 3.79 10.8 4.74 3.01 1.04 5.91 2.62 13.99 3.80 9.23 6.24 4.67 Forest 86.61 71.94 84.17 88.41 78.59 83.89 84.23 78.16 65.60 78.92 57.61 85.22 86.91 91.76 77.75 88.35 72.84 75.64 72.76 78.09 78.67 Water body 0.28 1.04 0.10 0.00 1.01 0.27 0.22 0.14 0.21 0.00 0.75 0.29 0.56 0.71 0.43 0.53 0.30 0.79 0.44 0.52 1.07 T a b l e 2. Descriptive statistics of the annual mean of TP in stream water. Catchment No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 TP [mg L-1] Standard Deviation (±0.024) (±0.017) (±0.009) (±0.007) (±0.021) (±0.015) (±0.011) (±0.022) (±0.013) (±0.001) (±0.008) (±0.013) (±0.016) (±0.002) (±0.014) (±0.018) (±0.062) (±0.019) (±0.025) (±0.027) (±0.033) Mean 0.0255 0.0412 0.0325 0.0238 0.0618 0.0320 0.0713 0.0398 0.0420 0.0250 0.0460 0.0210 0.0320 0.0129 0.0384 0.0453 0.1644 0.0453 0.0546 0.0540 0.0796 pert's knowledge, and hence, is very subjective and case dependent. Relying on the whole-watershed land cover approach suggested by Silva and Williams (2001), the percentage of different land use in each watershed has been considered as input variables to map in-stream TP concentration in the respective watershed. While developing any model, the total available samples are generally divided into training and validation sets prior to the model building, and in some cases a cross-validation set is also used. It should be noted that the available data in the present study is limited and therefore cross validation was not performed. Based on the early stopping idea (Rech, 2002), data set is classified into three groups called training (60%), controlling (25%) and testing (15%) data sub-sets. The first subset is used to estimate the parameters. The second subset is called the validation set. The third subset which is used to monitor the calibration process error, is called the cross validation set. When the network begins to overfit the data, the error on the cross validation set typically begins to rise. When the cross validation error increases for a specified number of iterations, the estimation process is discontinued, and the parameters estimated at the minimum of the cross validation error serve as final estimates (Hsieh, 38 1993). Considering the number of rivers for which the water quality data were available (21 basins), it seems that number of rivers that should be kept for testing the trained network would be very few (three basins). Subsequently, reliable information could not be provided for judging on the trained network. Therefore, testing the trained networks was given up using testing data that was real but few in numbers. They were added to the controlling data set. Finally, twenty-one catchments were classified into two sets in proportion to 60% and 40% as training (12 river basins) and controlling (9 river basins) data, respectively. The controlling dataset was then used for validation, as well. However, the uncertainty analysis performed on the developed ANN model would certainly help build the confidence in the model predictions. The ANN used here is the three layer feed forward network (Fig. 2), trained using the standard back propagation algorithm, with sigmoid transforming function in the hidden layer and the output layer, which were fixed after various trials. The number of hidden neurons in the network, which is responsible for capturing the dynamic and complex relationship between various input and output variables was identified by various trials (Eberhart and Dobbins, 1990; Maier and Dandy, 2000). The trial and error procedure started with two hidden neurons initially, and the number of hidden neurons was increased up to 13 during the trails with a step size of 1 in each trial. For each set of hidden neurons, the network was trained in batch mode to minimize the mean square error at the output layer. The training was stopped when there was no significant improvement in the efficiency, and the model was then tested for its generalization properties. Uncertainty analysis and impact of land use change on TP Due to the possibility of high uncertainty which might be originated from the limited data points, uncertainty analysis of the ANN model predictions based on Monte Carlo simulations was performed to assess the confidence in model predictions. Accordingly, the probability distribution of the input variables was initially identified and 500 synthetic combinations of land use in the watersheds were generated using the information about the derived probability distribution of variables. While generating the synthetic land uses, the following conditions were considered: (i) the land use percentage should be within 0 to 100, (ii) the summation of land use percentages for a watershed should be 100% since the land use change is not independent, and (iii) the generated land use follow the probability distribution of the variable concerned. Five hundred such random combinations of land uses were generated in this procedure, and all of them were inputs to the trained (and tested) ANN model. The generated synthetic land uses represent any combination of land use that is plausible in the future and therefore can be employed for assessing the impact of land use change on TP concentrations in the study area. Results and discussions further analysis, thereby making the architecture of the ANN be 6-4-1. The minimum coefficient of variation during the trials implies that the trained network has minimum uncertainty. Fig. 3 depicts the scatter plot of measured and ANN estimated TP values in the watershed. It can be noted from the Fig. 3 that the plot shows less scatter and data point do not significantly deviate from the 1 : 1 line (solid lines in the plot). Therefore it can be inferred that the trained ANN has sufficiently learned the relationship between the input and the output variables. The model showed the RMSE of 0.0018 mg L-1 on the validation data sets. Statistical variability of input variables Performance of ANN Model Tab. 3 presents the performance of ANN models (in terms of the r-square between the computed and measured TP value) during the trial and error procedure used for model development. The trial included varying the number of hidden neurons in the network, as well as training the network with different initial weights. It can be noted from Tab. 3 that the performance of the ANN model varies the number of hidden neurons in the model architecture, which was as expected. It can be seen from Tab. 3 that ANN with higher than 5 neurons may be over fitting the data as standard deviation of performance between various trials (different initial weights) are quite high. The maximum performance with minimum coefficient of variation is observed for the model with 4 neurons and is selected for The probability distribution of land use and population density was quantified using the Best Fit program (Palisades Corp. CA, USA), which considered 28 different distributions to the data, and ranked them according to the specified goodness of fit criterion. The parameters of the probability distributions were estimated using maximumlikelihood estimator (Haan, 2002). The Chi-Square goodness of fit test was performed to evaluate and rank the distributions that best described the data. The statistical properties of the best fitted distribution for each of the variables are presented in Tab. 4 and the shape of the distribution is presented in Fig. 4. The major land use in the watershed, forest, followed 3-parameter log-logistic distribution, while agriculture land use followed log-Pearson type III distribution. T a b l e 3. R-square value between the ANN output and measured TP value during the model development stage. Number of hidden neurons in the network 1 2 3 4 5 6 7 8 9 10 11 12 13 1 0.45 0.55 0.56 0.57 0.97 0.55 0.52 0.54 0.56 0.55 0.55 0.99 0.87 R-square during trials with different initial weights 2 3 4 0.45 0.45 0.46 0.53 0.47 0.52 0.51 0.51 0.53 0.62 0.53 0.54 0.82 0.57 0.54 0.53 0.51 0.86 0.58 0.53 0.91 0.54 0.54 0.88 0.54 0.87 0.53 0.54 0.54 0.56 0.55 0.89 0.91 0.56 0.57 0.53 0.89 0.93 0.52 Summary statistics of the performance during trials Mean S.D. 0.45 0.00 0.52 0.04 0.53 0.02 0.56 0.03 0.76 0.17 0.60 0.18 0.62 0.15 0.61 0.15 0.70 0.18 0.54 0.16 0.76 0.17 0.64 0.19 0.75 0.20 0.18 0.07 0.06 ( a) T ra ining 0.16 0.14 ( b) Valid atio n AN Estiamted TP (mg/L) N ANNEsti at ed TP (mg/L) m 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 Me as ure d T P ( mg /L) Measured TP ( mg/L) Fig. 3. Comparison of the measured and ANN estimated TP values in the district during training and validation of the ANN model. T a b l e 4. Results of the distribution fitting of the 21 catchments data set. Variable Urban Forest Agriculture Grassland Water body Human population density Statistical Distribution Dagum Log-Logistic (3P) Log-Pearson 3 Johnson Johnson Log-Normal Parameters k = 0.607 = 3.1893 = 5.6559 = 1.9051E+8 = 8.5099E + 8 =­8.5099E + 8 = 1.8099 = 4.1985 = 2.258 = 0.46987 = 1.0045 = 11.236 = 2.0892 = 0.5658 = 0.76132 = 1.397 =­0.0481 = 0.64718 = 5.2726 Uncertainty analysis of ANN output As mentioned earlier, the generated samples of land use combinations and population density in the watersheds, whose statistics were summarized in Tab. 5, are given the trained ANN model to get the output (TP). The ANN estimated TP values (500 numbers) are used to identify the probability distribution that it follows. It is noted that both the measured and predicted values of TP followed exponential distribution. It is observed during the analysis that the shape and location parameters of the probability distribution followed by the measured and predicted TP were sufficiently close to each other, indicating a good confidence on the model predictions. The parameter value of the exponential distribution (Schmidt and Makalic, 2009) for the measured TP was 20.58, while that for the ANN generated data was 21.25. Fig. 5 shows the probability distribution followed by the measured value of TP and the ANN estimated value of TP obtained from synthetic replicates of land use. It is noted that the distribution characteristics of mea40 sured TP and ANN estimated TP do not vary much, indicating that the ANN estimated TP is statistically similar to the measured values. The results of the analysis ensure that the ANN model predictions are satisfactory and can be employed for decisionmaking. Impact of land use change on TP Fig. 6 depicts the impact of land use change on TP concentration in the streams of the Chugoku district, Japan. It can be observed that any decrease in forest cover will cause the increase in TP concentration in the stream flow for this region, which is obvious since the forest area reduces the erosion from the watershed and the phosphorus transport will be less in densely forested areas. The results of the analysis indicate that the rate of increase in TP will be high when the forested area is less than 30% in any catchment. These results also revealed that forest ecosystem could play a significant role in improvement of in-stream water quality in the rivers of the study area. Fig. 4. Probability distribution of the land use variation in the Chugoku district. T a b l e 5. Summary of the statistics of the synthetically generated data set, which was used for uncertainty analysis of the ANNbased TP model. Human population density 242.46 8.83 181.89 197.36 38950.36 9.67 2.58 1434.06 45.52 1479.59 17.34 Statistics Mean Standard error Median Standard deviation Sample variance Kurtosis Skewness Range Minimum Maximum Confidence level (95.0%) Urban 4.10 0.12 3.52 2.72 7.37 2.36 1.35 16.80 0.00 16.80 0.24 Forest 65.09 0.35 66.34 7.81 61.00 1.97 ­1.07 52.30 27.93 80.23 0.69 Agriculture 18.24 0.40 16.09 8.94 80.00 2.60 1.32 56.35 4.39 60.75 0.79 Grassland 12.18 0.10 12.07 2.17 4.69 ­0.12 0.13 13.07 5.18 18.26 0.19 Water body 0.39 0.01 0.31 0.30 0.09 4.12 1.66 1.89 0.00 1.90 0.03 The agriculture land use is of major concern for TP loading in stream mainly due to fertilizer application in the land. The results presented in Fig. 6 suggests that effective management strategies are required to contain the increase in TP concentration in stream flow if intensive agriculture is being planned in the watersheds. Any increase beyond 60% of the total area for agriculture use will significantly increase the TP concentrations. It may be noted that the increase in agricultural land use 41 would be mostly at the expense of the forest reduction, and the changes in both land uses jointly attribute to the increase in TP. It is noted from Fig. 6 that increase in grassland also would enhance the TP concentration in the stream flow plausibly due to fertilizer and manure application. The rate of increase of TP due to change in area of grassland is higher compared to that of agriculture land use. Fig. 6 also suggests some thresholds in the land use subsystem of the catchments. If the catchment system passes through these thresholds, drastic change will appear in the response of the catchment in term of TP concentration. Fig. 5. Probability distribution of the measured and ANN estimated Total Phosphorus (TP). Fig. 6. ANN model predicted variation of TP along the variation of land use percentage. Summary and conclusions Rapidly growing countries, and particularly those emerging from rural to urban, frequently lack the resources to adequately monitor and forecast the impacts of land development on biologic, hydrologic, and social resources. Development often leads to more intensive land use and related increase in the generation of pollutants. Modeling these changes is critical to evaluate emerging land use and potential problems in water quality in the watershed. The objective of this study was to assess the nature and emergence of spatial patterns in land use and population density on the water quality of streams in the Chugoku district of Japan. The study employed artificial neural network (ANN) technique to assess the relationship between the total phosphorus in river water and the land use in 21 river basins in the district. The results suggest that the ANN could be a viable tool to relate the land use characteristics to the water quality. The developed model was able to reasonably estimate the Total Phosphorus (TP) in the stream water. The study also focused on the evaluation of the uncertainties associated with water quality predictions from the ANN model using the Monte Carlo framework. The results of the uncertainty analysis indicated that the ANN model predictions are statistically similar to the characteristics of the measured TP values in the 21 watersheds. It was observed from the study that any reduction in forested area or increase in agricultural land in the watersheds may cause an increase in TP concentrations in the stream water, and therefore appropriate watershed management practices should be followed before making any land use change in the Chugoku district. Acknowledgements. The first author would like to appreciate the postdoctoral fellowship from the Alexander von Humboldt Foundation. The second author acknowledges the STAR program of Deutscher Akademischer Austausch Dienst (DAAD), Germany for financial support to visit Kiel University to take part in this work.

Journal

Journal of Hydrology and Hydromechanicsde Gruyter

Published: Mar 1, 2012

References