Evaluating the Efficiency of K nearest Neighbor and Fuzzy C-Means Clustering Based Methods in the Outputs of Hydrological Models

Document Type : Research

Authors

1 Watershed and Range Management Department, Gonbad Kavous University, Gonbad Kavous, Golestan

2 Watershed and rangemanagment Department, Gonbad Kavous University, Gonbad Kavous

3 Watershed and Rangemanagment Department, Gonbad Kavous University, Gonbad Kavous, Golestan

Abstract

Because of incomplete model input and imperfections of the model structure there is not any single hydrological model that has the best performance in different conditions and present outputs without uncertainty. In this situation by combining individual models outputs, the strengths of each single model are used to make a new model that performs better than each single model. In this study the efficiency of nonparametric K nearest neighbor and the Fuzzy C-Means clustering based methods were compared with BGA (Bates Granger Averaging), GRA (Granger Ramanathan Averaging), AICA (Akaike Information Criterion), BICA(Bayes Information Criterion), equal weights averaging and lasso methods in averaging output of hydrological models GR5J, SimHyd , SACRAMENTO and SMAR. Firstly, using the amount of rainfall, evapotranspiration, temperature, and the daily discharge of the Kasilian Watershed in Pol Sefid city at the Bon Koh Station was simulated by each hydrological model. Then different model averaging methods were used to combine the output of each single model. Results indicated that for the calibration period, the GR5J and SACRAMENTO, and the correlation coefficient, Nash Sutcliffe efficiency and RMSE were 0.83, 0.69 and 0.24 respectively. The models SimHyd and GR5J performed better for the validation period; the correlation coefficient, Nash Sutcliffe efficiency and RMSE were 0.73, 0.27 and 0.52 respectively. The lasso and GRA model averaging had the best performance for the calibration period and for the validation data equal weights averaging and BGA had the best performance. For calibration data K nearest neighbor performed better than fuzzy K means clustering based method and the best performance for two methods was obtained at 20 neighbors and for validation data fuzzy K means clustering based method performed better and it observed model performance was improved as the number of neighbors was increased.

Keywords


Arsenault R, Gatien P, Renaud B, Brissette F, Martel JL. 2015. A comparative analysis of 9 multi-model averaging approaches in hydrological continuous streamflow simulation. Journal of Hydrology. 529(3): 754–767.
Abrahart RJ, See L. 2002. Multi-model data fusion for river flow forecasting: an evaluation of six alternative methods based on two contrasting catchments. Hydrology and Earth System Sciences. 6(4): 655–670.
Bates JM, Granger C. 1969. The combination of forecasts. Journal of the Operational Research Society. 20(4): 451–468.‏
Bezdek JC. 1974a. Cluster validity with fuzzy sets. J.Cybernet. 3: 58–73.
Bezdek JC. 1974b. Numerical taxonomy with fuzzy sets. J.Math. Biol. 1: 57–71.
Buckland ST, Burnham KP, Augustin NH. 1997. Model selection: an integral part of inference. Biometrics. 53: 603–618.‏
Burnham KP, Anderson DR. 2004. Multimodel inference: understanding AIC and BIC in model selection. Sociological Methods & Research. 33(2): 261–304.‏
Borhani Darian AR, Farahmandfar Z. 2011. Calibration of rainfall-runoff models using MBO algorithm, The Iranian Society of Irrigation and Water Engineering. 1(4): 60–71.
Duan Q, Ajami NK, Gao X, Sorooshian S. 2007. Multi-model ensemble hydrologic prediction using Bayesian model averaging. Adv. Water Resour. 30: 1371–1386.
Chiew FHS, Peel MC, Western AW. 2002. Application and testing of the simple rainfall runoff model SimHyd, Mathematical models of small watershed hydrology and application (Eds: V.P.Singh and D.K.Frevert), Water Resources Publication, Littleton, Colorado, USA. pp. 335–367.
Chiew FHS, Siriwardena L. 2005. Estimation of SimHyd parameter values for application in ungauged catchments, In Zerger A, and Argent RM, (eds) MODSIM 2005 International Congress on Modelling and Simulation, Modelling and Simulation Society of Australia and New Zealand.
Coron L, Perrin C, Delaigue O, Thirel G, Michel C, Andréassian V, Bourgin F, Brigode P, Le Moine N, Mathevet T, Mouelhi S, Oudin L, Pushpalatha R, Valéry A, Coron L, Perrin C, Delaigue O, Thirel G, Michel C, Andréassian V, Bourgin F, Brigode P, Le Moine N, Mathevet T, Mouelhi S, Oudin L, Pushpalatha R, Valéry A. 2018. airGR: Suite of GR Hydrological Models for Precipitation-Runoff Modelling. R package.
 Diks CG, Vrugt JA. 2010. Comparison of point forecast accuracy of model averaging methods in hydrologic applications. Stochastic Environmental Research and Risk Assessment. 24 (6): 809–820.
Dobarco MR, Arrouays D, Lagacherie P, Ciampalini R, Saby NPA. 2017. Prediction of topsoil texture for Region Centre (France) applying model ensemble methods. Geoderma. 298 (15): 67–77.
Essou GRC, Brissette F, Picher PL. 2017.  Impacts of combining reanalyses and weather station data on the accuracy of discharge modelling. Journal of Hydrology. 545: 120–131.
Friedman J, Hastie T, Tibshirani R, Simon N, Narasimhan B, Qian j. 2018. glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models. R package.
Granger CW, Ramanathan R. 1984. Improved methods of combining forecasts. Journal of Forecasting. 3(2): 197–204.‏
Granger CW. Newbold P. 1977. Identification of two-way causal systems. In Frontiers of quantitative economics (Vol. 3). North-Holland Amsterdam.‏
Gu L, P. Hanson J, Post WM, Kaiser DP,  Yang B, Nemani R, Meyers T. 2008. The 2007 eastern US spring freeze: increased cold damage in a warming world?. AIBS Bulletin. 58(3): 253–262.‏
Goodarzi MR, Zahabiyon B, M 18. Bavani B, Kamal RA. 1391. Performance comparison of three hydrological models SWAT, IHACRES and SimHyd for the runoff simulation of Gharesou basin. Water and Irrigation Management. 2(1): 25–40. (In Persian).
Hu TS, Lam KC, Ng ST. 2001. River flow time series prediction with a range dependent neural network. Hydrological Sciences Journal. 46: 729–745.
Hoeting JA, Madigan D, Raftery AE, Volinsky CT. 1999. Bayesian model averaging: a tutorial. Statistical science. 14(4): 382–417.‏
Hansen BE. 2008. Least-squares forecast averaging. J. Econom. 146 (2): 342–350.
Kachroo RK Liang GC. 1992. River flow forecasting. Part 2. Algebraic development of linear modelling techniques, Jouذثفrnal of Hydrology. 133:17–40.
Kamal AR, Masah Bovani AR. 1389. Climate Change and Variability Impact in Basin’s Runoff with Interference of Tow Hydrology Models Uncertainty. Journal of Water and Soil. 24(5): 920–931. (In Persian).
Kunnath-Poovakka A, Eldho TI. 2019. A comparative study of conceptual rainfall-runoff models GR4J, AWBM and Sacramento at catchments in the upper Godavari river basin, India. J. Earth Syst. Sci. 128(2):1-15.
Li W, Sankarasubramanian A. 2012. Reducing hydrologic model uncertainty in monthly streamflow predictions using multimodel combination. Water Resources Research. 48(12).‏
Muhammad A, Stadnyk TA, Unduche F, Coulibaly P. 2018. Multi-Model approaches for improving seasonal ensemble streamflow prediction scheme with various statistical post-processing techniques in the Canadian Prairie Region. Water. 10(11).
Parrish MA, Moradkhani H, DeCHant CM. 2012. Toward reduction of model uncertainty: Integration of Bayesian model averaging and data assimilation. Water Resources Research. 48(3).
Pogder G. 2004. Rainfall Runoff Library User Manual. CRC for Catchment Hydrology.
Raftery AE, Gneiting T, Bakabdaoui F. 2005. Using Bayesian model averaging to calibrate forecast ensembles. Mon. Weather. Rev. 133 (5): 1155–1174.
Rouhni H, Farahi Moghadam M. 2014. Application of the genetic algorithm technique for optimization of the hydrologic tank and SIMHHYD models, parameters. Journal of Range and Watershed Management. 66( 4): 521–533. (In Persian).
Sabzevari T, Ardakanian R, Shamsayi A, Talebi A. 2008. Predicting the Flood Hydrographs of Ungauged Watersheds Using the HEC-HMS and GIS. Water Engineering. 2(4):1–12. (In Persian).
Singh H, Sankarasubramanian A. 2014. Systematic uncertainty reduction strategies for developing streamflow forecasts utilizing multiple climate models and hydrologic models. Water Resources Research. 50(2): 1288–1307.‏
See L, Openshaw S. 2000. A hybrid multi-model approach to river level forecasting. Hydrological Science Journal. 45(4):523–536.
Shamseldin A, O’Connor K, Liang G. 1997. Methods for combining the output of different rainfall-runoff models. Journal of Hydrology. 197: 203–229.
Tan B O, OʼConnor KM. 1996. Application of an empirical infiltration equation in the SMAR conceptual model. Journal of Hydrology. 185: 275–295.
Traore VB, Sambou S, Tamba S, Fall S,  Diaw A T, Cisse MT. 2014. Calibrating the rainfall-runoff model GR4J and GR2M on the Koulountou river basin, a tributary of the Gambia River. American Journal of Environmental Protection. 3(1): 36–44.
Tuteja NK, Cunnane C. 1999. A quasi physical snowmelt run-off modelling system for small catchments. Hydrological Processes. 13(12/13): 1961–1975.
Vaze J, Chiew FHS, Perraud JM, Viney N, Post D, Teng J, Wang B, Goswami M. 2010. Rainfall-runoff modeling across southeast Australia: Datasets, models and results, Australian Journal of Water Resources. 14(2): 101–116.
Zhang X, Waters D, Ellis R. 2013. Evaluation of SimHyd, Sacramento and GR4J rainfall runoff  models in two contrasting Great Barrier Reef catchments. 20th International Congress on Modelling and Simulation, Adelaide, Australia. 3260–3266.
Zhang L, Yang X. 2018. Applying a multi-model ensemble method for long-term Runoff prediction under climate change scenarios for the Yellow River Basin, China. Water. 10 (3).