Comparison of Two Nonparametric Models, K- nearest neighbor and M5 Decision Tree in Forecasting the River Discharge in the Karaj Catchment

Document Type : Research

Authors

1 M.Sc. Water Engineering Department, University of Birjand

2 Associate Professor, Water Engineering Department, University of Birjand

3 Assistant Professor, Water Engineering Department, University of Birjand

Abstract

The importance of water resources planning and management, the fast growing population, and the limited surface water resources, have made the application of the new technology to forecasting of river flow. A necessity, various methods have been presented in recent years to forecast the river flow, and the data-based models are considered the most reliable for this purpose. The river flow in the Karaj Catchment has been simulated using the data based models (KNN and M5). Hydroclimatological data (discharge, precipitation, temperature and evaporation) for the 2002 to 2009 duration have been collected to carry out the simulation processes. The performance and accuracy of the models were examined and compared. The Gamma test was used to select appropriate compositions. Suitable compositions were determined as the model inputs (KNN and M5). These features were entered in to the two data-based models. Results showed that both models simulated reliable flow predictions, if the discharge had been entered as an input. The M5 model showed a better precision as compared with the KNN model. The Coefficient of determination (R2) for the KNN and M5 models were 0.97 and 0.93, respectively. The RMSE were 0.55 and 0.87, for the same two models, respectively, and the value of the KGE were 0.99, 0.96, respectively.

Keywords


Ababaei B. VerdiyNejad R. 2013. Estimation of hydraulic performance of pressure irrigation systems using artificial neural networks and nonparametric regression. Water and Soil Journal (Agricultural Sciences and Technology). 27 (4): 769–779.
Azimi V. Vakilifar A. Asadi A. 2014 . Evaluation of gene expression programming and m5 Model for daily drain estimation (Case study: Lighvan River in the 2002-2002 period), International Journal of Water Resources and Development. 3 (3): 134–142.
Deepashree R. Mujumdar P. 2011. A comparison of three methods for downscaling daily precipitation in the Punjab Region, HydrologicalProcesses. 25(23):3575–3589.
Durrant P.J. 2001. WinGamma: a non-linear data analysis and modelling tool with application to flood prediction, Ph.D. thesis, Department of Computer Science, Cardiff University, Wales, UK.
Emami Far S. Alipour A. 2011. Using the M5 tree estimation to estimate evaporation from the exposed surface (Case Study: Qom Area). Eleventh General Seminar on Irrigation and Evaporation Reduction. Kerman Shahid Bahonar University.
Eshghi P. Farzadmehr J. Dasturani MT. Arabs Asadi Z. 2014. The Effectiveness of Intelligent Models in Estimating the River Suspended Sediments (Case Study: Babaaman Basin, Northern Khorasan) Journal of Watershed Management Research Vol. 7, No. 14, Autumn and Winter 2016.
Fallahi MR. Varvani H. Goliyan S. 2012. Precipitation forecasting using regression tree model to flood control. 5th International watershed and water and soil resources management, Kerman, Iran.
Fatahi Nafchi RA. Mousavi SF. Kargar A. 2012. Application of artificial neural network method developed in a basin to estimate monthly discharge in sub-basins (Case study: BeheshtAbad Basin). Iran Water Research Journal. 6 (11): 193-197.
Gupta HV. Kling H. Yilmaz K. Martinez GF. 2009. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modeling. Journal of Hydrology. l 377: 80–91.
Hamed Ensaniat N. 2012. Simulation of daily runoff using PSO algorithm in optimization of basin models. Master's thesis, Faculty of Engineering Islamic Azad University, Science Research Branch.
Hand, DJ, Mannila, H., Smyth, P 2001.Principles of data mining, Cambridge, Mass: The MIT Press.
Hejje Bakhsh P. 2010. Predicting the suspended load by regression decision making trees and comparing by emprical models.MSc thesis, Civil Faculty, Yazd University: 72 p.
Jafarzadeh A. Pourreza Bilondi M. Khashei-Siuki1 A. Aghakhani Afshar A Yaghoobzadeh M. 2017. Reliability estimation of rainwater catchment system using future GCM output data (case study: Birjand city) journal of European Water. 59: 169-175.
Jagtap SS. Lall U. Jones, JW. Gijsman AJ. Ritchie JT. 2004. Dynamic nearest-neighbor method for estimating soil water parameters.Trans. ASAE. 47:1437–1444.
Jalali VR. Homayi M. 2010. A nonparametric model is proposed using the -k closest neighboring technique to estimate the bulk density of the soil. Journal of Agricultural Science and Technology, Water and Soil Science. 56: 181- 190.
Jones J. Evans A. Margetts D. Durrant SJP. 2002. The GAMMA test. Idea Group Publishing. 26 p.
Lin B. Syed M. Falconer RA. 2008. Predicting faecal indicator levels in estuarine receiving waters eAn integrated hydrodynamic and ANN modelling approach. Environmental Modelling & Software. 23: 729– 740.
Mansuy N. Thiffault E. Pare D. Bernier P. Guindon L. Villemaire P. Poirier V. Beaudoin A. 2014. Digital mapping of soil properties in Canadian managed forests at 250 m of resolution using the k-nearest neighbour method.  Geoderma. 59-73.
Moghaddamnia A. Ghafari Gousheh M. Piri J. Amin S. Han D. 2009. Evaporation estimation using artificial neural networks and adaptive neurofuzzy inference system techniques, Advanced, Journal of Water Res. 32: 88–97.
Moriasi DN. Arnold JG. Van Liew, MW. Bingner RL. Harmel RD. Veith TL. 2007. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations,Transactions of the ASABE. 50: 885–900.
Quinlan JR. 1992. Learning with continuous classes. In proceedings AI, 90 (Adams & Sterling, Eds), 343-348.
Remsan R. Shamim MA. Han D. Mathew J. 2008. ANFIS and NNARX based rainfall runoff modeling, International Conference on Systems, Man and Cybernetics. . Miyazaki, Japan.
Sattari MT. Nahrin F. Azimi V. 2013. Daily reference evapotranspiration prediction using artificial neural network model and M5 tree model (Case study: Bonab station). Irrigation and Drainage Journal of Iran. 1 (7): 104–113.
Seifi A. Riahi-Madvar H. 2012. Input variable selection in expert systems based on hybrid Gamma Test-Least Square Support Vector Machine ANFIS and ANN models. Provisional Chapter, INTECH. United kingdom.
Senthil Kumar A. Ojha C. Goyal M. Singh R. Swamee P. 2012.and Modeling of suspended sediment concentration at Kasol in India using ANN, Fuzzy logic, and decision tree algorithms. Journal of Hydrol. Eng. 17(3): 394–404.
Soleimani Motlagh M. Talebi A. Akrami M. 2011. Possibility of predicting the probability of occurrence of avalanches using the nearest neighbor method in the GIS software environment (Case study: Avalanche axes of Tehran Province). Iran Watershed Science and Engineering. 5 (16): 33–38.
Yurekli. K. Taghi Sattari MT. Anli AS. Hinis MA. 2012. Seasonal and annual regional drought prediction by using data-mining approach. Atmosfera. 25 (1): 85-105.
Zahiri A. Azamathullah HM. 2012. Comparison between linear genetic programming and M5 tree models to predict flow discharge in compound channels. Journal of Neural Comput & Applic. 24:413–420.
Zahiri A. Qourbani Kh. 2012. Flow simulation in compound sections using M5 decision tree model. Journal of Soil and Water Protection Research. 20 (3): 113–132.