Using Boosted Regression Tree, Logistic Model Tree, and Random Forest Algorithms to Evaluate the Groundwater Potential

Document Type : Research

Authors

1 Master of Science (M.Sc.), Department of Surveying Engineering, Faculty of Surveying Engineering and Geospatial Information, University of Tehran, Tehran, Iran

2 Master of Science (M.Sc.), Civil Engineering, Water and Hydraulic Structures, Young Researchers and Elite Club, Mashhad Branch, Islamic Azad University, Mashhad, Iran

3 Associate Professor of Civil Engineering Department, University of Birjand

Abstract

Groundwater is exploited uncontrollably due to population growth and industrialization in different parts of the world. The purpose of this study is to evaluate the groundwater potential by advanced machine learning algorithms using topographical, hydrological, environmental, and geological criteria. To do this, three advanced machine learning algorithms were used, including Boosted Regression Tree (BRT), Logistic Model Tree (LMT), and Random Forest (RF). Therefore, for implementation, geo-hydrological data of 37 groundwater wells in Birjand plain of South Khorasan province were collected and randomly selected in a ratio of 70 to 30 were divided into training and validation data sets. Finally, groundwater potential maps were prepared using BRT, LMT, and RF algorithms. In order to validate the groundwater potential prediction algorithms, the area under the curve (AUC) and the statistical criteria of positive predictive rate, negative predictive rate, sensitivity, specificity, and accuracy were used. The results showed that the LMT model (AUC = 0.865) has a better performance than the BRT and RF models in predicting the groundwater potential of the study area.

Keywords


Althuwaynee OF, Pradhan B, Park HJ, Lee JH. 2020. A novel ensemble bivariate statistical evidential belief functions with knowledge-based analytical hierarchy process and multivariate statistical logistic regression for landslide susceptibility mapping. CATENA, 114 (2): 21–36.
Benjmel K, Amraoui F, Boutaleb S, Ouchchen M, Tahiri A, Touab A. 2020. Mapping of groundwater potential zones in crystalline terrain using remote sensing, GIS techniques, and multicriteria data analysis (Case of the Ighrem Region, Western Anti-Atlas, Morocco). Water, 12 (1): 471.
Chapi K, Singh VP, Shirzadi A, Shahabi H, Bui D, Pham BT, Khosravi K. 2017. A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environmental Modelling and Software, 95(1): 229–245.
C.hen W, Xie X, Wang J, Pradhan B, Hong H, Bui DT, Duan Z, Ma J. 2017. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA, 151: 147–160.
Chen W, Zhao X, Tsangaratos P, Shahabi H, Ilia I, Xue W, Ahmad BB 2020. Evaluating the usage of tree-based ensemble methods in groundwater spring potential mapping. Journal of Hydrology, 583(1): 124602.
Chezgi J, Pourghasemi HR, Naghibi, SA, Moradi HR, Kheirkhah Zarkesh M. 2016. Assessment of a spatial multi-criteria evaluation to site selection underground dams in the Alborz Province, Iran. Geocarto International, 31 (1): 628–646.
Corsini A, Cervi F, Ronchetti F. 2009. Weight of evidence and artificial neural networks for potential groundwater spring mapping: An application to the Mt. Modino area (Northern Apennines, Italy). Geomorphology, 111 (2): 79–87.
Costache R, Arabameri A, Elkhrachy I, Ghorbanzadeh O, Pham QB. 2021. Detection of areas prone to flood risk using state-of-the-art machine learning models. Geomatics, Natural Hazards and Risk, 12(1): 1488–1507.
Eftekhari M, Eslaminezhad S, Haji Elyasi A, Akbari M. 2021b. Development of DRASTIC model using artificial intelligence on the potential of aquifer contamination in semi-arid regions. Iranian Journal of Ecohydrology, 8(3): 651–665. (In Persian).
 Eftekhari M, Eslaminezhad SA, Akbari M, DadrasAjirlou Y, Elyasi AH. 2021a. Assessment of the potential of groundwater quality indicators by geostatistical methods in semi-arid regions. Journal of Chinese Soil and Water Conservation, 52(3): 158–167.
Eftekhari M, Madadi K, Akbari M. 2019. Monitoring the fluctuations of the Birjand Plain aquifer using the GRACE satellite images and the GIS spatial analyses. Watershed Management Research Journal, 32(4): 51–65. (In Persian).
Eslaminezhad SA, Omarzadeh D, Eftekhari M, Akbari M. 2021. Development of a data_driven model to predict landslide sensitive areas. Geographia Technica, 16(1): 97–112.
Falah F, Daneshfar M, ghorbaninejad S. 2017. Application of the statistical index model in groundwater potential mapping in the Khorramabad Plain. Journal of Water and Sustainable Development, 4(1): 89–98.
Hong H, Tsangaratos P, Ilia I, Liu J, Zhu AX, Chen W. 2018. Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Science of the Total Environment, 625 (2): 575–588.
Jancewicz K, MigoĊ„ P, Kasprzak M. 2019. Connectivity patterns in contrasting types of tableland sandstone relief revealed by Topographic Wetness Index. Science of the Total Environment, 656 (2): 1046–1062.
Kalantari Z, Ferreira CSS, Walsh RPD, Ferreira AJD, Destouni G. 2017. Urbanization development under climate change: hydrological responses in a peri-urban Mediterranean catchment. Land Degradation and Development, 28 (7): 2207–2221.
Kanani-Sadat Y, Arabsheibani R, Karimipour F, Nasseri M. 2019. A new approach to flood susceptibility assessment in data-scarce and ungauged regions based on GIS-based hybrid multi criteria decision-making method. Journal of Hydrology, 572 (4): 17–31.
Khosravi K, Melesse AM, Shahabi H, Shirzadi A, Chapi K, Hong H. 2019. Flood susceptibility mapping at Ningdu Catchment, China using bivariate and data mining techniques. In Extreme Hydrology and Climate Variability; Elsevier: Amsterdam, The Netherlands,  419–434.
Khosravi K, Nohani E, Maroufinia E, Pourghasemi HR. 2016. A GIS-based flood susceptibility assessment and its mapping in Iran: A comparison between frequency ratio and weights-ofevidence bivariate statistical models with multi-criteria decision-making technique. Natural Hazards, 83(2): 947–987.
Khosravi K, Pham BT, Chapi K, Shirzadi A, Shahabi H, Revhaug I, Prakash I, Tien Bui D. 2018. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Science of the Total Environment, 627(1): 744–755.
Kim JC, Jung HS, Lee S. 2019. Spatial mapping of the groundwater potential of the Geum River basin using ensemble models based on remote sensing images. Remote Sensing, 11 (2): 2285.
Kumar A, Krishna AP. 2018. Assessment of groundwater potential zones in coal mining impacted hard-rock terrain of India by integrating geospatial and analytic hierarchy process (AHP) approach. Geocarto International, 33(2): 105–129.
Lee S, Hyun Y, Lee S, Lee MJ. 2020. Groundwater potential mapping using remote sensing and GIS-based machine learning techniques. Remote Sensing, 12 (3): 1200–1213.
Manap MA, Nampak H, Pradhan B, Lee S, Sulaiman WNA, Ramli, MF. 2014. Application of probabilistic-based frequency ratio model in groundwater potential mapping using remote sensing data and GIS. Arabian Journal of Geosciences, 7(2): 711–724.
Moghaddam DD, Rahmati O, Panahi M, Tiefenbacher J, Darabi H, Haghizadeh A, Haghighi AT, Nalivan OA, Tien Bui D. 2020. The effect of sample size on different machine learning models for groundwater potential mapping in mountain bedrock aquifers. Catena, 187 (4): 104421.
Naghibi SA, Dolatkordestani M, Rezaei A, Amouzegari P, Heravi MT, Kalantar B, Pradhan B. 2019. Application of rotation forest with decision trees as base classifier and a novel ensemble model in spatial modeling of groundwater potential. Environmental Monitoring and Assessment, 191 (3): 1–20.
Naghibi SA, Moradi Dashtpagerdi M. 2017 Evaluation of four supervised learning methods for groundwater spring potential mapping in Khalkhal region (Iran) using GIS-based features. Hydrogeology Journal, 25 (4): 169– 189.
Naghibi SA, Pourghasemi HR, Abbaspour K. 2018. A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS. Theoretical and Applied Climatology, 131 (3): 967–984.
Nhu VH, Shirzadi A, Shahabi H, Singh SK, Al-Ansari N, Clague JJ, Jaafari A, Chen W, Miraki S, Dou J. 2020. Shallow landslide susceptibility mapping: A comparison between logistic model tree, logistic regression, naïve bayes tree, artificial neural network, and support vector machine algorithms. International Journal of Environmental Research and Public Health, 17 (2): 2749.
Pham BT, Van Phong T, Nguyen HD, Qi C, Al-Ansari N, Amini A, Ho LS, Tuyen TT, Yen HPH, Ly HB. 2020. A comparative study of kernel logistic regression, radial basis function classifier, multinomial naïve bayes, and logistic model tree for flash flood susceptibility mapping. Water, 12 (1): 239.
Prasad P, Loveson VJ, Kotha M, Yadav R. 2020. Application of machine learning techniques in groundwater potential mapping along the west coast of India. GIScience & Remote Sensing, 57(6): 735–752.
Razavi-Termeh SV, Sadeghi-Niaraki A, Choi, SM. 2019. Groundwater potential mapping using an integrated ensemble of three bivariate statistical models with random forest and logistic model tree models. Water, 11(5): 1596.
Sahoo S, Munusamy SB, Dhar A, Kar A, Ram P. 2017. Appraising the accuracy of multi-class frequency ratio and weights of evidence method for delineation of regional groundwater potential zones in canal command system. Water Resour. Manag., 31 (4): 4399–4413.
Shoombuatong W, Hongjaisee S, Barin F, Chaijaruwanich J, Samleerat T. 2012. HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees. Computers in Biology and Medicine, 42 (5): 885–889.
Tehrany MS, Pradhan B, Jebur MN. 2013. Spatial prediction of flood susceptible areas using rule based Decision Tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504 (2):69–79.
Witten IH, Frank E, Mark AH. 2011. Data mining: Practical machine learning tools and techniques. Acm Sigmod Record, 31 (1): 76-77.
Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM. 2016. Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides, 13 (3): 839–856.
Yuan F, Bauer ME. 2007. Comparison of impervious surface area and normalized difference vegetation index as indicators of surface urban heat island effects in Landsat imagery. Remote Sensing of Environment, 106 (4): 375– 386.