Evaluation of artificial intelligence models for daily prediction of reference evapotranspiration using temperature, rainfall and relative humidity in a warm sub-humid environment

. Accurate estimation of reference evapotranspiration is essential for agricultural management and water resources engineering applications. In the present study, the ability and precision of three artificial intelligence (AI) models (i.e., Support Vector Machines (SVMs), Adaptive Neuro-Fuzzy Inference System (ANFIS) and Categorical Boosting (CatBoost)) were assessed for estimating daily reference evapotranspiration (ET 0 ) using limited weather data from five locations in a warm sub-humid climate in Mexico. The Penman–Monteith FAO-56 equation was used as a reference target for ET 0 values. Three different input combinations were investigated, namely: temperature-based (minimum and maximum air temperature), rainfall-based (minimum air temperature, maximum air temperature and rainfall), and relative humidity-based (mini-mum air temperature, maximum air temperature and relative humidity). Extraterrestrial radiation values were used in all combinations. The temperature-based AI models were compared with the conventional Hargreaves–Samani (HS) model commonly used to estimate ET 0 when only temperature records are available. The goodness of fit for all models was assessed in terms of the coefficient of determination (R 2 ), Nash–Sutcliffe model efficiency coefficient (NSE), root mean square error (RMSE) and mean absolute error (MAE). The results showed that among the AI models evaluated, the SVM models outperformed ANFIS and CatBoost for modeling ET 0 . Further, the influence of relative humidity and rainfall on the performance of the models was investigated. The analysis indicated that relative humidity significantly improved the accuracy of the models. Finally, the results showed a better response of the temperature-based AI models over the HS method. AI models can be an adequate alternative to conventional models for ET 0 modeling.


INTRODUCTION
Evapotranspiration (ET) refers to the physical processes of transferring water from the soil surface (evaporation) and plants (transpiration) to the atmosphere (Allen et al., 1998). ET is a critical component of the water balance at plot, field, farm, catchment, basin and global levels. The measurement of evapotranspiration under normal conditions is of great importance in the estimation and management of present and future water resources and for solving many theoretical problems in the fields of hydrology, climatology and meteorology. In irrigation planning, evapotranspiration data are used for estimating the acreage of various crops or combinations of crops that can be irrigated with a given water supply or as a basis for estimating the amount of water that will be required to irrigate a given area. A lysimeter can be used to directly and accurately measure crop evapotranspiration (ET c ) from a well-watered agricultural crop. However, its widespread application is restricted by costly and time-consuming operations for data management (Wang et al., 2014). Therefore, in practice, the most common approach used for estimating crop evapotranspiration (ET c ) is the crop coefficient (K c ) approach, which consists of multiplying reference evapotranspiration (ET 0 ) with the crop coefficient (Allen et al., 1998). ET 0 is "the rate at which water, if available, would be removed from the soil and plant surface of a specific crop, arbitrarily called a reference crop" (Jensen et al., 1990). The reference crop is typically grass or alfalfa under well-watered conditions. Numerous equations, classified as temperature-based, radiation-based, pan evaporation-based and combination-type have been developed for estimating ET 0 . They vary in terms of data requirements and accuracy. The adapted FAO-56 Penman-Monteith equation (FAO56-PM) was recommended as the standard equation for estimating ET 0 and calibrating other ET 0 equations (Allen et al., 1998;Jensen et al., 1990;Kisi, 2013;Wang et al., 2014). FAO56-PM can be used in a wide variety of climate conditions, at different time steps and needs no local calibration because of its physical basis.
In the recent years, artificial intelligence (AI) techniques such as Artificial Neural Network (ANN), Generalized Regression Neural Network (GRNN), Adaptive Neuro-Fuzzy Inference System (ANFIS), Support Vector Machines (SVMs), Gene Expression Programming (GEP), Genetic Programming (GP), Extreme Learning Machine (ELM), Random Forest (RF), Categorical Boosting (CatBoost), Multilayer Perceptron (MLP) and Cascade Correlation Neural Network (CCNN) have been accepted as effective tools for estimating reference evapotranspiration (ET 0 ). Artificial intelligence methods are an alternative and emerging method, and can be used as innovative approaches because they offer benefits such as no required knowledge of internal variables, simpler solutions for multi-variable problems and accurate calculation (Gocić et al., 2015). In several studies, the accuracy of these techniques has been improved by using algorithms [e.g., Fire Fly Algorithm (FFA), Genetic Algorithm (GA), Particle Swarm Optimizer (PSO), Grey Wolf Optimizer (GWO), Multi-Verse Optimizer (MVO), Whale Optimization Algorithm (WOA) and Ant Lion Optimizer (ALO)] to tune the model's parameters and by the use of wavelet transform (WT) techniques to transform data series into sub-series. Reis et al. (2019) examined the accuracy of ANN and ELM techniques for estimating ET 0 based on temperature data in a semiarid region of Brazil and compared them with the Hargreaves-Samani (HS) empirical equation. They found that ANN and ELM models provide further accuracy compared to the HS equation. In recent years, the company Yandex has proposed a novel AI algorithm called CatBoost, capable of solving problems with heterogeneous features, noisy data and complex dependencies . Among its main features is the ability to avoid overfitting of trained models (Prokhorenkova et al., 2018). Huang et al. (2019b) employed Cat-Boost, RF and SVMs to predict ET 0 in a humid region of China. The results showed that the SVM technique offered the best accuracy when an incomplete combination of meteorological parameters was used, while Cat-Boost performed best with the complete combination of meteorological parameters. Tikhamarine et al. (2019) used five ANN models optimized by GWO, MVO, PSO, WOA and ALO algorithms to estimate monthly ET 0 . The results of the comparison showed that the ANN-GWO model's performance was superior to that of the other models. Raza et al. (2020) investigated the potential of SVM, MLP, GRNN and CCNN machine learning models to estimate ET 0 in four climatic regions. They found that the accuracy of the SVM model was better than the others for all evaluated climatic zones. Zhang et al. (2020) compared the performance of three artificial intelligence methods (i.e., CatBoost, GRNN and RF) for estimating daily ET 0 using limited meteorological data in an arid and semi-arid region of Northern China. CatBoost was found to perform better than GRNN and RF models. Tikhamarine et al. (2020) developed an optimized SVM model using the WOA algorithm for monthly ET 0 estimation and compared it with other two SVM-hybrid models named SVM-MVO and SVM-ALO. Their results indicated that the SVM-WOA model was more accurate at predicting ET 0 compared to the other models. Furthermore, prior studies have also proved the usability and potential of the ANFIS method for ET 0 modeling (Alizamir et al., 2020;Rezaabad et al., 2020;Roy et al., 2020).
In more recent years, a special class of AI algorithms known as Deep Learning (DL) has been successfully applied for estimating ET 0 , as in the case of Chen et al. (2020), who evaluated three DL models named Deep Neural Network (DNN), Temporal Convolution Neural Network (TCN) and Long Short-Term Memory Neural Network (LSTM) using limited meteorological data. Their results were compared with classic Machine Learning (ML) and empirical models, and the results indicated that both DL and ML techniques performed better than empirical models. In another study, Granata and Di (2021) examined two types of DL models called LSTM and NARX (Nonlinear Autoregressive Network with Exogenous input) for estimating actual evapotranspiration (ETa) in two different climatic conditions of the United States. The results revealed that both models can provide very accurate predictions of ETa, although the performance of the models can be affected by local climatic conditions.
The objectives of this study were to: (1) investigate the capability of three artificial intelligence techniques, i.e., ANFIS, SVMs and CatBoost, for modeling daily ET 0 in a sub-humid region based on limited climate data, (2) to compare the accuracy of the ANFIS, SVM and Cat-Boost techniques with the Hargreaves-Samani model when only temperature data are available, and (3) to examine the effect of using relative humidity and rainfall as input variables on the artificial intelligence model's prediction precision in a warm sub-humid region.

Case study site and data collection
This study was performed using weather data from five meteorological stations located in the Yucatán Peninsula, Mexico (Fig. 1). The climatic data were provided by the Mexican National Meteorological Service (SMN; Servicio Meteorológico Nacional) and Mexico's National Institute for Forestry, Agriculture and Livestock Research (INIFAP; Instituto Nacional de Investigaciones Forestales Agrícolas y Pecuarias). Continuous and longterm series of observed daily weather data including minimum and maximum air temperature (T min , T max , °C), mean wind speed (u 2, ms -1 ), mean relative humidity (RH, %), precipitation (W) and global solar radiation (R s , MJ M -2 day -1 ) were collected from five meteorological stations. The geographic information and annual meteorological conditions of these stations during the period under study are presented in Tab. 1. All stations studied were identified as moist sub-humid (annual P/ET 0 ratio between 0.65 and 1.0) according to the global aridity index (Tab. 1) developed by the United Nations Convention to Combat Desertification (Middleton and Thomas, 1997). The climate in the Yucatán Peninsula can be described as tropical savanna (Aw) according to the Köppen system (Köppen, 1936), with a rainy summer and dry winter. The maximum and minimum daily air temperature is 36 ºC and 16 ºC respectively. The rainfall occurs in summer and autumn with a gradient of minimum rainfall in the northwest (600 mm/year) to higher quantities toward the southeast (1400 mm/year).
The database was checked to find missing or inconsistent records of air temperature, relative humidity and precipitation. In cases where less than five consecutive missing or incorrect records were found within a period of one month, the Piecewise Cubic Hermite Interpolating Polynomials (PCHIPs) interpolation method (Fritsch and Carlson, 1980;Torrente Cantó, 2018) was used to fill in missing values or replace inconsistent values. Otherwise, these values were removed from the database. Overall, deleted and missing data, which accounted for approximately 2% of the database, were replaced with values estimated by interpolation.

Targets used for training and testing
In this paper, the FAO56-PM equation (Allen et al., 1998) was used to provide the daily ET 0 estimates for training and testing the models. This is a very common practice, because lysimetric or other experimental ET measurements are not available in most cases. (1) where ET 0 is reference evapotranspiration (mm day -1 ), Δ is the slope of the saturation vapor pressure (kPa °C -1 ), γ is the psychrometric constant (kPa °C -1 ), R n is net radiation at the crop surface (MJ m -2 day -1 ), G is soil heat flux density (MJ m -2 day -1 ), which may be ignored as the magnitude of the day soil flux is small, T is mean daily air temperature (°C), u 2 is average wind speed at 2 m height (m s -1 ), e s is saturation vapor pressure (kPa), e a is actual vapor pressure (kPa), and e s -e a is saturation vapor pressure deficit (kPa). The computation of all data required for calculating ET 0 followed the method and procedure given in Chapter 3 of FAO-56.
In this study, the ET 0 -FAO56PM equation was used to evaluate multiple linear regression and artificial intelligence methods.

Adaptive neuro fuzzy inference system (ANFIS)
ANFIS is a multilayer model initially proposed by Jang (1993). It is a hybrid model where the nodes in the different layers of a feed-forward network handle fuzzy parameters. This is equivalent to a fuzzy inference system (FIS) with distributed parameters. The technique splits the representation of prior knowledge into subsets, in order to reduce the search space, and uses the backpropagation algorithm to adjust the fuzzy parameters. The resulting system is an adaptive neural network functionally equivalent to a first-order Takagi-Sugeno inference system, where the input-output relationship is linear.
In a first-order Sugeno system, a standard rule set with two fuzzy IF/THEN rules can be expressed as: • Rule 1. If x is A 1 and y is B 1 , then f 1 = p 1 x + q 1 y + r 1 (2) • Rule 2. If x is A 2 and y is B 2 , then f 2 = p 2 x + q 2 y + r 2 (3) where A 1 and B 1 are the fuzzy sets in the antecedent, f i is the output within the fuzzy region specified by the fuzzy rule; and pi, q i and r i are the design parameters that are determined during the training process.
The ANFIS architecture consists of five layers, namely: fuzzy layer, product layer, normalized layer, defuzzy layer and total output layer. The inputs for two of them are shown in Fig. 2. Each layer performs a specific function in the fuzzy inference system. For identification, the adaptive nodes are represented by squares and fixed nodes are portrayed as circles. Layer 1 (Fuzzy layer): each node i in this layer (indicated with a square) represents a node function: where x (or y) is the input to node i, and A i or (B i-2 ) is the linguistic label (small, large, etc.) characterized by appropriate membership functions (MFs) μ Ai (x) and μ Bi- where x is the input to the node and ({a,b,c}) are the premise parameters set that changes the shapes of the MFs with a maximum of 1 and a minimum of 0. Layer 2 (Product layer): The circle nodes represented with Π in Fig. 2 denote a fuzzy operator, e.g., product t-norm, which multiplies the incoming signals, such as: where O 2,i is the output of layer, and the output signal w i indicates the firing strength of the rule. Layer 3 (Normalized layer): The nodes in this layer are represented with N and they calculate the ratio of the i-th rule's firing strength to the sum of firing strengths of all rules by: where the O 3 , i is the output of Layer 3. The quantity is referred to as the normalized firing strength.
Layer 4 (De-fuzzy layer): The nodes in this layer are represented with a square and they calculate the weighted output of each linear function: where is the output of layer 3, and {p i x + q i y + r i } is the parameter set, referred to as the consequent parameters.
Layer 5 (Total output layer): The single node denoted with a Σ computes the overall output as follows: =fout=Estimated overall output (9) A hybrid learning algorithm was used to determine the premise and consequent parameters. The hybrid learning algorithm procedure calculates the consequent parameters in a forward pass and the premise parameters in a backward pass. In the forward phase, the information is transmitted forward to layer 4, where the consequent parameters are calculated using the least squares regression algorithm. In the backward phase, the error signals propagate backwards and the premise parameters are estimated by the gradient descent (GD) algorithm (Jang et al., 1997). This error measure is generally defined by the sum of the squared differences between actual and desired outputs.
In the present study, a bell-shaped function (Eq. 5) was used for the MFs. On the other hand, two MFs were used for each input variable, considering that the best results were obtained with this value, achieved through an iterative process. The clustering method called grid partitioning was used to create the Takagi-Sugeno FIS structure (Cobaner, 2011;Shiri et al., 2012).

Support Vector Machines (SVMs)
Support vector machines (SVMs) is a supervised learning method from the field of machine learning theory and structural risk minimization, applicable to classification or regression analysis, and was presented by Vapnik (2013). In addition to its strong mathematical foundation in statistical learning theory, SVMs have proven to have highly competitive performance in several real-world applications. Originally developed for solving classification problems, the SVM procedure can also be effectively applied to regression problems as follows.
, where x i is the input vector, y i is the output value and N is the total number of data sets, the objective is to establish a function that indicates the degree of independence of y i from the input x i . In SVMs, this regression function is approximated using the following function: where ω is a weight vector, b is a bias (scalar value) and φ(x) is the high-dimensional feature space which is nonlinearly mapped from the input space x. The coefficients b and ω are calculated by minimizing the following regularized risk function: In the regularized risk function, the term is empirical error (risk), and measured by function L ε given below: The term is the regularization term, which represents the Euclidean norm, C is a positive trade-off parameter that regulates the degree of the empirical error in the optimization problem that is selected by the user. is called tube size and it is equivalent to the accuracy of the approximation placed on the training data points.
To obtain the estimations of w and b, Eq. (11) is transformed to the primal function given by Eq. (12) by introducing the positive slack variables ξ and ξ* as follows: Minimize (13) subjected to To address the optimization problem, Lagrange multipliers α and α* are added to the condition equations, and the equation can be written in its dual form: with constraints: where α i and α i* are Lagrange multipliers to be solved, and K(x i ,x j ) is the so-called kernel function and calcu-lated by K(x i ,x j )=φ(x i ).φ(x j ) on the feature space. The kernel allows SVMs to form nonlinear boundaries; in other words, it gives the SVM the capacity to model complicated separating hyperplanes. In this study, after a number of trial-and-error processes, the radial basis function (RBF) was chosen as the kernel function. The RBF is defined as follows: where x i and x j are vectors in the input space and γ is the kernel parameter. After calculating Lagrange multipliers, an optimal desired weights vector of the regression hyperplane is found as follows: and Eq. (10) can be rewritten as follows: where n is the number of support vectors, (α i -α i *) are their Lagrange multipliers, the term K(x i ,x j ) is the kernel function in the input space and the bias b is calculated from training samples. There are three free parameters while using the RBF kernel: C (cost), ε (epsilon) and γ (gamma) should be determined to find the optimal solutions. In this study we have used the genetic algorithm (GA) technique with a ten-fold cross-validation procedure to optimize these parameters (Saud et al., 2020) and varying the parameters ε = 0.002 to ε = 2, C = 0.0001 to C = 10, and γ = 0.0001 to γ = 2. A GA is an intelligent optimization method based on the principles of genetics and natural selection. Further details of this technique can be found in Antonanzas-Torres et al. (2015) and Zhang et al. (2015).

CatBoost
CatBoost is a decision-tree machine learning algorithm based on gradient boosting decision tree (GBDT), developed by researchers from the Russian internet company Yandex (Prokhorenkova et al., 2018). It is capable of solving problems with heterogeneous characteristics, noisy data and complex dependencies compared to other algorithms based on decision trees. Among the advantages of using CatBoost is that it requires the configuration of few hyperparameters, thus avoiding overfitting and obtaining more generalized models. Decision trees are used for regression, and each tree corresponds to a partition of the feature space and the output value.
This model has several advantages compared with traditional GBDT algorithms, which generally cope with categorical features by a method named Greedy Target Statistics (Greedy TS). To deal with categorical features during training rather than the preprocessing phase to estimate the expected and category driven target, Cat-Boost allows the use of a complete data set for training. According to (Prokhorenkova et al., 2018), the Greedy TS strategy manages categorical features with minimum information loss. This is useful for minimizing information loss and overfitting. Given a data set D={X i } i= 1,…,n, where X i =(x i ,1,⋯,x i ,m) is a vector of m characteristics, the category of the k-th training example can be replaced by a numerical characteristic expressed in (Eq. 20) according to the requested TS. The substitution of a given categorical example xσ p,k , k can be obtained by calculating its average value with the same category value placed before in a random permutation of data set σ=(σ 1 ,⋯,σ n ). Also, CatBoost is able to combine various categorical features into a new one in a greedy way by establishing a new tree split.
where y is the objective function, P is a previous value, is the weight of the previous value, and [xσ j,k =xσ p,k ] is equal to one when xσ j,k =xσ p,k , otherwise it is equal to zero, since α>0 represents the P weight. This method contributes to reducing the noise obtained from the low frequency category.
On the other hand, CatBoost combines multiple categorical features, using a greedy way of combining all the categorical characteristics and their combinations in the current tree with all the categorical characteristics in the data set, so that CatBoost overcomes the gradient bias.
The classic GBDT procedure generates a weak learner in each iteration and each learner is trained based on the gradient of the previous learner. The accumulation of the classified results of all learners provides the result (Friedman, 2002). However, it will lead to a punctual and biased gradient estimation, causing the final learned model to overfit. CatBoost uses a new method to change the gradient estimation method in the classic algorithm, called gradient augmentation. This method can overcome the prediction change caused by gradient bias and further improve model generalizability.
To obtain an unbiased gradient estimate, CatBoost trains a separate M i model for each x i sample, and the M i model is trained with a data set that does not contain the X i sample. In this case, M i is used to obtain a gradient estimate of the sample. Also, this gradient will be used to train the base learner for the final model.
In the present study, in the implementation of the CatBoost model, some of its main parameters that affect the precision and model stability were adjusted using the cross-validation (fold = 5) technique (Saud et al., 2020); the number of iterations varied between 200 and 800 at 100 intervals, the maximum depth of tree ranged from 2 to 10 at intervals of 2, and the proportion of subsets varied from 0.5 to 1 at intervals of 0.05.

Hargreaves and Samani equation
The HS model (Hargreaves and Samani, 1985) is considered an alternative for estimating ET 0 when only temperature records are available. The HS model is structured as follows: where ET HS is the reference evapotranspiration estimated (mm day -1 ); K coef is an empirical coefficient, which was initially established at 0.0023 but has been recalibrated according to the place used; T mean , T max , and T min are the average, maximum and minimum daily air temperature respectively; H 0 is the extraterrestrial solar radiation (MJ m -2 day -1 ); and the coefficient 0.048 is for converting MJm -2 day -1 to mm day -1 . In this study, the K coef was obtained by the nonlinear least squares fitting technique.

Inputs considered and model scenarios
In the present study, three input combinations of the daily minimum air temperature (T min ; ºC), maximum air temperature (T max ; ºC), mean relative humidity (RH; %), extraterrestrial radiation (H 0 ; MJ m -2 day -1 ), and rainy days (RT) as binary number [(W), rainfall > 0, W = 1; rainfall = 0, W = 0] were used to estimate the daily reference evapotranspiration. To identify the influence of the parameters considered on the accurate prediction of ET 0 , the Pearson correlation coefficients between the ET 0 and independent parameters (inputs) were calculated. Tab. 2 presents the Pearson correlation coefficients obtained. According to Tab. 2, all inputs considered were observed to have favorable correlations with ET 0 . However, the highest correlation was obtained for T max and RH, while the lowest correlation was obtained for the RT variable. In this study, three scenarios were considered to evaluate the effect of relative humidity, rainfall and temperature, namely: (1) Relative humidity-based: T min , T max , RH and H 0 (SVM1, ANFIS1, CatBoost1), (2) Rainfall-based: T min , T max , RT and H 0 (SVM2, ANFIS2, CatBoost2), (3) Temperature-based: T min , T max and H 0 (SVM3, ANFIS3, CatBoost3), H 0 was calculated as a function of the day of year, site latitude and solar angle, according to the equation proposed by Allen et al. (1998).
As a normal practice for developing predictive models and to avoid problems in the training and testing stages due to the different orders of magnitude and large-scale fluctuations of the inputs that generate numerical issues while producing the forecast, the dataset was normalized to the range from zero to one using the following equation (Feng et al., 2016): (22) where x n, x 0 , x min and x max are the normalized value, real value, minimum value and maximum value respectively.
Moreover, to ensure the representativeness of the dataset, the database was randomly split into two subsets, using 70% for training and the remaining 30% to test the model. The training dataset was used to train all the models, while the testing data was used to verify the accuracy and the performance of the trained model. Despite the fact that a k-fold validation might be desirable, a simple holdout assessment was applied, i.e., a single data set assignment was considered for evaluating the models. Given the size of the data series, the test set sample was considered representative enough. Further, in light of the different modeling approaches considered, the application of the k-fold assessment might have involved too high computational costs.
A script file written in MATLAB 2019b software version was used to carry out the computer simulation of the ANFIS models. For the SVM and CatBoost mod-els, two open-source software packages named 'LIBSVM 3.2' (Chang et al., 2013) and 'Catboost' respectively were implemented using the R computing environment (Meyer and Wien, 2014;RDevelopment, 2009).

Model performance evaluation
The performance and accuracy of the proposed models for estimating daily ET 0 were evaluated using five common statistical tests (Despotovic et al., 2015;Teke et al., 2015): root mean square error (RMSE; Eq. 23), mean absolute error (MAE; Eq. 24), mean bias error (MBE; Eq. 25), coefficient of determination (R 2 ; Eq. 26) and the Nash-Sutcliffe model efficiency coefficient (NSE; Eq. 27). The five criteria adopted are sufficient to fully characterize the efficiency of the models.
(mm day -1 ) where n and avg represent the total number of evaluated data and the average of the variable, and x and y are the ET 0 values predicted by the models and ET 0 -FAO56PM respectively. Smaller RMSE and MAE values imply a closer approximation of the values measured by the models. Larger R 2 values indicate a closer match of measured data trends with the model results. Mean Bias Error (MBE) captures the average bias in the prediction. MBE is primarily used to estimate the average bias in the model and to decide if any steps need to be taken to correct the model bias. Positive values of MBE indicate the overestimation of daily ET 0 by a model and negative values indicate underestimation. The NSE values can range from -∞ to 1, and a perfect fit between the simulated and observed data is reached when NSE value is close to 1 (Krause et al., 2005).  In this study, the capabilities of three artificial intelligence techniques (i.e., CatBoost, ANFIS and SVMs) for predicting ET 0 with the daily meteorological variables T max , T min , RH, R s and u 2 were examined. The type of input parameters plays an important role in the accuracy of AI models in predicting daily ET 0 . Fig. 3 shows the scatter plots of the ET 0 values estimated by the best AI technique and calibrated HS model in each scenario during the testing phase and measured values at the five meteorological stations. According to R 2 values, the plots clearly reveal that relative humidity might be the most influential input for estimating ET 0 , and it is also noted that the SVM method was the best in the three scenarios evaluated. In addition, it is observed that the empirical HS equation commonly used when temperature data are available performed worse compared to the temperaturebased AI models. Tab. 4 gives the R 2 , NSE, RMSE, MAE and MBE values for ANFIS, SVM and CatBoost models during the training and testing phase. In the first scenario, the effect of relative humidity on the model's prediction accuracy was evaluated in a sub-humid tropical region. The SVM1 model had the best performance for all evaluated stations, with an RMSE of 0.37 -0.479 mm day -1 , MAE of 0.285-0.372 mm day -1 , R 2 of 0.937-0.862 and NSE of 0.936-0.861. The CatBoost1 and ANFIS1 models were ranked second and third respectively. For the second scenario, the effect of rainfall on the performance of the models was investigated, since it is usually measured and can be used to improve ET 0 estimation. The SVM2 models outperform the CatBoost2 and ANFIS2 models for all locations with an RMSE of 0.494-0.700 mm day -1 , MAE of 0.379-0.534 mm day -1 , R 2 of 0.834-0.758 and NSE of 0.833-0.752. On the other hand, the ANFIS2 and CatBoost2 models based on rainfall data presented similar performance criteria (RMSE, MAE, R 2 and NSE). In the third scenario, considering that relative humidity and rainfall data are not always readily available, a temperature-based model may be helpful. The SVM3 model provided the best performance at four of the five stations evaluated with RMSE values of 0.548-0.731 mm day -1 , MAE of 0.418-0.563 mm day -1 , R 2 of 0.779-0.738 and NSE of 0.778-0.731, with the CatBoost3 model showing higher accuracy at the Mérida station. Overall, the SVM technique offers better performance for all evaluated scenarios with a RMSE = 0.552 mm day -1 , MAE = 0.425 mm day -1 , R 2 = 0.811 and NSE = 0.806, followed by the CatBoost model with a RMSE = 0.561 mm day -1 , MAE = 0.434 mm day -1 , R 2 = 0.799 and NSE = 0.737. As seen in scenario 1, the inclusion of relative humidity significantly improved all evaluated techniques.
In general, during the test phase, it should be noted that the consideration of relative humidity as an input in the SVM1 model significantly increased the estimation accuracy through a decrease in MAE and RMSE values of 32.73 and 32.71% respectively, and an increase in R 2 of 14.5%. Meanwhile, when W data were added to the SVM2 model, it showed a slight reduction in MAE and RMSE of 9.06 and 8.87% respectively, while R 2 increased by 4.4%. As demonstrated in several studies, the combination of bio-inspired algorithms improves the accuracy of artificial intelligence models (Mohammadi and Mehdizadeh, 2020; Yin et al., 2017). In the present study the parameters of the SVM-based models (C, g, and e) were optimized using the genetic algorithm (Tab. 3).
One of the main drawbacks of AI-based techniques is overfitting (Fathian et al., 2019). However, in the case of the SVM method, one of its main advantages over other AI methods lies in the fact that the non-linear problem always converges to a global minimum. In most studies, model accuracy has been shown to increase in ET 0 prediction as the number of meteorological parameters increase. A study conducted by Chen et al. (2020) showed that the performance of SVM models was better than LSTM and DNN models when relative humidity features were available. Tab. 4 also shows the performance indicators for the data set during the training  Tab. 5 shows the performance results of the HS model, as well as its calibrated empirical coefficient (K coef ). On comparing these results with the AI-based models of scenario 3 (temperature-based), it is clearly observed that in general the AI-based models outperformed the HS method. In fact, many studies have shown the superiority of AI-based methods over the HS equation (Reis et al., 2019;Chen et al., 2020;. In addition, the calibration of the Ks coefficient considerably improved the performance of the HS models compared to the uncalibrated ones. On the other hand, the temperature-and extraterrestrial radiationbased HS equation tended to overestimate ET 0 in humid and sub-humid regions (Allen et al., 1998), with estimates improving when calibrated in local climate conditions (Almorox et al., 2015). However, the uncalibrated HS equation was found to underestimate ET 0 under the warm sub-humid climate of the Yucatán Peninsula, according to the MBE statistic in Tab. 5.
Overall, SVM showed the best performance among the three AI models, and comparing the CatBoost and ANFIS models, the CatBoost model showed better performance than the ANFIS model for all three scenarios evaluated. These results agreed with the result obtained by Raza et al. (2020), who evaluated five AI-based methods, finding that the SVM method improves the accuracy of estimates in both hyper-arid and high-humidity climates. Similar results were obtained by Huang et al. (2019b) in humid regions of China when they evaluated the SVM, CatBoost and RF methods, finding that, of the three methods, SVMs offered the best prediction accuracy and stability with incomplete combinations of meteorological parameters as inputs (T min , T max , RH) with a RMSE value of 0.640 mm day -1 , while CatBoost performed best with the complete combination of parameters (T min , T max , RH, U 2 and R s ) with a RMSE value of 0.220 mm day -1 .
Regarding the bias of the AI-based models, Tab. 4 shows that the SVM model had small positive MBE values during the test phase, suggesting that the SVM model slightly overestimates ET 0 values at the study site, the exception being at the Campeche station when only temperature data were used. Overall, the CatBoost method had a tendency to slightly underestimate the values of ET 0 , according to the negative MBE values. In the case of the ANFIS models, according to the MBE indicator, the model overestimated ET 0 values at the Campeche and Calakmul stations, and underestimated ET 0 values at the Efraín Hernández, Mérida and Tantakín stations.
One of the main advantages of AI-based models over traditional methods is that overfitting problems can be avoided by selecting an appropriate structure, such as in the case of ANFIS method, or by adjusting internal parameters through cross-validation in CatBoost or using algorithms in the case of SVM models.
The main advantage of the use of AI models is their capacity to model large amounts of noisy data from dynamic and non-linear systems. One of the main disadvantages of using AI-based models is that the trained algorithm to be used requires the use of specialized software (i.e., MATLAB, R or Python) or, where appropriate, it must be embedded as a module on an Arduino or Raspberry Pi board.

CONCLUSIONS
In the present study, the performance of SVM, ANFIS and CatBoost models was assessed for estimating daily ET 0 , considering climatic data from five weather stations located in the Yucatán Peninsula, Mexico.
Overall, the SVM approach showed the best performance for all evaluated scenarios for estimating ET 0 , which is helpful for irrigation scheduling in a warm sub-humid region of the Yucatán Peninsula, Mexico and possibly elsewhere with similar climate. In the present study, it has been shown that the use of the GA algorithm combined with SVM-based models improves the accuracy of daily ET 0 estimation.
Most of the previous works carried out in warm sub-humid climates include the relative humidity vari- able in the model. In this study, the precipitation variable was considered because the variability of temperature in the rainy season in the Yucatán region is strongly modulated by precipitation events. This way, adding precipitation data as a binary number in the rainfall-based scenario slightly increased model performance. However, the inclusion of RH values into the relative humiditybased scenario significantly improved estimation accuracy. A major disadvantage of this scenario is the unavailability of RH data in some regions, while temperature and rainfall data are generally measured at all weather stations. This suggests that, if RH data are available, the SVM1 model should be used to obtain better results. Finally, a comparative analysis of model performance showed that the AI models have better ability than the HS model for ET 0 modelling when only temperature records are available, although the HS model presents the advantage of using an algebraic equation, facilitating its application. AI-based models can be a suitable alternative to conventional models for ET 0 modelling; however, prior adjustment of their internal parameters by using a crossvalidation test is necessary to avoid overfitting. In the future, in order to improve the present study, it would be convenient to analyze the performance of the models under a seasonal analysis scheme, as well as the performance of the models in the dry and rainy seasons.
Thus, the results of this study indicate that SVMs increase the prediction accuracies of ET 0 estimates in warm sub-humid tropical climates such as the Yucatán in Mexico, especially when air humidity is included in the model.