Comparing multivariate regression and artificial neural network to predict barley production from soil characteristics in northern Iran
Abstract
In this study artificial neural network (ANN) models were designed to predict the biomass and grain yield of barley from soil properties; and the performance of ANN models was compared with earlier tested statistical models based on multivariate regression. Barley yield data and surface soil samples (0–30 cm depth) were collected from 1 m2 plots at 112 selected points in the arid region of northern Iran. ANN yield models gave higher coefficient of determination and lower root mean square error compared to the multivariate regression, indicating that ANN is a more powerful tool than multivariate regression. Sensitivity analysis showed that soil electrical conductivity, sodium absorption ratio, pH, total nitrogen, available phosphorus, and organic matter consistently influenced barley biomass and grain yield. A comparison of the two methods to identify the most important factors indicated that while in the ANN analysis, soil organic matter (SOM) was included among the most important factors; SOM was excluded from the most important factors in the multivariate analysis. This significant discrepancy between the two methods was apparently a consequence of the non-linear relationships of SOM with other soil properties. Overall, our results indicated that the ANN models could explain 93 and 89% of the total variability in barley biomass and grain yield, respectively. The performance of the ANN models as compared to multivariate regression has better chance for predicting yield, especially when complex non-linear relationships exist among the factors. We suggest that for further potential improvement in predicting the barley yield, factors other than the soil properties considered such as soil micronutrient status and soil and crop management practices followed during the growing season, need to be included in the models