Assessment of the Different Machine Learning Models for Prediction of Cluster Bean (Cyamopsis tetragonoloba L. Taub.) Yield

Main Article Content

Darshan Jagannath Pangarkar
Rajesh Sharma
Amita Sharma
Madhu Sharma


Prediction of crop yield can help traders, agri-business and government agencies to plan their activities accordingly. It can help government agencies to manage situations like over or under production. Traditionally statistical and crop simulation methods are used for this purpose. Machine learning models can be great deal of help. Aim of present study is to assess the predictive ability of various machine learning models for Cluster bean (Cyamopsis tetragonoloba L. Taub.) yield prediction. Various machine learning models were applied and tested on panel data of 19 years i.e. from 1999-2000 to 2017-18 for the Bikaner district of Rajasthan. Various data mining steps were performed before building a model. K- Nearest Nighbors (K-NN), Support Vector Regression (SVR) with various kernels, and Random forest regression were applied. Cross validation was also performed to know extra sampler validity. The best fitted model was chosen based cross validation scores and R2 values. Besides the coefficient of determination (R2), root mean squared error (RMSE), mean absolute error (MAE), and root relative squared error (RRSE) were calculated for the testing set. Support vector regression with linear kernel has the lowest RMSE (23.19), RRSE (0.14), MAE (19.27) values followed by random forest regression and second-degree polynomial support vector regression with the value of gamma = auto. Instead there was a little difference with R2, placing support vector regression first (98.31%), followed by second-degree polynomial support vector regression with value of gamma = auto (89.83%) and second-degree polynomial support vector regression with value of gamma = scale (88.83%). On two-fold cross validation, support vector regression with a linear kernel had the highest cross validation score explaining 71% (+/-0.03) followed by second-degree polynomial support vector regression with a value of gamma = auto and random forest regression. KNN and support vector regression with radial basis function as a kernel function had negative cross validation scores. Support vector regression with linear kernel was found to be the best-fitted model for predicting the yield as it had higher sample validity (98.31%) and global validity (71%).

Yield, machine learning, K-NN, SVR, random forest

Article Details

How to Cite
Pangarkar, D. J., Sharma, R., Sharma, A., & Sharma, M. (2020). Assessment of the Different Machine Learning Models for Prediction of Cluster Bean (Cyamopsis tetragonoloba L. Taub.) Yield. Advances in Research, 21(9), 98-105.
Short Research Article


Meftahizade Heidar, Hamidoghli, Yousef, Assareh, Mohammad, Javanmard, Majid. Effect of sowing date and irrigation regimes on yield components, protein and galactomannan content of guar (Cyamopsis tetragonoloba L.) in Iran climate. Australian Journal of Crop Science. 2017;11:1481-1487.

Gonzalez-Sanchez A, Juan FS, Waldo OB. Predictive ability of machine learning methods for massive crop yield prediction. Spanish Journal of Agricultural Research. 2014;12(02).

Roel A, Plant RE. Factors underlying yield variability in two California rice fields. Agronomy Journal. 2004;96(5):1481-1494.

Jeong JH, Resop JP, Mueller ND, Fleisher DH, Yun K, Butler EE, et al. Random forests for global and regional crop yield predictions. PLoS One. 2016;11(6): e0156571. Available:

Pavani S, Beulet AS. Heuristic prediction of crop yield using machine learning technique. International Journal of Engineering and Advanced Technologies. 2019;09(01):135 – 138.

Khosla E, Dharavath R, Priya R. Crop yield prediction using aggregated rainfall-based modular artificial neural network and support vector regression. Environ Dev Sustain. 2020;22:5687-5708

Drummond ST, Sudduth KA, Joshi A, Birrell SJ, Kitchen NR. Statistical and neural methods for site-specific yield prediction. T ASABE. 2003;46(1):5-14.

Devika B, Ananthi, B. Analysis of crop yield prediction using data mining technique to predict annual yield of major crops. International Research Journal of Engineering and Technology. 2018;05(12): 1460-1465.

Mishra S, Mishra D, Gour HS. Applications of machine learning techniques in agricultural crop production: A review paper. Indian Journal of Science and Technology. 2016;9(38):1-14.

Zhang L, Zhang J, Kyei-Boahen S, Zhang M. Simulation and prediction of soybean growth and development under field conditions. Am-Euras J Agr Environ Sci. 2010;7(4):374-385.

Smola A, Schölkopf B. A Tutorial on Support Vector Regression. Statistics and Computing. 2004;14(3):199-222.

Vapnik V, Golowich S, Smola A. Support vector method for function approximation, regression estimation, and signal processing. In: Advances in neural information processing systems (Mozer M, Jordan M, & Petsche T, eds), MIT Press, Cambridge, MA, USA. 1997:281-287.

Manjula E, Djodiltachoumy S. A model for prediction of crop yield. International Journal of Computational Intelligence and Informatics. 2017;6(4):298-305.

Noronha P, Divya J, Shruthi BS. Comparative study of data mining techniques in crop yield prediction. International Journal of Advanced Research in Computer and Communication Engineering. 2016;05(02): 132-135.

Porchilambi K, Sumitra P. Machine learning algorithms for crop yield prediction: A survey. Journal of Emerging Technologies and Innovative Research. 2019;06(03):112-116.

Kumar A, Kumar N, Vats V. Efficient crop yield prediction using machine learning algorithms. International Research Journal of Engineering and Technology. 2018; 05(06):3151-3159.

Kumar S, Kumar V, Sharma RK. Rice yield forecasting using support vector machine. International Journal of Recent Technology and Engineering. 2019;08(04):2588-2593.