Prediction of Employee Turnover with Imbalance Dataset Using Machine Learning Methods

Abstract views: 97 / PDF downloads: 38


  • Çetin KAYA Ostim Technical University
  • Murat ŞİMŞEK Ostim Technical University



Employee Turnover, Machine Learning, Imbalanced Data, Cross Validation, Classification Algorithm


Employee turnover can have a significant impact on an organisation's productivity, culture and profitability. Accurately predicting employee turnover can help organisations proactively identify and address issues before they become major problems. In this paper, various analyses were performed with the help of traditional machine learning methods using employee turnover and attrition dataset. As a result of the analyses, unbalanced data distribution was detected in the dataset. In order to solve this problem, methods for balancing up and down data sets were used. After data balancing, the k-fold method, one of the cross-validation methods, was applied to avoid overlearning. The Random Forest Classification method was selected and used together with the ROS method, which shows higher performance. GridSearchCV, a hyper-parameterisation technique, was applied to the selected model to select the best parameters. At the same time, both data pre-processing and postprocessing activities were performed. As a result of the experiments conducted in the study, it was found that the data set balanced using the proposed method increased the performance values in the classification result and improved the classification performance compared to the raw data set and other sampling methods.


Download data is not yet available.

Author Biographies

Çetin KAYA, Ostim Technical University


Murat ŞİMŞEK, Ostim Technical University



D. Ramyachitra and P. Manikandan,” IMBALANCED DATASET CLASSIFICATION AND SOLUTIONS: A REVIEW”, International Journal of Computing and Business Research (IJCBR), vol. 5, Issue 4, July 2014

Hong, W.-C., Pai, P.-F., Huang, Y.-Y., & Yang, S.-L. (2005). Application of Support Vector Machines in Predicting Employee Turnover Based on Job Performance. Advances in Natural Computation, 668–674. doi:10.1007/11539087_85

Danquah, R. A., Handling Imbalanced Data: A Case Study For Binary Class Problems, Department of Mathematics Southern Illinois University Edwardsville, IL 62026

Kim, J., Jeong, J., & Shin, J. (2020). M2m: Imbalanced Classification via Major-to-Minor Translation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr42600.2020.01391

Leevy, J. L., Khoshgoftaar, T. M., Bauder, R. A and Seliya, N., A survey on addressing high-class imbalance in big data, Leevy et al. J Big Data (2018) 5:42

Şimşek, M., Daş, A. S. ,The Effect of Handling Imbalanced Datasets Methods on Prediction of Entrepreneurial Competency in University Students, 2022,

Zhao, Y., Hryniewicki, M. K., Cheng, F., Fu, B., & Zhu, X. (2018). Employee Turnover Prediction with Machine Learning: A Reliable Approach. Intelligent Systems and Applications, 737–758. doi:10.1007/978-3-030-01057-7_56

Alao, D., Adeyemo, A.B.: Analyzing employee attrition using decision tree algorithms. Comput. Inf. Syst. Dev. Inform. Allied Res. J. 4 (2013)

Sexton, R.S., McMurtrey, S., Michalopoulos, J.O., Smith, A.M.: Employee turnover: aneural network solution. Comput. Oper. Res. 32, 2635-2651 (2005)

A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, and F. Herrera, Learning from imbalanced data sets. Springer, 2018.




How to Cite

KAYA, Çetin, & ŞİMŞEK, M. (2023). Prediction of Employee Turnover with Imbalance Dataset Using Machine Learning Methods. International Journal of Advanced Natural Sciences and Engineering Researches, 7(9), 12–16.

Conference Proceedings Volume