Abstract

This paper looks at the problem of employee turnover, which has considerable influence on organizational productivity and healthy working environments. Using a publicly available dataset, key factors capable of predicting employee churn are identified. Six machine learning algorithms including decision trees, random forests, naïve Bayes and multi-layer perceptron are used to predict employees who are prone to churn. A good level of predictive accuracy is observed, and a comparison is made with previous findings. It is found that while the simplest correlation and regression tree (CART) algorithm gives the best accuracy or F1-score, the alternating decision tree (ADT) gives the best area under the ROC curve. Rules extracted in the if-then form enable successful identification of the probable causes of churning.