Please use this identifier to cite or link to this item: http://hdl.handle.net/11189/8952
Title: Experimental analysis of hyperparameters for deep learning-based churn prediction in the Banking sector
Authors: Domingos, Edvaldo 
Ojeme, Blessing 
Daramola, Olawande 
Keywords: Churn prediction;churn modeling;machine learning;deep neural networks;supervised learning;customer relationship management
Issue Date: 2021
Publisher: MDPI
Source: Domingos, E., Ojeme, B. & Daramola, O. 2021. Experimental analysis of hyperparameters for deep learning-based churn prediction in the Banking sector. Computation, 9: 34. [https://doi.org/10.3390/ computation9030034]
Journal: Computation 
Abstract: Until recently, traditional machine learning techniques (TMLTs) such as multilayer perceptrons (MLPs) and support vector machines (SVMs) have been used successfully for churn prediction, but with significant efforts expended on the configuration of the training parameters. The selection of the right training parameters for supervised learning is almost always experimentally determined in an ad hoc manner. Deep neural networks (DNNs) have shown significant predictive strength over TMLTs when used for churn predictions. However, the more complex architecture of DNNs and their capacity to process huge amounts of non-linear input data demand more time and effort to configure the training hyperparameters for DNNs during churn modeling. This makes the process more challenging for inexperienced machine learning practitioners and researchers. So far, limited research has been done to establish the effects of different hyperparameters on the performance of DNNs during churn prediction. There is a lack of empirically derived heuristic knowledge to guide the selection of hyperparameters when DNNs are used for churn modeling. This paper presents an experimental analysis of the effects of different hyperparameters when DNNs are used for churn prediction in the banking sector. The results from three experiments revealed that the deep neural network (DNN) model performed better than the MLP when a rectifier function was used for activation in the hidden layers and a sigmoid function was used in the output layer. The performance of the DNN was better when the batch size was smaller than the size of the test set data, while the RemsProp training algorithm had better accuracy when compared with the stochastic gradient descent (SGD), Adam, AdaGrad, Adadelta, and AdaMax algorithms. The study provides heuristic knowledge that could guide researchers and practitioners in machine learning-based churn prediction from the tabular data for customer relationship management in the banking sector when DNNs are used.
URI: http://hdl.handle.net/11189/8952
ISSN: 2079-3197
DOI: https://doi.org/10.3390/ computation9030034
Appears in Collections:FID - Journal Articles (DHET subsidised)

Files in This Item:
File Description SizeFormat 
Experimental_analysis_of_hyperparameters.pdfArticle599.95 kBAdobe PDFView/Open
Show full item record

Google ScholarTM

Check

Altmetric


Items in Digital Knowledge are protected by copyright, with all rights reserved, unless otherwise indicated.