Marketing Segmentation Through Machine Learning Models: An Approach Based on Customer Relationship Management and Customer Profitability Accounting

Introduction and the Research Problem

Raquel Florez-Lopez and Juan Manuel Ramon-Jeronimo conduct a study reported in “Marketing Segmentation Through Machine Learning Models: An Approach Based on Customer Relationship Management and Customer Profitability Accounting.”

The primary theme of the research problem highlights how Customer relationship management (CRM) focuses on customer relations with various tools and strategies to obtain customer segmentation of the target market. However, several cost-benefit analyses require statistical techniques to work with large amounts of data to obtain the appropriate information. Furthermore, the authors emphasize the limitations of the development procedure of CRM programs that involve multiple phases, including the critical metrics in the initial segmentation of clients for analysis. While the traditional techniques involve clustering techniques, principal component analysis, or logistic regression for building segmentation models, the performance of the methods decreases when working with massive volumes of data, thereby reducing the models’ real-fit, interpretability, and robustness. Consequently, enormous data sets impact the effectiveness of the quantitative measures, leading to diminishing the quality of analysis and decision-making.

Research Problem Solution

The authors propose an alternative method of implementing a machine learning model based on decision trees to address these challenges. The advantage of a decision-tree model is its capability of achieving higher performance while omitting the priori hypotheses [1]. In addition, the logical rules behind the model are more straightforward to understand and integrate [2]. Furthermore, the study proposes a three-stage methodology that comprises of the various selections concerning marketing features, customer segmentation information obtained with the help of univariate and oblique decision trees, and a customer profitability accounting (CPA) function that is primarily based on marketing, data warehousing and opportunity based costs derived from different scenarios. Besides, the study included marketing datasets collected from large insurance companies to obtain the alternative cost and pricing information to evaluate the performance of the univariate decision trees and statistical techniques.

Furthermore, a CPA model is developed based on cost-benefit functions is integrated to measure the performance of the CRM from a promotional politics (mailings) perspective. The study also includes the mailing data and data warehousing costs to compute the incremental profit incurred from successful and unsuccessful customer contacts. In addition, the authors claimed that the proposed method differs from existing approaches concerning complex functions, the inclusion of feature selection and segmentation models based on statistical techniques, and univariate and oblique decision trees.  Finally, a new CRM and CPA measurement technique is included to analyze the performance.

Similarly, the authors highlighted various works of literature that emphasized using metrics to measure the success of the CRM. These metrics include financial and market share and profit margin-related information. Additionally, the customer-centric measures include cost incurred on customer acquisition, conversion rate, customer retention, existing customer sales rate, and loyalty measures completed the metric groups. Considering the effectiveness of these metrics, the study incorporated the features in the development of the CPA model based on cost-benefit function and to design a well-optimized CRM. However, the market segmentation problem within the CRM model was evaluated using psychographic and socioeconomic attributes with the help of statistical techniques and a decision-tree approach.

Decision Tree Overview

Decision trees are used for classification and regression-based problems. The name itself suggests a hierarchical structure that produces specific predictions as a result of a feature-related split of the series of data. Ideally, the decision tree structure has three major components: root nodes, decision nodes, and leaf nodes. The structure comprises a root node at the beginning, and the final decision is made by its leaves.

First, the root node at the beginning of the tree is the primary node responsible for dividing the time series into various structures. Next, the decision node is the resultant node derived from the splitting of the root nodes. In contrast, the node representing zero or more decision nodes is the leaf or terminal node that cannot further split. Finally, the resultant output of the tree is a hierarchical form of decision-making across multistage. The figure below gives the representation of the decision tree used for the mailing problem in the study.

Figure 1. Decision-Tree Structure Implemented for Mailing Problem

Findings of the Research

The study provides an essential theoretical approach towards customer segmentation and pinpoints the issues related to CRM and CPA issues. However, another critical finding of the research is the lack of overall consensus in segmentation methodology [3]. The primary reason for this factor is that customer segmentation can have multiple perspectives. In contrast, the market segmentation strategies do not provide the estimated costs incurred on poorly targeted customer segments or the amount accrued on preparing the data for targeting customers through direct email-based campaigns [4]. Therefore, the dataset used for a computational model must be carefully evaluated to identify various essential attributes in demographics, psychographics, socioeconomic data, interview and survey data, questionnaires, and other samples [5].

The existing business environment is becoming complex with limited data warehousing systems, which increases the computational power required to tackle large data volumes. Therefore, the excess of data presents three critical problems: (i) selecting the appropriate attributes for the identification of potential customers from multiple groups of independent variables; (ii) understanding the prediction model’s logic due to various computable attributes; and (iii) verification of the hypotheses of a statistical model with correlated attributes in a customer data.

At the same time, the predictive machine learning models can forecast customer behaviors and decisions, and decision trees are an excellent fit to tackle different scenarios in customer segmentation. Due to the ability to generate straightforward logical rules, these models are easily interpretable and are effective for business decisions. For market segmentation, the implementation of principal component analysis (PCA) has capabilities to reduce data dimensionality and is pivotal in feature selection. However, a significant drawback for PCA in terms of costs and interpretability exists in predictive modeling [6]. The vital components lacking in the PCA approaches include a lack of relationship between the independent and dependent variables for analysis. Moreover, when multiple variables exist, specific PCA components become difficult to interpret. This interpretability problem can hamper its implementation because the success of a market segmentation strategy depends on the ease of interpreting these market features [7].

Therefore, it is essential to adopt a market segmentation model that can perform effective feature selection and eliminate statistical hypotheses as it becomes restrictive in the model’s performance, causing inefficient customer segments that do not add value to decision-making. In recent decades, the inclusion of robust feature selection mechanisms has gained tremendous importance, and an embedded approach can produce ground-breaking performance [8]. Thus, a filter method is relevant in segregating the features from the data without using the classifiers. As a result, this method can provide lower computation time for the model [9].

On the other hand, wrapper methods can be crucial for achieving reliable classification accuracy as they provide information on subsets and their features with the help of induction algorithms. Ideally, the wrapper methods can produce better results than other methods. However, the challenge of employing this method is higher computational cost and risk overfitting the model due to the same algorithm being implemented for feature selection and classification. Additionally, embedded methods are computation-friendly and effective as feature selection is integrated along with the induction algorithm, and only a particular model structure is obtained from the algorithm to achieve the relevant features of the data. Although the embedded method is easier for computation, there are significant challenges such as the inclusion of irrelevant features that can decrease the model’s classification accuracy. Therefore, decision trees can be included within embedded approaches for better results [10].

From the results achieved by the proposed model, the authors concluded that employing a discriminative analysis statistical approach can improve the performance. However, there is a risk associated with the approach as a subsequent increase in the ratio of price-cost factors can result in poor results. Thus, to solve the challenges of this method, it is essential to identify customers who have responded positively to a marketing campaign for maintaining the optimum balance of the increasing attributes in a dataset to identify potential customers.

The authors state that the Classification And Regression Trees (CART), i.e., univariate decision tree is efficient and produces superior performance in terms of high and low priced cost quotient and data warehousing costs. In addition, the addition of lower explicative factors of the data makes it more interpretable and decreases the overall cost of analysis. In addition, the CART oblique decision tree can increase profitability by separating the high-price ratios from the lower number of attributes considered for the task. These properties make it a useful algorithm for customer segmentation (see example in Figure 2).

Figure 2. Discriminant Analysis Examples using CART Oblique Decision Tree

Future Work Suggestions and Implications for Practitioners

The impact of CRM on market segmentation is evident. However, the authors suggest that a new measurement approach is essential that includes both CRM and CPA in future developments. In addition, metric-based results to measure the success of CRM programs should be integrated with the overall measurement of market segmentation, client profitability.  The CRM management should integrate global cost-functions, including the additional marketing costs such as data warehousing, potential profits, and cost associated with positive responses and unsuccessful customer contacts. However, machine learning performs higher than statistical methods, while univariate decision trees allow efficient segmentation.

On the contrary, it is critical to ensure more straightforward logical decision rules in a computational model, as the managers should readily interpret the analytical results. Therefore, future developments should be mindful of computational time and costs and the challenges of overfitting and underfitting with machine learning models. At the same time, it is essential to achieve a balanced dataset that includes the positive outcomes compared to the other cost factors to handle such challenges.


1. Kim, Y., Street, W. N., & Menczer, F. (2001). An evolutionary multi-objective local selection algorithm for cus-
tomer targeting. Proceeding Congress on Evolutionary Computation, 2, 759-766.

2. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. New York:
Chapman & Hall.

3. Brijs, T. (2002). Retail market basket analysis: A quantitative modelling approach. Unpublished doctoral dis-
sertation, Department of Applied Economics, Limburg University, the Netherlands.

4. QAS. (2006, September). The hidden costs of poor data management (International Research White Paper).
Dynamic Markets Commissioned by QAS.

5. Tseng, T. L., & Huang, C. C. (2007). Rough set-based approach to feature selection in customer relationship
management. Omega, 35, 365-383.

6. Kim, Y., Street, W. N., Russell, G. J., & Menczer, F. (2005). Customer targeting: A neural network approach
guided by genetic algorithm. Management Science, 51, 264-276.

7. Anderberg, M. R. (1973). Cluster analysis for applications. New York: Academic Press.
Bass, F. M., Tigert, D. G., & Lonsdale, R. T. (1968). Market segmentation: Group versus individual behaviour.
Journal of Marketing Research, 5, 264-270.

8. John, G. H., Kohavi, R., & Pfleger, K. (1994, July). Irrelevant features and the subset selection problem. In
W. W. Cohen & H. Hirsh (Eds.), Proceedings of the Eleventh International Conference on Machine
Learning (pp. 121-129). San Francisco: Morgan Kaufmann.

9. Blum, A., & Langley, P. (1997). Selection of relevant features and examples in machine learning, Artificial
Intelligence, 97, 245-271.

10. Kira, K., & Rendell, K. A. (1992). The feature selection problem: Traditional methods and a new algorithm.
Proceedings of the Ninth International Conference on Artificial Intelligence (pp. 129-134), Cambridge: MIT