Soft Computing: Impactful Tools in Customer Segmentation


Knowing your customers is essential to a successful business. It helps establish a rapport between the customers and the business, and makes the customers personally attached to what they are buying. Customer segmentation helps organizations recognize customers with similarities as a group rather than as individuals. It makes it easy to serve consumers efficiently and adds to the value of the business. But it is crucial to choose the proper techniques and use the right tools to achieve effective customer segmentation. It becomes hard since there are tons of factors at play. One of these advanced techniques to attain customer segmentation is Soft Computing (SC). A study reviewed in the research paper defines SC as follows:

An evolving collection of methodologies, which aims to exploit tolerance for imprecision, uncertainty, and partial truth to achieve robustness, tractability, and low cost. (REFERENCE: INDUSTRIAL APPLICATIONS OF SOFT COMPUTING: A REVIEW)

SC uses approximations to compute real-life problems to make logical decisions. But is SC an effective technique to achieve customer segmentation? To better understand this, we discuss ‘Soft computing applications in customer segmentation: State-of-art review and critique‘ by Abdulkadir Hiziroglu (2013). The paper was published in Expert Systems with Applications. It reviews multiple empirical studies and elaborates on issues that impact customer segmentation. The research problem discusses the comparison of different techniques for customer segmentation regarding how to attain effective results. It also presents details on SC, its application in customer segmentation is, and why and how it is a useful technique for customer segmentation.

Advancement in data and technology

Customer segmentation was first recognized as a separate concept in the 1950s. In the late 1970s, it started to be carried out with traditional statistical techniques. Over the years, technology has substantially progressed, and advancement in technology entices advancement in segmentation techniques. Data is increasing by the minute, and conventional segmentation techniques may be inefficient to handle it. Modern techniques such as Machine Learning (ML) and Artificial Intelligence (AI) are put into play to extract information from an enormous amount of data. These tools, combined with Data Mining (DM) techniques, put forth SC. Because SC resonates with modern-day technologies, they are generically improved and applicable to customer segmentation.

Critical Factors and Issues in Customer Segmentation

When structuring a customer segment, organizations encounter many challenges. These challenges contribute to the efficiency of segmentation as well. Impacting factors in customer segmentation are selection criteria, variables classification, selecting a fitting model, data collection and analysis, and implementation and interpretation results.

Criteria of Selection

Over the years, many researchers have put together a set of criteria that we account for when performing customer segmentation effectively. Every researcher enlisted some standards according to their understanding and interpretation of segmentation problems. There is no broad analysis or clear guidance to measure the accuracy of each criterion. However, measurability, accessibility, differentiability, substantiality, actionability, and homogeneity are the most common standards. This means that we must be able to measure the size of segments and smoothly serve them. Each segment must carry a uniqueness and have a reasonable size to access, serve, and handle.

Variable Classification

Another challenge that we consider is classifying and choosing the accurate variables to determine a segment. Every customer has similarities and differences according to their personalities. These personal traits, or variables, must be selected carefully to have logical and structured segments. Although, classifying accurate variables is a critical issue because of unstructured academic studies. It is crucial to determine the objective of customer segmentation as it helps choose a variable(s) accordingly. One of the most valued traits is behavioral characteristics such as previous purchases.

Model of Segmentation

Another factor that plays a crucial part in the segmentation problem is segmentation models. The segmentation models determine the approach which we must utilize to create customer segmentation. The research categorizes the models into three classes: (1) single, (2) two-stage, and (3) multistage, depending on segmentation variables. Some prominent models are a priori and post hoc methods. In the a priori method, we choose a variable to classify customers in segments, whereas the post hoc method uses grouping for classification. These methods can either be descriptive or predictive. By descriptive method, we mean that variables do not have classification with respect to dependency. Predictive method has one independent variable whereas rest of variables are dependent.

Data Collection and Analysis

Customer segmentation draws on the analysis of data and the objective for which we are performing it. Whenever we perform customer segmentation, a company considers its marketing strategies or the access to data. A company may perform segmentation depending on several reasons or objectives, including lead generation, launching a new product, understanding customers, acquiring new customers, profit generation, retaining customers, and others. The segmentation objectives and analysis are the basis for segmentation strategy, which can be market-induced or customer-induced. The research summarizes this list of segmentation objectives into five and the units of analysis into four categories as shown in Table 1.

Table 1: Segmentation Objectives and Unit of Analysis

The research paper only considers sample size regarding the sample design. Acquiring the data for this purpose is facilitated by advancements in technology. The three data types considered are (1) survey which is generated through questionnaires, (2) secondary obtained through industrial databases, and (3) simulation data generated through simulations.

Implementation and Interpretation of Results

Customer segmentation utilizes many data analysis techniques, including but not limited to cluster analysis, AID (Auto-Interaction Detection), CHAID (Chi-Squared Auto-Interaction Detection), latent class structure, and soft computing. Other technologies like fuzzy logic (FL), artificial neural networks (ANN), and evolutionary methods (EM) are also a part of SC. We consider these technologies for analysis as well as preparation of data.

When performing customer segmentation, we must classify these techniques as a priori or post hoc depending on data characteristics, i.e., volume and structure. Hierarchal and partitional methods have certain drawbacks individually. However, a mix of both techniques may result in more powerful results.
While performing data analysis with clustering, we must take standardization or normalization into account. Mostly, normalization has zero to little effect on segmentation, but some studies suggest that it can be beneficial when a subset variable dominates the cluster. Similarly, identifying the number of clusters in segmentation is useful. We can analyze dendrogram or perform cluster validity measures for this purpose.

A final issue that we face when performing customer segmentation is assuring reliability and validity. It is essential to test the two requirements to guarantee that the customer segmentation is working. The validity can be internal or external. Internal validity shows that the result we can use more generally, whereas external validity estimates clustering optimization.

Likewise, we have two approaches to assure reliability: (1) consistency validation (i.e., that an algorithm produces stable results or that the estimator is mathematically consistent) and (2) cross-validation (i.e., that the model works well across different strata of data). Measuring internal validity based on criteria helps in selecting an optimal schema. For clustering, these measurements are compactness and separation.

Soft Computing

SC is a set of technologies, specifically algorithms, that use approximation to solve real-life problems. These technologies use data learned through experiments instead of complex mathematics to solve a problem. To eliminate vagueness and uncertainty, SC corporates different methods and techniques. It uses these tools to solve data mining problems to perform customer segmentation. These technologies assist us in deciding a solution through approximation. Table 2 classifies these methods based on how we utilize them in DM.

Soft computing technologies Data mining tasks
Fuzzy sets

(For imprecision and uncertainty)

·        Clustering

·        Association rules

·        Functional dependencies

·        Data summarization

·        Time series analysis

·        Web mining

·        Image retrieval

Artificial neural networks

(For predictions)

·        Rule extraction

·        Rule evaluation

·        Clustering

·        Regression

·        Web mining (information extraction and personalization)

Evolutionary methods

(For robustness and efficiency)

·        Regression

·        Association rules

·        Web mining (search and retrieval and query optimization)

Rough sets

(For mathematical or computational problems)

·        Decision rule induction

·        Data filtration (including attribute reduction)

·        Rule generation

·        Web mining (information retrieval, information fusion, handling multimedia data, document clustering, web usage mining)

Table 2: SC Technologies and their Uses

Methods to Solve Research Problem

The methods to solve the research problem draws on a critical analysis of the studies reviewed in this paper. It identifies studies relevant to segmentation efficiency and sets up a coding procedure. The method also considers preserving the reliability of the coding procedure with evolving technologies. Only those studies are selected which fulfill the criteria. These studies must be empirical and contain real-life or simulated data only. They must only resolve data analysis problems, i.e., classification and clustering. Moreover, the articles or studies must only be published in journals (of any kind) and focus on customer segmentation only. These criteria minimize the list to 42 studies only published between 1986 to 2012.

The selected articles were assigned a code number along with coded factors (criterion). A total of 18 of these factors are considered to be variables for this research. The research categorized variables based on five segmentation objectives discussed previously (in Data collection and analysis). Variables without a clear objective were categorized as not available. Similarly, segmentation variables have four categories in this methodology: (1) general observable, (2) product-specific observable, (3) general unobservable, and (4) product-specific unobservable. The research also divides segmentation technologies and techniques based on major methods of SC.

The research also considers the segmentation criteria (discussed previously) and the critical issues to classify the selected empirical studies, including the validity and reliability measures as core factors for the segmentation process in the selected articles. Finally, to ensure the reliability of the above-mentioned coding technique, two evaluators independently coded the data and eliminated any inconsistency by comparing and re-coding.


The results show that the highest number of studies published in 2004, i.e., 23.8% of total studies. It also shows that nearly 40% of the selected studies were published in Expert Systems with Applications. The tourism industry was discussed the most among these studies. Overall, 65% of studies utilized neuro computing from SC techniques with application in every industry these studies discuss. FL and EM technologies were used in nearly 50% of the studies. Similarly, the most utilized techniques among these studies were related to neuro-computing. These techniques include the Self-Organizing maps technique, which is part of 45% of all studies. The results also show that above 80% of studies resolved the clustering problem.

Customer profitability and advancing existing customers are the most employed objectives of segmentation and more than 80% of studies analyzed existing customers as the basic unit. The research also proves that customer privacy and anonymity is a necessary element to choosing SC techniques. Research study suggests that the general unobservable variable is used the least in the selected studies while product-specific observable/unobservable are used the most. There is no application of a multi-stage model and more than 90% use of the single-stage model in these studies. Overall, 12% of studies state no criteria to improve segmentation efficiency, whereas 17% of studies utilize homogeneity as a favorable criterion in successful segmentation.

Nearly 50% of articles in discussion applied normalization to the data to avoid issues with data analysis, whereas 14% did not utilize normalization. Moreover, most studies used questionnaires or secondary data as the source of information. 65% of studies did not use any reliability measure, whereas 30% of studies use reliability measures and ensured reliability.

Future of Soft Computing in Customer Segmentation

SC is a useful tool pack of techniques and beneficial regarding the enormous amount of data we have at hand today. SC is becoming a fitting part of interdisciplinary studies because of its ability to benefit from data and its applications in DM. We can utilize SC in predictive marketing, accelerated diffusion, economical manufacturing, and predictive optimal supply chain. But SC may not accelerate business management because of its technical requirements in non-technical fields, complexity, and research gap between social and applied science. Although, a solution to such a problem can cause immense advancement, especially in business-related fields, and can be a potential research area.

There are some issues that we must consider when using SC techniques. These are (1) scalability problems, (2) feature evaluation, (3) dimensionality reduction, (4) choice of metrics and evaluation techniques, (5) incorporation of domain knowledge and user interaction, and (6) efficient integration of SC tools into real-world systems. Nevertheless, future research on an appropriate combination of different SC techniques can assist with these issues. SC is a significant area of study, which can evolve its branches in nearly every field and industry. Although it has advanced significantly over the years, there is still room for improvement and evolution, in customer segmentation as well as in every social science field.