Understanding Customer Lifetime Value (CLV) in dynamic non-contractual B2B eco-system using Ensembled Neural Networks

10 min readJun 13, 2021

A new perspective on predicting CLV for non-contractual B2B customers using Neural Networks. Photo by Alina Grubnyak on Unsplash

Customer relationships in the business-to-business (B2B) non-contractual settings are fundamentally different from business-to-customer (B2C) or other contractually bound B2B scenarios. It varies widely in transaction size, frequency, and pattern, which is typically non-deterministic, seasonal, and economy-driven. Developing a robust and scalable modelling framework in such a complex ever-changing environment has been a difficult proposition for businesses across the globe.

In this article, I aim to nudge the business and data science community towards leveraging new ways of addressing the problem through algorithmic innovation, though-experimentation, and solution design — developing a general-purpose scalable and future proof B2B Customer Lifetime Value (CLV) prediction solution leveraging the power of Neural Networks. While the focus of this article is on CLV estimation, the solution in principle can be easily extended to the prediction of other KPI’s across different business scenarios.

Why is there a need to revisit the CLV frameworks and refine the CRM policies?

CLV is the building block of customer-centric planning and decision-making for any business entity. Photo by Omar Flores on Unsplash

Forecasting CLV is fundamental to customer-centric planning and decision-making for any business entity. It encompasses the idea of how much value can be attributed to the future relationship of a customer with a firm. An in-depth understanding of the customers’ potential, i.e. identifying those who are likely to generate value versus those who will not, can govern the business’s marketing/sales strategies along with other customer relationship management (CRM) initiatives. Moreover, as industries across the globe move towards a more client experience-based approach to position their product/service offerings, outlooks are changing rapidly. Technological disruption has played a crucial role in broadening the art of possibility through the greater and richer volume of data captured across multitudes of interaction touch-points between the customer and various business functions combined with the analytical sophistication and improvement in computational scalability offered through Artificial Intelligence, Machine Learning and Cloud-native / On-prem ecosystems.

These developments clearly make the case that fine-tuning traditional approaches with new advanced algorithms are the next logical step. Although CLV estimation methodologies in B2C industries have evolved with time, accurate and robust prediction of CLV remains a big challenge in the B2B context. Even with growing information capture rates that ensure the availability of large volumes of data, robust models that effectively capture business clients' underlying behaviour are rare to find. The problem is even more acute for SMEs (Small & Medium-sized Enterprises), where most of the business is non-contractual with seasonal fluctuations in demand and uncertain repeat orders. In markets with such ambiguities, identifying the true customer value to design optimal marketing/customer experience strategies is often the difference between market leaders and laggards.

For many of these B2B organizations, innovative and accurate estimation of the CLV metric can in effect become a cornerstone of their CRM policy while at the same time becoming more customer-centric, building loyalty, relevance, financial discipline and pricing strategy to gain strong competitive advantages. Historically, a wide variety of methods, ranging from simple Recency-Frequency-Monetary (RFM) metric to clustering algorithms, have been used to estimate CLV in both B2C and B2B space. Parametric statistical models like the Pareto-NBD/Survival Analysis have also been popular choices among Data Scientists. But despite their usefulness, these traditional approaches often fail to stand the test of time due to stringent assumptions based on probability distributions, over-simplistic approach and/or evolution in customer behaviour. With the global economic landscape becoming more volatile, data-intensive and competitive, there is a need for a new robust solution to this age-old problem.

Can Neural Networks be the answer to these age old challenges?

With increased digitalization and effective storage capabilities to harness multi-channel customer interactions, data is no longer scarce, even in a non-contractual B2B environment. In this context, a generic Neural Network architecture can help B2B organizations understand forward-looking CLV, without compromising on the benefits of traditional approaches.

In principle, our proposed framework needs to capture the dynamics of customers’ life-cycle journeys and combine it with other macro/micro-economic indicators, to translate it into an estimation of future values. It needs to accounts for the thousands of possible interactions across a magnitude of touch-points without the need for manual hypotheses testing and/or feature selection. At the same time, it will also be critical to drive personalized customer-specific actions and deliver a superior customer experience.

Our proposed framework needs to capture the dynamics of customers’ life-cycle journeys and combine it with other macro/micro-economic indicators, to translate it into an estimation of future values. Photo by Markus Spiske on Unsplash

What are the key considerations for the proposed CLV framework?

1. Data availability

Greater volume, variety and frequency of input information can have an incremental impact on model performance. Photo by Mika Baumeister on Unsplash

The key element in building any robust modelling framework is good-quality granular data. This is especially true for DL frameworks where greater volume, variety and frequency of input information can have an incremental impact on model performance. Enlisted below is an indicative list of data elements that are best suited to solve the problem at hand. While the list mostly includes structured data, any unstructured inputs in the form of call logs or sentiments can also be incorporated into the framework.

Customer firmographic details: Business type, scale, annual revenue, customer base, geographic distribution
Transactional behaviour of customers: Transaction date, product ID, sales revenue & cost, channel, geography
Campaign details: Marketing channel (email, SMS, call, etc.), campaign type, start date, end date
Customer interactions and feedback: Call details, online queries, complaints/issue logs, net promoter score
Web traffic information: Page visits, timestamps, number of clicks
Product details: Product ID, product category/sub-category, unit cost
Macro-economic factors: Inflation, unemployment rate, GDP growth rate

2. Target definition

The CLV target definition needs to accounts for the dynamic relationship between the organization and its customer. Photo by Pablò on Unsplash

The relationship between the organization and its customers goes through different phases of the life-cycle, starting from acquisition to termination of the relationship. As with most predictive CLV models, at any point in time t, future value (target) over the next k periods is based on discounted cash flow.

Modifications can be made to this fundamental formula depending on specific use-cases.

3. Input features

Illustrative timeline representation of typical input and target windows for a single customer.

The granularity, structure and frequency of the input features are some of the key differentiators between the Neural Network approach versus some of the other traditional models. We conceptualize 3-layers of inputs to provide insights into the future value of the customers:

Time-series of customer journeys i.e. the sequence of customer-related events like transactions, interactions, feedback, etc. along with macro-economic information like global/national inflation, unemployment rate, etc. provides a granular view of the evolution of each customer’s relationship with the business entity. The optimal look-back window and granularity of these dynamic inputs may be optimized through some basic data exploration and analysis.
Features like retail base, geographic distribution, product portfolio, etc. are unlikely to change rapidly with time (i.e. more static) but still provide a baseline view of the customer’s market position and future transactions. This is especially relevant for new/prospect customers for whom historical customer journeys may not available.
The current state of a customer may be summarized via an enhanced version of the traditional Recency-Frequency-Monetary framework. The traditional RFM framework is complemented by two additional metrics: frequency and recency of past interactions. The underlying hypothesis is that the recency, frequency and volume of past transactions coupled with recent campaigns/marketing actions provide a strong indication of customer’s future behaviour.

What Neural Network architecture can account for all these considerations?

1. Model sketch

Illustrative view of how input signals for a particular customer can pass separately through the 2 Neural Networks before being merged and transformed into a CLV estimate

To leverage the signals described above (A, B and C) and transform them into CLV forecast for each customer, the modelling algorithm needs to be a hybrid framework, capable of handling both sequential (time-series + cross-sectional data from A) and static information (cross-sectional data from B and C) simultaneously.

To achieve this, we propose a merged-DL model, which is a combination of Recurrent Neural Network (RNN) and Multi-layered Perceptron (MLP) frameworks.

Neural Network 1 — RNN (for sequential signals): RNNs (more specifically LSTMs or GRUs) are typically used for time-series forecasting, real-time text/video analysis. The ability of this class of algorithms to retain useful information from historical data makes it ideal for analyzing sequential and contextual relationship between past events in customer’s life-cycle and interpret future value.
Neural Network 2 — MLP (for static signals): MLPs are the classical feedforward Neural Networks that help analyze non-linear relationships between cross-sectional static input features to provide accurate predictions of future events/values.

The two independent Neural Networks, LSTM and MLP, can be combined using a merge-layer, which enables the 2 models to act as one single network and share information, improving the performance of the overall model. Such ensembling of sequential and static neural networks when tuned properly, can easily out-perform other powerful Machine Learning algorithms like RandomForest, XGBoost, etc.

2. Model interpretation

Shapley values can provide answers to the ‘black-box’ approach of Neural Networks. Photo by Alex Perez on Unsplash

While the Neural Network architecture offers a lot in terms of solution structure flexibility and model accuracy, it often poses a challenge in terms of business application due to lack of interpretability. This where Shapley values can provide answers to the potential roadblock. SHAP (and its distributed scalable adoption ‘Shparkley’ by Apache) is a unified open-source algorithm-agnostic approach to explain the output of the models, connecting game theory with local explanations, representing a feasible, consistent, and locally accurate additive feature attribution method. In the present context, for every CLV prediction made for each customer at any point in time, SHAP helps us understand the underlying drivers, relative to the average CLV at portfolio level.

What are the final thoughts and how can the solution be extended to other problem statements?

The scope of the solution is not limited to CLV forecasting alone. Photo by Mikael Kristenson on Unsplash

The generic domain agnostic framework can help business leaders across organizations, make informed business decisions in the uncertain world of B2B non-contractual domain (rather than relying on old school hand-drawn estimation techniques). The scope of the solution is not limited to CLV forecasting alone. It can be extended further to traverse through the state-action-reward space and re-define business strategies in terms of driving customer retention and pricing. It empowers the CXO of a commercial bank, or the Head of a retail distribution chain, managing a vast pool of independent clients, to focus their business strategy and investment decisions towards building, retaining and nurturing relationships with clients that are likely to add value to their businesses in future.

From a design standpoint, the combination of Artificial Intelligence and feature engineering allows the proposed framework to combine granular customer-level journeys with broader macro/micro-economic information to produce robust forecasts. We are excited and eagerly look forward to such innovations in informed decisioning across the B2B domain in near future.

Chandramauli Chaudhuri is a Principal Data Scientist and leader in the field of Artificial Intelligence and Machine Learning. His area of interest lies in solving some of the most critical business and customer-specific problems across industries, using state of the art Deep Learning, Statistical Modelling and Time-series Forecasting frameworks, in an Ethical and Responsible manner.

Follow on LinkedIn: https://www.linkedin.com/in/chandramaulic/

References

Horak, Pavel. (2017). Customer Lifetime Value in B2B Markets: Theory and Practice in the Czech Republic. International Journal of Business and Management. 12. 47. 10.5539/ijbm.v12n2p47
Fripp, G (2014), “Guide to CLV” Guide to Customer Lifetime Value
Farris, Paul W.; Neil T. Bendle; Phillip E. Pfeifer; David J. Reibstein (2010). Marketing Metrics: The Definitive Guide to Measuring Marketing Performance. Upper Saddle River, New Jersey: Pearson Education, Inc. ISBN 0137058292. The Marketing Accountability Standards Board (MASB) endorses the definitions, purposes, and constructs of classes of measures that appear in Marketing Metrics as part of its ongoing Common Language: Marketing Activities and Metrics Project.
Shaw, R. and M. Stone (1988). Database Marketing, Gower, London.
Peppers, D., and M. Rogers (1997). Enterprise One to One: Tools for Competing in the Interactive Age. New York: Currency Doubleday.
Erwin, Derek. “Key SaaS Metrics for Investors: Customer Acquisition Cost (CAC) & Customer Lifetime Value (CLTV)”. The Startup Finance Blog. Retrieved 6 December 2018.
Hanssens, D., and D. Parcheta (forthcoming). “Application of Customer Lifetime Value (CLV) to Fast-Moving Consumer Goods.”
Ryals, L. (2008). Managing Customers Profitably. ISBN 978–0–470–06063–6. p.85.
Berger, P. D.; Nasr, N. I. (1998). “Customer lifetime value: Marketing models and applications”. Journal of Interactive Marketing. 12 (1): 17–30. DOI:10.1002/(SICI)1520–6653(199824)12:13.0.CO;2-K.
Fripp, G (2014)”Marketing Study Guide” Marketing Study Guide
Adapted from “Customer Profitability and Lifetime Value,” HBS Note 503–019.
Gary Cokins (2009). Performance Management: Integrating Strategy Execution, Methodologies, Risk and Analytics. ISBN 978–0–470–44998–1. p. 177
Peter S. Fader, Bruce G.S. Hardie, Ka Lok Lee (2005) RFM and CLV: Using Iso-Value Curves for Customer Base Analysis. Journal of Marketing Research: November 2005, Vol. 42, №4
Tkachenko, Yegor. Autonomous CRM Control via CLV Approximation with Deep Reinforcement Learning in Discrete and Continuous Action Space. (April 8, 2015). arXiv.org: https://arxiv.org/abs/1504.01840
V. Kumar (2008). Customer Lifetime Value. ISBN 978–1–60198–156–1. p. 6
Karvanen, Juha; Rantanen, Ari; Luoma, Lasse (2014). “Survey data and Bayesian analysis: a cost-efficient way to estimate customer equity”. Quantitative Marketing and Economics. 12 (3): 305–329. arXiv:1304.5380. DOI:10.1007/s11129–014–9148–4.
Sak, Hasim; Senior, Andrew; Beaufays, Francoise (2014).“Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling”. INTERSPEECH 2014. Google, USA. https://www.isca-speech.org/archive/interspeech_2014/i14_0338.html
Olah, Christopher (August 2015), “Understanding LSTM Networks”. Colah’s Blog.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Ma, Edward (August 2018), “Interpreting your deep learning model by SHAP”. Towards Data Science. https://towardsdatascience.com/interpreting-your-deep-learning-model-by-shap-e69be2b47893
Bez, Ramon (Dec 2016), “Customer Lifetime Value in Ecommerce — How to Build a Profitable Business”. Compass. https://blog.compass.co/how-to-build-a-profitable-business-demystifying-customer-lifetime-value-with-exclusive-data-from-compass/
Qymatix (October 2018), “How to Define and Increase the Lifetime Value of your B2B Customers”.
Pedretti, Lucas (May 2019), “Are you a sales manager with Big Data? Here three Predictive Analytics examples for B2B”. Qymatix.