Marketing spends millions in campaigns and it is difficult to predict any ROI.
Digital Marketing now allows experts to collect large amounts of data which can be analyzed to make actionable decisions to optimize the ROI of Marketing campaigns.
This case study is about building a predictive model to identify prospects who will subscribe or not to a bank term deposit after a direct Marketing campaign.
Problems to solve
How to assess the performance of a Marketing campaign ?
How to predict if a prospect engaged by a direct Marketing campaign is going to buy a product or subscribe to a service ?
How to improve lead qualification to help sales people focus on the right targets ?
Can machine learning help with these matters and how accurate predictive models can be to detect future customers and optimize Marketing ROI ?
Like humans, machines can learn to make predictions by analyzing past information
(historical data). Machines can quickly identify patterns in a set of data and produce a mathematical formula (model) using the variables from this historical data.
In order to demonstrate the performance of MyDataModels solution for this type of problem we choose a specific case study.
This case study is based on real data from a public dataset from UCI *.
The objective in this case study is to predict if a prospect will subscribe or not to a bank term deposit after a direct Marketing campaign.
* see detailed information on this dataset at Dataset information section
The graph below shows an extract of the public dataset.
Each line (2883) is a prospect and each column (28 feature) is a variable which can be used in the model.
28 Used features :
- Output variable (target): has the client subscribed to a term deposit? (binary: 1/yes, 0/no)
- age (numeric)
- duration: last contact duration, in seconds (numeric).
- campaign: number of contacts performed during this campaign
- pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted)
- previous: number of contacts performed before this campaign and for this client (numeric)
- emp.var.rate: employment variation rate - quarterly indicator (numeric)
- cons.price.idx: consumer price index - monthly indicator (numeric)
- cons.conf.idx: consumer confidence index - monthly indicator (numeric)
- euribor3m: euribor 3 month rate - daily indicator (numeric)
- nr.employed: number of employees - quarterly indicator (numeric)
- job : type of job (categorical)
- education (categorical)
- month: last contact month of year ('mar', ..., 'nov', 'dec')(One Hot encoded)
- day_of_week: last contact day of the week ( 'mon','tue','wed','thu','fri')(One Hot encoded)
To create predictive models MyDataModels has a performant solution called TADA.
In order to build a model, experts need to :
1) upload the historical dataset into TADA,
2) set what they want to predict (here “target”)
3) select the other variables to use (all the other columns).
Below here are the statistics of a model obtained with TADA in 5 minutes.
ACC = Accuracy
TPR = True Positive Rate
TNR = True Negative Rate
MCC = Matthew’s Correlation Coefficient
How to use this model?
Once a model is being built from historical data, professionals can easily make predictions by using information from their current prospects.
To use a model, domain experts need to create a new dataset with, as lines all the prospects they want a prediction for, and as columns all the variables used in the model.
Then they need to import this file into TADA (see screen below) and click on “Generate score”.
A new file will be generated with a new column (“prediction”) giving the predictions for the current prospects: yes or no
Benefits of TADA
Marketing and Communication experts are not data scientists. They may not have the required skills in machine learning nor coding to build predictive models. Moreover, most data handled by these professionals are Small Data, meaning that often their historical data only contains a limited number of campaigns and thousands of customers but rarely millions (like in Big Data). Traditional machine learning tools work well with Big Data but do not perform with Small Data.
MyDataModels allows domain experts to build automatically predictive models from Small Data. No training is required, domain experts can use their raw data, no need to normalize it, handle outliers, no feature engineering is required. Thanks to this limited data preparation and in few clicks the above results from this specific dataset were obtained in less than 5 minutes on a standard laptop.
MyDataModels brings a self-service solution for those who have Small Data and no data scientists
Marketing experts spend millions in customer acquisition campaigns. Targeting and ROI optimization is key to generate qualified leads and turn them into customers. More and more campaigns are now digital and as a consequence Marketing experts collect large amounts of data which unfortunately are not “mined” to discover hidden information for effective decision making.
In this specific use case, the results from MyDataModels’ predictive model reached a 85% accuracy rate which is a satisfying score for most professionals.
Marketing & Communications departments could use more machine learning to assess the quality of campaigns in general and lead conversion in particular. Even better, TADA can help Marketers identifying the key aspects of a campaign that are most likely to drive its performance.
By predicting conversion rates to their clients and providing more actionable reporting with meaningful KPIs, Marketing agencies using this solution have a competitive edge.
With this automated technology, Marketing experts spend less time on data and deliver more qualified leads to Sales.
The dataset is coming from direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required. The classification goal is to predict if the client will subscribe ('yes') or not ('no') a bank term deposit
- Task: Binary Classification
- Number of features: 28
- Size of data: 2883 samples
- Weight: Positive class: 89.1%, Negative class: 10.9%
Data description: This dataset is used to predict the subscription of a prospect
- Target variable: (tagert) will the client subscribe a bank term deposit? (yes or no).
- Score: Accuracy