Models of equipment failures are based on data referencing observations of past machine runs and failures. We can use Machine Learning approaches to model to current situations to predict and anticipate machine failures and schedule maintenance pre-emptively.
Problems to solve
How to detect when a machine is going to break?
How to anticipate maintenance to prevent downtime?
How to move from preventive to predictive maintenance?
Can machine learning help in these matters and how accurate predictive models can be to predict failures?
Like humans, machines can learn to make predictions by analyzing past information (historical data). Machines can quickly identify patterns in a set of data and produce a mathematical formula (model) using the variables from this historical data.
In order to demonstrate the performance of MyDataModels solution for this type of problem we choose a specific case study.
This dataset is from a company that uses many machines to build final products. As the production is stopped every time a machine has a failure, management would like to build a predictive model that finds which machine is going to fail next.
As we explored the data, we understood that the company is using 1000 machines. A machine has an average time between failure of 55 weeks, some are brand new machines and others running since almost two years. In our dataset almost 40 % of the machines had a failure in the past two years.
* see detailed information on this dataset in the Dataset information section
The graph below shows an extract of the dataset.
Each line is a machine and each column (feature) is a variable.
For each machine which corresponds to a row in our dataset (1000 rows), we have 9 variables:
- Machine nbr: from 1 to 1000
- “lifetime” indicates number of weeks since the machine has been used
- “broken” which is our target variable (Yes or No)
then we have 3 numeric variables related to
and 2 variables related to
- The team using the machine
- The machine’s provider.
To create predictive models MyDataModels developed a performant solution: TADA.
In order to build a model, experts need to :
- upload the historical dataset into TADA
- set what they want to predict (here “broken”)
- select the other variables to use (all the other columns)
Below here are the statistics of a model obtained with TADA within two minutes.
ACC = Accuracy
TPR = True Positive Rate
TNR = True Negative Rate
MCC = Matthew’s Correlation Coefficient
How to use this model?
Once a model is being built from historical data, professionals can easily make predictions by using updated information from their current patients who need a diagnostic.
To use a model, domain experts need to create a new dataset with, as lines, all the machines they want a forecast for, and as columns alle the variables used in the model.
Then, they need to import this file into TADA (see screen below) and click on “Generate score”.
A new file will be generated with a new column giving the predictions of the machines operating state : Yes = broken or No = It Works
Benefits of TADA
Manufacturing, Maintenance and Operation Managers could benefit from predictive models but they are not data scientists and may not have the required skills in machine learning nor coding experience to build them. Even if data handled by these professionals could be considered as Big Data (data from sensors for example), the risks they want to predicts are Small Data, failures in this specific use case.
In this case, historical data contains at most a few hundreds of failures but rarely thousands or millions (as in Big Data). Traditional machine learning tools work well with Big Data but do not perform for prediction of Small Data within Big Data (unbalanced dataset).
MyDataModels allows domain experts to build automatically predictive models from Small Data. No training is required and they can use their collected data directly without a need to normalize it or handle outliers. No feature engineering is required. Thanks to this limited data preparation and in few clicks the above results from this specific dataset were obtained in less than 2 min on a regular laptop.
MyDataModels brings a self-service solution for those who have Small Data and no data scientists.
Manufacturers are constantly under pressure to stay competitive by optimizing processes, improving efficiency of aging infrastructure, reducing unplanned downtime, sudden failures and thus maintenance costs.
A CXP Group study found that 95% of companies describe their current maintenance processes as not very efficient. Until now, production managers and machine operators operate on scheduled maintenance and regularly repaired machine parts to prevent downtime. Unfortunately, 50% of these preventive maintenance activities are ineffective.
In this failure detection use case, the results obtained from MyDataModels’ predictive models are more than beneficial with a 96% accuracy rate.
By using an automated machine learning solution like TADA, companies can now proactively identify problems by running a root cause analysis and push fixes including spare-parts, software, hardware and firmware to eliminate possible points of failure or degraded performance that end-users could experience – ultimately increasing customer satisfaction and competitive advantage.
- Task: Binary Classification
- Number of features: 9
- Size of data: 1.000 samples
- Weight: Positive class (broken) 40%, Negative class: 60%
- Target variable: (broken) Machine actually broken? yes/no.
- Score: Accuracy