Follow us

# Data Modeling

In Artificial Intelligence, a model is an abstract representation of a decision process. Its primary goal is to enable the decision process automation, often applied to business. The model can also help understanding the modeled process itself. Machine Learning models are mathematical algorithms that are “trained” using data. Ideally, the model should also explain the reason behind its decision to help understand the decision process.

## Model categories

### Predictive models

Predictive models can provide meaningful analytics and insights in topics such as healthcare, trading or customer relationship, and thanks to their anticipation ability, help gain competitive edges. They also prove themselves valuable in Defense & Space use cases.

### Descriptive models

Descriptive models are an abstract representation of a system. They help better understanding how internal and external events or behaviors interact within the system. It can be used to optimize a workflow and improve active customers’ ROI.

### Decision models

Fed by qualitative and quantitative data, we build decision models help us perceive, organize, and manage the business rules to improve operational performance. Planning, prices, logistics can benefit from decision models.

## Predictive modeling

In predictive modeling, the outcome defines the performed task.

If the outcome is made of continuous values, it is a regression task. The model will then return a numerical value.

If the outcome is made of two or more categories, it is a classification task. The model will then deliver a class. When there are two classes, we talk about binary classification and multi-classification otherwise.

Several algorithms can perform regression and classification tasks to build predictive models: regression algorithms, Bayesian algorithms, kernel algorithms, decision trees, neural networks, and evolutionary algorithms such as ZGP (the core engine of MyDataModels products).

## Algorithms & engine

### Regression

Regression algorithms gather supervised machine learning techniques, where algorithms are trained before being applied to data to create a prediction. They are useful to assess the causal effect of a (or multiple) variable upon another.

### Decision tree

Part of supervised machine learning technique as well, decision trees are used to predict a goal or a target based upon a series of questions. It can operate through classification (categories) or regression (numbers).

### Time series

Time series are used to comprehend the behavior of a given asset over time, and therefore build accurate predictions about its future. It is done by indexing series of data points in time order, whether they are listed or graphically represented.

### Small Data

A new frontier for AI, Small Data represents up to 85% of all the data collected. Small Data algorithms can work on datasets with little history and yet provide meaningful insights through efficient predictive modeling.

### ZGP Engine

ZGP is a unique mathematical expression engine inspired by evolutionary algorithms. It is able to create simple mathematical expressions that are particularly good at predicting or classifying based on small datasets.

## Data modeling features

Data modeling tools have to perform specific tasks to meet challenging business goals.

### Sensitivity analysis

Sensitivity analysis allows one to assess the causal effect of one or more variable(s) upon each other. It helps test the robustness of a model and optimize it. It does so by assessing the uncertainty caused by a given variable.

### Data visualisation

Data visualization displays raw intel through visual representation. It takes reporting to another level. It can be a means for spotting weak signals, thus generating a competitive edge.

### Live predict

By instantaneously extrapolating the machine learning results from live databases, it is possible to provide a dynamic sensitivity analysis. Therefore it becomes an opportunity to create more business value from data modeling and predictive analytics.

### Correlation insight

Statistical and Machine Learning correlation evaluate whether or not there is a relationship between variables or assets. Along with an adequate understanding of probabilities, it can help forecast, target, or improve sales.

## Modeling business processes

Data-driven decisions are crucial for success. Modeling techniques applied to business processes contribute to the growth of any business.

### Data mining

Data mining identifies significant patterns within the datasets. It can also make use of supervised algorithms—this approach differs from data modeling.

### Business decision

Data modeling emphasizes the importance of data in business decisions. Reducing the scope of possible decisions helps make quicker, better decisions.

### Pre-Processing

It’s the first step of data cleaning. It consists in transforming the data so that its structure becomes uniform and usable by algorithms.

### Data preparation

Preparing a dataset for its processing can be harsh for non-data-scientists. Format, outliers, missing values are common setbacks. Sometimes, feature engineering can be required. Quaartz takes care of this for you.

### Testing

When generating a predictive model, it is necessary to validate its outcomes’ accuracy. It is the testing phase.

### Validating

A part of the data is set aside before the model’s generation. This dataset is used to compare the model’s prediction with actual values. It is the model’s validation.

### Evaluating

Of course, there is no such thing as a perfect model with 100% accuracy. Therefore, there is an evaluation phase that serves to rank the model against its peers. Evaluation provides metrics such as confusion matrix, F1 score, or the Kappa cohen score, amongst others, to assess the model’s accuracy.

### Deploy

In the data modeling world, deploying a model means applying its algorithm to new data in a real business environment. It can serve to make business decisions. It is data science used in real life.