In 5 Steps to a Predictive Model - Acting Instead of Reacting, Part 3
19 Feb 2020 - Industry 4.0, Artificial Intelligence, Production, Technology
Many steel producers are eager to invest in predictive analytics in order to avoid defects in their production and thus save costs. Yet many are confronted with the confusion of which algorithms to use for what and whether a particular type of predictive model performs better than others. Here is an approach on how to create your own predictive model in 5 steps.
Data are the oil of the 21st century and the raw material with which the machine learning (ML) algorithms are fed. The PSImetals Factory Model contains most of the raw data needed to train a predictive quality model. The true value is, however, not in the raw data themselves but in the ability to give them a meaning within its individual process context and to holistically connect them with all relevant artists in the production orchestra like machinery or customer orders – in real-time and historically.
Machine learning technologies can exhaustively explore all possible combinations of how production factors affect quality metrics and defect types. Based on historical defects and all its related process and production data, a defect prediction model can be developed in order to predict future quality defects as early as possible.
To predict a future defect, a model is built using many process parameters of all parts of the hot rolling mill, such as the furnace, scale breaker, roughing mill, coil box, tandem mill stands, cooling and coiler, as well as order data, quality measurements and other data points. All this information is linked together thanks to the material genealogy and the process timings that allow connecting information coming from different systems to a specific material and product. For the training data set, the actual defects that have been detected at the finishing lines in the past are also provided. This allows applying a supervised machine learning approach when a model is explicitly given the examples of the events it is targeted to predict.
Where to Start
The exact problem statement for the predictive model needs to be defined based on the expected ways of how the model output (“prediction”) can be used by existing automated decision-making systems or human operators. This requires careful consideration of the existing business processes and possible actions that can be taken to check for, and if necessary correct, the predicted defect.
Defining the right use case is a challenging task, as it should account both for business needs (“can we use the model output in a meaningful way?”) and technical feasibility (“can we build a reliable and robust prediction model with the data we have?”). Indeed, it could be that the collected data does not contain the information you want to model. For example, if defects are a very rare occurrence, it might be hard to predict each of them independently. Instead, an approach with predicting specific material properties, which are measured for each and every coil, might be chosen.
The Process of Creating a Predictive Model
As you can see in the graphic above the process of creating a predictive model involves 5 steps. Learn more about the entire process and the relevant steps in the following part starting with step 1.
Step 1: The process starts with historical data acquisition
All digital information about production orders and production steps, process data, quality information, defect details etc. are collected in the first step. The PSImetals Factory Model contains most of the required data: it has access to production orders, produced finished and semi-finished units, production quality data including defective material units, Level 2 process data, etc.
Tips: Data Acquisition
- Depending on the business goal, 1 year of historical data could be a good time period
- Avoid too long time periods since the processes evolve over time
- Pay attention to routes diversity in the historical data: The challenge of “defect per route” prediction is in the ability to provide enough routes diversity in the historical data for the model to be able to assign different probabilities to different routes. If there are not enough ”what if” variations in the routes as some of the routes were never selected in the past, then the sample is not fully representative.
- Ensure diverse examples in the dataset: Since machine learning models are “empirical” in their nature, they cannot learn from cases that rarely or never happen. The model needs examples to learn from. The model might grasp the less obvious patterns when similar coils take different routes with different outcomes. But in order to achieve this, such occurrences need to be represented in the data.
- The obvious resolution to the lack of examples in the dataset is the provision of more data, both in length (historical period) and density (more data sources, higher granularity). However, in some cases this might not be enough, since the challenge is in the existing “bias” in the data which affects its representativeness. For example, in this case we would need more diverse data not affected by existing patterns of route choice. Such dataset cannot be simply generated in production. We would need to send similar coils to different routes on purpose. But this does not make sense from business standpoint since it may lead to production losses in the process of “training data generation".
- Give the ability to grasp the similarities between different materials (by describing them with a comparable set of parameters) and different routes (by describing the lines in a similar way). In this case the model would not treat each route (sequence of processing line) as a single entity, but can rather be able to assign probabilities for different routes based on known parameters of the lines and comparable processing characteristics (e.g. pickling speed).
Step 2: Data preparation is required to format and align all of these raw data into a unified model
A number of techniques are applied to correctly merge the data together, clean and exclude potentially corrupted data sources, such as those having missing values.
The material genealogy, production quality and production process data are linked into a consistent historical production and defect data set (“Quality Data Matrix along the whole Production Chain”) to train the defect prediction model. In order to do so, the data set is divided into two parts: typically, 75% of the data are used for training and 25% for validating the prediction model.
Tips: Data preparation
- Anomalies should be detected and excluded from the dataset
- Some features with no added value can be removed
- Standardize non-numerical values (e.g. categories) and numerical variables so they can be used by some algorithms
Hundreds of factors (parameters) can be used in model training. It should be confirmed that all of the factors used as model input are available in real-time to generate the prediction at a given moment (to mimic further real-world application).
Step 3: Now the training of one or more prediction models with different ML methods begins
Different algorithms exist side by side. They behave differently and also the resulting prediction models will predict defects differently. The most commonly used algorithms for prediction are gradient boosting, regression and random forest. Before deciding which method is best, it is important to validate and compare the resulting prediction methods.
Model quality is estimated using the so called ROC AUC metric which is typically used for the probabilistic classification and ranking problems. The core feature of this metric as compared to other popular classification metrics (accuracy, precision, recall, F-measure) is in its ability to focus on overall quality of relative ranking performed by the model instead of evaluating the correctness of individual labels.
Step 4: The prediction models are further evaluated and compared by using Key Performance Indicators (KPIs)
Very important for the classification of models are predictions that were correct (the true positives), but also the false defect predictions (the false positives). The validation data are used to analyze the quality of a prediction model.
Once we have found the most powerful predictive model, we can also use it to analyze the relative importance of the data features - we can easily derive the relative predictive power of many data features.
Tips: 3 Main Model Validation KPIs
- The Precision KPI is a ratio of the number of true positives divided by the sum of true positives and false positives. It says how accurate the prediction is. If we predict a defect, will it be correct?
- The Recall KPI is calculated as the ratio of the number of true positives divided by the sum of true positives and false negatives. It indicates the sensitivity of the prediction, or what defects have we missed?
- The Receiver Operating Characteristic curve (ROC curve) summarizes the false positive rate (x-axis) versus the true positive rate (y-axis). The higher the surface under the curve, the better the prediction model.
The predictive machine learning model returns probabilities of having a defect. In order to translate this defect probability in a Boolean decision to perform a certain action or not, we need to define a threshold. Any defect probability higher than the defined threshold will result in a chosen defect handling approach on the suspicious material.
Step 5: The most powerful model is then used for the online prediction system, which feeds it with actual production data to predict defects
Such a prediction model should be integrated in order to obtain the input data in real time in an automated way and to make a prediction of the expected defect for a given coil almost in real time. It should also be integrated with the defect handling logic to perform an action based on prediction in order to shorten the duration from the problem occurrence until the actual action.
Predictive models usually have a short life span and their accuracy and memory decreases with time.
Therefore, it is important to systematically monitor the quality of the prediction model and train it from time to time with newer historical data.
What questions come to your mind when you think about how to avoid defects in metals production?
Acting Instead of Reacting Series
- Part 1: How to Score with Machine Learning
- Part 2: How to Predict Defects with Machine Learning
- Part 3: In 5 Steps to a Predictive Model
Director Marketing PSI Metals GmbH
After taking over the marketing department of PSI Metals in 2015 Raffael Binder immediately positioned the company within the frame of Industry 4.0. So it is no wonder that in our blog he covers such topics as digitalization, KPIs and Artificial Intelligence (AI). Raffael’s interests range from science (fiction) and history to sports and all facets of communication.
+43 732 670 670-61