For complex applications in production control the automobile industry relies on Artificial Intelligence in terms of Qualicision technology in over 120 production lines worldwide to calculate production sequences.
Due to the enormous combinatorial complexity of the questions to be solved here, it is indispensable that these solutions handle the process-related combinatorial uncertainty. Here, the relation to Deep Learning is strong. As with Qualicision, Deep Learning also requires controlled handling of combinatorial complexity, and hence dealing with the combinatorial uncertainty.
Both methods have solved the very complex combinatorial problem of mastering with algorithms the Asian game Go in an impressive way. Obviously, the complexity of the game Go with a combinatorial dimension of two to the power of one thousand two hundred excludes a complete search. For comparison, the problem of machine chess already solved earlier has a complexity of two to the power of four hundred.
How does AlphaGO manage to solve this? Roughly speaking, Go is understood and modelled as a sequence of moves in the sense of a Markov decision-making process. The speed of computing systems, which has increased enormously in recent years, has been combined with a very intelligent combination of a Monte Carlo Tree Search (MCTS) merged with two Deep Neural Networks into one architecture, with training duration extending several weeks and supported by massive parallel computing.
In the end AlphaGO ran on forty so-called search threads on 1202 CPUs and 176 GPUs in a parallel architecture.
Algorithmically, the MCTS method in the breadth-first search, i.e. in the selection of the next possible move, was limited by the so-called Policy Neural Network. The policy network outputs a learnt probability distribution in the search for the best possible moves, starting from a given position.
For limiting the search space, in the depth search the so-called value network is used, storing estimates in order to best evaluate the feasibility of a node without having to go through the underlying sub-depth explicitly. In an initialization phase AlphaGO's policy network was trained with approximately thirty million human-played positions of the KGS Go server available on the Internet.
The Value Network has learned evaluations of position nodes by playing against itself over and over again. In this manner AlphaGo distinguished more promising nodes from less promising nodes, improving that distinction by playing millions of games against itself.
In this way, a so-called Reinforcement Learning Architecture emerged, with MCTS as a search method and two Deep Neural Networks: the move selection network (Policy Network) and the sub-depth limitation network (Value Network).
This fascinating process converges and works brilliantly. In the later version AlphaGO Zero, the system only played against itself with one single interconnected neural network covering both purposes. Thus, the input of the human-played games was no longer required.
However, AlphaGO and AlphaGO Zero take advantage of the fact that the rules of the game Go are clearly defined and the set of rules is fixed by definition. Thus, the search space of the game positions is enormous but finite (the Japanese Rules of Go consist of fourteen articles and about twenty rules with related interpretive comments (see Excerpt of the Japanese Go Rules). In particular, the number of stone types is manageable: There are only two of them, white and black stones. Therefore, the resulting combinatorics is the uncertainty-giving element of the game due to its by enumeration de facto untreatable size.
The main object of the game Go is to use your stones to form territories by surrounding vacant areas of the board, thus occupying larger territories than your opponent. The Go board consists of a grid of 19 horizontal and 19 vertical lines, with 361 intersections. The empty points which are horizontally and vertically adjacent to a stone are known as liberties (green squares).
In the upper diagrams, the white stone has four liberties and the black string has 6 liberties.
No isolated stone or solidly connected string of stones without liberties (horizontally or vertically adjacent empty points) can persist on the board. All stones without liberties (called prisoners) are removed from the board.
The lower diagram shows that white playing can set a stone on the liberty (green dot), thus capturing the black string. More Go rules and strategies
The so-called game tree complexity can be estimated about 21.200
Now let us compare the game Go with the task of sequencing orders in an assembly line. Not infrequently a car factory produces in a day two thousand vehicles from an astronomically large number of order variants, which are to be sequenced in a way that meets the technical restrictions of the assembly lines. Compared to the game Go these are the rules of the game in this context (see Dimensions in sequencing of assembly orders in car factories). An example of such rule is that after two orders with a rear view camera at least two orders that do not include a rear view camera have to follow. A further example is that a white car in the sequence has to be followed at least by five more white cars. If a number of white cars have rear view cameras the result for the mentioned distance rule results in a puzzle problem similar to that of a Go game.
Certainly, the difference is that in a car assembly great many of such rules are to be considered, in most cases sixty to seventy, sometimes even over a hundred with interpretation tolerances. In addition, the number of orders and their composition (number of rear view cameras, leather seats yes/no, different colours, etc.) dramatically increases the complexity of sequencing.
Applied to the comparison with the game Go, here the type of the stones is not only not set as in the game Go to two but the number of stone types is open. It may differ from day to day, because the orders in their composition can vary from day to day, which means that there is a different number of order types on a daily basis.
The Qualicision sequencing solutions work with the so-called Qualicision goal conflict analysis instead of probability distributions as a method for dealing with uncertainty. This is used to estimate the search space both in width and in depth for positive and negative relevance in the process of calculating the sequence. Theoretically it may also be considered as a Markov process. The underlying methodology of the estimation is based on Fuzzy Logic and the so-called Fuzzy goal functions (impact curves).
This solution works even if the rules of the game and the types of stone vary from game to game. As stated earlier the type of orders used in sequencing varies from day to day. Therefore, applying the law of large numbers in daily sequencing cannot be applied for learning. What has been learnt yesterday can already be invalid today or at least differ.
Considering the combinatorial complexity of sequencing, for a sequence of two thousand order positions it can be estimated at approximately two to the power of twenty thousand. By comparison, as stated for the game Go with two to the power of one thousand two hundred and for chess with two to the power of four hundred.
The combinatorial complexity of sequencing in an automobile factory is thus ten times greater in the exponent than that of the game Go, and in the end by two to the power of eighteen thousand astronomically much larger.
In comparison, the famous number of atoms in the universe seems very small with two to the power of two hundred and forty. Even though the "sequencing moves" that are actually permitted in sequencing reduce the combinatorics due to the restrictions of the assembly line, the uncertainty to be treated during sequencing is not treated with probabilities and Monte Carlo methods as in the case of the game Go but via fuzziness. Nevertheless, the intelligence of both methods is ultimately due to the intelligent management of uncertainty to limit the search space.
Autonomously controlled AGV transports with Qualicision optimization for modular production in Industry 4.0 projects with PSI swarm production (from BMWi Project SMART FACE).
Photos (top to bottom): Sergey Tarasov/Shutterstock.com | Saran Poroong/Shutterstock.com | Suphaksorn Thongwongboot/Shutterstock.com (composing PSI), Klug, F. Research Project SMART FACE