Machine learning demystified
Machine Learning (ML) can seem impenetrable to anyone unfamiliar with it. A lack of understanding as to what ML is, and what it means for the manufacturing industry, sometimes gives rise to outlandish ideas about intelligent machines poised to take over mankind. But ML, basically, is a major advance in the development of Information Technology (IT). The way it works, and its limitations, must be fully understood by those who want to use it to the full benefit of their organisations.
ML, admittedly, requires the use of specific statistical and IT skills that few people yet have, or have needed in manufacturing. But its principles are quite simple – and even intuitive to grasp. For me, it was what I once considered to be a rather mundane language translation online service – namely Google Translate – that helped me realise the transformative potential of ML.
To put it simply, language translation software has long been based on programming dictionaries, grammatical rules and their numerous exceptions. This approach involves considerable effort.
From ‘rule-based’ to ‘data-driven’ processes
· The new methodology stemmed from a simpler idea: don’t try to define rules and lexical tables from scratch, let the software ‘discover’ them. Millions of already translated pages are collected from international organisations.
· When a user submits a text for translation, the software slices it into basic elements and then searches for identical or similar ones in the same language within the translated pages.
· The most likely translation is extracted to be suggested to the user.
Relevant statistical patterns found in the data, therefore, replace translations rules. Instead of having to be painstakingly programmed, they are automatically ‘learned’ by the software. It is easy to see the cost-saving value of this approach, compared to the traditional one, especially since the quality of the resulting translation is usually at a par with it.
In manufacturing, the productivity gains are compounded by a substantial quality improvement. Anyone who has ever specified automation processes knows just how complex it can be to anticipate all the possible situations the software will have to face once it is in production. This is even when functional domain experts are involved. The software’s functional rules are based on assumptions that themselves rely on a limited number of observations. But reality often proves to be far more complex than expected, meaning that automation is eventually suboptimal or that the software ends up requiring expensive corrections.
Machine Learning, on the other hand, absorbs and develops itself using all available data, regardless of the volume. That means the risk of patterns or a use case being left out of the picture is limited.
Humans must remain in charge
The machine also avoids human intelligence’s ‘cognitive biases’ that translate into imperfect selections of available data and inappropriate decision making.
A good example is that of the automated processing of loan requests received by banks. An algorithm looks through a borrower’s key information along with reimbursement information. It then highlights the likely relationship between a borrower profile and a default risk. Applied to a new loan request, the algorithm will predict, with an accuracy level considered as sufficient, whether the borrower will pay back. This means the risk of a bad decision, triggered by prejudice or the bank operative’s mood, is removed.
Nonetheless, it is crucial that humans remain the ultimate decision-makers.
First, because the software is obviously not perfect. It is governed by settings made by humans. For instance, it may have been optimised to avoid ‘false-positives’ (where the loan is granted to a borrower who will default) and so will lean towards rejecting certain loan applications. Therefore, a user must check that the system’s recommendations are legitimate and, if necessary, reject them. This will allow the system to learn new criteria so that the algorithm accepts applications from similar profiles next time.
Another key reason is that only humans should make sure ethical standards are met, especially when a decision concerns an individual’s rights.
Data über alles
It is essential to choose and set up an algorithmic model that fits the manufacturing process at stake and the type of data sustaining it. The automation performance will depend on meeting two imperatives: data quality, and training set representativeness, meaning that the automation will be more efficient when ML is carried out based on unbiased observations.
Access to data is crucial for ML success because, ultimately, no level of algorithmic sophistication will ever make up for a poor data set.
With the growing power of computers and digitisation, it has become possible and probably essential to leverage a data-driven approach to design more efficient automated manufacturing processes. Beyond the required scientific skills, the success of these solutions lies in the collection of relevant data and the monitoring of their operations by humans. Machine Learning tends to dismiss arbitrary behaviours. It is up to us to make sure it does not replace these with inappropriate over-generalisations.
By Jean-Cyril Schütterlé, VP Product & Data Science at Sidetrade