The Warranty claim process for equipment manufacturers requires their dealers to provide certain details about the machine part e.g. part number, date of manufacture, etc. Some of the key details provided by the dealers are the descriptions of the problems seen by the dealer, the potential root cause of the problem, and the possible solution.
These descriptions map to certain codes as prescribed by the manufacturer and the dealer must pick the right codes and submit the claim for warranty. The list of these codes is usually long, running into many hundreds of codes, and many times, dealers end up entering the wrong code or select a default code like ‘Others’ or ‘Miscellaneous’. Once the claim reaches the manufacturer, their subject matter experts (SMEs) must validate the codes as part of the Quality Assurance process and end up spending a lot of time in correcting the erroneous entries made by the dealers. This results in a longer claim cycle and cost for the manufacturer.
Bringing AI into Quality Assurance
One of Tavant’s customers has been facing this challenge and was exploring the options to solve this using Artificial Intelligence (AI). So, the idea was to use AI to predict the right code based on the description entered by the dealer in real-time so that the data entered is correct and the SMEs don’t have to spend a lot of time in validating and fixing the codes resulting in better data quality and significant savings for them.
The Challenge
At first glance, it looks like the problem can be solved as a classification problem using machine learning. However, there were certain challenges to handle such as:
- Unreliable historical data – as mentioned above, the available claims data in the warranty system has a large proportion of incorrect codes, hence cannot be relied upon as training data. For instance, we found that there was a lot of imbalance in the data with a lot of data under the ‘Miscellaneous’ category code
- Descriptions entered by the dealers are free-form text and can vary in the style of language used
- The customer QA team recently formulated a set of new codes making the current codes in the data obsolete
The Solution
Natural Language Processing (NLP) has progressed leaps and bounds with state of the art changing every few months. This has mainly been driven by the rise of Deep Learning and specifically the usage of Word Embeddings – starting with Word2Vec to the current state of the art models like BERT, ELMO, etc. In word embeddings, words having the same meaning have similar representation and can also maintain the context in which the word is used, thus being able to differentiate between things like Apple, the company, and Apple, the fruit.
We also approached the problem using Word Embeddings. However, to handle the mentioned challenges and to build a production-grade solution, a lot more was required.
Below are some of the salient features of our solution:
- Data augmentation – there was very little data (~ 2 descriptions/code) provided by the experts from the customer – recall they had formulated a set of brand-new codes. We used NLP based data augmentation techniques to generate realistic descriptions
- Semantic similarity – we used word embeddings to find semantically similar descriptions and associated codes
- Continual Learning – the solution presents the dealers with top-3 predictions, and the dealer can select the right one from them. This allows the model to learn and evolve with more data
- Low latency – the models predict the code with sub-second latency this ensuring good user experience
- High scalability – the models are containerized using Docker and orchestrated using Azure Kubernetes Services (AKS), ensuring high scalability with an increase in workload
The whole exercise would have been meaningless without reasonably high accuracy. We beat the expectations and achieved accuracy levels of almost 90%. This will improve further as more data comes into the system.
The Benefits
The ROI of the solution is significant with respect to the improvements in business metrics, as mentioned below:
- Reduction in time for the dealers to select the QA codes
- Better data quality as dealers can no longer assign a “default” code for the descriptions entered
- Reduction in time spent by the manufacturer SMEs in correcting the codes entered by dealers – from weeks to minutes
- Reduced claim processing time