Natural Language understanding of ingredient labels

Problem

Our client SGS helps processed food manufacturers to ensure their products are safe for human consumption. They wanted machines to understand the language written on the “Ingredient label” of millions of processed food products. These labels were written in French. Once understood by machines, they could give Artificially Intelligent recommendations to food companies, so that they produce safer products. And comply with the regulatory requirements of the local market.

Solution

We trained our Natural Language Pipeline on 100,000 ingredients. This training data was sourced from public datasets like Wikipedia, WHO & the likes. We built it as linked data to identify the purpose behind each ingredient. These purposes were based on the taxonomy of 21 categories such as Acidity Regulation, Anti Caking, Anti Foaming, Sweetening, Food Coloring, etc. Our system will also identify the E Number and INS number of ingredients.

Result

In several tests, we found our system was able to understand 75% of the ingredients mentioned on product labels. This was way better than the NLP offered by Big 5 Tech vendors, where they were able to understand 15% of the ingredients on product labels. While we continue to improve the accuracy of our system, SGS is experimenting with our NLP AI to make AI recommendations on millions of food products for their clients.

Approach

At Smarter. Codes we are making “Symbolic AI” great again. In the AI space, Symbolic AI has been a contrasting approach to implementing AI to the conventional Machine Learning approach. This enabled our AI Engineers to train (and debug the training) of the AI pipeline faster and with greater transparency.