Klevu - NLP in Site Search: Catalog Processing

We asked Nilay Oza, Co-Founder & CEO of partner Klevu, to take us through the application of NLP in their site search solution for Magento Commerce.

Natural Language Processing (NLP) is a subfield of Artificial Intelligence (AI) and focuses on the interactions between humans and computers. NLP looks at how to program computers to process and interpret large amounts of data around human language.

If you’re an ecommerce manager, a solutions specialists, a product manager or an AI retail tech enthusiast, you’ll love this mini series of articles! They aim to demonstrate how NLP is applied by Klevu to provide relevant site search experience for online stores. NLP powers Klevu at multiple levels; in this first article of the series, we focus on Catalog Processing.  

Level 1: Catalog Processing

As a first step, Klevu receives the catalog from the store. The catalog contains information about products and their respective attributes. Klevu’s algorithm reads the catalog and adds relevant data to make the catalog’s search wider and deeper. This process happens automatically. The picture below shows the overall Klevu NLP and ML (Machine Learning) process.

Catalog processing in Klevu enriches the catalog by adding linguistically and semantically associated data. It expands search coverage by at least x 2.

For example, if a product title states ‘red curtain’, Klevu will automatically understand ‘red’ as a colour and ‘curtain’ as a noun. It will then add relevant meta data such as ‘drape’ as another noun association. This means at the time of search, if a user searches for ‘drape’, Klevu will be able to show ‘curtains’ in the results as it has already mapped ‘curtain’ and ‘drape’ behind the scene. The figure below visually demonstrates the example from a real word search.

The process of adding relevant words is executed on the entire catalog, making it fundamental to extending search coverage. Klevu’s NLP works at multiple levels, essentially making the catalog ready for search queries that may not have a direct match with the content in the catalog.  

Klevu’s NLP injects both linguistic and semantic information (variations in language and logic of individual shoppers) to its extended catalog to match the patterns of customer search through historical learning and language understanding.

Examples of catalog processing

Individual Alternative Logic

NLP in Klevu accounts for individual shoppers entering different search queries who essentially want to find the same product. Why? Because user search queries are often written quickly and submitted without review.

The application of NLP does this by applying different forms of logic and language relating to the same meaning. By adding alternative forms of terms to the catalog, NLP allows for relevant search results no matter how the shopper logically searches for it.

For example, shoppers can search for a very specific model or product number in a variety of ways, such as: “HP45”, “HP 45”, “HP-45”. In this case, Klevu will show the HP45 product irrespective of how shoppers have searched for it, i.e. it may not match how it’s written in the catalog. Looking at another real world example, bathroom takeaway has a product, Radiator Valve. The SKU code is written at 100-1001. Klevu in this case, would store the SKU with a variation without the dash (-) which will allow wider search tolerance. In the picture below, it shows that when a customer searched for 1001001 (without a dash), Klevu returns the relevant SKU.

Decompounding relevant independent words

Klevu uses NLP to decompound specific words in many European languages. In particular, it decomposes specific words and breaks them up into relevant independent words.

For example, the application of NLP decomposes (in Swedish) the word "rödvinglas" into three individual words "röd", "vin" and "glas", allowing queries such as "vinglas" or "glas” or “rödvin" to succeed at the time the shopper enters the search query.

In the English language it may seem that there’s less need to decompound words within a catalog, but it does in fact help with search. For example, pop sockets for phones sold on Skinny Dip London, stores ‘popsockets’ as one word in the catalog. Decompounding here has allowed a more liberal search of ‘pop sockets’, although technically it does not exist in the catalog as a collection of two words, visible in the example below.

Synonyms, Nouns and Adjectives

The application of NLP in Klevu automatically injects synonyms, nouns and adjectives to match each product category or product name. For example, ‘wardrobe’ can also be called a ‘cupboard’ (i.e. noun), and a ‘dress’ can also be called a ‘frock’.  Klevu also automatically adds semantic categories such as ‘outdoor area’ for the term ‘garden’, or ‘furniture’ for the mention of  ‘chair’.  

Summary

The application of NLP powers Klevu to be ultra language-focused, removing limitations of catalog and allow shoppers to search in a natural way, as humans like to do :).

So much so, that Klevu’s advanced character normalisations account for languages such as Arabic and Japanese. Klevu can even recognise the different language symbols across the world by using special parses to resolve and recognise word boundaries.

This allows Klevu to enhance and enrich search for shoppers and increase ecommerce sales as a result. By identifying key subjects and allowing Klevu’s smart search to address the use case of the shopper, showing relevant products instead of an accessory related to the product, NLP powers Klevu, and Klevu empowers ecommerce brands.

Without these enhancements, it doesn't matter how good a keyword-based solution is, it will fail to bring any results for the wider context queries.

Watch out for the next article in the Klevu series: NLP in Site Search: Query Processing.