A quick introduction to Wit
Wit.ai is an amazing platform for developing a wide range of product based on NLP. In fact, it joined Facebook about 5 years ago. One trait of The Wit.ai Team and I found in Facebook products was the seemingless integration with Messenger. 📧
You can read about it or even consult their Medium Blog there.
Wit.ai makes it easy for developers to build applications and devices that you can talk or text to. Their vision is to empower developers with an open and extensible natural language platform.
And it’s because it is extremely easy and free to create AI-based chatbots with this plateform that we have chosen it for our problem. 💯
First, without any knowledge about how to consume their service in our application, we need to train an agent, with the help of their smart platform, on a specific topic. 🎛️
Training an agent on the Restaurant topic
Okay, so now is the part where we get to accomplish an automated learning of our Chatbot with NLP. To stay consistent with our application, we will train our agent on the restaurant & food topic. 🥗
If you want more information about how Wit works or more advanced tutorials, I suggest you to read their docs there. 📔
The first things to do is to login and create a Wit app. Let’s do that.
You will then be presented with such an interface :
Before going on, let me explain the core principle of Wit. 🗣️
The Chatbot you are creating will consist of several entities that will have a different set of values.
For our case, we might define an entity called intent (created by default) and train the bot on the several values it might take. This is to extract what the user wants from his queries. Let’s see this below.
Here we demonstrated how to create the entity intent and add a new value to it “greeting”. This action is the essential part of training your bot. 🤖
Now let’s create a specific more complex intent for our Restaurant app.
Here we showcase two main facts:
- You can create as many entities as you want in a sentence, as long as it makes sense towards training the bot.
- There a already-made entities provided by wit. Here was an example of wit/location. These entities are robust, so I suggest to use them when appropriate.
Now that you understood the base principle, you can train your bot on any particular subject, and naturally, the more you train it on a specific entity, the more accurate it will become at analyzing natural human language. 👱♀️
Here is the same entity, Number, but trained with more sentences. Notice how Wit identifies all the entities on its own. 💪
This is the power of their system, the agent identified the entites with a sentence it hasn’t seen. And notice the probability on the right that it displays for each intent ! 👀
As an indicator, here are the entities we used for Loa
Let’s see the details for the intent entity.
We can see detailed analytics on the precision and recall for that entity as well as all the sentences we have inputed to train it, filtered by value.
Now you have all the tools in hand to train your own Chatbot ! 🎉
If there any other questions still left, you can check out the recipes part of wit.ai docs, which covers the most frequent questions.
Now that we know that Wit can provide us entities as a function of user input, we will take a closer on how to interact with the wit platform. 🔦
Interactions with WIT’s API
There are several ways to interact with Wit, either via developing a bot with Facebook Messenger and registering to the wit service with webhooks, but as we have chosen to implement our own dialog system from scratch, so we chose to interact with it via their API. 💻
After all, we can interact with wit as we would do with any other REST API, but we have to grab an app token first, as shown below:
As such, we can test our chatbot with CURL to see what wit output to us:
The python command is only used for formatting. The important part is that wit’s API returns to us an object containing the keys as entities that it detected, along with their probability.
So now, we can interact with the result as with any other API, for example we can develop a function to select entities only if they exceed a certain probability provided by the bot:
Here, the purpose was just to understand Chatbots and to demonstrate how to interact with Wit, and it’s in Part 2 that we will construct a Node API, centralizing our behaviors, from scratch. 🧱
Now, let’s look at how to build the foundations for the Recommender System.
In this part, we will focus on building the other big component of our framework, the Recommendation System. Recall that this subject is extremely broad and in constant evolution, so we will present one solution that works well for our problem, but it is certainly not the only one. 🙅
What method will we use
As with any software problem, we tried to find the best solution that would fit our needs for Loa. Here, since the user will be interacting with a Chatbot, it is not the best option to choose a Content based filtering, because it’s not really practical to construct a user profile from a single conversation. 📢
So what we wanted is to recommend restaurants to the user quite quickly, and if possible, with as little information as possible. After some research, the method we ended up using is called Latent Factor Collaborative Filtering.
Yeah I know it doesn’t sound pretty, but it’s not that hard to understand. 🤔
The big picture is that say we have a matrix, or a table of users and their ratings of size n*m, we will factorize it into two smaller matrices of size n*k and k*m, reducing the n items into k factors.
These k factors are called latent because they do not exist before we factorize the matrix, so instead of comparing restaurants to restaurants reviews, we can compare linear combinations of these items, which might be things like restaurants with special type of food or restaurants with given characteristics. 📊
And this configuration is perfect because it fits with the constraints we wanted ! Meaning that we will get recommendations by factors that are not predefined, but learned by the matrix factorization algorithm. 📚
Enough Theory, let’s put it into practice by working with the dataset. 🧑🔬
The Yelp Dataset and how it can solve our problem
In this part I will give snippets of code to attain the whole pipeline from the raw datasets to our two latent matrices, U and I. Note that will need Python, Pandas and ScikitLearn. All imports will be stated when needed.
We have chosen the Yelp Dataset because of the variety of features it offered. Here, we will work with both the review and business dataset.
Here we took a subset of the whole review dataset, and they are organized by state (US reviews only), so the first step was to extract reviews and businesses in two separate DataFrames.
Next, we need to clean the textual data that we have, meaning we have to remove stopwords, punctuation and return a list of this processed text. 🧹
Now comes one of the most important part of the algorithm. We will split our yelp_df into user reviews and business reviews, combine their reviews into one single paragraph and apply a TFIDF transform to them. 🤯
- The first step separates the data into two important matrices, one with all the review of one user, and one with all the reviews for one restaurant.
- The second merges all the text into one huge block, which will be easier to process and extract features from.
- The last step is common in all ML text processing, the TFIDF transformation will learn important features accross each document, and will allow to develop a system based on numeric values afterwards.
The last and crucial step for our model is to create the base review data matrix, and apply a simple gradient descent algorithm to “learn” both the U and I matrices (as shown in the picture above with latent factors). ↪️
- We concatenate our base review matrix on both ids as rows and columns and with stars as values because it’s what fits our constraints to factorize the base review matrix, from text features to rating.
- We iterate over all the epochs to update our U and I matrices, or stop if we meet a particular error threshold.
The two last things needed are saving the model as a pickle file, and then the most exciting part, implementing a way to predict restaurants with respect to a user query. 🔮
To recommend N restaurants from a user query, we need to pass it to the same process as the user profiles, and dot-product that with the item-profiles and get the top best N ratings.
Now if we pass the string “I want to eat Italian food!” to our function, here are 3 sample businesses associated with the recommendations:
Awesome ! Now we can bootstrap a simple API so that our model will be available from anywhere 🚀
Serving the Engine as an Rest API
- We packaged the recommendation system into its own module, and we will use it in one of our routes to recommend restaurants 🍣
- We only have two routes, one is ‘/recommender/’, to test that our API is running, and the main route, ‘/recommender/v1/recommend/’ for the recommendation system 🛣️
- We use utilities from flask_restful here, but you are not required to do so, it is just an habit we have 📐
Now, if we fire up the server running Flask, we would get recommendations the same way as with our script, but with one major difference: We are getting them from an API, meaning we will be able to integrate in our whole systems in the next article. 🔩