Tell me if this sounds familiar. A chatbot software vendor gave a kick ass sales pitch at your company. You will get all the necessary tools to build a virtual assistant that will solve all your problems. It will be able to answer all (common) questions from employees and clients, 24/7, effectively freeing up time for teams to work on more complex stuff. Sounds great, so let’s go!
You start the project with a lot of ambition and energy. You know the first weeks/months will require a lot of work & time to train the bot. People will ask different questions about multiple topics. You just have to cluster similar questions into intents and train an NLP model with it. Easy!
Fast forward a couple of months. Your NLP model has grown tremendously, a lot of new intents and even more expressions. But instead of becoming smarter, your chatbot seems to become more confused. A lot of times it answers slightly (or totally) off-topic. How is this possible? It’s like eagerly starting a fitness program, but in your enthusiasm you overtrain and/or develop bad training techniques, resulting in discomfort or even injury.
So what can you do to fix this? Go over the following checklist:
What does your NLP model look like?
- How many intents do you have?
- How are your intent names defined? Is there a naming convention?
- What is the quality of your training expressions?
Let’s dive deeper into these 3 questions and look at some strategies to remedy problems that they can bring.
How many intents do you have?
Typically, the number of intents grows exponentially the first few weeks/months after deployment. User expressions enter via your chatbot and the bot trainers are eager to handle each and everyone of them. So between 100–300 intents or more is quite common by then.
- Check with your bot provider what the recommended number of intents is per bot. If you go over that number, try creating multiple bots and do handover between them.
- Only focus on creating intents that bring value. The ideal chatbot intents cover question that are a high volume and low complexity. Don’t create an intent for every question that you receive. It’s ok for your bot to not understand everything. Just make sure there is proper handover to a human agent.
How are your intent names defined?
Often, there is no formal naming convention when people start training your chatbot’s NLP. NLP trainers can choose their own intent names and in the beginning it all feels manageable. But there comes a certain point, mostly after a couple of hundred intents have been created, that overlapping or closely related intents start to pop up. This is when ‘intent blending’ starts and this has some negative effects:
- Your bot becomes confused, meaning for certain incoming expressions, it retrieves multiple intents with very close matching scores. So at some point, it does not know which one to choose anymore.
- Your AI trainers don’t know to which intent they need to assign a new training candidate expression.
- Your user gets bot replies that are wrong or not understood.
- Create a naming convention for defining intent names. Try to structure them with a consistent logic. E.g. make the distinction between action or information request, use the verb in the intent name (request, ask, block, buy, …)
- Make someone from your team responsible for the intent catalog, the “NLP gatekeeper”. The creation, update or deletion of intents and the communication about it within the team of AI trainers, is the main focus of this person. This to ensure consistency and clarity.
What is the quality of your training expressions?
The expressions that you use to train your intents are a prime example of “garbage in, garbage out”. It is the conversational data that will build the NLP model and determine how user’s inbound expressions are matched to intents. So it’s very important to verify that expressions used for training are 100% relevant for the intent that they are assigned to. When you assign real user’s expressions to intents without scrutiny, words or sentence-parts will be used with different meanings across multiple intents, spelling errors will be commonplace, …
- Make sure words or sentence-parts are specific and relevant for an intent. Avoid using them for multiple intents. A common thing to track is word density and/or frequency, where the NLP system can highlight when certain words are used too often across multiple intents.
- Implement a four-eyes principle for assigning expressions to intents. Let at least two AI trainers assign the same set of expressions to intents, independently from each other. Discuss the expressions that got assigned to different intents and come to an agreement or omit them from the training set of that intent.