As the founder of a Chat technology company, I’ve been in the lucky position to experience first hand the ups, downs and shifts happening around Conversational Systems. For us at Humanise.AI, we closed 2019 by selling £1.5m of accommodation reservations though a chatbot in the space of just a month. It feels that both ourselves, and others, are cracking the “what’s the business value of this stuff” question. My thoughts below distill many of the lessons and experiences we learnt throughout 2019 and that culminated in the success we had at the end of the year.
In 2018 Google demonstrated Duplex, perhaps the first example of a narrow domain-specific AI that blew my mind. Duplex is able to book a restaurant table or an appointment for a haircut, communicating naturally with humans over a phone call. It actually seemed to work — I remember watching at my desk and saying to colleagues at the time, excitedly, “you’ve got to see this!”
But, really, booking a restaurant table is a very narrow use case. What Duplex really demonstrated was that if you put enough effort into a narrow scenario, you can do something pretty awesome. I expect more narrow use cases to impress in 2020, as the lessons from Duplex begin to emerge.
Last year the technology behind Duplex started rolling out to our smartphones. There’s been some cynicism when it emerged that a quarter of Duplex calls actually use humans, but that still leaves an impressive 75% fully automated with AI. Make no mistake, this is a significant moment for AI.
Although we’re starting to see real success in narrow domains, so called “General AI” remains a long way off.
Virtual assistatants will continue to improve, but in a slow and incremental fashion. 2020 won’t be the year in which Siri, Alexa, Google Assistant, or indeed any chatbot, becomes that mythical AI assistant we all wish they were. Progress will remain frustratingly slow, but progress will be occurring — it’s just going to be a long, hard slog over many years to get there. The smart money will go into narrow domain use cases for the forseeable future.
After a number of years of stagnation in the NLP technology market, 2019 was the year in which we began to see really exciting progress.
New algorithms and frameworks using transformer-based techniques emerged throughout last year. These include the famous GPT2 from Open AI, which made a name for itself by writing plausible fake news. But also Google’s BERT, Microsoft’s XLNet and other derivatives which excel at many NLP tasks.
But big challenges with these new techniques remain — things like BERT can be far too slow and require far too much expensive compute capacity for many applications.
Slimming the models and improving performance has been a focus for late 2019 and will continue to be so for 2020. A very small but specialist community is leading here — notably, tech startup HuggingFace who just raised their Series A investment round. I expect more investment and emergence of niche NLP outfits in 2020.
Right now, using the latest techniques requires quite deep expertise — these are not chatfuel.
Do I think these new NLP techniques will be transformational and usher in a new era? Let me put it this way: I expect them to bring meaningful improvements. Predicting a massive leap is unrealistic. Building really good conversational systems takes a lot of work and some artistic flourish —new technologies will help, but this won’t change.
At Humanise.AI we’re working with BERT and see real potential for its use in some very interesting cases. Turning it from a cool algorithm into something that can be used at scale in a production system is the main challenge right now.
How and where things like BERT fit into the broader chatbot technology landscape is interesting to speculate. Frameworks like Rasa might begin to include such new technologies. But writing personally, I wonder if we might not see something new emerge — there is still a lot of work to get to what I would consider my “perfect” NLP stack.
Interestingly, virtually all of the work in this space appears to be being progressed as open source. Are the days of closed-source, proprietary NLP technologies numbered, I wonder?
In 2019 we started to see a shift of emphasis away from chatbots as customer support tools and towards being marketing and sales tools. Drift leads in positioning chat technology as a marketing tool — capturing leads and making your marketing website more responsive. Quite a different message to the early days of chatbots.
In late 2019 we at Humanise.AI built a solution for an accommodation client that sold £1.5m of room reservations in the space of just a month. We were blown away by the results, as were our client!
The key we’ve discovered to chatbot sales success is a focus on the marketing aspect of the solution. In our case, we run email marketing campaigns to a target customer base. Each email includes a clear call-to-action button that links to our chatbot. Because everyone who comes to our chatbot is arriving with a clear purpose in mind, we can make it very efficient at processing that one use (see my first trend on narrow use cases). No more “hi, how can I help you?” chatbot startup messages — we get straight down to business.
It feels like we’re in the midst of a big shift from Chatbots for customer service, to Chatbots that pay for themselves by acting as an effective sales/marketing tool.
In the early days of chatbots we saw an emphasis on finding answers from a static set of text — IBM Watson being an example of this, impressing with its ability to answer general knowledge questions.
However, in practice such solutions have ended up looking a bit too much like fancy FAQ engines and have failed to set the world on fire.
What seems to be working much better, is where a Conversational system is empowered by integrations to other systems. When Chat makes an API call to another system, suddenly that FAQ engine can execute transactions, sell things and execute a business process.
This means that chatbot projects are starting to look a bit more like traditional projects, with all the complexity and dependencies that implies — for good, or bad.
It turns out that when you focus on making the user’s life efficient, buttons win. Pressing a “yes” button is one button press, whereas typing “y-e-s-[return]” is four button presses. As soon as you ditch a religious zeal around natural language processing and focus on what actually works for users, button input starts to look attractive.
At Humanise.AI we made adjustments to our Web Chat system to better support button-driven dialogs. We’re now able to selectively turn the text input field on/off. So, for example, if we present a Yes/No button combination we will often remove the option for the user to type a response. There’s a lot of considerations as to when we do this and how we provide escape routes, but it’s proven a very effective way to simplify dialogs and better shepherd users through them. Meanwhile, Facebook continues to only allow the text input field to be turned on or off for the whole bot — frustrating.
I’ve been watching Cleo, a natural language interface for your banking, throughout 2019. They’ve adopted a tone of voice that’s unashamedly appealing to a young, hip audience. No bank would ever adopt this style of language, or indeed the swear words they’re not afraid to use. But maybe this is the point and why they are succeeding.
Cleo works because it’s the antidote to boring conventional banking. It talks using street language and speaks “as one of us” to an audience segment that’s turned off by how banks traditionally communicate. My mother wouldn’t approve, but others love it.
This idea that a Chat system shouldn’t target everyone, but instead focus on one specific goup, is powerful. It means we can use Chat to reach audiences we might otherwise struggle to reach. We might even have multiple chatbots to reach different user groups.
However, there are some really interesting conflicts here. I’ve already seen very traditional corporates struggle with even the most innocuous of humorous chatbot responses. I remain unconvinced, for example, that the response “If I was a dog, I’d wag my tail” was so terrible an answer to the question “How are you today?”
It feels like there’s a collision here between super conventional businesses and a consumer shift towards informality. Cleo’s approach takes no hostages, would make my grandmother blush and blows apart the idea business communication should never offend. It’s unclear how conventional businesses will react to this opportunity.
Personal opinion: smart businesses will learn from what Cleo is doing, even if they turn the volume knob down a little.
Facebook Messenger started the bot craze, but Facebook doesn’t seem to have made any significant updates in the past year — it feels like they have stalled at a time when they most need to be filling out their platform. And Cleo, perhaps the biggest poster-child for an ambitious and widely used service on Facebook Messenger, has been quietly moving away from Messenger and to its own app.
Meanwhile, Apple has been slow to launch Apple Business Chat and we’re yet to see any meaningful adoption. Several years after its announcement, WhatApp’s Business API is still on limited release where you need to be in a secret club to get access. And SnapChat and Instagram have no APIs at all. What happened to the promised explosion of bots integrated with all these messaging apps?
Without singling out any names, my experience is that messaging app providers have obscure “certification” requirements that aren’t openly documented and which can make them tough to work with. The most mature, Facebook, has experienced security scares that led it to suspend the certification of new bots for an extended period — leaving developers in the lurch. We’re a long way from these being “app store” like experiences that 3rd party developers can build businesses on top of.
These challenges led us at Humanise.AI to invest in our own Web Chat system. It didn’t feel viable for us to build a business that was completely dependent on others, as we would have been without our own chat interface. As it turns out, in 2019 our clients have been expressing a clear preference for Web Chat and seem increasingly nervous of association with social media brands — probably with good justification.
It feels like the jury is out, regarding the use of messaging apps as a channel for hosting 3rd party conversational solutions. This is a real shame, because it should be a big opportunity. I hope my concerns prove wrong, but I fear they will not.
Today, Web Chat is almost always confined to a small pop-up window. We have the website, and we have the chat window — they’re entirely separate things.
But we can envisage very different models. Why shouldn’t something in the Chat window cause something to happen on the webpage, or vice-versa?
At Humanise.AI we worked last year with a client on the concept for a chatbot “coach” that worked alongside the client’s main website. Things on the webpage caused the chatbot to do things and vice-versa. Maybe we will see more of this “chat breaking out of the window” stuff in 2020.
This is a bit of a personal obsession, so I’m not sure if it’ll catch on, or not. It feels to me that the phrase “Chatbots” is a bit too 2017 and doesn’t really capture the exciting potential of this type of solution. I’ve started to use the phrase “Conversational System”, as it raises fewer preconceived ideas in people’s heads. But maybe I’m deluded and we’re stuck with Chatbots, no matter how much I dislike the phrase!