How to Build a Chatbot from zero

0
69


We will divide this guide into three steps:

  • Understanding the Basics of how a Chatbot works
  • Building a model with TF, Keras, and Jupyter
  • Deploying the model with Serverless as Python AWS Lambda
  • Build a Preact Frontend to interact with the bot

For a Chatbot to work it has to understand the intention of the user. This can be represented in mapping words to an intent. This can be single words or a combination of various words (sentences).

Example of a greeting intent, which would be triggered by the phrases Hi, Hello or Good Morning.

After detecting the intent the Chatbot is then able to either directly respond with a preprogrammed response or get more data from an external source (like an API or Database) for a user-specific response. The data format to map the input phrases with intents and responses is given as a JSON file:

[{
"tag": "greeting",
"patterns": ["hello", "hi", "hey", "hi joonka"],
"responses": ["Yo!", "Howdy!", "G'day mate!", "Hiya!"]
}]

Once the intent-tag is detected we will randomly select one of the possible responses.

We need to detect the intents. Therefore we need to apply some Natural Language Processing (NLP) techniques like tokenization and word embeddings. The source code can be followed on GitHub in the form of a Jupyter notebook.

Feature Tokenization and One-Hot Labels

The input of Machine learning (ML) models have to be numerical. Consequently, we need to represent our input phrases as a sequence of numbers. The process of mapping every word to a unique id is called Tokenization. In Tensorflow we can make use of a text preprocessor: the Tokenizer. If we have an intent_list = ["Hi", "Hello", "Good Morning"] , containing all input phrases, the tokenizer then will identify every word with a number. For the above list the word index would be the following: {'hi': 1, 'hello': 2, 'good': 3, 'morning': 4}. For words that our model has not seen previously, we will use an out-of-vocabulary (OOV) token.

We have created a numerical representation of words, but we want to train our model on sequences of words. For the input phrase “Good Morning” we would use our word index to create a sequence of [3, 4] (good -> 3, morning -> 4). This sequence has a length of two. In general, our sequences can have a different length (e.g. hi -> len([1]) = 1), but for our ML model to work, we need the sequences to have the same length. Here we can use a technique of padding, meaning that we fill the sequences, which are smaller than the longest sequence with zeroes until all sequences have the same size (one could also cut off part of the longer sequences):

Hi           -toSequence-> [1]    -padding-> [0, 1]        Good Morning -toSequence-> [3, 4] -padding-> [3, 4]

Apart from the features we also need to numerical represent the intents (labels). We will do so by using a “One-Hot” representation. If we want to train on two intents, e.g. “greeting” and “farewell”, then we could represent them as

greeting     -> [1, 0]
farewell -> [0, 1]

In TF we can use some preprocessing utilities: The Tokenizer, pad_sequences and to_categorical. The corresponding code is then given by

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
# Tokenization
tokenizer = Tokenizer(oov_token = oov_tok)
tokenizer.fit_on_texts(intent_list)
word_index = tokenizer.word_index
# Input phrases to sequences
sequences = tokenizer.texts_to_sequences(intent_list)
# Padding
max_length = len(max(sequences, key=len))
vocab_size = len(word_index)
padded = pad_sequences(sequences)
# One-Hot labels
labels = tf.keras.utils.to_categorical(label_list)
num_categories = len(labels[0])

Word Embedding

We can further improve on the tokenization of words by representing every word in an n-dimensional vector space. E.g. if we choose a 2-dimensional space we would map every word to a vector of 2 real numbers

Word embedding of Hello and Goodbye, which point in the opposite direction.
hello   -> [1.0, 1.0]
goodbye -> [-1.0, -1.0]

Words with a similar semantic then would point in a similar direction and be close to each other in the vector space. Words with a different meaning would tend to be far from each other.

The initial word embedding vectors are initialized randomly, but the model will train the final vector numbers. TF has an Embedding layer to do all of this for us.

The final model is then given by

import tensorflow as tf
from tensorflow.keras import layers
tf.keras.Sequential([
layers.Embedding(vocab_size+1, 16, input_length=padded_length),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(64, activation='relu'),
layers.Dropout(0.5),
layers.Dense(num_intents, activation='softmax')
])

where we added two Dense layers with a Dropout for regularization. The final needs to have the same number of neurons as num_intents we want to train.

Making Predictions

To make predictions we also have tokenize, sequential and pad our input phrases, before passing the numerical representation to our model. The softmax layer activation will give us a list of probabilities for each intent. The entry with the highest probability will be used to detect our intent.

sentence = 'Good morning'
sequence = tokenizer.texts_to_sequences([sentence])
padded_sequence = pad_sequences(sequence, maxlen=padded_length)
prediction = model.predict(padded_sequence)[0]
intent = intents[np.argmax(prediction)]
tag = intent['tag']

For deploying the Model and be able to make predictions we will use Serverless, which is a useful tool to automate cloud infrastructure. The source code can be seen here.

Serverless

The core of Serverless is the serverless.yml file

service: chatbotprovider:
name: aws
runtime: python3.6
region: eu-west-1
stage: dev
functions:
infer:
handler: infer.inferHandler
timeout: 30
events:
- http:
path: infer
method: post
cors: true

It describes a function “infer”, which can be called by an HTTP POST request. To avoid problems with Cross-Origin-Resource-Sharing (CORS), while making requests from a different origin, we have addedcors: true.

Deploying the function is then as simple as executing sls deploy -v (-v for verbose). It will then create an S3 bucket, a Lambda function, and an API Gateway and return the endpoint of the microservice. To test the function we can make use of cURL

curl -d '{"msg": "hello"}' -X POST https://{id}.execute-api.{region}.amazonaws.com/dev/infer... wait ...=> {"msg": "hello", "intent": "greeting", "response": "Yo!"}

It could be that the first call is going to timeout. The Lambda function has to fire up and train the model, which could take some time. Later calls will return instantly, as we will not train the model each time the lambda function is called.

The function can also be tested offline with the command

sls invoke local --path event.json

where the query data is given in the event.json file.

Size (Optional)

One limitation of Lambda functions is the maximum size of 250mb² of the dependencies like TF. To not exceed this limitation we had to manually zip and remove some dependencies with the serverless-python-requirements plugin. Just add the following to your serverless YAML

custom:
pythonRequirements:
dockerizePip: true
zip: true
slim: true
noDeploy:
- boto3
- botocore
- docutils
- jmespath
- pip
- python-dateutil
- s3transfer
- setuptools
- tensorboard
- six
plugins:
- serverless-python-requirements

To visualize the Bot interactions we can build a small Chat interface. We will use a slim React alternative called Preact, which source code is given here. Execute yarn install && yarn start and the app should be running.

The main component Home takes as state an array of messages, which will be appended every time a user sends a message or the backend returns a response. A message can be represented as

{
isUser: false,
msg: "Hi! This is Joonka a demo of Natural Language Processing."
}

Since February of 2019³, we can use a fancy API for that: Hooks!

Hooks

Hooks can be used to move the state management from classes to functional components. In our case, we will initialize the messages state with useState(initMessages). Then we can mutate the state with the setMessages function, which is returned by useState. When a user sends a message we call sendMsg function and update the state, with the user message and the response:

const Home = () => {
const [messages, setMessages] = useState(initMessages);
const sendMsg = () => {
axios.post(
"{endpoint}",
{ msg: input }
)
.then(({ data: { intent, response } }) => {
setMessages([
...messages,
{ isUser: true, msg: input },
{ isUser: false, msg: response }
]);
})
.catch(err => console.error(err));
};

Testing

Even though this is a small demo project it is always preferable to do some Unit testing. Preact comes preinstalled with the JavaScript testing framework JEST and to deal with React components we will use Enzyme. The tests can be found here and are executed by yarn test.

JEST and Enzyme

JEST is a JavaScript testing framework, which lets us run all our Unit tests with one command: jest, which will look for *.test.js files. In principle we define matchers in a testing file like

expect(1+2).toBe(3);

and jest complains when something we expected does not match the matcher.

While testing React components we have to take care of the component rendering and child component dependencies. Here Airbnb has given us a handy library called Enzyme, which lets us prepare the components we want to test for. Enzyme lets us render a function and offers an API to easily extract DOM elements. E.g. if we want to simulate a click on the send button

test("Sends msg on send button click", done => {
const wrapper = mount(<Home />);
wrapper.find("input").simulate("keydown", { key: "a" });
wrapper.find("button").simulate("click");
setImmediate(() => {
expect(
wrapper
.find("main")
.childAt(4)
.text()
).toBe("a");
done();
});
});

We mount the <Home /> component, enter some text in the input field, simulate a click and check if the Component rendered the sent message.



Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here