Chatbots are becoming increasingly popular in the tech industry. With the advent of Natural Language Processing (NLP), a chatbot can intelligently understand and interpret the language used in a conversation with a human. Python is one of the best programming languages that can be used to create a chatbot because of its simplicity and readability.
In this tutorial, we will build a simple chatbot using Python. We will train our chatbot on a small dataset and use a pre-built NLP library called NLTK (Natural Language Toolkit) to process the text input and generate responses. We will divide the tutorial into the following sections:
- Setting up the Environment
- Installing Required Libraries
- Defining Functions for Preprocessing and Training Data
- Writing the Code for our Chatbot
Setting up the Environment
Before we start building the chatbot, we need to set up our environment. We will be using Python 3.6+ for this tutorial. If you do not have Python installed on your system already, you can download the latest version from the official website.
Create a new project directory where the chatbot files will reside. Open a command prompt (Windows) or a terminal (macOS/Linux), navigate to the project directory, and create a virtual environment. We will use the command python -m venv venv
to create a virtual environment called venv
. Once the virtual environment has been created, activate it using the following command:
# On Windows
venvScriptsactivate.bat
# On macOS/Linux
source venv/bin/activate
Installing Required Libraries
Our chatbot will depend on several libraries, including NLTK, numpy, and tensorflow. Install these libraries using the following command:
pip install nltk numpy tensorflow
Defining Functions for Preprocessing and Training Data
In this step, we will define some helper functions to preprocess our data and train our model.
import nltk
from nltk.stem import WordNetLemmatizer
import numpy as np
lemmatizer = WordNetLemmatizer()
def preprocess(sentence):
"""
Tokenize and lemmatize the sentence.
"""
words = nltk.word_tokenize(sentence)
words = [lemmatizer.lemmatize(word.lower()) for word in words]
return words
def bag_of_words(sentence, words):
"""
Create a bag of words vector for the sentence.
"""
sentence_words = preprocess(sentence)
bag = np.zeros(len(words), dtype=np.float32)
for idx, word in enumerate(words):
if word in sentence_words:
bag[idx] = 1.0
return bag
The preprocess(sentence)
function tokenizes and lemmatizes a sentence. Tokenization involves splitting the sentence into individual words. Lemmatization involves converting words to their base forms, e.g., “running” becomes “run” and “mice” becomes “mouse”. This process reduces the number of unique words in our dataset and helps to remove any discrepancies in the same words with different tenses, plurals, etc.
The bag_of_words(sentence, words)
function takes as input a sentence and a list of words. It creates a vector of length len(words)
, where each element is set to 1.0
if the corresponding word is present in the sentence and 0.0
otherwise. This function converts a sentence into a numerical input that can be used by our model.
def create_training_data(data):
"""
Create training data from the dataset.
"""
corpus = []
classes = []
words = set()
for intent in data["intents"]:
for pattern in intent["patterns"]:
words.update(preprocess(pattern))
corpus.append((pattern, intent["tag"]))
classes.append(intent["tag"])
# Create a list of unique words
words = sorted(list(words))
x_train = []
y_train = []
for (pattern, tag) in corpus:
# Create a bag of words vector for each pattern
bag = bag_of_words(pattern, words)
x_train.append(bag)
# Create a one-hot encoded vector for each tag
label = classes.index(tag)
label_vec = np.zeros(len(classes), dtype=np.float32)
label_vec[label] = 1.0
y_train.append(label_vec)
return x_train, y_train, words, classes
The create_training_data(data)
function takes as input a dataset containing tags and associated patterns. It preprocesses the patterns, converting them into a list of unique words and creating a bag of words vector for each pattern. It also creates a one-hot encoded vector for each tag. The function returns the training data (input and output), a list of unique words, and a list of classes (i.e., tags).
Writing the Code for our Chatbot
We can now start writing the code for our chatbot. Here is the complete chatbot.py
code:
import random
import json
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import SGD
from preprocessing import create_training_data
with open("intents.json", "r") as f:
data = json.load(f)
x_train, y_train, words, classes = create_training_data(data)
# Create a neural network
model = Sequential()
model.add(Dense(128, input_shape=(len(words),), activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(64, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(len(classes), activation="softmax"))
# Compile the model
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss="categorical_crossentropy", optimizer=sgd, metrics=["accuracy"])
# Train the model
model.fit(np.array(x_train), np.array(y_train), epochs=200, batch_size=5, verbose=1)
# Save the model
model.save("chatbot_model.h5")
print(f"Model saved to chatbot_model.h5")
def predict(sentence):
"""
Predict the tag associated with a sentence.
"""
bag = bag_of_words(sentence, words)
res = model.predict(np.array([bag]))[0]
idx = np.argmax(res)
tag = classes[idx]
if res[idx] < 0.7:
return "I'm sorry, I don't understand."
else:
for intent in data["intents"]:
if intent["tag"] == tag:
return random.choice(intent["responses"])
The first step is to import the required libraries and load the training data using the create_training_data(data)
function from the previous section. We then create and compile a neural network using the Keras API from tensorflow. The network consists of an input layer of size len(words)
, two hidden layers with 128 and 64 neurons respectively, a dropout layer with a rate of 0.5 to prevent overfitting, and an output layer with size len(classes)
.
We then train the model using the fit()
method, passing in our training data and setting epochs=200
and batch_size=5
. We save the model to a file called chatbot_model.h5
.
Finally, we define a predict(sentence)
function that takes as input a sentence and predicts the tag associated with the sentence using our trained model. If the confidence of the prediction is less than 0.7, the function returns an “I’m sorry, I don’t understand” message. Otherwise, it returns a random response from the list of responses associated with the predicted tag.
Conclusion
In this tutorial, we have built a simple chatbot using Python and the NLTK library. We used a neural network to classify the input message and generate an appropriate response. This chatbot can be further improved by adding more training data and fine-tuning the hyperparameters of the neural network. Chatbots have the potential to revolutionize customer service and support in many industries and Python is an excellent tool for building them.