Generative based Human Conversation Chatbot

Github Link

Delhi Technological University  Jan 2017 - Jun 2017

A ChatBot (short form of  Chat Robot) a computer program that simulates human conversation, or chat, through artificial intelligence. Typically, a chatbot will communicate with a real person, but applications are being developed in which two chatbots can communicate with each other. Chatbots are used in applications such as ecommerce customer service, call centers and Internet gaming. Retrieval-based bots use a repository of predefined responses and some kind of heuristic to pick an appropriate response based on the input and context.Generative models don’t rely on pre-defined responses. They generate new responses from scratch.In this project, we have been implemented a chatbot trained on the dataset of subtitles of Big Bang Theory.

Data Collection

We used data from the transcripts of the famous TV series The Big Bang Theory. The files were:

  • text.txt: this is the training data contatining the pair in number token format
  • dict.json: this is the dicitonary to translate from number token to English word token in test time
  • actors.json : this is for signal indication in test time
  • summary.json : this file contain the length info for selecting the right bucket options for training

Experiments

We used Word2Vec for the word embedding. We used Encode - Decoder Sequence to Sequence Modeling to train the data. The encoders and decoders we LSTMs

Result and Conclusion

The highly reliable chatbot was able to mimic daily human conversations.