Intelligent Conversational Bot 2017

TVBot C.-S. Chen, K.-L. Lo, F.-Y. Sun, H.-T. Yeh & Y.-T. Yeh https://facebook.com/theTVBot

image

Overview

image

We are able to:

Ontology and Database

Source

Trakt.tv: it provides a unified interface to popular movie / tv show sources (IMdb, etc.).

Intent List

Fine-grained intents over vague intents. Why? Reduce possibility for wrong slot filling.\

e.g. Who House of cards? v.s. How many are there in Game of Thrones?

Value Mapping

The slot value from user NL might not match actual slot.

Language Understanding

RNN-NLU

RNN-NLU is able to classify intent and fill slots at the same time, jointly optimized.

image

Tend to over-fill. e.g. I would like to watch a show that airs Friday. request_show_title(show.air_day: ’Friday’, show.country: ’to’)

api.ai

Provided by an external service. Model unknown.

Tend to under-fill. e.g. I would like to find a show produced by .

request_show_title()

Comparison

  RNN-NLU api.ai
Good New model, powerful if fine tuned. Online training.
Bad Hard to train (days every time). Black box, can’t tune or apply tricks.

Data

Mostly generated by hands. We generate almost 1000 templates, and fill in the db values as the training sentences.

Dialogue Management

State Tracker

For both user and agent, the state tracker tracks the following information in vector form <a1, a2, …, an> a_i in {0, 1}

RL Agent (DQN@Mnih+2013)

image

Ruled-based Agent

  1. If user confirms a slot, make sure the slot is what we confirmed previously. If not, confirm again.
  2. If the slot value user informs mismatch with our state, confirm the slot.
  3. If user informs without any slots in NLU, confirm what the user want actually.
  4. If we have show cadidates few enough, inform user or raise a multiple choice directly.
  5. If candidates are too many, some heuristic (cast, crew first) and database information (how many possible values in a slot) helps to decide which slot to request.
  6. If all the slots have been requested or there’re no candidates at all, then confirm a slot according to previous confirm/request slots which is stored in state.

1 ~ 3 is responsible for handling NLU error. If none of it is triggered, then it means that NLU did it quite well.

User Simulator

  Ruled-based DQN
w/ error 0.66 0.53
w/o error 1.0 0.93

Reward function

Our reward function definition is

Performance

image

image

image

Natural Language Generation

We separated the sentence into 7 categories:

  BLEU
Training 1.0
Testing 0.4425

Rule-based NLG

Fill the slot values into predifined template.

NN-NLG

The model is a seq2seq. Trained on several templates for each categories.

Miscellaneous

Speech API

Integrated with Bing Speech-to-text API. If audio gotten, call the api.

Translation API

The db is built in English. To support other language, sent to Bing Translation API if other language detected.