(黃意堯 許義宏 王浩恩 盧慶原 侯信丞)


We build a bot for music.

What cool

Play Music: Search and play music from Spotify
Recommendation System: Recommend music based on the user’s favor
Singer Information: Inquiry the singer of songs and albums
Lyrics Inquiry: Inquiry the lyrics of songs
Emotion Detection: Detect users’ facial expression through Azure API
Slot Value Correction: Automatically correct the slots being wrongly typed or spoken Ex: “敬騰” → ”蕭敬騰”
LU Error Handling: Handle the error of LU if the slots are mis-filled Ex: song = 張宇 → confirm again → singer = 張宇


Music Data
• Collected from Spotify
• 678 Singers, 2674 Albums, 95079 Tracks
Connect system to Spotify API
• Support playlist recommendation
Search lyrics from PyLyrics Library
• Open-source Python lib to search lyrics


• Bidirectional seq2seq model to do joint training on intent and slot.
Intent and Slot
• 19 intents and 4 slots
Template Data and Real Data
• We design over 70 templates and generate more than 20k template training data.
• Collect 10k training data from real user.
LU Performance


Model: Seq2seq model with attention
Training Size: 312400 data
Testing Size : 30000 data
BLEU Score: 0.362


1. Slot Value Correction: “敬騰” → ”蕭敬騰”
2. LU Error Handling:
song = 張宇 → confirm again → singer = 張宇
1. Input:
• Last 4 turns semantic frames
• Last 4 turns Bot actions and slots
2. Prediction: Classification on goal/slots
Model Comparison:
1. PCA + SVM: SVM with dimension reduction
2. XGBoost: faster and more precise


• support 21 different actions
1. Handles “Don’t care” term on
2. Average turns : 5.93 (test on user simulator)
3. Average reward : -12.12 (test on user simulator
DQN RL Agent
1. Use one hot embedding and three layers fully connected network
2. Success rate: 5.93
3. Average reward: -12.12
User Simulator:
• Reward:
1. -1 on each turn
2. +30 if goal is correctly completed
3. -20 if goal is incorrectly completed