Wiki
Clone wikiwebdialog / Get Started
Getting the code
Either run:
git clone https://bitbucket.org/matthen/webdialog.git
or download the code in a zip file.
SSL
First you need to create a public and private key for SSL. (Using SSL means you don't need to keep asking for permission to use the microphone)
mkdir ssl openssl genrsa -des3 -out ssl/server.key 1024 openssl req -new -key ssl/server.key -out ssl/server.csr openssl x509 -req -days 365 -in ssl/server.csr -signkey ssl/server.key -out ssl/server.crt
Requirements
Make sure you have web.py installed:
sudo pip install web.py
You should now be able to run:
python server.py
After entering the password for your encryption, it will wait to serve on port 8080. Visit https://localhost:8080/dialog to see the simple demo webdialog ships with. Dialog logs will start being saved in the logs directory.
Extending the demo
In writing the code, I have tried not to force too much upon developers who decide to use it. Here is a small step-by-step tutorial which should make it clear how you can go on to extend the simple demo to a real system. The aim will be to write a system which tracks the most recognised word in the ASR, and outputs that to the browser.
First make a copy of DialogState.py
cp DialogState.py DialogStateTutorial.py
At the top of DialogStateTutorial.py, add
from collections import defaultdict import operator
And under python def __init__(
in the dialogState class, add:
# word counts self.word_counts = defaultdict(int)
This creates a defaultdict to store the word counts in the dialogState. Note one of these objects gets initialised for each dialog. The update(self, asr_result)
function is what gets passed new ASR results at each turn. The asr_result
argument it is passed is an object like this:
{ "confidence":0.8165613412857056, "hyps":[ "can you lend me a 12000 dollars", "can you let me a 12000 dollars", "can you read me a 12000 dollars", "can you lend me a twelve thousand dollars", "can you let the twelve thousand dollars", "can you lead me a 12000 dollars", "can you let me 12000 dollars", "can you lend me 12000 dollars", "can you buy me a 12000 dollars" ] }
Here is an update
function which will track the top recognised word:
def update(self, asr_result): self.asr_results.append(asr_result) for hyp in asr_result["hyps"]: for word in hyp.split(): self.word_counts[word] += 1 most_frequent_word = max(self.word_counts.iteritems(), key=operator.itemgetter(1))[0] response = { "tts": "You have said '" + most_frequent_word + "' the most.", "ended": False, "most_frequent_word":most_frequent_word } self.responses.append(response) return response
Note the response now includes a new property, most_frequent_word
, which we will access in the browser. In order to use this new class, create a file called config.cfg
in the top directory containing:
[webdialog] ; your python class for updating dialog state dialog_state_class = DialogStateTutorial.dialogState
Now run the server (python server.py
) and visit https://localhost:8080/dialog to try the new system.
Lastly I want to demonstrate how to access the other response data that is sent from the DialogState object. In real applications this could be a new list of coordinates for a map, a list of search results, a URL to an image, etc...
In templates/display.html add the following:
<p>Most frequent word: <span id="most_frequent_word"></span></p>
This template gets included into a div with id display
on the main page, under the control box which by default shows the live top ASR hypothesis, and the system response.
And in static/js/views.js we will add some javascript which listens for events on the window.dialog
object and updates the web page. Add the following before the final });
in views.js:
// Most Frequent Word if ($('#most_frequent_word').length != 0) { window.dialog.on("response", function(event, response) { var $most_frequent_word = $('#most_frequent_word'); if (response.hasOwnProperty("most_frequent_word")) { $most_frequent_word.text(response.most_frequent_word); } }); $('#error_text').hide(); }
This code (written using jQuery) checks the most_frequent_word span exists in the document, and then attaches a listener to the "response"
event. The listener updates the text of the span with the most_frequent_word attribute of the response. Note this corresponds exactly with the JSON returned by the DialogState object's update
function.
Lastly add the following CSS to static/css/display.css
:
#display span#most_frequent_word { font-weight:bold; }
Now rerun the server and check it all works.
This has shown how webdialog implements a dialog system as a python object, and how to get started visualising the responses in the browser.
Note that this tutorial required editing only the files display.html
, display.css
, views.js
and config.cfg
. These are not tracked in the git repository, so by editing these files alone you should be able to do git pull
without getting any conflicts.
Updated