Use g2p to create an extended dictionary to cover each project's use-case
On a per-project basis, while JSGF grammar is being generated, use a g2p library to generate dictionary words that do not exist within the default dictionary provided by CMU. We should then merge the CMU and generated libraries to one dictionary for use in pocketsphinx (SpeakPythonRecognizer).
This will allow speech utterances of any word desired by the user (with possible problems arising from errors introduced by the how well or not the g2p library is trained). Some pronunciations of certain words may be slightly off.
Comments (5)
-
-
reporter Thank you for the suggestion. That will be good for end-users, but for the project's use case (hiding the generation from users of SpeakPython), I may need to go with another dependency installation, either in python or Linux. I'll have to look into some options. I'd hate to make installation even more complex :(
-
That makes sense.
I'd hate to make installation even more complex :(
And thanks for thinking that way :-) +1
-
reporter np! I'd hate having to do all that just to try it out. Maybe I'll create a script that's a one-run installation shell script. Would have to incorporate a lot to work for various linux distros, but I think it would really help people get started using SpeakPython.
-
reporter - changed status to resolved
Utilized Sequitur-g2p to turn unrecognized words in the generated JSGF into phoneme mappings and add them to the dictionary.
- Log in to comment
Here's the tool that I used to extend my dictionary http://www.speech.cs.cmu.edu/tools/lextool.html