-
Notifications
You must be signed in to change notification settings - Fork 145
small tweaks to the utterances #90
Conversation
Was mistakenly removed.
Target for WSGI deployments.
…with party mode playing with large libraries. Daemonized the updating of audio/video libraries.
…man utterances. Should have fixed error on WhatNewAlbums intent where there was a unicode issue.
So we easily know which method was executed.
IMHO it does more good than harm to have a large number of variations in the utterances. I don't personally want to have to remember exactly how to say something to get it to work. |
I like having the options too. But I'm curious if there is any kind of relation the sample utterances size has with running out of room on the custom slots. |
It appears to be more the Intents referencing the slots than the utterances. I can make it fail reliably just by adding a slot reference into an Intent. |
The forums are (as usual) not very conclusive. The is a hazy limit of 200kb, performance doesn't get hit but accuracy can suffer as you end up making a "generic slot" https://forums.developer.amazon.com/questions/28018/uploading-more-than-2500-utterances.html So I started doing some testing to satisfy my curiosity. I removed all the utterances for Replaced them with the following six utterances aware that they were not enough examples but I wanted to see what would happen.
Then tested several variations
I haven't finished my poking around but thought I would share the results so far. |
Hmm the bit about adding "enough" into a slot makes it a "generic slot" explains why when I leave a slot empty (edit: well, not empty, since you can't do that, but with 1 or 2 items), I get "none" returned in the slot for everything I say. It'd be awfully nice if Amazon disclosed what the magic number is for that behavior. Would be even nicer if they'd just provide a way to define a slot as "generic" manually. As for the utterance limit, we're not there yet.. but will be soon if we add much more. I'd be cautious with the trimming though. |
I think some trimming is ok, perhaps "balancing" would have been a better word as there are intents with 800+ utterance variations and others with only one or two :) |
I know it needs some trimming, but to do so you'd need to ensure each example utterance removed still works without the example in there. Just means time/tedium.. you're welcome to give it a go, I just haven't had the time or energy for such a thing lately. |
I too had to put loads of work into "germanification" of utterances. I think it's really important to have a user talk naturally to Alexa and not having to remember the exact commands. |
I agree about natural language, but I think testing and human input helps achieve this more than the "guessing" that comes from only using the utterance generators. Though Alexa does do some thinking, she isn't directly matching a string she hears to a single utterance and failing anything that isn't a direct match (but a match certainly does help). How much thinking is what I am trying to work out by these little tests. Given these two example utterances:
She successfully matched these spoken phrases:
I think that anything with "how much" and either "time" or "remaining" would match So I tested it and these all passed (well, they matched the intent CurrentPlayItemTimeRemaining)
What was curious was:
So yes we should be providing genuine examples of utterances, but perhaps not always using utterance generators or having a good prune of the results when you do. Because in the 819 sample utterances for WhatNewAlbums/Shows/Movies that are pretty much all the same, all she has to identify which of the three intents she needs to fire is the word for the media type. Whilst there is a small chance someone with a PVR in Kodi might choose to say: "Alexa, ask Kodi do we have some new shows recorded tonight" It does seems unlikely someone would say: "Alexa, ask Kodi do we have some new albums recorded tonight" It would be better to have the word "recorded" associated with PVR related intents as it seems more likely someone would say: "Alexa, ask Kodi do we have anything recorded?" they would mean a TV Show on the PVR (sidenote: does anyone know if Kodi puts PVR recorded content into the library as Shows and Movies?) |
We're not just using an utterance generator and guessing as you're suggesting. The utterance generator is just so we don't have to manually write out every utterance we want. It's not magically figuring what to put in there on its own -- a great deal of thought goes into figuring out how the user might say things. For me, I just didn't want to waste the time trying to optimize it right then and there because that involves saying all of the various utterances to Alexa repeatedly to make sure they both work and are reliable. There are/were other things that are more important to me to work on. If you want to do that and figure out the minimum number of utterances to make all of the phrases work, you're welcome to. But please do make sure all of the utterances that you currently see in there are functional and reliable. Just because you may not say something in there doesn't mean others don't. A lot of them are in there purely because I've heard my wife and kids say them, for instance. Otherwise, I'll get to it when I have the time to. |
@ausweider reminds me of a good point here too.. whatever we do for the English utterances, we'll need to do the same work/testing for German (even though support for this isn't in place yet). They can remain out of sync for a little bit, but he would need time to make the same changes and test. I'd worry about leaving them out of sync for too long, particularly after the addition of new Intents that partially replace existing ones (like what will happen when I push up the generic search/play functionality). |
a simple example is previously to play a specific episode you had to say: "play season 3 episode 13 of SuperShow" Now you can say: "play episode 13 season 3 of SuperShow"
Well I can't argue with that. The PR was not actually trimming the intents, this conversation started from my closing comment about trimming. It got a little side tracked as it was intended to just raise the issue about the balance and how there was an intent with 800+ variations about how to ask if there were any new movies, but only 4 for playing a specific movie. I have added more changes in this PR after some more tweaking |
AddonExecute execute the addon {Addon} | ||
AddonExecute execute the plugin {Addon} | ||
AddonExecute execute the script {Addon} | ||
AddonExecute execute the {Addon} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been my experience that you can safely omit articles. Is it not working for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is advised to use them if they are valid phrase people might use.
Sample Utterance Contents
Given the flexibility and variation of spoken language in the real world, there will often be many different ways to express the same request. For example, to ask for a horoscope a user might say:
- what is the horoscope
- get me my horoscope
- tell me the horoscope
- how’s my horoscope today
Or any other variations on the above forms:
- “what’s” and “what is”
- “get”, “tell”, and “give”
- “my” and “the”
The lack of them does cause problems.
Current sample utterance
WatchMovie play movie {Movie}
(followed by: (unrelated but might be useful))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter issue you mention is a known one. It's the reason I want to add generic search/play functionality. Currently, you cannot say "play ghostbusters." It can work, but it can also just pick some other somewhat similar intent too.
It's a bug only in the sense that we don't outright fail on it at the moment. I haven't bothered trying to catch it, because it will be fixed entirely after I merge in the generic play/search branch I have.
I haven't personally had any issues speaking articles even if they're not present in the sample utterances. If you're having problems, that's a good enough reason to add them. I only commented on it because we were just discussing trimming the utterances, not expanding them ;)
But I don't care one way or the other provided it all works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"balancing" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making absolutely sure, since I definitely don't see the problem you're having with articles here..
You didn't happen to add the utterance, "play {Movie}," to the WatchMovie intent on a local branch or something, did you? If you had, I could see that particular issue with ghostbusters that you noted.
If you do have any local changes you're testing on top of what's in the repo here, you might try removing them first.
I'm pressing this because I seriously use the WatchMovie intent on a near-daily basis and I always use articles in my speech when speaking to Alexa, because I find I stumble more if I try to omit them.
In fact, I actually had it on my own TODO list to "balance" the intents in this way, but the opposite of what you did.. I had planned to remove all articles from all utterances :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope it is not in my MOVIES slot, though (as we have discussed before) I am less convinced the contents of the slots matter that much.
Maybe it would have worked if it was in the slot, but didn't need to be as it played fine when I said "play movie ghostbusters". Then when I added the articles to the sample utterances (still without ghostbusters in the slot) it then played fine when is said "play the movie ghostbusters"
You could test at your end with a movie you know is not in your MOVIES slot, or any movie even if it is not in your library. Just to see what Alexa hears.
Re the learning. I agree, giving feedback on those cards is very important in helping Alexa understand you better. I think that is the main way she learns, especially in UK/DE as we don't have the "voice training" feature that the US does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope it is not in my MOVIES slot, though (as we have discussed before) I am less convinced the contents of the slots matter that much.
They do matter in that it will have to fuzzy match less often. Whether or not that works out better is another discussion, but it will in fact get exact an match earlier if it's in the slot.
Maybe it would have worked if it was in the slot, but didn't need to be as it played fine when I said "play movie ghostbusters". Then when I added the articles to the sample utterances (still without ghostbusters in the slot) it then played fine when is said "play the movie ghostbusters"
I'm not trying to argue that you should populate your slots to get it to work (edit: though you do need to get past the threshold to convert it to a "generic slot;" something we should probably document in the README). I was just curious what the difference was between our deployments. Obviously, if adding articles to the sample utterances makes it work for you, then that's the ultimate answer. But for the sake of satisfying my own curiosity, I wouldn't mind pinpointing the exact difference.
You could test at your end with a movie you know is not in your MOVIES slot, or any movie even if it is not in your library. Just to see what Alexa hears.
I'll do this tonight and report back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, just tested it:
"Alexa, ask Kodi to play the movie Deepwater Horizon"
Response:
Trying to match: the movie deep water horizon
So, just to document it, it does appear that the lack of the article in the sample utterance when the requested movie isn't in the Slot is the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cool, and typically when I am reporting back on things I have tested I don't use my local branch.
I'm not trying to argue that you should populate your slots to get it to work
Me neither :) and I do acknowledge that if it was in my slot chances are it would have worked fine.....
.. though to also satisfy my own curiosity I have just removed the articles so the samples were back to
WatchMovie play film {Movie}
WatchMovie play movie {Movie}
WatchMovie watch film {Movie}
WatchMovie watch movie {Movie}
and added ghostbusters to my MOVIES slot.
The results are more confusing, as it didn't work, it is still looking for "the movie ghostbusters"
Trying to match: the movie ghostbusters
Simple match failed, trying fuzzy match...
I thought there may have been a cache issue with the Skill but realised if there was then it would have still worked as it was working on the last "build".
So that is very interesting. It must be due to the learning or the UK/US differences. I would rather not reset my Echo and erase the voice history to find out though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wouldn't worry too much more about it. Adding the articles is obviously the right thing to do since we both were able to make it fail without them.
speech_assets/SampleUtterances.txt
Outdated
@@ -1,18 +1,36 @@ | |||
AMAZON.StopIntent cancel | |||
AMAZON.StopIntent shut up | |||
AMAZON.StopIntent stop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove these? This is necessary to support streaming audio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't remove the intent, just the samples as they are not required
To implement a standard built-in intent, include the intent in your intent schema and then add handling for the intent in your code. You do not need to provide any sample utterances for these intents, although you can if you want to extend the intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I thought we do extend the intent? It functions as both the 'standard' Amazon stop intent as well as the stop command for Kodi playback.
edit: To be honest, I'm not really sure what Amazon actually means there. Did they mean to say "extend the utterances?" As in, add more utterances that can be used for Stop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The invocation name will decide where the stop command goes
"Alexa, stop"
vs
"Alexa, tell Kodi to stop" (though that could be confused with playback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just confused about what Amazon means by "extending the intent." I still don't really know, but if it works.. it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, add more utterances that can be used for Stop?
That is how I interpret it.
I don't think the articles are required, unless there is some difference between UK and US. On my deployment anyway, I can add articles and she still figures it out just fine. If you're seeing something different, it's likely a UK vs US thing. |
speech_assets/SampleUtterances.txt
Outdated
PlayPause play playback | ||
PlayPause play song | ||
PlayPause play track | ||
PlayPause play video |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be worried these might conflict with the play media commands.. I seem to remember I tried this before.
You'll want to test this pretty thoroughly if you want these utterances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean, but as they don't have slots then it should be possible to identify them (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be, but since we don't know a whole lot about how Alexa actually chooses which one to execute, I was erring on the safe side..
Like I said, I think I tried adding these and ran into intermittent problems. If you want to leave them in, just make sure you test very thoroughly.
speech_assets/SampleUtterances.txt
Outdated
Prev listen to previous one | ||
Prev listen to previous one again | ||
Prev listen to previous song | ||
Prev play previous |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally use "listen to/play previous". Why did you remove these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I do too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the joys of discussion
speech_assets/SampleUtterances.txt
Outdated
Prev play previous one again | ||
Prev play previous song | ||
Prev previous | ||
Prev previous song |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use this as well.
speech_assets/SampleUtterances.txt
Outdated
Skip listen to next song | ||
Skip next | ||
Skip next song | ||
Skip play next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use these too..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough
"Alexa, ask kodi to next"
"Alexa, tell kodi previous"
Testing all the prev and skip intents does highlight some clashes with the audio stream controls.
"to previous" = AudioStreamPrevious
"to next" = Skip
"previous" = AMAZON.PreviousIntent
"next" = AMAZON.NextIntent
"next track" = Skip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should post that over on the Flask-Ask PR. That's likely a bug.
Prev (listen to/play/watch) (/the/that) (previous/last) (one/item) (/again) | ||
|
||
Menu open menu | ||
StartOver (replay (/this) (/song/video/episode/track/movie)/start over/play again/play (/song/video/episode/track/movie) again/go back to the beginning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add audio, film, show, and TV show if we're going to allow the user to qualify the media to start over.
PageUp (/navigate/go) page up | ||
PageDown (/navigate/go) page down | ||
Prev (listen to/play) (/the/that) (previous/last) (song/track) (/again) | ||
Prev (watch/play) (/the/that) (previous/last) (video/episode/movie) (/again) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add film, show, and TV show if we're going to allow the user to qualify here.
Left (/navigate/go) left | ||
PageUp (/navigate/go) page up | ||
PageDown (/navigate/go) page down | ||
Prev (listen to/play) (/the/that) (previous/last) (song/track) (/again) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add audio if we're going to allow the user to qualify here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went to add it, but then saw that there was a similar example for changing the audio track on a video file.
AudioStreamNext next audio
AudioStreamPrevious previous audio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. "next/previous audio" makes more sense in the context of the audio stream, so I'd leave it as it is.
I haven't tested every addition here, but just via inspection it looks OK to merge to me with the exception of the removal of utterances I personally use for the Skip and Prev intents. May want to wait until the flask-ask branch is merged though, since some of the changes here may mean updates to the German translation too. |
Cool, the PR was originally intended for the flask-ask branch as part of the bigger flask change. It would be good to merge it in even if it is not perfect, as new issues keep popping up what all the changes that are going on. e.g. conflicts with the "utterance trimming" you worked on for the NewShowInquiry samples 2ae625a fun fun fun |
Respectfully, I disagree, since your changes here can have an affect on the translations. I'd personally suggest we wait until the Flask-Ask branch is merged into master, rebase this onto master, and revisit then. It's not up to me, though I'd appreciate if we could at least sort out the skip/prev utterances should m0ngr31 decide to include this in 2.5. |
@ausweider, if you get a chance, could you look over the changes here and review from the perspective of the German translation so we have an idea of its impact? |
Whatever works. Though the changes in the utterance samples are always going to be different for each language, they are never going to be like for like. At the moment there are 36 DE samples for StreamArtist and 12 samples for Prev. In the EN list there are 2 for StreamArtist and 6 for Prev. With total samples for each at: DE 1434 and EN 1903. |
Aye, I understand. I just wanted to give everyone a chance to review all of this without holding back the flask-ask branch. The changes you've made here aren't complex, but Alexa itself is finicky so I'd want some peer-review personally before merging it. I just haven't had the chance to really go over it all, since I'm more focused on rebasing all of my stuff onto the flask-ask branch at the moment, among other things.. :) But if @m0ngr31 wants to include this as part of 2.5, we should set aside some time to test it before merging into master. |
I'm going to go through these again soon.. but.. honestly, you'd have better success with multiple PRs that are actually 'small'; e.g., one for the articles, one for the skip/prev changes, etc. |
The necessary article additions were added in 5330ecc. Skip and previous utterances were expanded a little too, and I don't see any clashes here with the music streaming stuff. If you'd still like to change OpenRemote you should submit another PR, preferably with only that change. #139 is a good place to discuss how to trim down the utterances if you'd like as well. |
I think this can be closed now? |
yeah probably, it was so long ago now I don't know what's what anymore :) There was some useful experimenting in the conversation, but I can alway ref back to it |
open for review of course
I changed the "open" remote to:
OpenRemote (/start/activate/enable) (navigation mode/navigation/button mode/buttons/remote/remote control(/s)/controls/controller)
I think we can trim down some of the "new shows/movies/tracks" and "time remaining" utterances as in the case of WhatNewMovies there are 816 utterances :)