better papagayo lipsync

Issue #507 resolved

Alessandro Padovani created an issue 2021-05-08

blender 2.92, commit 59c302c

This is related to ~~#504~~. Automatic lipsync is a fast way to get our figures to talk, but the result is often "robotic". This is because lipsync translates every single phoneme into a complete mouth action for that phoneme. While we as humans tend to "relax" a lot between mouth poses.

The following "rules" are what I extrapolated when editing a lipsync to get a more human behaviour. We may add a tool to "relax" a lipsync track by following these rules. They're simple but effective. Essentially they remove most etc poses to blend-in the other phonemes. The principle is that phonemes tend to maintain their shape in a human speech. Then we may find better rules for relaxing but I believe this is a good start.

only keep the first initial "rest" pose, remove all the "rest" and "etc" poses until the first phoneme
keep a "etc" pose only between two same open vowels "AI E O" and delete all other "etc". ex. "AI etc AI" is good, while "AI etc O" becomes "AI O" and "AI etc FV" becomes "AI FV"
use 50% of next shape for "etc", ex. "AI etc AI" becomes "AI AI50% AI"
set open vowels "AI E O" to 50% between consonants "FV MBP W", ex. "FV AI50% FV"
only keep the first final "rest" pose, remove all the "rest" and "etc" poses after the last phoneme

Also an option to "emphasize" the open vowels "AI E O" may be effective to control the speech. For example we may set a 150% or 200% factor to emphasize the speech. Or we may set 75% or 50% to diminish it. Please note that "etc" must be affected too being the middle pose, ex. with rule 3 150% of "AI etc AI" becomes "AI150% AI75% AI150%".

Attached is an example with a dat file from lipsync-o-tizer that's an automatic lipsync derived from papagayo. First the automatic lipsync papagayo-test.mp4, then the relaxed version with a 150% emphasis papagayo-fix.mp4. We see that the relaxed version is much more human like, while in the automatic version there are "robotic glitches".

https://morevnaproject.org/papagayo-ng/
https://www.autolipsync-o-tizer.com/

‌

Comments (13)

Alessandro Padovani reporter
As a side note to allow “emphasize” we need to unset the slider limits in the global settings. Also it would be useful to implement ~~#480~~ for general animation.
- 2021-05-08T08:18:56+00:00
Alessandro Padovani reporter
Commit 66b1ada doesn’t work fine. It seems to remove all etc poses that’s not ideal since this way two successive phonemes that are the same don’t get spoken, ex. “AI etc AI etc AI“ in papagayo becomes “AA AA AA“ in daz. Then Thomas feel free to improve the rules if you find a better way, but just removing etc doesn’t work.

Also it would be better to have “relax” as an option so the user can load the original papagayo animation if he wants to.

I’m attaching papagayo-fix.blend for reference. That’s the daz animation for papagayo-fix.mp4, relaxed with 150% emphasis, how it should be by following the relax rules above.

‌
- 2021-05-08T18:30:54+00:00
Alessandro Padovani reporter
- attached papagayo-fix.blend
reference animation for the relax rules

edit. reuploaded to fix some errors
- 2021-05-08T19:27:09+00:00
Alessandro Padovani reporter
update. If possible it would be nice to improve rule 4. That is, the idea is that the mouth doesn’t have time to stretch to full emphasis when the vowel is between two consonants. But this is only true if the consonants are close enough. And more precisely it’s only the second consonant that matters, that’s the one closing the mouth after the vowel. So we may consider this factor.

This adds complexity because we have to take into account the distance between phonemes too. So if it is too complex then the original rule 4 may be fine enough.

4 bis. set open vowels "AI E O" to 50% if followed by a consonant "FV MBP W" in a 3 frames range, ex. "AI50% FV"
- 2021-05-08T19:51:02+00:00

Alessandro Padovani reporter

Since the test speech is quite short I’m posting it here together with the relaxed conversion. So it may be easier to check it for improvements or errors and discuss it in general.

ORIGINAL PAPAGAYO
01 rest
02 rest
02 etc
04 AI
06 etc
09 AI
10 etc
14 AI
16 etc
20 E
24 etc
35 MBP
37 E
40 etc
45 FV
47 O
48 etc
49 MBP
52 AI
54 MBP
57 AI
58 etc
61 O
68 U
69 etc
72 U
73 etc
77 E
81 AI
83 FV
86 AI
88 WQ
91 E
93 etc
100 FV
103 AI
108 etc
111 AI
113 etc
115 AI
119 FV
123 WQ
126 AI
127 etc
142 U
149 rest
151 rest

‌

BLENDER RELAXED WITH 150% EMPHASIS (RULE 4 BIS)
01 rest
04 AA150
06 AA75
09 AA150
10 AA75
14 AA150
20 EH150
35 M
37 EH150
45 F
47 OW75
49 M
52 AA75
54 M
57 AA150
61 OW150
68 UW
72 UW
77 EH150
81 AA75
83 F
86 AA75
88 W
91 EH150
100 F
103 AA150
108 AA75
111 AA150
113 AA75
115 AA150
119 F
123 W
126 AA150
142 UW
149 rest

‌

2021-05-08T20:24:17+00:00

Thomas Larsson repo owner
OK, I think I got it now. The latest commit should work as desired. There is also an option to update the limits of open vowels, so you don’t have to do that manually.

The strength of an open vowel followed by a silent vowel (vowel? F and M are not vowels) is reduced to half if the time distance is 3 or less. One could perhaps replace this by a sliding scale depending on the distance. So the factor would be 25% for 1 frame, 50% for 2 frames, and 75% for 3 frames.
- 2021-05-09T06:45:37+00:00
Alessandro Padovani reporter
- edited description
- 2021-05-09T08:18:45+00:00
Alessandro Padovani reporter
Commit df754fe seems to work fine here. Though I can’t understand the difference with “update limits”, it seems the same with and without, at least with the provided test speech.

As for “silent vowel” I didn’t know how to name a “silent” phoneme so I invented it. But “consonant” is probably better, though it refers to a letter rather than a phoneme. So I edited the comments above with “consonant”.

edit. Thomas I agree on the “sliding scale” concept and there are a lot of things that can probably be improved. I just didn’t want to add too much complexity as a first step, but if you feel confident about improving the relax rules please do it.

I believe it is important to load the relaxed version as an option. So the user can load the original papagayo track if he wants to. This is for various reasons. First this is experimental I’m not an expert at all and it’s not extensively tested so relaxing may not work fine in some cases. Then in papagayo you can edit the phonemes, so if the user already edited the track to his wish he may want the original not relaxed track to load in blender. Finally some advanced user may simply want to edit the original papagayo track himself in blender.

So Thomas may we please have “relax” as an option ? Or if not please let me know why.
- 2021-05-09T08:35:40+00:00
Thomas Larsson repo owner
Relaxing is made optional in the last commit. The emphasis and update limits options are only displayed when relaxing is active, because they don’t affect the raw papagayo data.

Update limits is useful if you load the file with emphasis > 1, because by default the visemes are limited between 0 and 1. The picture shows the difference at frame 4 when the test animation was loaded with emphasis = 1.5. Of course, once the limits have been changed you can load new animations and not run into limits.

‌
- 2021-05-09T15:27:13+00:00
Alessandro Padovani reporter
Commit 8c9d7cd works fantastic. Thank you Thomas for the fast fix.

As for the limits that’s commented in my second post and I keep the sliders limits off in the global settings to allow for emphasis. That’s why I didn’t see any difference. A better popup would be “Update sliders limits to allow for emphasis.“, I myself didn’t understand what it’s for with the actual popup.

Then if you may implement ~~#480~~ as requested in the second post that is to allow a general “emphasis” or “plussing” for animation.
- 2021-05-09T17:20:52+00:00
bouich jules
wow thank you so much guys for working on the lips sync!

i have already noticed some HUGE changes on the last build AMAZING!

‌

thank you
- 2021-05-10T00:42:36+00:00
Alessandro Padovani reporter
You’re welcome. Luckily Thomas seems interested in new features for animation so we get them implemented. I can’t really do anything good myself without him.
- 2021-05-10T06:22:21+00:00
Alessandro Padovani reporter
- changed status to resolved
- 2021-05-11T05:26:08+00:00
Log in to comment

Assignee: –

Type: enhancement

Priority: minor

Status: resolved

Votes: 0

Watchers: 1