Renamed to Rhubarb Lip Sync (no hyphen)

This commit is contained in:
Daniel Wolf 2017-07-09 11:26:28 +02:00
parent 77d8f2b866
commit 7330003f24
5 changed files with 44 additions and 43 deletions

View File

@ -4,12 +4,12 @@
This summary is not legally binding. The actual license terms are defined by the license texts below.
* Rhubarb Lip-Sync is released under the MIT license. All its third-party dependencies (libraries, resources, etc.) are released under the MIT license, a BSD license, or a similar permissive license. This means that you can use Rhubarb Lip-Sync in almost any way you want, including the creation of commercial software based on it.
* When you run Rhubarb Lip-Sync on an audio file, the resulting lip-sync data belongs to you alone. This means that if you use Rhubarb Lip-Sync in the production process of a video game, an animated cartoon, or a similar product *that doesn't ship with lip-sync functionality*, you don't even have to care about the MIT license.
* Rhubarb Lip Sync is released under the MIT license. All its third-party dependencies (libraries, resources, etc.) are released under the MIT license, a BSD license, or a similar permissive license. This means that you can use Rhubarb Lip Sync in almost any way you want, including the creation of commercial software based on it.
* When you run Rhubarb Lip Sync on an audio file, the resulting lip sync data belongs to you alone. This means that if you use Rhubarb Lip Sync in the production process of a video game, an animated cartoon, or a similar product *that doesn't ship with lip sync functionality*, you don't even have to care about the MIT license.
## Rhubarb Lip-Sync
## Rhubarb Lip Sync
[Rhubarb Lip-Sync](https://github.com/DanielSWolf/rhubarb-lip-sync) is released under the **MIT License (MIT)**.
[Rhubarb Lip Sync](https://github.com/DanielSWolf/rhubarb-lip-sync) is released under the **MIT License (MIT)**.
Copyright (c) 2015-2016 Daniel Wolf

View File

@ -1,4 +1,4 @@
= Rhubarb Lip-Sync
= Rhubarb Lip Sync
:A: Ⓐ
:B: Ⓑ
:C: Ⓒ
@ -12,20 +12,20 @@
image:https://img.shields.io/travis/DanielSWolf/rhubarb-lip-sync/master.svg["Build Status", link="https://travis-ci.org/DanielSWolf/rhubarb-lip-sync"]
image:https://img.shields.io/twitter/follow/RhubarbLipSync.svg?style=social&label=Follow["Twitter", link="https://twitter.com/RhubarbLipSync"]
https://github.com/DanielSWolf/rhubarb-lip-sync[Rhubarb Lip-Sync] is a command-line tool that automatically creates 2D mouth animation from voice recordings. You can use it for animating speech in computer games, animated cartoons, or any similar project.
https://github.com/DanielSWolf/rhubarb-lip-sync[Rhubarb Lip Sync] is a command-line tool that automatically creates 2D mouth animation from voice recordings. You can use it for animating speech in computer games, animated cartoons, or any similar project.
Rhubarb Lip-Sync produces output files in various text formats (TSV/XML/JSON). If you're a programmer, this makes it easy for you to use the output in whatever way you like. If you're not a programmer, there is currently no direct way to import the result into your favorite animation tool. If this is what you need, feel free to https://github.com/DanielSWolf/rhubarb-lip-sync/issues[create an issue] telling me what tool you're using. I might add support for a few popular animation tools in the future.
Rhubarb Lip Sync produces output files in various text formats (TSV/XML/JSON). If you're a programmer, this makes it easy for you to use the output in whatever way you like. If you're not a programmer, there is currently no direct way to import the result into your favorite animation tool. If this is what you need, feel free to https://github.com/DanielSWolf/rhubarb-lip-sync/issues[create an issue] telling me what tool you're using. I might add support for a few popular animation tools in the future.
== Demo video
Click the image for a demo video. This was generated using Rhubarb Lip-Sync 1.0.0; newer versions create even better animation!
Click the image for a demo video. This was generated using Rhubarb Lip Sync 1.0.0; newer versions create even better animation!
https://www.youtube.com/watch?v=OX_K387EKoI[image:http://img.youtube.com/vi/OX_K387EKoI/0.jpg[]]
[[mouth-shapes]]
== Mouth shapes
Rhubarb Lip-Sync can use between six and nine different mouth positions. The first six mouth shapes ({A}-{F}) are the _basic mouth shapes_ and the absolute minimum you have to draw for your character. These six mouth shapes were invented at the Hanna-Barbera studios for shows such as Scooby-Doo and The Flintstones. Since then, they have evolved into a _de-facto_ standard for 2D animation, and have been widely used by studios like Disney and Warner Bros.
Rhubarb Lip Sync can use between six and nine different mouth positions. The first six mouth shapes ({A}-{F}) are the _basic mouth shapes_ and the absolute minimum you have to draw for your character. These six mouth shapes were invented at the Hanna-Barbera studios for shows such as Scooby-Doo and The Flintstones. Since then, they have evolved into a _de-facto_ standard for 2D animation, and have been widely used by studios like Disney and Warner Bros.
In addition to the six basic mouth shapes, there are three _extended mouth shapes_: {G}, {H}, and {X}. These are optional. You may choose to draw all three of them, pick just one or two, or leave them out entirely.
@ -70,15 +70,15 @@ This shape is also used as an in-between when animating from {C} or {D} to {F}.
*This extended mouth shape is optional.* Whether there should be any visible difference between the rest position {X} and the closed talking mouth {A} depends on your art style and personal taste. If you decide not to use it, you can specify so using the <<extendedShapes,`extendedShapes`>> option.
|===
== How to run Rhubarb Lip-Sync
== How to run Rhubarb Lip Sync
=== General usage ===
Rhubarb Lip-Sync is a command-line tool that is currently available for Windows and OS X.
Rhubarb Lip Sync is a command-line tool that is currently available for Windows and OS X.
* Download the https://github.com/DanielSWolf/rhubarb-lip-sync/releases[latest release] and unzip the file anywhere on your computer.
* Call `rhubarb`, passing it a WAVE file as argument and telling it where to create the output file. In its simplest form, this might look like this: `rhubarb -o output.txt my-recording.wav`. There are additional <<options,command-line options>> you can specify in order to get better results.
* Rhubarb Lip-Sync will analyze the sound file, animate it, and create an output file containing the animation. If an error occurs, Rhubarb Lip-Sync will instead print an error message to `stderr` and exit with a non-zero exit code.
* Rhubarb Lip Sync will analyze the sound file, animate it, and create an output file containing the animation. If an error occurs, Rhubarb Lip Sync will instead print an error message to `stderr` and exit with a non-zero exit code.
[[options]]
=== Command-line options ===
@ -95,27 +95,27 @@ The following is a complete list of available command-line options.
_Default value: ``tsv``_
| `-d` _<path>_, `--dialogFile` _<path>_
| This option is meant for situations where you know the dialog text in advance. Specify a plain-text file (in ASCII or UTF-8 format) containing just the dialog of the audio file. Rhubarb Lip-Sync will still perform word recognition internally, but it will prefer words and phrases that occur in the dialog file. This leads to better recognition results and thus more reliable animation.
| This option is meant for situations where you know the dialog text in advance. Specify a plain-text file (in ASCII or UTF-8 format) containing just the dialog of the audio file. Rhubarb Lip Sync will still perform word recognition internally, but it will prefer words and phrases that occur in the dialog file. This leads to better recognition results and thus more reliable animation.
For instance, let's say you're recording dialog for a computer game. The script says: "`That's all gobbledygook to me.`" But actually, the voice artist ends up saying "`That's _just_ gobbledygook to me,`" slightly changing the dialog. If you specify a dialog file with the original line ("`That's all gobbledygook to me`"), this will still allow Rhubarb Lip-Sync to produce better results. Rhubarb Lip-Sync will ignore the dialog file where it audibly differs from the recording, and benefit from it where it matches.
For instance, let's say you're recording dialog for a computer game. The script says: "`That's all gobbledygook to me.`" But actually, the voice artist ends up saying "`That's _just_ gobbledygook to me,`" slightly changing the dialog. If you specify a dialog file with the original line ("`That's all gobbledygook to me`"), this will still allow Rhubarb Lip Sync to produce better results. Rhubarb Lip Sync will ignore the dialog file where it audibly differs from the recording, and benefit from it where it matches.
_It is always a good idea to specify the dialog text. This will usually lead to more reliable mouth animation, even if the text is not completely accurate._
[[extendedShapes]]
| `--extendedShapes` _<string>_
| As described in <<mouth-shapes>>, Rhubarb Lip-Sync uses six basic mouth shapes and up to three _extended mouth shapes_, which are optional. Use this option to specify which extended mouth shapes should be used. For example, to use only the {G} and {X} extended mouth shapes, specify `GX`; to use only the six basic mouth shapes, specify an empty string: `""`.
| As described in <<mouth-shapes>>, Rhubarb Lip Sync uses six basic mouth shapes and up to three _extended mouth shapes_, which are optional. Use this option to specify which extended mouth shapes should be used. For example, to use only the {G} and {X} extended mouth shapes, specify `GX`; to use only the six basic mouth shapes, specify an empty string: `""`.
_Default value: ``GHX``_
| `--threads` _<number>_
| Rhubarb Lip-Sync uses multithreading to speed up processing. By default, it creates as many worker threads as there are cores on your CPU, which results in optimal processing speed. You may choose to specify a lower number if you feel that Rhubarb Lip-Sync is slowing down other applications. Specifying a higher number is not recommended, as it won't result in any additional speed-up.
| Rhubarb Lip Sync uses multithreading to speed up processing. By default, it creates as many worker threads as there are cores on your CPU, which results in optimal processing speed. You may choose to specify a lower number if you feel that Rhubarb Lip Sync is slowing down other applications. Specifying a higher number is not recommended, as it won't result in any additional speed-up.
Note that for short audio files, Rhubarb Lip-Sync may choose to use fewer threads than specified.
Note that for short audio files, Rhubarb Lip Sync may choose to use fewer threads than specified.
_Default value: as many threads as your CPU has cores_
| `-q`, `--quiet`
| By default, Rhubarb Lip-Sync writes a number of progress messages to `stderr`. If you're using it as part of a batch process, this may clutter your console. If you specify the `--quiet` flag, there won't be any output to `stderr` unless an error occurred.
| By default, Rhubarb Lip Sync writes a number of progress messages to `stderr`. If you're using it as part of a batch process, this may clutter your console. If you specify the `--quiet` flag, there won't be any output to `stderr` unless an error occurred.
| `--logFile` _<path>_
| Creates a log file with diagnostic information at the specified path.
@ -140,12 +140,12 @@ _Default value: ``debug``_
== How to use the output
The output of Rhubarb Lip-Sync is a file that tells you which mouth shape to display at what time within the recording. You can choose between three file formats -- TSV, XML, and JSON. The following paragraphs show you what each of these formats looks like.
The output of Rhubarb Lip Sync is a file that tells you which mouth shape to display at what time within the recording. You can choose between three file formats -- TSV, XML, and JSON. The following paragraphs show you what each of these formats looks like.
[[tsv]]
=== Tab-separated values (`tsv`)
TSV is the simplest and most compact export format supported by Rhubarb Lip-Sync. Each line starts with a timestamp (in seconds), followed by a tab, followed by the name of the mouth shape. The following is the output for a recording of a person saying 'Hi.'
TSV is the simplest and most compact export format supported by Rhubarb Lip Sync. Each line starts with a timestamp (in seconds), followed by a tab, followed by the name of the mouth shape. The following is the output for a recording of a person saying 'Hi.'
[source]
----
@ -216,19 +216,19 @@ There is nothing surprising here; everything said about XML format applies to JS
== Limitations
Rhubarb Lip-Sync has some limitations you should be aware of.
Rhubarb Lip Sync has some limitations you should be aware of.
=== English only
Rhubarb Lip-Sync only produces good results when you give it recordings in English. You'll get best results with American English.
Rhubarb Lip Sync only produces good results when you give it recordings in English. You'll get best results with American English.
=== 2D animation only
Rhubarb Lip-Sync tries to imitate the animation style used in classic 2D animated cartoons. The results look stylized, and that's intentional. If you're working on a realistic 3D game or movie, Rhubarb Lip-Sync may not be the best choice.
Rhubarb Lip Sync tries to imitate the animation style used in classic 2D animated cartoons. The results look stylized, and that's intentional. If you're working on a realistic 3D game or movie, Rhubarb Lip Sync may not be the best choice.
== Tell me what you think!
I'd love to hear from you!
* Have you created something great using Rhubarb Lip-Sync? *https://twitter.com/RhubarbLipSync[Let me know on Twitter!]*
* Have you created something great using Rhubarb Lip Sync? *https://twitter.com/RhubarbLipSync[Let me know on Twitter!]*
* Do you need help? Have you spotted a bug? Do you have a suggestion? *https://github.com/DanielSWolf/rhubarb-lip-sync/issues[Create an issue!]*

View File

@ -2,7 +2,8 @@
## Unreleased
* Added --output command-line option
* Added `--output` command-line option
* Dropped the hyphen: Rhubarb Lip-Sync is now Rhubarb Lip Sync.
## Version 1.5.0
@ -22,7 +23,7 @@
* **Preventing long static segments**
Watch yourself in a mirror saying "He seized his keys." Your lips barely moved, right? That's exactly what would happen in previous versions of Rhubarb Lip-Sync. Only worse: Because there is only one "clenched teeth" mouth shape, the mouth would stay completely static during phrases like this. Rhubarb Lip-Sync 1.4.0 now does what [a professional animator would do](http://animateducated.blogspot.de/2016/10/lip-sync-animation-2.html?showComment=1478861729702#c2940729096183546458): It opens the mouth a bit wider for some syllables, keeping the lips moving. This may be cheating, but it looks much better!
Watch yourself in a mirror saying "He seized his keys." Your lips barely moved, right? That's exactly what would happen in previous versions of Rhubarb Lip Sync. Only worse: Because there is only one "clenched teeth" mouth shape, the mouth would stay completely static during phrases like this. Rhubarb Lip Sync 1.4.0 now does what [a professional animator would do](http://animateducated.blogspot.de/2016/10/lip-sync-animation-2.html?showComment=1478861729702#c2940729096183546458): It opens the mouth a bit wider for some syllables, keeping the lips moving. This may be cheating, but it looks much better!
* **Using wide-open mouth shape more often**
@ -32,15 +33,15 @@
* **New, bidirectional animation algorithm**
Since version 1.0.0, Rhubarb Lip-Sync has used a predictive animation algorithm. That means that in many situations (usually before a vowel), the mouth *anticipates* the upcoming sound. It moves *ahead of time*, resulting in more natural animation.
Since version 1.0.0, Rhubarb Lip Sync has used a predictive animation algorithm. That means that in many situations (usually before a vowel), the mouth *anticipates* the upcoming sound. It moves *ahead of time*, resulting in more natural animation.
For version 1.3.0, this core animation algorithm has been re-written from scratch. The new algorithm still anticipates the *next* vowel, but now also considers the *previous* vowel. The resulting animation is even closer to human speech.
* **Artistic timing**
Previous versions of Rhubarb Lip-Sync have tried to reproduce the timing of the recording as precisely as possible. For rapid speech, this often resulted in jittery animation that didn't look good: It tried to fit too much information into the available time. Traditional animators have known this problem since the 1930s. Instead of slavishly following the timing of the recording, they focus on important sounds and mouth shapes, showing them earlier (and thus longer) than would be realistic. On the other hand, they often skip unimportant sounds and mouth shapes altogether.
Previous versions of Rhubarb Lip Sync have tried to reproduce the timing of the recording as precisely as possible. For rapid speech, this often resulted in jittery animation that didn't look good: It tried to fit too much information into the available time. Traditional animators have known this problem since the 1930s. Instead of slavishly following the timing of the recording, they focus on important sounds and mouth shapes, showing them earlier (and thus longer) than would be realistic. On the other hand, they often skip unimportant sounds and mouth shapes altogether.
Rhubarb Lip-Sync 1.3.0 adds a new step in the animation pipeline that emulates this artistic approach. The resulting animation looks much cleaner and smoother. Ironically, it also looks more in-sync than the precise animation created by earlier versions.
Rhubarb Lip Sync 1.3.0 adds a new step in the animation pipeline that emulates this artistic approach. The resulting animation looks much cleaner and smoother. Ironically, it also looks more in-sync than the precise animation created by earlier versions.
* **Tweaks to the animation rules and tweening**
@ -48,41 +49,41 @@
* **Improved pause animations**
Pauses in speech are tricky to animate. Early version of Rhubarb Lip-Sync always closed the mouth, which looks strange for very short pauses. Later versions kept the mouth open for short pauses, which can also look weird if the first mouth shape *after* the pause is identical to the mouth shape *during* the pause: It looks as if somebody just forgot to animate that part.
Pauses in speech are tricky to animate. Early version of Rhubarb Lip Sync always closed the mouth, which looks strange for very short pauses. Later versions kept the mouth open for short pauses, which can also look weird if the first mouth shape *after* the pause is identical to the mouth shape *during* the pause: It looks as if somebody just forgot to animate that part.
This version of Rhubarb Lip-Sync uses three different strategies for animating pauses, depending on the duration of the pause and the mouth shapes before and after it.
This version of Rhubarb Lip Sync uses three different strategies for animating pauses, depending on the duration of the pause and the mouth shapes before and after it.
* **`--extendedShapes` command-line option**
Previous versions of Rhubarb Lip-Sync used a fixed set of eight or nine mouth shapes for animation. If users wanted to use fewer mouth shapes, they had to modify the output, for instance by replacing every "X" shape with an "A". This version of Rhubarb Lip-Sync introduces the `--extendedShapes` command-line option that allows the user to specify which mouth shapes should be used. This is not only more convenient; knowing which mouth shapes are actually available also allows Rhubarb Lip-Sync to create better animation.
Previous versions of Rhubarb Lip Sync used a fixed set of eight or nine mouth shapes for animation. If users wanted to use fewer mouth shapes, they had to modify the output, for instance by replacing every "X" shape with an "A". This version of Rhubarb Lip Sync introduces the `--extendedShapes` command-line option that allows the user to specify which mouth shapes should be used. This is not only more convenient; knowing which mouth shapes are actually available also allows Rhubarb Lip Sync to create better animation.
* **`--quiet` mode**
A "quiet" mode has been added. In that mode, Rhubarb Lip-Sync doesn't create any output except for animation data and error messages. This is helpful when using Rhubarb Lip-Sync as part of an automated process.
A "quiet" mode has been added. In that mode, Rhubarb Lip Sync doesn't create any output except for animation data and error messages. This is helpful when using Rhubarb Lip Sync as part of an automated process.
* **Fixes to the grapheme-to-phoneme algorithm**
Rhubarb Lip-Sync comes with a huge dictionary containing pronunciations for more than 100,000 English words. If the dialog text contains words not found in this dictionary, Rhubarb Lip-Sync will try to guess the correct pronunciation. I've fixed several bugs in the G2P algorithm that does this. As a result, using the `--dialogFile` option now results in even better animation.
Rhubarb Lip Sync comes with a huge dictionary containing pronunciations for more than 100,000 English words. If the dialog text contains words not found in this dictionary, Rhubarb Lip Sync will try to guess the correct pronunciation. I've fixed several bugs in the G2P algorithm that does this. As a result, using the `--dialogFile` option now results in even better animation.
## Version 1.2.0
* **Dialog file needn't be exact**
Since version 1.0.0, Rhubarb Lip-Sync can handle situations where the dialog text is specified (using the `-dialogFile` option), but the actual recording omits some words. For instance, the specified dialog text can be "That's all gobbledygook to me," but the recording only says "That's gobbledygook to me," dropping the word "all."
Since version 1.0.0, Rhubarb Lip Sync can handle situations where the dialog text is specified (using the `-dialogFile` option), but the actual recording omits some words. For instance, the specified dialog text can be "That's all gobbledygook to me," but the recording only says "That's gobbledygook to me," dropping the word "all."
Until now, however, Rhubarb Lip-Sync couldn't handle *changed* or *inserted* words, such as a recording saying "That's *just* gobbledygook to me." This restriction has been removed. As of version 1.2.0, the actual recording may freely deviate from the specified dialog text. Rhubarb Lip-Sync will ignore the dialog file where it audibly differs from the recording, and benefit from it where it matches.
Until now, however, Rhubarb Lip Sync couldn't handle *changed* or *inserted* words, such as a recording saying "That's *just* gobbledygook to me." This restriction has been removed. As of version 1.2.0, the actual recording may freely deviate from the specified dialog text. Rhubarb Lip Sync will ignore the dialog file where it audibly differs from the recording, and benefit from it where it matches.
## Version 1.1.0
* **More reliable speech recognition**
The first step in automatic lip-sync is speech recognition.
Rhubarb Lip-Sync 1.1.0 recognizes spoken dialog more accurately, especially at the beginning of recordings.
The first step in automatic lip sync is speech recognition.
Rhubarb Lip Sync 1.1.0 recognizes spoken dialog more accurately, especially at the beginning of recordings.
This improves the overall quality of the resulting animation.
* **More accurate breath detection**
Rhubarb Lip-Sync animates not only dialog, but also noises such as taking a breath.
Rhubarb Lip Sync animates not only dialog, but also noises such as taking a breath.
For this version, the accuracy of breath detection has been improved.
You shouldn't see actors opening their mouth for no reason any more.
@ -94,7 +95,7 @@
* **Builds on Linux**
In addition to Windows and OS X, Rhubarb Lip-Sync can now be built on Linux systems.
In addition to Windows and OS X, Rhubarb Lip Sync can now be built on Linux systems.
I'm not offering binary distributions for Linux at this time.
To build the application yourself, you need CMake, Boost, and a C++14-compatible compiler.

View File

@ -4,5 +4,5 @@ If you own a copy of [Sony Vegas](http://www.sonycreativesoftware.com/vegassoftw
Copy (or symlink) the files in this directory to `<Vegas installation directory>\Script Menu`. When you restart Vegas, you'll find two new menu items:
* *Tools > Scripting > Import Rhubarb:* This will create a new Vegas project and add two tracks: a video track with a visualization of Rhubarb Lip-Sync's output and an audio track with the original recording.
* *Tools > Scripting > Debug Rhubarb:* This will create markers or regions on the timeline visualizing Rhubarb Lip-Sync's internal data from a log file. You can obtain a log file by redirecting `stdout`. I've written this script mainly as a debugging aid for myself; feel free to contact me if you're interested and need a more thorough explanation.
* *Tools > Scripting > Import Rhubarb:* This will create a new Vegas project and add two tracks: a video track with a visualization of Rhubarb Lip Sync's output and an audio track with the original recording.
* *Tools > Scripting > Debug Rhubarb:* This will create markers or regions on the timeline visualizing Rhubarb Lip Sync's internal data from a log file. You can obtain a log file by redirecting `stdout`. I've written this script mainly as a debugging aid for myself; feel free to contact me if you're interested and need a more thorough explanation.

View File

@ -149,7 +149,7 @@ int main(int argc, char *argv[]) {
vector<char*>(argv, argv + argc) | transformed([](char* arg) { return fmt::format("\"{}\"", arg); }), " "));
try {
*infoStream << fmt::format("Generating lip-sync data for {}.", inputFilePath) << std::endl;
*infoStream << fmt::format("Generating lip sync data for {}.", inputFilePath) << std::endl;
*infoStream << "Processing. ";
JoiningContinuousTimeline<Shape> animation(TimeRange::zero(), Shape::X);
{