rhubarb-lip-sync

Commit Graph

Author	SHA1	Message	Date
Daniel Wolf	289b7ba56e	Restructured rhubarb-exporters	2016-11-16 11:35:27 +01:00
Daniel Wolf	3e34425c11	Refactoring: Split code into multiple projects	2016-11-16 11:01:01 +01:00
Daniel Wolf	c19ad1c8d0	Using biased language model to handle dialog more forgivingly Using a fixed 0.1-0.9 ratio between default and dialog language model	2016-10-21 21:41:50 +02:00
Daniel Wolf	9cfe577612	Fixed bad config when creating language model from dialog	2016-10-21 21:17:17 +02:00
Daniel Wolf	529a32e1b2	Better animation of short pauses	2016-10-14 20:25:30 +02:00
Daniel Wolf	503ba9104a	Treating schwa as a separate phone	2016-09-30 17:12:10 +02:00
Daniel Wolf	1f6f6d6175	Added convenience function Timed<T>.getDuration()	2016-09-29 12:06:47 +02:00
Daniel Wolf	f5b7971f52	Refactoring: Replaced audio "length" with "duration"	2016-09-29 12:06:28 +02:00
Daniel Wolf	f44baaa05f	Improve noise detection heuristic	2016-09-29 12:06:06 +02:00
Daniel Wolf	760f6c2ce6	Refactoring and better logging	2016-09-29 10:44:34 +02:00
Daniel Wolf	750078618c	Sharing audio buffer between operations	2016-09-26 13:11:01 +02:00
Daniel Wolf	de05f69507	Fixed compiler warning	2016-09-23 21:15:55 +02:00
Daniel Wolf	2fdd98f5b3	Removed potentially unsafe conversion	2016-09-23 21:15:34 +02:00
Daniel Wolf	938079a75f	Renamed phoneExtraction to phoneRecognition	2016-09-21 10:32:26 +02:00
Daniel Wolf	600b3429a7	No longer discarding "burnt" decoders See https://sourceforge.net/p/cmusphinx/discussion/help/thread/f1dd91c5/#1d89/0491/7f0c/60fc	2016-09-21 10:28:31 +02:00
Daniel Wolf	eea1eb381c	Refactored ObjectPool to correctly handle custom deleters	2016-09-21 10:25:08 +02:00
Daniel Wolf	d97c880754	Performing per-utterance cepstral mean normalization See discussion in https://sourceforge.net/p/cmusphinx/discussion/help/thread/51e2979b/	2016-09-18 22:02:02 +02:00
Daniel Wolf	f4f9ffe883	Logging bin path, hoping to crack that elusive segfault	2016-09-18 22:00:55 +02:00
Daniel Wolf	cf13499158	Caching bin path	2016-09-18 22:00:08 +02:00
Daniel Wolf	0ab009e17a	Workaround for off-by-one error in whereami library	2016-09-11 13:17:52 +02:00
Daniel Wolf	2607b9a12b	Fixed Boost version check	2016-09-11 12:59:09 +02:00
Daniel Wolf	c679b8fb71	Using different xml_writer_settings signature for old Boost versions	2016-09-11 11:40:17 +02:00
Daniel Wolf	261a768e0d	Removed Boost.Predef since it's not available in Boost 1.54	2016-09-11 11:40:17 +02:00
Daniel Wolf	d4b86357cf	Using boost::optional<T>.get_value_or() instead of value_or() for old Boost versions	2016-09-11 11:40:16 +02:00
Daniel Wolf	d98de34b98	Replaced calls to boost::optional<T>.value() with operator* Boost 1.54 doesn't support value() yet, plus * is cleaner	2016-09-11 11:40:16 +02:00
Daniel Wolf	2aef178eb0	Better error messages for incompatible WAVE files	2016-09-10 21:19:12 +02:00
Daniel Wolf	b95a3f621c	Fixed Linux build	2016-08-31 22:21:53 +02:00
Daniel Wolf	8fd78d63cf	Animating pauses only between words, not at start or end of recording	2016-08-11 16:28:04 +02:00
Daniel Wolf	a632e7a3b3	Fixed TSV export Exporter now terminates with shape X rather than A.	2016-08-11 15:49:51 +02:00
Daniel Wolf	81111ef96a	Fixed infinite loop with short recordings	2016-08-11 15:45:16 +02:00
Daniel Wolf	78027ea63c	Thread count can be limited via command-line argument	2016-08-11 10:29:01 +02:00
Daniel Wolf	206cde4658	Supporting noises (breathing, smacking, etc.)	2016-08-11 10:18:03 +02:00
Daniel Wolf	bd1f8226ec	Added TimeRange.trim() method	2016-08-11 10:16:50 +02:00
Daniel Wolf	734d06ad38	Disabling PocketSphinx's VAD We're performing VAD ourselves	2016-08-10 20:46:32 +02:00
Daniel Wolf	a851a76ce5	Minor improvements to animation rules	2016-08-10 20:13:05 +02:00
Daniel Wolf	8b025a3522	Fixed predictive mouth animation	2016-08-10 18:53:01 +02:00
Daniel Wolf	16892ae991	Fixed OS X build	2016-08-10 18:24:24 +02:00
Daniel Wolf	b22378221f	Better AH animation	2016-08-07 20:38:02 +02:00
Daniel Wolf	c65c8b4eb3	Better animation of pauses in speech	2016-08-05 19:34:57 +02:00
Daniel Wolf	1c50ece142	Refactoring	2016-08-05 17:17:25 +02:00
Daniel Wolf	b62fe8af98	Improved timing of bilabial stops ("B", "P")	2016-08-04 22:21:48 +02:00
Daniel Wolf	c566ac56cc	Suppressing log messages in console for non-debug builds	2016-08-04 21:02:40 +02:00
Daniel Wolf	229105a965	Fixed erratic progress display	2016-08-04 20:39:40 +02:00
Daniel Wolf	6888dadd04	Speedup through better multithreading * Fixed excessive locking * Using more threads for voice recognition	2016-08-04 19:39:43 +02:00
Daniel Wolf	1cb41b8309	Workaround for another kind of decoder corruption	2016-08-03 21:33:13 +02:00
Daniel Wolf	0a577d1947	Fixed audio resampling Audio was cut off due to incorrect length calculation	2016-08-03 20:55:45 +02:00
Daniel Wolf	f356855bbd	Implemented tweening for smoother animation	2016-08-02 22:02:59 +02:00
Daniel Wolf	95d46ef0b7	Re-written animation code * Still uses (almost) the same rules, but more powerful underlying concept * Re-introduced shape H for "L" sounds * Introduced shape X for idle position	2016-07-31 21:42:37 +02:00
Daniel Wolf	26cae93478	Refactored audio handling Now audio clips can be passed around as const references and don't carry state any more.	2016-07-27 21:58:37 +02:00
Daniel Wolf	799f334fa7	Using unique_ptr instead of raw pointers in object pool	2016-07-27 21:44:39 +02:00
Daniel Wolf	b3b2366468	Re-written library code for parallel execution The new implementation correctly re-throws exceptions on the calling thread instead of terminating the application.	2016-07-27 21:44:39 +02:00
Daniel Wolf	5198ee9230	Made Lazy<T> copyable	2016-07-20 20:16:23 +02:00
Daniel Wolf	17b43ad205	Added class Lazy<T>	2016-07-19 21:33:07 +02:00
Daniel Wolf	ddcadad710	Introduced user-defined literal "cs" for centiseconds Now that ReSharper supports it (see https://youtrack.jetbrains.com/issue/RSCPP-14653)	2016-07-05 21:17:51 +02:00
Daniel Wolf	0447cbb4ff	Refactored VAD multithreading	2016-06-30 20:52:29 +02:00
Daniel Wolf	8fa494fb77	Improved VAD quality via dry run	2016-06-30 20:42:36 +02:00
Daniel Wolf	6de7ba020a	Fixed VAD error handling	2016-06-30 20:17:28 +02:00
Daniel Wolf	ed27b8470c	Workaround for PocketSphinx bug See https://sourceforge.net/p/cmusphinx/discussion/help/thread/f1dd91c5/#7529 Also minor refactoring.	2016-06-30 20:06:38 +02:00
Daniel Wolf	2c0471e79f	Improved lip animation for B/P and L sounds	2016-06-29 22:35:14 +02:00
Daniel Wolf	2d314f4bc7	Multithreaded recognition: refactoring and fixes * Decoders are correctly released after use * Determining optimal thread count for multithreading	2016-06-29 21:47:25 +02:00
Daniel Wolf	f13449f810	Added thread info to logging	2016-06-29 21:47:25 +02:00
Daniel Wolf	75407dab54	Augmenting each detected voice activity to give recognizer some silence samples to work with	2016-06-29 21:47:25 +02:00
Daniel Wolf	2a5ed95698	Improved animation quality through new algorithm Using "lazy" ruleset instead of 1:1 mapping from phones	2016-06-29 21:46:08 +02:00
Daniel Wolf	8c9466bcf3	Removed mouth shape H (special shape for 'L' sound)	2016-06-26 21:06:22 +02:00
Daniel Wolf	9bf8355742	Sped up recognition via multithreading	2016-06-26 21:06:21 +02:00
Daniel Wolf	3a0a38575f	Sped up VAD via multithreading	2016-06-26 21:06:21 +02:00
Daniel Wolf	84097756c8	Added ThreadPool class	2016-06-26 14:02:17 +02:00
Daniel Wolf	0aeb35c42e	Fixed deprecated library calls	2016-06-26 11:06:44 +02:00
Daniel Wolf	c9b17e1937	Improved tokenization by taking dictionary into account	2016-06-25 21:52:04 +02:00
Daniel Wolf	f275267ac7	Small VAD improvements * RAII * Slightly fewer false positives	2016-06-24 22:35:33 +02:00
Daniel Wolf	faa3f2b4bb	Fixed overflow with long audio files	2016-06-24 21:51:17 +02:00
Daniel Wolf	c6c31a831c	Using WebRTC for voice activity detection (VAD) My simple power-based approach wasn't reliable enough.	2016-06-21 22:20:18 +02:00
Daniel Wolf	97f172282d	Fixed off-by-one error in wave file reader	2016-06-21 21:47:08 +02:00
Daniel Wolf	0e00e58d91	Gracefully handling failed audio alignment	2016-06-21 19:20:27 +02:00
Daniel Wolf	944c374415	Migrated to latest CMU Sphinx version	2016-06-19 21:18:40 +02:00
Daniel Wolf	b2f702c8f4	Fixed OS X build	2016-06-16 19:41:49 +02:00
Daniel Wolf	6c9612d2c3	Raised low-pass threshold to better cope with high-pitched voices	2016-06-15 20:14:51 +02:00
Daniel Wolf	4346552312	Improved speed of voice activity detection ... by factor 2 by removing second pass. Also added voice activity detection to progress calculation.	2016-06-15 20:14:51 +02:00
Daniel Wolf	c4b054176c	Fixed WAVE file reader position calculation The bug only showed through massive seek times.	2016-06-15 20:14:44 +02:00
Daniel Wolf	522f6c2019	Made audio stream handling safe for long streams	2016-06-15 20:14:43 +02:00
Daniel Wolf	d1bbe8538e	Added more logging	2016-06-15 20:14:43 +02:00
Daniel Wolf	542a5ee3d8	Added join function for strings	2016-06-15 20:07:51 +02:00
Daniel Wolf	1e29151974	Fixed string conversion for Timed<void>	2016-06-14 17:36:54 +02:00
Daniel Wolf	5cc13cb16f	Improved error message	2016-06-14 17:36:18 +02:00
Daniel Wolf	0d488e8de2	Restored dialog option, this time based on language model This approach should be more robust and error-tolerant.	2016-06-10 22:35:27 +02:00
Daniel Wolf	4ed5908627	Implemented US-English G2P using sound change rules	2016-06-03 20:02:34 +02:00
Daniel Wolf	8be6485685	Implemented string conversion from Latin-1 to Unicode	2016-06-02 22:21:37 +02:00
Daniel Wolf	4d45bf7c89	Merged ascii.cpp into stringTools.cpp	2016-06-02 20:09:37 +02:00
Daniel Wolf	4d95b4c2c5	Implemented text tokenization using Flite	2016-06-02 18:24:27 +02:00
Daniel Wolf	d4b9a8e0c6	Implemented simple conversion from Unicode string to ASCII	2016-06-02 18:24:25 +02:00
Daniel Wolf	f1563919e1	Removing redundant prefixes from PocketSphinx log output	2016-05-17 17:56:11 +02:00
Daniel Wolf	c67e916185	Splitting audio into utterances before processing Advantages: * No problems with long silences (PocketSphinx doesn't like them) * Potential for parallelization * Potential for improved phone timing accuracy	2016-05-17 16:01:10 +02:00
Daniel Wolf	bbc933a821	Temporarily removed --dialog option	2016-05-17 14:28:18 +02:00
Daniel Wolf	2f31c5aa61	Refactoring * Rewriting Timeline<T> to be sparse, i.e., allow gaps * Added specialized subclasses BoundedTimeline<T> and ContinuousTimeline<T> * Timed<T> and TimeRange: has-a, not is-a * Introducing Timed<void>	2016-05-17 14:28:18 +02:00
Daniel Wolf	9eef09145e	Added getPairs function	2016-05-12 21:44:46 +02:00
Daniel Wolf	baf2423b27	Added time manipulation functions to TimeRange and Timeline	2016-04-19 22:06:20 +02:00
Daniel Wolf	895b942df3	Implemented AudioStreamSegment	2016-04-19 22:04:43 +02:00
Daniel Wolf	ce204c68de	Fixed constness	2016-04-19 21:12:44 +02:00
Daniel Wolf	c14fb1c7b2	Fixed output format for structured logging	2016-04-19 19:30:38 +02:00
Daniel Wolf	8d2d100376	Refactored enum serialization/deserialization	2016-04-17 20:22:16 +02:00

1 2 3 4 5

206 Commits