rhubarb-lip-sync

Commit Graph

Author	SHA1	Message	Date
Daniel Wolf	81111ef96a	Fixed infinite loop with short recordings	2016-08-11 15:45:16 +02:00
Daniel Wolf	78027ea63c	Thread count can be limited via command-line argument	2016-08-11 10:29:01 +02:00
Daniel Wolf	206cde4658	Supporting noises (breathing, smacking, etc.)	2016-08-11 10:18:03 +02:00
Daniel Wolf	734d06ad38	Disabling PocketSphinx's VAD We're performing VAD ourselves	2016-08-10 20:46:32 +02:00
Daniel Wolf	6888dadd04	Speedup through better multithreading * Fixed excessive locking * Using more threads for voice recognition	2016-08-04 19:39:43 +02:00
Daniel Wolf	1cb41b8309	Workaround for another kind of decoder corruption	2016-08-03 21:33:13 +02:00
Daniel Wolf	26cae93478	Refactored audio handling Now audio clips can be passed around as const references and don't carry state any more.	2016-07-27 21:58:37 +02:00
Daniel Wolf	b3b2366468	Re-written library code for parallel execution The new implementation correctly re-throws exceptions on the calling thread instead of terminating the application.	2016-07-27 21:44:39 +02:00
Daniel Wolf	ed27b8470c	Workaround for PocketSphinx bug See https://sourceforge.net/p/cmusphinx/discussion/help/thread/f1dd91c5/#7529 Also minor refactoring.	2016-06-30 20:06:38 +02:00
Daniel Wolf	2d314f4bc7	Multithreaded recognition: refactoring and fixes * Decoders are correctly released after use * Determining optimal thread count for multithreading	2016-06-29 21:47:25 +02:00
Daniel Wolf	9bf8355742	Sped up recognition via multithreading	2016-06-26 21:06:21 +02:00
Daniel Wolf	c9b17e1937	Improved tokenization by taking dictionary into account	2016-06-25 21:52:04 +02:00
Daniel Wolf	c6c31a831c	Using WebRTC for voice activity detection (VAD) My simple power-based approach wasn't reliable enough.	2016-06-21 22:20:18 +02:00
Daniel Wolf	0e00e58d91	Gracefully handling failed audio alignment	2016-06-21 19:20:27 +02:00
Daniel Wolf	944c374415	Migrated to latest CMU Sphinx version	2016-06-19 21:18:40 +02:00
Daniel Wolf	4346552312	Improved speed of voice activity detection ... by factor 2 by removing second pass. Also added voice activity detection to progress calculation.	2016-06-15 20:14:51 +02:00
Daniel Wolf	d1bbe8538e	Added more logging	2016-06-15 20:14:43 +02:00
Daniel Wolf	0d488e8de2	Restored dialog option, this time based on language model This approach should be more robust and error-tolerant.	2016-06-10 22:35:27 +02:00
Daniel Wolf	f1563919e1	Removing redundant prefixes from PocketSphinx log output	2016-05-17 17:56:11 +02:00
Daniel Wolf	c67e916185	Splitting audio into utterances before processing Advantages: * No problems with long silences (PocketSphinx doesn't like them) * Potential for parallelization * Potential for improved phone timing accuracy	2016-05-17 16:01:10 +02:00
Daniel Wolf	bbc933a821	Temporarily removed --dialog option	2016-05-17 14:28:18 +02:00
Daniel Wolf	2f31c5aa61	Refactoring * Rewriting Timeline<T> to be sparse, i.e., allow gaps * Added specialized subclasses BoundedTimeline<T> and ContinuousTimeline<T> * Timed<T> and TimeRange: has-a, not is-a * Introducing Timed<void>	2016-05-17 14:28:18 +02:00
Daniel Wolf	8d2d100376	Refactored enum serialization/deserialization	2016-04-17 20:22:16 +02:00
Daniel Wolf	7ce79f9c08	Replaced Boost.Log with small custom logger Boost.Log is a complex monstrosity and I can't get it to build on OS X.	2016-04-14 09:42:47 +02:00
Daniel Wolf	90e1375f1b	Handling zero-length audio files	2016-04-12 20:45:47 +02:00
Daniel Wolf	04c828506d	Simplified code using Timeline<T>	2016-04-09 22:07:25 +02:00
Daniel Wolf	a8900f80ec	Removing DC offset from audio Also a bit of refactoring regarding audio processing	2016-03-16 21:01:43 +01:00
Daniel Wolf	35ec1f8a45	Introduced template functions to unify enum<->string conversions	2016-03-08 22:20:40 +01:00
Daniel Wolf	ad9d8e6567	Renamed `audioInput` directory to `audio`	2016-03-08 18:21:17 +01:00
Daniel Wolf	b78e418a8f	Refactored audio streams * All streams are now mono (simplifies reasoning about samples) * Streams can be cloned * Streams can be seeked within	2016-03-07 21:28:31 +01:00
Daniel Wolf	04ca644cca	Added structured logging	2016-03-03 22:31:16 +01:00
Daniel Wolf	cdffb56613	Redirecting pocketsphinx log to main log	2016-03-03 22:31:16 +01:00
Daniel Wolf	7a1f446ca3	Using GSL	2016-02-29 20:58:58 +01:00
Daniel Wolf	667edf9485	Improved dialog handling	2016-02-10 21:53:58 +01:00
Daniel Wolf	05ef692706	Added (primitive) option to explicitly supply the dialog	2016-02-09 22:08:11 +01:00
Daniel Wolf	75872fe45d	Using -dither to prevent recognition errors in connection with zero silence	2016-02-01 20:26:14 +01:00
Daniel Wolf	7aa6057b8e	Allowing for long pauses in speech without breaking sync	2016-01-28 21:52:50 +01:00
Daniel Wolf	c425885929	Showing combined progress for entire task	2016-01-28 19:13:40 +01:00
Daniel Wolf	8e7fcc4efe	Implemented two-step phone detection for better accuracy	2016-01-28 14:19:32 +01:00
Daniel Wolf	2bfe671f82	Simplified directory structure to make Visual Studio build work	2016-01-08 16:59:18 +01:00
Daniel Wolf	0f33fcfbd0	Removing zero silence, seems like Sphinx doesn't like it See http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor I couldn't reproduce the original problem, but it doesn't seem to hurt, either.	2016-01-08 16:44:03 +01:00
Daniel Wolf	31cb3b195c	Showing progress bar	2016-01-08 10:53:35 +01:00
Daniel Wolf	5c0fe24fae	Refactoring: Using camelCase throughout	2016-01-06 20:47:37 +01:00

43 Commits