rhubarb-lip-sync

Commit Graph

Author	SHA1	Message	Date
Daniel Wolf	2a5ed95698	Improved animation quality through new algorithm Using "lazy" ruleset instead of 1:1 mapping from phones	2016-06-29 21:46:08 +02:00
Daniel Wolf	8c9466bcf3	Removed mouth shape H (special shape for 'L' sound)	2016-06-26 21:06:22 +02:00
Daniel Wolf	9bf8355742	Sped up recognition via multithreading	2016-06-26 21:06:21 +02:00
Daniel Wolf	3a0a38575f	Sped up VAD via multithreading	2016-06-26 21:06:21 +02:00
Daniel Wolf	84097756c8	Added ThreadPool class	2016-06-26 14:02:17 +02:00
Daniel Wolf	0aeb35c42e	Fixed deprecated library calls	2016-06-26 11:06:44 +02:00
Daniel Wolf	c9b17e1937	Improved tokenization by taking dictionary into account	2016-06-25 21:52:04 +02:00
Daniel Wolf	f275267ac7	Small VAD improvements * RAII * Slightly fewer false positives	2016-06-24 22:35:33 +02:00
Daniel Wolf	faa3f2b4bb	Fixed overflow with long audio files	2016-06-24 21:51:17 +02:00
Daniel Wolf	c6c31a831c	Using WebRTC for voice activity detection (VAD) My simple power-based approach wasn't reliable enough.	2016-06-21 22:20:18 +02:00
Daniel Wolf	97f172282d	Fixed off-by-one error in wave file reader	2016-06-21 21:47:08 +02:00
Daniel Wolf	0e00e58d91	Gracefully handling failed audio alignment	2016-06-21 19:20:27 +02:00
Daniel Wolf	944c374415	Migrated to latest CMU Sphinx version	2016-06-19 21:18:40 +02:00
Daniel Wolf	b2f702c8f4	Fixed OS X build	2016-06-16 19:41:49 +02:00
Daniel Wolf	6c9612d2c3	Raised low-pass threshold to better cope with high-pitched voices	2016-06-15 20:14:51 +02:00
Daniel Wolf	4346552312	Improved speed of voice activity detection ... by factor 2 by removing second pass. Also added voice activity detection to progress calculation.	2016-06-15 20:14:51 +02:00
Daniel Wolf	c4b054176c	Fixed WAVE file reader position calculation The bug only showed through massive seek times.	2016-06-15 20:14:44 +02:00
Daniel Wolf	522f6c2019	Made audio stream handling safe for long streams	2016-06-15 20:14:43 +02:00
Daniel Wolf	d1bbe8538e	Added more logging	2016-06-15 20:14:43 +02:00
Daniel Wolf	542a5ee3d8	Added join function for strings	2016-06-15 20:07:51 +02:00
Daniel Wolf	1e29151974	Fixed string conversion for Timed<void>	2016-06-14 17:36:54 +02:00
Daniel Wolf	5cc13cb16f	Improved error message	2016-06-14 17:36:18 +02:00
Daniel Wolf	0d488e8de2	Restored dialog option, this time based on language model This approach should be more robust and error-tolerant.	2016-06-10 22:35:27 +02:00
Daniel Wolf	4ed5908627	Implemented US-English G2P using sound change rules	2016-06-03 20:02:34 +02:00
Daniel Wolf	8be6485685	Implemented string conversion from Latin-1 to Unicode	2016-06-02 22:21:37 +02:00
Daniel Wolf	4d45bf7c89	Merged ascii.cpp into stringTools.cpp	2016-06-02 20:09:37 +02:00
Daniel Wolf	4d95b4c2c5	Implemented text tokenization using Flite	2016-06-02 18:24:27 +02:00
Daniel Wolf	d4b9a8e0c6	Implemented simple conversion from Unicode string to ASCII	2016-06-02 18:24:25 +02:00
Daniel Wolf	f1563919e1	Removing redundant prefixes from PocketSphinx log output	2016-05-17 17:56:11 +02:00
Daniel Wolf	c67e916185	Splitting audio into utterances before processing Advantages: * No problems with long silences (PocketSphinx doesn't like them) * Potential for parallelization * Potential for improved phone timing accuracy	2016-05-17 16:01:10 +02:00
Daniel Wolf	bbc933a821	Temporarily removed --dialog option	2016-05-17 14:28:18 +02:00
Daniel Wolf	2f31c5aa61	Refactoring * Rewriting Timeline<T> to be sparse, i.e., allow gaps * Added specialized subclasses BoundedTimeline<T> and ContinuousTimeline<T> * Timed<T> and TimeRange: has-a, not is-a * Introducing Timed<void>	2016-05-17 14:28:18 +02:00
Daniel Wolf	9eef09145e	Added getPairs function	2016-05-12 21:44:46 +02:00
Daniel Wolf	baf2423b27	Added time manipulation functions to TimeRange and Timeline	2016-04-19 22:06:20 +02:00
Daniel Wolf	895b942df3	Implemented AudioStreamSegment	2016-04-19 22:04:43 +02:00
Daniel Wolf	ce204c68de	Fixed constness	2016-04-19 21:12:44 +02:00
Daniel Wolf	c14fb1c7b2	Fixed output format for structured logging	2016-04-19 19:30:38 +02:00
Daniel Wolf	8d2d100376	Refactored enum serialization/deserialization	2016-04-17 20:22:16 +02:00
Daniel Wolf	44d18d00f8	Added header file to CMakeLists.txt This makes navigation easier for me. Plus, ReSharper didn't like not knowing the header files.	2016-04-14 22:14:57 +02:00
Daniel Wolf	7ce79f9c08	Replaced Boost.Log with small custom logger Boost.Log is a complex monstrosity and I can't get it to build on OS X.	2016-04-14 09:42:47 +02:00
Daniel Wolf	4941bff739	Replaced strerror_s with (less safe) strerror libc++ (Xcode) doesn't seem to support it.	2016-04-13 10:37:10 +02:00
Daniel Wolf	d8fbd3596b	Fixed UnboundedStream constructor	2016-04-13 10:37:10 +02:00
Daniel Wolf	db6f2e076b	Fixed GCC build	2016-04-12 23:04:16 +02:00
Daniel Wolf	4b8e38970a	Added hanging indent to help output to make it more readable	2016-04-12 21:23:15 +02:00
Daniel Wolf	fd6b3b1e2f	Supporting multiple export formats - Simplified XML export format - Added TSV and JSON formats - Using TSV as standard export format	2016-04-12 21:08:23 +02:00
Daniel Wolf	90e1375f1b	Handling zero-length audio files	2016-04-12 20:45:47 +02:00
Daniel Wolf	7bc4e37a1a	Improved error handling and error messages	2016-04-12 18:02:52 +02:00
Daniel Wolf	04c828506d	Simplified code using Timeline<T>	2016-04-09 22:07:25 +02:00
Daniel Wolf	83291aa96c	Implemented class Timeline<T>	2016-04-09 20:56:25 +02:00
Daniel Wolf	2be3751a4f	Renamed TimeSegment to TimeRange	2016-03-28 20:30:55 +02:00
Daniel Wolf	8c1e24e9c8	Implemented voice activity detection	2016-03-16 21:01:44 +01:00
Daniel Wolf	425f47491c	Fixed compiler warnings	2016-03-16 21:01:43 +01:00
Daniel Wolf	a8900f80ec	Removing DC offset from audio Also a bit of refactoring regarding audio processing	2016-03-16 21:01:43 +01:00
Daniel Wolf	af5a6649c1	Implemented logging to log file	2016-03-08 22:59:44 +01:00
Daniel Wolf	35ec1f8a45	Introduced template functions to unify enum<->string conversions	2016-03-08 22:20:40 +01:00
Daniel Wolf	ad9d8e6567	Renamed `audioInput` directory to `audio`	2016-03-08 18:21:17 +01:00
Daniel Wolf	b78e418a8f	Refactored audio streams * All streams are now mono (simplifies reasoning about samples) * Streams can be cloned * Streams can be seeked within	2016-03-07 21:28:31 +01:00
Daniel Wolf	419b0ec469	Making sure log is written in case of exception	2016-03-06 20:40:31 +01:00
Daniel Wolf	04ca644cca	Added structured logging	2016-03-03 22:31:16 +01:00
Daniel Wolf	cdffb56613	Redirecting pocketsphinx log to main log	2016-03-03 22:31:16 +01:00
Daniel Wolf	7efea6f56b	Prepared for logging using Boost.Log v2	2016-02-29 21:48:27 +01:00
Daniel Wolf	7a1f446ca3	Using GSL	2016-02-29 20:58:58 +01:00
Daniel Wolf	667edf9485	Improved dialog handling	2016-02-10 21:53:58 +01:00
Daniel Wolf	05ef692706	Added (primitive) option to explicitly supply the dialog	2016-02-09 22:08:11 +01:00
Daniel Wolf	9b10f38bcb	Added missing include	2016-02-02 10:13:07 +01:00
Daniel Wolf	f09155e486	Using raw pointers instead of iterators for string manipulation This avoids an assertion error when I temporarily move 1 past end	2016-02-01 20:47:27 +01:00
Daniel Wolf	75872fe45d	Using -dither to prevent recognition errors in connection with zero silence	2016-02-01 20:26:14 +01:00
Daniel Wolf	0cb0153874	Improved phone-to-mouth mapping	2016-01-31 21:39:49 +01:00
Daniel Wolf	7aa6057b8e	Allowing for long pauses in speech without breaking sync	2016-01-28 21:52:50 +01:00
Daniel Wolf	c425885929	Showing combined progress for entire task	2016-01-28 19:13:40 +01:00
Daniel Wolf	8e7fcc4efe	Implemented two-step phone detection for better accuracy	2016-01-28 14:19:32 +01:00
Daniel Wolf	2bfe671f82	Simplified directory structure to make Visual Studio build work	2016-01-08 16:59:18 +01:00
Daniel Wolf	0f33fcfbd0	Removing zero silence, seems like Sphinx doesn't like it See http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor I couldn't reproduce the original problem, but it doesn't seem to hurt, either.	2016-01-08 16:44:03 +01:00
Daniel Wolf	31cb3b195c	Showing progress bar	2016-01-08 10:53:35 +01:00
Daniel Wolf	f14feefeb0	Using #pragma once instead of include guards Just looks cleaner	2016-01-06 21:08:39 +01:00
Daniel Wolf	9e9a432f70	Improved formatting of command-line output	2016-01-06 21:08:39 +01:00
Daniel Wolf	5c0fe24fae	Refactoring: Using camelCase throughout	2016-01-06 20:47:37 +01:00
Daniel Wolf	acd13e2890	Added a number of string-related tools.	2016-01-06 20:47:29 +01:00
Daniel Wolf	3e5d6e3625	Using TCLAP to parse command line	2016-01-06 20:47:27 +01:00
Daniel Wolf	e2840dba3f	Fixed warning	2015-12-21 13:26:56 +01:00
Daniel Wolf	4baab9b207	Fixed Windows build	2015-12-21 13:17:14 +01:00
Daniel Wolf	932803d5ad	Ported platform-dependent code Added code for Windows, OS X, Solaris, BSD, and Linux. Right now, only the Windows version has been tested.	2015-12-14 20:46:31 +01:00
Daniel Wolf	e4b5b39504	Fixed corner cases Handling silences and last mouth shape	2015-12-03 23:07:15 +01:00
Daniel Wolf	7b282ce50f	Using std::string instead of std::wstring for command-line args Turns out that even if I manage to get Unicode command line args, there still is no portable way of opening a file from a Unicode path.	2015-12-03 23:07:15 +01:00
Daniel Wolf	27ba3ef357	Generating XML output	2015-12-03 23:07:15 +01:00
Daniel Wolf	2ef99119b0	Generating mouth shapes using simple lookup table	2015-12-01 22:55:53 +01:00
Daniel Wolf	994e2be314	Redirecting PocketSphinx log output	2015-12-01 22:55:53 +01:00
Daniel Wolf	d6f5c2ed1e	Reading sound file name from command line	2015-12-01 22:55:53 +01:00
Daniel Wolf	132adb1083	Improved error handling Plus some refactoring	2015-12-01 22:55:53 +01:00
Daniel Wolf	f2f6f75932	Refactoring - Moved phone recognition code to phone_extraction.cpp - Introduced type centiseconds - Code reorganization	2015-12-01 22:55:52 +01:00
Daniel Wolf	713e8b5d7f	Fixed comment	2015-10-31 20:41:17 +01:00
Daniel Wolf	d96bf12c96	Fixed model path; enabled fast mode	2015-10-19 22:03:29 +02:00
Daniel Wolf	3cd82e89f8	Reading WAVE file	2015-10-19 22:03:29 +02:00
Daniel Wolf	641f64022d	Implemented WAVE reading, writing, and conversion	2015-10-19 22:03:20 +02:00

1 2 3 4

194 Commits