Commit Graph

304 Commits

Author SHA1 Message Date
Daniel Wolf 26cae93478 Refactored audio handling
Now audio clips can be passed around as const references
and don't carry state any more.
2016-07-27 21:58:37 +02:00
Daniel Wolf 799f334fa7 Using unique_ptr instead of raw pointers in object pool 2016-07-27 21:44:39 +02:00
Daniel Wolf b3b2366468 Re-written library code for parallel execution
The new implementation correctly re-throws exceptions on the calling thread
instead of terminating the application.
2016-07-27 21:44:39 +02:00
Daniel Wolf 5198ee9230 Made Lazy<T> copyable 2016-07-20 20:16:23 +02:00
Daniel Wolf 17b43ad205 Added class Lazy<T> 2016-07-19 21:33:07 +02:00
Daniel Wolf ddcadad710 Introduced user-defined literal "cs" for centiseconds
Now that ReSharper supports it (see https://youtrack.jetbrains.com/issue/RSCPP-14653)
2016-07-05 21:17:51 +02:00
Daniel Wolf 0447cbb4ff Refactored VAD multithreading 2016-06-30 20:52:29 +02:00
Daniel Wolf 8fa494fb77 Improved VAD quality via dry run 2016-06-30 20:42:36 +02:00
Daniel Wolf 6de7ba020a Fixed VAD error handling 2016-06-30 20:17:28 +02:00
Daniel Wolf ed27b8470c Workaround for PocketSphinx bug
See https://sourceforge.net/p/cmusphinx/discussion/help/thread/f1dd91c5/#7529
Also minor refactoring.
2016-06-30 20:06:38 +02:00
Daniel Wolf 2c0471e79f Improved lip animation for B/P and L sounds 2016-06-29 22:35:14 +02:00
Daniel Wolf 2d314f4bc7 Multithreaded recognition: refactoring and fixes
* Decoders are correctly released after use
* Determining optimal thread count for multithreading
2016-06-29 21:47:25 +02:00
Daniel Wolf f13449f810 Added thread info to logging 2016-06-29 21:47:25 +02:00
Daniel Wolf 75407dab54 Augmenting each detected voice activity to give recognizer some silence samples to work with 2016-06-29 21:47:25 +02:00
Daniel Wolf 2a5ed95698 Improved animation quality through new algorithm
Using "lazy" ruleset instead of 1:1 mapping from phones
2016-06-29 21:46:08 +02:00
Daniel Wolf 8c9466bcf3 Removed mouth shape H (special shape for 'L' sound) 2016-06-26 21:06:22 +02:00
Daniel Wolf 9bf8355742 Sped up recognition via multithreading 2016-06-26 21:06:21 +02:00
Daniel Wolf 3a0a38575f Sped up VAD via multithreading 2016-06-26 21:06:21 +02:00
Daniel Wolf 84097756c8 Added ThreadPool class 2016-06-26 14:02:17 +02:00
Daniel Wolf 0aeb35c42e Fixed deprecated library calls 2016-06-26 11:06:44 +02:00
Daniel Wolf 96b0ad9b1d Switched to better acoustic model 2016-06-25 22:07:28 +02:00
Daniel Wolf da78375a10 Added CMU Sphinx US English acoustic model 2016-06-25 22:00:47 +02:00
Daniel Wolf c9b17e1937 Improved tokenization by taking dictionary into account 2016-06-25 21:52:04 +02:00
Daniel Wolf 8502256241 Updated LICENSE.md 2016-06-25 21:51:06 +02:00
Daniel Wolf f275267ac7 Small VAD improvements
* RAII
* Slightly fewer false positives
2016-06-24 22:35:33 +02:00
Daniel Wolf faa3f2b4bb Fixed overflow with long audio files 2016-06-24 21:51:17 +02:00
Daniel Wolf c6c31a831c Using WebRTC for voice activity detection (VAD)
My simple power-based approach wasn't reliable enough.
2016-06-21 22:20:18 +02:00
Daniel Wolf aec3dbae01 Added WebRTC library 2016-06-21 22:13:05 +02:00
Daniel Wolf 97f172282d Fixed off-by-one error in wave file reader 2016-06-21 21:47:08 +02:00
Daniel Wolf 0e00e58d91 Gracefully handling failed audio alignment 2016-06-21 19:20:27 +02:00
Daniel Wolf 944c374415 Migrated to latest CMU Sphinx version 2016-06-19 21:18:40 +02:00
Daniel Wolf 478766ff6e Updated CMU SphinxBase and PocketSphinx 2016-06-19 20:53:24 +02:00
Daniel Wolf b2f702c8f4 Fixed OS X build 2016-06-16 19:41:49 +02:00
Daniel Wolf 6c9612d2c3 Raised low-pass threshold to better cope with high-pitched voices 2016-06-15 20:14:51 +02:00
Daniel Wolf 4346552312 Improved speed of voice activity detection
... by factor 2 by removing second pass.
Also added voice activity detection to progress calculation.
2016-06-15 20:14:51 +02:00
Daniel Wolf c4b054176c Fixed WAVE file reader position calculation
The bug only showed through massive seek times.
2016-06-15 20:14:44 +02:00
Daniel Wolf 522f6c2019 Made audio stream handling safe for long streams 2016-06-15 20:14:43 +02:00
Daniel Wolf d1bbe8538e Added more logging 2016-06-15 20:14:43 +02:00
Daniel Wolf 542a5ee3d8 Added join function for strings 2016-06-15 20:07:51 +02:00
Daniel Wolf 1e29151974 Fixed string conversion for Timed<void> 2016-06-14 17:36:54 +02:00
Daniel Wolf 5cc13cb16f Improved error message 2016-06-14 17:36:18 +02:00
Daniel Wolf 0d488e8de2 Restored dialog option, this time based on language model
This approach should be more robust and error-tolerant.
2016-06-10 22:35:27 +02:00
Daniel Wolf 4ed5908627 Implemented US-English G2P using sound change rules 2016-06-03 20:02:34 +02:00
Daniel Wolf 7a763e8755 Fixed syntax error in sound change data 2016-06-03 20:00:46 +02:00
Daniel Wolf bf19d267ee Added sound change code and data 2016-06-03 10:37:47 +02:00
Daniel Wolf 8be6485685 Implemented string conversion from Latin-1 to Unicode 2016-06-02 22:21:37 +02:00
Daniel Wolf 4d45bf7c89 Merged ascii.cpp into stringTools.cpp 2016-06-02 20:09:37 +02:00
Daniel Wolf 4d95b4c2c5 Implemented text tokenization using Flite 2016-06-02 18:24:27 +02:00
Daniel Wolf 8d1c618cec Patched Flite to prevent name collision with PocketSphinx 2016-06-02 18:24:27 +02:00
Daniel Wolf 942cabd773 Added Flite as library 2016-06-02 18:24:26 +02:00