Daniel Wolf
c19ad1c8d0
Using biased language model to handle dialog more forgivingly
...
Using a fixed 0.1-0.9 ratio between default and dialog language model
2016-10-21 21:41:50 +02:00
Daniel Wolf
9cfe577612
Fixed bad config when creating language model from dialog
2016-10-21 21:17:17 +02:00
Daniel Wolf
529a32e1b2
Better animation of short pauses
2016-10-14 20:25:30 +02:00
Daniel Wolf
503ba9104a
Treating schwa as a separate phone
2016-09-30 17:12:10 +02:00
Daniel Wolf
1f6f6d6175
Added convenience function Timed<T>.getDuration()
2016-09-29 12:06:47 +02:00
Daniel Wolf
f5b7971f52
Refactoring: Replaced audio "length" with "duration"
2016-09-29 12:06:28 +02:00
Daniel Wolf
f44baaa05f
Improve noise detection heuristic
2016-09-29 12:06:06 +02:00
Daniel Wolf
760f6c2ce6
Refactoring and better logging
2016-09-29 10:44:34 +02:00
Daniel Wolf
750078618c
Sharing audio buffer between operations
2016-09-26 13:11:01 +02:00
Daniel Wolf
de05f69507
Fixed compiler warning
2016-09-23 21:15:55 +02:00
Daniel Wolf
2fdd98f5b3
Removed potentially unsafe conversion
2016-09-23 21:15:34 +02:00
Daniel Wolf
938079a75f
Renamed phoneExtraction to phoneRecognition
2016-09-21 10:32:26 +02:00
Daniel Wolf
600b3429a7
No longer discarding "burnt" decoders
...
See https://sourceforge.net/p/cmusphinx/discussion/help/thread/f1dd91c5/#1d89/0491/7f0c/60fc
2016-09-21 10:28:31 +02:00
Daniel Wolf
eea1eb381c
Refactored ObjectPool to correctly handle custom deleters
2016-09-21 10:25:08 +02:00
Daniel Wolf
d97c880754
Performing per-utterance cepstral mean normalization
...
See discussion in https://sourceforge.net/p/cmusphinx/discussion/help/thread/51e2979b/
2016-09-18 22:02:02 +02:00
Daniel Wolf
f4f9ffe883
Logging bin path, hoping to crack that elusive segfault
2016-09-18 22:00:55 +02:00
Daniel Wolf
cf13499158
Caching bin path
2016-09-18 22:00:08 +02:00
Daniel Wolf
0ab009e17a
Workaround for off-by-one error in whereami library
2016-09-11 13:17:52 +02:00
Daniel Wolf
2607b9a12b
Fixed Boost version check
2016-09-11 12:59:09 +02:00
Daniel Wolf
c679b8fb71
Using different xml_writer_settings signature for old Boost versions
2016-09-11 11:40:17 +02:00
Daniel Wolf
261a768e0d
Removed Boost.Predef since it's not available in Boost 1.54
2016-09-11 11:40:17 +02:00
Daniel Wolf
d4b86357cf
Using boost::optional<T>.get_value_or() instead of value_or() for old Boost versions
2016-09-11 11:40:16 +02:00
Daniel Wolf
d98de34b98
Replaced calls to boost::optional<T>.value() with operator*
...
Boost 1.54 doesn't support value() yet, plus * is cleaner
2016-09-11 11:40:16 +02:00
Daniel Wolf
2aef178eb0
Better error messages for incompatible WAVE files
2016-09-10 21:19:12 +02:00
Daniel Wolf
b95a3f621c
Fixed Linux build
2016-08-31 22:21:53 +02:00
Daniel Wolf
8fd78d63cf
Animating pauses only between words, not at start or end of recording
2016-08-11 16:28:04 +02:00
Daniel Wolf
a632e7a3b3
Fixed TSV export
...
Exporter now terminates with shape X rather than A.
2016-08-11 15:49:51 +02:00
Daniel Wolf
81111ef96a
Fixed infinite loop with short recordings
2016-08-11 15:45:16 +02:00
Daniel Wolf
78027ea63c
Thread count can be limited via command-line argument
2016-08-11 10:29:01 +02:00
Daniel Wolf
206cde4658
Supporting noises (breathing, smacking, etc.)
2016-08-11 10:18:03 +02:00
Daniel Wolf
bd1f8226ec
Added TimeRange.trim() method
2016-08-11 10:16:50 +02:00
Daniel Wolf
734d06ad38
Disabling PocketSphinx's VAD
...
We're performing VAD ourselves
2016-08-10 20:46:32 +02:00
Daniel Wolf
a851a76ce5
Minor improvements to animation rules
2016-08-10 20:13:05 +02:00
Daniel Wolf
8b025a3522
Fixed predictive mouth animation
2016-08-10 18:53:01 +02:00
Daniel Wolf
16892ae991
Fixed OS X build
2016-08-10 18:24:24 +02:00
Daniel Wolf
b22378221f
Better AH animation
2016-08-07 20:38:02 +02:00
Daniel Wolf
c65c8b4eb3
Better animation of pauses in speech
2016-08-05 19:34:57 +02:00
Daniel Wolf
1c50ece142
Refactoring
2016-08-05 17:17:25 +02:00
Daniel Wolf
b62fe8af98
Improved timing of bilabial stops ("B", "P")
2016-08-04 22:21:48 +02:00
Daniel Wolf
c566ac56cc
Suppressing log messages in console for non-debug builds
2016-08-04 21:02:40 +02:00
Daniel Wolf
229105a965
Fixed erratic progress display
2016-08-04 20:39:40 +02:00
Daniel Wolf
6888dadd04
Speedup through better multithreading
...
* Fixed excessive locking
* Using more threads for voice recognition
2016-08-04 19:39:43 +02:00
Daniel Wolf
1cb41b8309
Workaround for another kind of decoder corruption
2016-08-03 21:33:13 +02:00
Daniel Wolf
0a577d1947
Fixed audio resampling
...
Audio was cut off due to incorrect length calculation
2016-08-03 20:55:45 +02:00
Daniel Wolf
f356855bbd
Implemented tweening for smoother animation
2016-08-02 22:02:59 +02:00
Daniel Wolf
95d46ef0b7
Re-written animation code
...
* Still uses (almost) the same rules, but more powerful underlying concept
* Re-introduced shape H for "L" sounds
* Introduced shape X for idle position
2016-07-31 21:42:37 +02:00
Daniel Wolf
26cae93478
Refactored audio handling
...
Now audio clips can be passed around as const references
and don't carry state any more.
2016-07-27 21:58:37 +02:00
Daniel Wolf
799f334fa7
Using unique_ptr instead of raw pointers in object pool
2016-07-27 21:44:39 +02:00
Daniel Wolf
b3b2366468
Re-written library code for parallel execution
...
The new implementation correctly re-throws exceptions on the calling thread
instead of terminating the application.
2016-07-27 21:44:39 +02:00
Daniel Wolf
5198ee9230
Made Lazy<T> copyable
2016-07-20 20:16:23 +02:00
Daniel Wolf
17b43ad205
Added class Lazy<T>
2016-07-19 21:33:07 +02:00
Daniel Wolf
ddcadad710
Introduced user-defined literal "cs" for centiseconds
...
Now that ReSharper supports it (see https://youtrack.jetbrains.com/issue/RSCPP-14653 )
2016-07-05 21:17:51 +02:00
Daniel Wolf
0447cbb4ff
Refactored VAD multithreading
2016-06-30 20:52:29 +02:00
Daniel Wolf
8fa494fb77
Improved VAD quality via dry run
2016-06-30 20:42:36 +02:00
Daniel Wolf
6de7ba020a
Fixed VAD error handling
2016-06-30 20:17:28 +02:00
Daniel Wolf
ed27b8470c
Workaround for PocketSphinx bug
...
See https://sourceforge.net/p/cmusphinx/discussion/help/thread/f1dd91c5/#7529
Also minor refactoring.
2016-06-30 20:06:38 +02:00
Daniel Wolf
2c0471e79f
Improved lip animation for B/P and L sounds
2016-06-29 22:35:14 +02:00
Daniel Wolf
2d314f4bc7
Multithreaded recognition: refactoring and fixes
...
* Decoders are correctly released after use
* Determining optimal thread count for multithreading
2016-06-29 21:47:25 +02:00
Daniel Wolf
f13449f810
Added thread info to logging
2016-06-29 21:47:25 +02:00
Daniel Wolf
75407dab54
Augmenting each detected voice activity to give recognizer some silence samples to work with
2016-06-29 21:47:25 +02:00
Daniel Wolf
2a5ed95698
Improved animation quality through new algorithm
...
Using "lazy" ruleset instead of 1:1 mapping from phones
2016-06-29 21:46:08 +02:00
Daniel Wolf
8c9466bcf3
Removed mouth shape H (special shape for 'L' sound)
2016-06-26 21:06:22 +02:00
Daniel Wolf
9bf8355742
Sped up recognition via multithreading
2016-06-26 21:06:21 +02:00
Daniel Wolf
3a0a38575f
Sped up VAD via multithreading
2016-06-26 21:06:21 +02:00
Daniel Wolf
84097756c8
Added ThreadPool class
2016-06-26 14:02:17 +02:00
Daniel Wolf
0aeb35c42e
Fixed deprecated library calls
2016-06-26 11:06:44 +02:00
Daniel Wolf
c9b17e1937
Improved tokenization by taking dictionary into account
2016-06-25 21:52:04 +02:00
Daniel Wolf
f275267ac7
Small VAD improvements
...
* RAII
* Slightly fewer false positives
2016-06-24 22:35:33 +02:00
Daniel Wolf
faa3f2b4bb
Fixed overflow with long audio files
2016-06-24 21:51:17 +02:00
Daniel Wolf
c6c31a831c
Using WebRTC for voice activity detection (VAD)
...
My simple power-based approach wasn't reliable enough.
2016-06-21 22:20:18 +02:00
Daniel Wolf
97f172282d
Fixed off-by-one error in wave file reader
2016-06-21 21:47:08 +02:00
Daniel Wolf
0e00e58d91
Gracefully handling failed audio alignment
2016-06-21 19:20:27 +02:00
Daniel Wolf
944c374415
Migrated to latest CMU Sphinx version
2016-06-19 21:18:40 +02:00
Daniel Wolf
b2f702c8f4
Fixed OS X build
2016-06-16 19:41:49 +02:00
Daniel Wolf
6c9612d2c3
Raised low-pass threshold to better cope with high-pitched voices
2016-06-15 20:14:51 +02:00
Daniel Wolf
4346552312
Improved speed of voice activity detection
...
... by factor 2 by removing second pass.
Also added voice activity detection to progress calculation.
2016-06-15 20:14:51 +02:00
Daniel Wolf
c4b054176c
Fixed WAVE file reader position calculation
...
The bug only showed through massive seek times.
2016-06-15 20:14:44 +02:00
Daniel Wolf
522f6c2019
Made audio stream handling safe for long streams
2016-06-15 20:14:43 +02:00
Daniel Wolf
d1bbe8538e
Added more logging
2016-06-15 20:14:43 +02:00
Daniel Wolf
542a5ee3d8
Added join function for strings
2016-06-15 20:07:51 +02:00
Daniel Wolf
1e29151974
Fixed string conversion for Timed<void>
2016-06-14 17:36:54 +02:00
Daniel Wolf
5cc13cb16f
Improved error message
2016-06-14 17:36:18 +02:00
Daniel Wolf
0d488e8de2
Restored dialog option, this time based on language model
...
This approach should be more robust and error-tolerant.
2016-06-10 22:35:27 +02:00
Daniel Wolf
4ed5908627
Implemented US-English G2P using sound change rules
2016-06-03 20:02:34 +02:00
Daniel Wolf
8be6485685
Implemented string conversion from Latin-1 to Unicode
2016-06-02 22:21:37 +02:00
Daniel Wolf
4d45bf7c89
Merged ascii.cpp into stringTools.cpp
2016-06-02 20:09:37 +02:00
Daniel Wolf
4d95b4c2c5
Implemented text tokenization using Flite
2016-06-02 18:24:27 +02:00
Daniel Wolf
d4b9a8e0c6
Implemented simple conversion from Unicode string to ASCII
2016-06-02 18:24:25 +02:00
Daniel Wolf
f1563919e1
Removing redundant prefixes from PocketSphinx log output
2016-05-17 17:56:11 +02:00
Daniel Wolf
c67e916185
Splitting audio into utterances before processing
...
Advantages:
* No problems with long silences (PocketSphinx doesn't like them)
* Potential for parallelization
* Potential for improved phone timing accuracy
2016-05-17 16:01:10 +02:00
Daniel Wolf
bbc933a821
Temporarily removed --dialog option
2016-05-17 14:28:18 +02:00
Daniel Wolf
2f31c5aa61
Refactoring
...
* Rewriting Timeline<T> to be sparse, i.e., allow gaps
* Added specialized subclasses BoundedTimeline<T> and ContinuousTimeline<T>
* Timed<T> and TimeRange: has-a, not is-a
* Introducing Timed<void>
2016-05-17 14:28:18 +02:00
Daniel Wolf
9eef09145e
Added getPairs function
2016-05-12 21:44:46 +02:00
Daniel Wolf
baf2423b27
Added time manipulation functions to TimeRange and Timeline
2016-04-19 22:06:20 +02:00
Daniel Wolf
895b942df3
Implemented AudioStreamSegment
2016-04-19 22:04:43 +02:00
Daniel Wolf
ce204c68de
Fixed constness
2016-04-19 21:12:44 +02:00
Daniel Wolf
c14fb1c7b2
Fixed output format for structured logging
2016-04-19 19:30:38 +02:00
Daniel Wolf
8d2d100376
Refactored enum serialization/deserialization
2016-04-17 20:22:16 +02:00
Daniel Wolf
44d18d00f8
Added header file to CMakeLists.txt
...
This makes navigation easier for me. Plus, ReSharper didn't like not knowing the header files.
2016-04-14 22:14:57 +02:00
Daniel Wolf
7ce79f9c08
Replaced Boost.Log with small custom logger
...
Boost.Log is a complex monstrosity and I can't get it to build on OS X.
2016-04-14 09:42:47 +02:00