Daniel Wolf
8e1d1fbdd3
Unified acronym capitalization
...
See http://stackoverflow.com/a/27172000/52041
2016-11-16 11:56:52 +01:00
Daniel Wolf
3e34425c11
Refactoring: Split code into multiple projects
2016-11-16 11:01:01 +01:00
Daniel Wolf
1f6f6d6175
Added convenience function Timed<T>.getDuration()
2016-09-29 12:06:47 +02:00
Daniel Wolf
f5b7971f52
Refactoring: Replaced audio "length" with "duration"
2016-09-29 12:06:28 +02:00
Daniel Wolf
f44baaa05f
Improve noise detection heuristic
2016-09-29 12:06:06 +02:00
Daniel Wolf
750078618c
Sharing audio buffer between operations
2016-09-26 13:11:01 +02:00
Daniel Wolf
d97c880754
Performing per-utterance cepstral mean normalization
...
See discussion in https://sourceforge.net/p/cmusphinx/discussion/help/thread/51e2979b/
2016-09-18 22:02:02 +02:00
Daniel Wolf
2aef178eb0
Better error messages for incompatible WAVE files
2016-09-10 21:19:12 +02:00
Daniel Wolf
78027ea63c
Thread count can be limited via command-line argument
2016-08-11 10:29:01 +02:00
Daniel Wolf
206cde4658
Supporting noises (breathing, smacking, etc.)
2016-08-11 10:18:03 +02:00
Daniel Wolf
16892ae991
Fixed OS X build
2016-08-10 18:24:24 +02:00
Daniel Wolf
229105a965
Fixed erratic progress display
2016-08-04 20:39:40 +02:00
Daniel Wolf
0a577d1947
Fixed audio resampling
...
Audio was cut off due to incorrect length calculation
2016-08-03 20:55:45 +02:00
Daniel Wolf
26cae93478
Refactored audio handling
...
Now audio clips can be passed around as const references
and don't carry state any more.
2016-07-27 21:58:37 +02:00
Daniel Wolf
b3b2366468
Re-written library code for parallel execution
...
The new implementation correctly re-throws exceptions on the calling thread
instead of terminating the application.
2016-07-27 21:44:39 +02:00
Daniel Wolf
ddcadad710
Introduced user-defined literal "cs" for centiseconds
...
Now that ReSharper supports it (see https://youtrack.jetbrains.com/issue/RSCPP-14653 )
2016-07-05 21:17:51 +02:00
Daniel Wolf
0447cbb4ff
Refactored VAD multithreading
2016-06-30 20:52:29 +02:00
Daniel Wolf
8fa494fb77
Improved VAD quality via dry run
2016-06-30 20:42:36 +02:00
Daniel Wolf
6de7ba020a
Fixed VAD error handling
2016-06-30 20:17:28 +02:00
Daniel Wolf
2d314f4bc7
Multithreaded recognition: refactoring and fixes
...
* Decoders are correctly released after use
* Determining optimal thread count for multithreading
2016-06-29 21:47:25 +02:00
Daniel Wolf
75407dab54
Augmenting each detected voice activity to give recognizer some silence samples to work with
2016-06-29 21:47:25 +02:00
Daniel Wolf
3a0a38575f
Sped up VAD via multithreading
2016-06-26 21:06:21 +02:00
Daniel Wolf
0aeb35c42e
Fixed deprecated library calls
2016-06-26 11:06:44 +02:00
Daniel Wolf
f275267ac7
Small VAD improvements
...
* RAII
* Slightly fewer false positives
2016-06-24 22:35:33 +02:00
Daniel Wolf
faa3f2b4bb
Fixed overflow with long audio files
2016-06-24 21:51:17 +02:00
Daniel Wolf
c6c31a831c
Using WebRTC for voice activity detection (VAD)
...
My simple power-based approach wasn't reliable enough.
2016-06-21 22:20:18 +02:00
Daniel Wolf
97f172282d
Fixed off-by-one error in wave file reader
2016-06-21 21:47:08 +02:00
Daniel Wolf
6c9612d2c3
Raised low-pass threshold to better cope with high-pitched voices
2016-06-15 20:14:51 +02:00
Daniel Wolf
4346552312
Improved speed of voice activity detection
...
... by factor 2 by removing second pass.
Also added voice activity detection to progress calculation.
2016-06-15 20:14:51 +02:00
Daniel Wolf
c4b054176c
Fixed WAVE file reader position calculation
...
The bug only showed through massive seek times.
2016-06-15 20:14:44 +02:00
Daniel Wolf
522f6c2019
Made audio stream handling safe for long streams
2016-06-15 20:14:43 +02:00
Daniel Wolf
d1bbe8538e
Added more logging
2016-06-15 20:14:43 +02:00
Daniel Wolf
c67e916185
Splitting audio into utterances before processing
...
Advantages:
* No problems with long silences (PocketSphinx doesn't like them)
* Potential for parallelization
* Potential for improved phone timing accuracy
2016-05-17 16:01:10 +02:00
Daniel Wolf
2f31c5aa61
Refactoring
...
* Rewriting Timeline<T> to be sparse, i.e., allow gaps
* Added specialized subclasses BoundedTimeline<T> and ContinuousTimeline<T>
* Timed<T> and TimeRange: has-a, not is-a
* Introducing Timed<void>
2016-05-17 14:28:18 +02:00
Daniel Wolf
895b942df3
Implemented AudioStreamSegment
2016-04-19 22:04:43 +02:00
Daniel Wolf
ce204c68de
Fixed constness
2016-04-19 21:12:44 +02:00
Daniel Wolf
7ce79f9c08
Replaced Boost.Log with small custom logger
...
Boost.Log is a complex monstrosity and I can't get it to build on OS X.
2016-04-14 09:42:47 +02:00
Daniel Wolf
4941bff739
Replaced strerror_s with (less safe) strerror
...
libc++ (Xcode) doesn't seem to support it.
2016-04-13 10:37:10 +02:00
Daniel Wolf
d8fbd3596b
Fixed UnboundedStream constructor
2016-04-13 10:37:10 +02:00
Daniel Wolf
db6f2e076b
Fixed GCC build
2016-04-12 23:04:16 +02:00
Daniel Wolf
90e1375f1b
Handling zero-length audio files
2016-04-12 20:45:47 +02:00
Daniel Wolf
7bc4e37a1a
Improved error handling and error messages
2016-04-12 18:02:52 +02:00
Daniel Wolf
04c828506d
Simplified code using Timeline<T>
2016-04-09 22:07:25 +02:00
Daniel Wolf
2be3751a4f
Renamed TimeSegment to TimeRange
2016-03-28 20:30:55 +02:00
Daniel Wolf
8c1e24e9c8
Implemented voice activity detection
2016-03-16 21:01:44 +01:00
Daniel Wolf
425f47491c
Fixed compiler warnings
2016-03-16 21:01:43 +01:00
Daniel Wolf
a8900f80ec
Removing DC offset from audio
...
Also a bit of refactoring regarding audio processing
2016-03-16 21:01:43 +01:00
Daniel Wolf
ad9d8e6567
Renamed `audioInput` directory to `audio`
2016-03-08 18:21:17 +01:00