Daniel Wolf
c9b17e1937
Improved tokenization by taking dictionary into account
2016-06-25 21:52:04 +02:00
Daniel Wolf
c6c31a831c
Using WebRTC for voice activity detection (VAD)
...
My simple power-based approach wasn't reliable enough.
2016-06-21 22:20:18 +02:00
Daniel Wolf
0e00e58d91
Gracefully handling failed audio alignment
2016-06-21 19:20:27 +02:00
Daniel Wolf
944c374415
Migrated to latest CMU Sphinx version
2016-06-19 21:18:40 +02:00
Daniel Wolf
4346552312
Improved speed of voice activity detection
...
... by factor 2 by removing second pass.
Also added voice activity detection to progress calculation.
2016-06-15 20:14:51 +02:00
Daniel Wolf
d1bbe8538e
Added more logging
2016-06-15 20:14:43 +02:00
Daniel Wolf
0d488e8de2
Restored dialog option, this time based on language model
...
This approach should be more robust and error-tolerant.
2016-06-10 22:35:27 +02:00
Daniel Wolf
f1563919e1
Removing redundant prefixes from PocketSphinx log output
2016-05-17 17:56:11 +02:00
Daniel Wolf
c67e916185
Splitting audio into utterances before processing
...
Advantages:
* No problems with long silences (PocketSphinx doesn't like them)
* Potential for parallelization
* Potential for improved phone timing accuracy
2016-05-17 16:01:10 +02:00
Daniel Wolf
bbc933a821
Temporarily removed --dialog option
2016-05-17 14:28:18 +02:00
Daniel Wolf
2f31c5aa61
Refactoring
...
* Rewriting Timeline<T> to be sparse, i.e., allow gaps
* Added specialized subclasses BoundedTimeline<T> and ContinuousTimeline<T>
* Timed<T> and TimeRange: has-a, not is-a
* Introducing Timed<void>
2016-05-17 14:28:18 +02:00
Daniel Wolf
8d2d100376
Refactored enum serialization/deserialization
2016-04-17 20:22:16 +02:00
Daniel Wolf
7ce79f9c08
Replaced Boost.Log with small custom logger
...
Boost.Log is a complex monstrosity and I can't get it to build on OS X.
2016-04-14 09:42:47 +02:00
Daniel Wolf
90e1375f1b
Handling zero-length audio files
2016-04-12 20:45:47 +02:00
Daniel Wolf
04c828506d
Simplified code using Timeline<T>
2016-04-09 22:07:25 +02:00
Daniel Wolf
a8900f80ec
Removing DC offset from audio
...
Also a bit of refactoring regarding audio processing
2016-03-16 21:01:43 +01:00
Daniel Wolf
35ec1f8a45
Introduced template functions to unify enum<->string conversions
2016-03-08 22:20:40 +01:00
Daniel Wolf
ad9d8e6567
Renamed `audioInput` directory to `audio`
2016-03-08 18:21:17 +01:00
Daniel Wolf
b78e418a8f
Refactored audio streams
...
* All streams are now mono (simplifies reasoning about samples)
* Streams can be cloned
* Streams can be seeked within
2016-03-07 21:28:31 +01:00
Daniel Wolf
04ca644cca
Added structured logging
2016-03-03 22:31:16 +01:00
Daniel Wolf
cdffb56613
Redirecting pocketsphinx log to main log
2016-03-03 22:31:16 +01:00
Daniel Wolf
7a1f446ca3
Using GSL
2016-02-29 20:58:58 +01:00
Daniel Wolf
667edf9485
Improved dialog handling
2016-02-10 21:53:58 +01:00
Daniel Wolf
05ef692706
Added (primitive) option to explicitly supply the dialog
2016-02-09 22:08:11 +01:00
Daniel Wolf
75872fe45d
Using -dither to prevent recognition errors in connection with zero silence
2016-02-01 20:26:14 +01:00
Daniel Wolf
7aa6057b8e
Allowing for long pauses in speech without breaking sync
2016-01-28 21:52:50 +01:00
Daniel Wolf
c425885929
Showing combined progress for entire task
2016-01-28 19:13:40 +01:00
Daniel Wolf
8e7fcc4efe
Implemented two-step phone detection for better accuracy
2016-01-28 14:19:32 +01:00
Daniel Wolf
2bfe671f82
Simplified directory structure to make Visual Studio build work
2016-01-08 16:59:18 +01:00
Daniel Wolf
0f33fcfbd0
Removing zero silence, seems like Sphinx doesn't like it
...
See http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor
I couldn't reproduce the original problem, but it doesn't seem to hurt, either.
2016-01-08 16:44:03 +01:00
Daniel Wolf
31cb3b195c
Showing progress bar
2016-01-08 10:53:35 +01:00
Daniel Wolf
5c0fe24fae
Refactoring: Using camelCase throughout
2016-01-06 20:47:37 +01:00