The
monitor command allows one to nominate
two existing counters to monitor the maximum sample value the VOX encounters and
the maximum sibilant count. When not using the enhanced VOX the sample
range will be from zero to 32768 and the sibilant counter will not be used (you
still need to specify it on account of the fact that I doubt anyone will ever
not use the enhanced VOX). When using the enhanced VOX the counters are a
sum of the samples over the window duration so for the default 30
millisecond window used at the 22 KHz the VOX runs at that's roughly 22 * 30
samples so values can be in the hundred thousands even for silence.
The
parameters
command stores the following parameter list in the registry overwriting whatever
settings might have been set with the VOX calibration dialog and tells the VOX
to reload it's parameters from those registry keys meaning that once set all
subsequent runs of DMDX will use those parameters for the VOX. Parameter
N1
is sample threshold to trigger the VOX (so 0..32768 for unenhanced and up to
32768 * 22 * N3
with N3
being the window duration in milliseconds). Parameter
N2 is a boolean switch to
turn the enhanced VOX on (so 0 or 1). Parameter
N3
is the sliding window duration in milliseconds when the enhanced VOX is in use.
Parameter N4
is the sibilant threshold and parameter N5
another boolean switch the turns the high frequency 11 kHz filter on (affects
the sibilant count significantly).
N1 sample
threshold
N2 enhanced VOX
N3 sliding window duration
N4 sibilant threshold
N5 high frequency filter
So for example here's the script I used
testing the VOX commands with that monitors the energy the VOX is encountering.
First off it's setting the VOX parameters sufficiently high that the enhanced
VOX can't trigger and end the task then it loops back on itself till the user
responds with a positive keyboard response displaying the highest values seen
from the previous half second's recording. Then it switches to unenhanced
operation and doing the same thing then setting the parameters back to something
more reasonable.
<ep> <VideoMode
desktop> <t 500> f15 <id Keyboard>
<id recordvocal 500,77> <id digitalvox> <nfb>
<cr>
</ep>
0 <vox parameters 999999999, 1, 30, 999999999, 1>
<dfm
2 stat> "VOX command test", "response to terminate" @2
<set c1=0>
<set c2=0> <vox monitor c1,c2>;
+1 d2 " sam %-6d sib %-6d" <sprintf c1,c2>
<set c1=0> <set c2=0> /
! "recording" <ln 2> * <binr -1>;
0 <vox
parameters 32768, 0, 0, 0, 0>
<dfm 2 stat> "VOX non enhanced", "response to
terminate" @2;
+1 d2 " sam %-6d sib %-6d" <sprintf c1,c2> <set c1=0> <set
c2=0> /! "recording" <ln 2> * <binr -1>;
0 <vox parameters 1000000, 1,
30, 200, 1>
"Thank you, that's the end.";
And here is the heart of what was going to be the self titrating VOX calibration task
(wound up not being titrating because it gets to actually sample the energy the
VOX is seeing as opposed to having to wildly guess and refine it's guess over
multiple trials which is what I was initially thinking I'd be doing) that's in
the remote
testing section, basically it determines what the background noise is then
samples some actual syllables and sets the thresholds appropriately (it might
be a little aggressive in borderline cases using only 1.3 times the noise in
poor situations, time will tell I suppose):