DmDX help, RecordVocal notes

DMDX Help.

RecordVocal notes.

It's impressive that over the years the RecordVocal input device has remained as problematic as it has but due to the singular predilection sound device driver writers have for making the same mistakes over and over DMDX continues to expose them. A lot of the time just downloading new drivers will fix the issue however sometimes you just have to work with the drivers you have. To do that you have to experiment with parameters to RecordVocal when you specify it as an input device, all of which is covered elsewhere (principally the two pages I've linked to here) but I might as well go over it in some detail in one location.

How do you tell when it's not working? Principally you see that the vocalizations are all over the place in the saved WAV files. Of course if you're running an experiment this might not be so obvious as subject's vocalizations do tend to vary their onsets. So you use the caputuretest.rtf item file in demos.zip on the web site similar to this:

<ep> <cr> <azk> f10 <t 4000> <id "keyboard"> <vm desktop>
<id recordvocal 150,777> <id digitalvox 77> </ep>
0 "This itemfile is for RecordVocal/DigitalVox testing." @-2 <set 1,1000>,
"WAV output must be looped to Input" @-1,
"(so the TimeDX Sound Latency test works)",
"RTs should be in the 1000ms range" @1;

+1 "first" / <wav 2> "1sec50ms" %0 <svp start> / * <%ms 1000> / "beep" ;
+2 "second" / <wav 2> "1sec50ms" %0 <svp start> / * <%ms 1000> / "beep" ;
+3 "third" / <wav 2> "1sec50ms" %0 <svp start> / * <%ms 1000> / "beep" <dec 1> <bicgt 1,0,-1> ;
0 L "The End" ;

This file plays a 50 ms beep 1000 ms after the clock is turned on and assuming you've set the Digital VOX up correctly and your sound card is mixing the wave output into the sound input you'll see 1000 ms RTs. How you achieve this can be by a number of means, whether by connecting the line output to the mic or line inputs or by setting the Recording Device to use the wave output in the sound card control panel or by setting the Recording Device to be the Stereo Mix instead of the mic an option present these days under Windows 10 that a lot of sound cards provide. This Stereo Mix device mixes the sound output into the input which is of course exactly what we want. Seems to be disabled by default but that is easily overcome and once Stereo Mix is selected as the default recording device you should be good. Of course nothing is ever perfect so you'll see some few milliseconds of variation (typically less than 5 ms) and should your machine have latencies recording and playing audio files you'll see those in there as well (30 ms is not unusual on older hardware, on later Windows 10 machines where DirectSound and Capture are emulated I see 50 ms of latency). You could remove them with the TimeDX Sound Latency test. If however you see wildly varying RTs (hundreds of milliseconds) then you'll have to play with the overrun protection that RecordVocal has. This basically makes the audio buffers DMDX uses larger and for some reason this appears to get around the errors. Overrun protection is specified with the second parameter and a typical value I've seen work has been 500 ms although others report values as large as 1500 being needed (so <id recordvocal 150,1500>), the default value is 50 ms, here I've used 777 ms. Pure experimentation determines the overrun protection parameter, there are no metrics I know of and I've always used a binary chop sort of method when I've needed to specify overrun other than the fact that the overrun protection should never be longer than the timeout period if you're using the DigitalVOX device as well as the RecordVocal device as the DigitalVOX code uses the overrun protection value as it's wake up period to examine newly acquired voice data and if that period is longer than the timeout it will never examine any data unless of course you specify a polling period in your DigitalVOX specification (so <t 500> <id recordvocal 150,1500> <id digitalvox 100> for example). There should not be any penalty for having a large overrun protection other than the fact that you're wasting memory but with the amount of RAM in machines these days so what.

Occurred to me that one could of course automate the calibration of the capture latency with test mode 9, so I did using this item file. On my Windows 10 laptop where DirectSoundCaputre is emulated I see these sort of results:

! Test Mode 9 results
! RT Latency Mean: 1051.59, Standard Deviation: 4.81
! RT Latency of 1048.0ms happened 63 times
! RT Latency of 1058.0ms happened 36 times
! RT Latency of 1047.0ms happened 1 time

Fairly curious having a 10 ms gap in the RTs, I'm guessing that's some sort of internal Windows thing as 10 ms just isn't a time that occurs in DMDX (the refresh rate here was 60 Hz, indeed I ran the test with the freesync video modifier to totally decouple display considerations and the 10 ms artifact remained). Concerned that perhaps it was the RecordVocal device interfering I commented it out and while it changed the results a little (because the audio buffers are different lengths) it's clearly not the culprit. And no, I wasn't using the enhanced vox either so it's window wasn't affecting things. YMMV of course but I suspect it's just another argument for rewriting the whole audio layer to use a more modern interface but after doing a bit of reading it looks like Windows 10 audio buffers are 10 ms in length which is awfully suspicious given our results. Admittedly there are new interfaces that would allow us to have smaller audio buffers that would probably lower that 10 ms but that's not likely to happen any time soon...

! Test Mode 9 results
! RT Latency Mean: 1051.88, Standard Deviation: 3.13
! RT Latency of 1053.0ms happened 88 times
! RT Latency of 1043.0ms happened 11 times
! RT Latency of 1051.0ms happened 1 time

DMDX Index.