DmDX help, audioinput

    To set up an audio task first you get a microphone that works with your sound card. You set the recording properties of the volume control to use the microphone and if there's a 20dB (or 10dB or whatever) microphone boost you turn it on. Alternatively if you have access to a pre-amp (in our experience this is a better setup) you plug the mic into that and you plug the pre-amp's line output into the line input on the sound card and you set the volume control's recording settings to line input. Then you see if you can see signal in DMDX's VOX calibration dialog and set it's settings. Then you include <id DigitalVOX> and or <id RecordVocal> in your item file (and if you use RecordVocal you're probably going to want to set up some overrun protection for it).

    The RecordVocal device simply records from the default audio capture device to a file, it does not generate any button presses. Using this device creates a file for every <ClockOn> in the item file, DMDX uses the name of the item file, the subject ID and the item number to make up the name. Data is stored as a single channel of 22KHz 16bit data that is the current response time limit in length if RecordVocal is running in legacy mode (without any parameters, see the <Timeout> keyword) or is a number of milliseconds longer than the response if triggered by the VOX and a parameter was specified (the parameter being the number of extra milliseconds recording). Output wave files also contain an RT cue if the Digital VOX device is used in conjunction with the RecordVocal device. Recorded files go in the directory the item file is in unless something (like the Avast anti virus) blocks DMDX from creating the .WAV files there in which case you'll get a warning in the diagnostics and the data file indicating that the file has been created in the %temp% directory. You can even play out earlier recordings with the <wav> keyword if you don't use a subject ID, or these days if you do use a subject ID you can use the macro S in a file name to expand to the subject ID.

    The DigitalVOX device (which can be used in conjunction with the RecordVocal device) monitors audio data as it is captured and when it's energy rises beyond a preset threshold the button press +DigitalVOX is generated which is by default mapped to the VOX input. Setting the threshold is done with the Digital VOX calibration dialog.
    The redesigned DigitalVOX code no longer monitors the audio data for the duration of the response time limit (unless the the RecordVocal device has also been included and is running in legacy mode with no parameters), but instead stops as soon as a VOX trigger has been detected.
    There are a couple of special keywords designed to be used in conjunction with the Digital VOX the first being the <DVOXBeepSpec> keyword. Here whenever a correct response is received DMDX will play a specified wave file to facilitate in naming task negation tasks where lip smack and so on can falsely trigger a response.    Another is the <SuppressAudioCapture> keyword that can be used to momentarily stop the actions of both DigtalVOX and RecordVocal in multimodal item files.
    The period parameter (see <InputDevice>) specifies the frequency with which the Digital VOX checks the audio stream for a sample above the threshold, the default being every 100ms. This does not affect the accuracy of the Digital VOX as it calculates the RT from the position of the sample in the stream, however it does affect when the <DVOXBeepSpec> beep can be played. For instance with the Digital VOX checking for samples every 100ms (with <id digitalvox>)the beep will be delayed up to 100ms from the time the sample was recorded that was above the threshold, with <id digitalvox 50> the beep will only be delayed up to 50ms.

    Normally there is only one audio capture device on a system and the sound card mixes all the various inputs together (like a MIC input and LINE input) and the sound card's software mixer control panel is usually used to separate the various inputs as it has the ability to mute certain inputs and to set their relative gains. There are several things to note here:
    1/ DMDX uses DirectX for it's audio input and some older audio cards may not provide DirectX drivers (although they may well function as normal Windows audio input devices).
    2/ Due to the rather well integrated nature of sound output in DMDX sound cards are likely to have to support what is known as Full Duplex data flow, they should be able to support simultaneous output and input at the same time (as opposed to Half Duplex where only one operation at a time can be performed). Happily the rather old Sound Blaster 16 Plug and Play in the test system is capable of this so I imagine this won't be much of a problem. Should TimeDX or DMDX not be able to initialize the capture device you might be able to assign two DMA channels to the sound card if it has the capacity for two but only has one assigned.
    3/ Earlier sound cards also tended to mix the output into the input as well, if you are playing wave files with the <wav> keyword at the same time as using the Audio Input devices there is a strong chance you will see that audio output in the audio input. Indeed, this is the mechanism I used to test how well everything was working, these days with Windows 10 you probably have a to select the Stereo Mix device as the default recording device to achieve the same results. I made a sound be played 1000ms after the clockon and checked the RTs, they were all 1000ms +/- 2ms (under Windows XP, latencies are much higher under Windows 10 where DirectSoundCapture is emulated, as is DirectSound for that matter). This could even be used to calibrate the sound latency, assuming you could somehow tease out capture buffer startup time from sound output startup times. These files (capturetest.rtf and 1sec50ms.wav) are now in demos.zip on the web site (and the whole process is elaborated upon in the RecordVocal notes).

    Another thing to do with microphones and win32 is the mixer's irritating habit of not displaying the control that enables the microphone input on a sound card. With the standard mixer you have to use the Options / Properties / Recording button, then under "Show Volume controls" select the Microphone if it isn't already and then after you've pressed Ok you can select the Microphone and you can actually then use the thing. Urgh.

    Some people need to change the overrrun protection parameter to stop the audio device getting out of phase with DMDX. I usually recommend something like <id RecordVocal 300,500> <id digitalVOX> which makes it record 300 ms of vocalization after the vox triggers. The 500 ms of overrun protection may need to be bigger, may need to be smaller, symptoms are that the recorded vocalization isn't starting with the clockon (to test it you can play a wave file some fixed time after the clockon and examine the recorded wave files, see the capture test in the Record Vocal notes). Note that in this instance if your timeout was 500 ms or less then the DigitalVOX code would never actually poll for voice energy as it uses the overrun period when it's provided for it's polling period (otherwise it's 100 ms), instead you should explicitly specify a polling period for the DigitalVOX (so <t 500> <id recordvocal 150,500> <id digitalvox 100> for example).

    A number of people use Athanassios Protopapas's CheckVocal: (if that link's busted try this one).to remove lip smacks and so on from recorded naming task wave files. Here's an explanation of CheckVocal's use from Thanassi:

1. We don't use the VOX from DMDX; there is nothing wrong with it as far as I know, but it is easier with the automatic marking plus manual check and correction in CheckVocal. We only use RecordVocal in DMDX and then just get the WAV files through CheckVocal, selecting "use RT marks from CheckVocal" for auto-triggering. In this way we don't need to be concerned with setting/adjusting/checking the voice key threshold, and the computer has one less time-intensive thing to do.

2. Audio cards have been flaky in many ways, most of which have turned out to be benign, except one really nasty problem, which, unfortunately pops up occasionally without warning: On some machines recording does not really start at time zero (the * timing mark) but some random time later (varies by trial). We realize this is a problem when we encounter chopped off responses or otherwise unreasonably low RTs. You need to check each machine exactly as configured through a few runs to minimize the probability of this (unfortunately it is not systematic, so there are no guarantees; mercifully it occurs very rarely). Ensuring updated audio drivers may help.

3. Using somewhat decent headsets can make a huge difference both in sensitivity and noise. Expensive desktop microphones on stands can be worse than much less expensive headsets due to the distance and sensitivity to head motion altering the angle. If you don't spend $10 but $30+ on a headset, that can save you hours of work. You'd think you need a powered and ground-balanced mike for good performance, but actually this does not make a lot of difference, you can just ask CheckVocal to "Remove DC" and then auto-triggering will work fine. We used to only work with audio jack but have found that USB headsets actually work fine (although I am still a bit nervous about the actual timing); I guess there's enough temporal noise in the reponses that this won't be a decisive factor.

4. Ambient noise is VERY important. You must conduct the experiment in a quiet room without any extraneous speech or other noises (i.e., no window to a street with passing motorcycles) and no buzzing devices (refrigerators, A/C, fluorescent lights).

5. Proper placement of the headset microphone is key. The microphone should lie well to the side of the lips, a bit below or perhaps above, but certainly outside the air streams of both mouth and nose, otherwise you will get a lot of air noise and mini "explosions" with stop consonants.

We always use CheckVocal with the left hand fingers (ring, middle, index) on C-V-B and the right hand on the mouse, this makes this go really fast after practice with a few participants' responses. Listen to what's captured before and after the mark (G/space), use N to hop over to the next sound, and do NOT zoom (just wastes your time). Use the spectrogram (lower panel) to see where speech begins (you will learn to perceive this with practice). You will probably need to adjust boundaries for words starting with /h/, /f/, /θ/ as these are often missed by auto triggering.

An experienced experimenter can process upwards of 1000 single-word trials per hour with CheckVocal when recorded in noise-free conditions with the default threshold of 45 db (and closer to 2000 trials per hour if they are repetitive, e.g., as in a Stroop task)