The Unicode code path in DMDX can either be turned on with the Unicode check box in the main DMDX dialog or a command line switch. Use of the -unicode command line switch overrides the Unicode check box. With the Unicode code path enabled DMDX uses any Unicode \u control words in the RTF file that may be present instead of the ANSI equivalents and marks all regular ANSI characters up by putting \'00 after them so they become Unicode 16 bit characters.
While the Unicode code path through DMDX was initially added to see if we could get DMDX to handle a Tamil font it's also useful with double code page Asian fonts that have MS Word smart quotes in them. Because DMDX when run with the Unicode code path on is now using the Unicode RTF keywords it can differentiate between Word's smart quotes and normal characters so the smart quote detection is permanently off when the Unicode code path is on. Once the Unicode path was added the problem with the Tamil font then became that Windows GDI didn't render the vowel modifiers in the correct places until we went to the Control Panel Regional and Language options and checked both the Complex right to left font handling and East Asian boxes (so you might need to do this as well). (In passing we also noticed in another instance a machine rendering a Hebrew font (so right to left text) where Unicode was off and it was rendering the Hebrew backwards until we installed the right to left language option, no other changes made to DMDX, the Unicode option or the RTF file). Once these options were installed DMDX's displayed output matched Word's and while Word's output is not technically correct (certain glyphs are not displayed) I figure once DMDX's output matches Word's my responsibility is pretty much over. There is another whole Windows Unicode module called Uniscribe that has a couple of new APIs that maybe some day we'll delve into but that day isn't today.
The key indicator that you need to turn the Unicode code path on is getting question marks rendered instead of the desired characters as that's what the Tamil initially looked like. In the past Asian fonts have had ANSI double code page alternatives to the Unicode characters, not so the Tamil. From what I've read using a ? as an alternative to Unicode characters instead of some ANSI code page alternative would appear to be a wide spread practice so perhaps others will come across this as well. Note that you don't want to use Unicode regardless, not the way DMDX is currently designed anyway as there are files that contain double code page font references without any Unicode alternative. Here, when you force DMDX to use the Unicode path it takes the double code page references as Unicode references and you'll get gibberish rendered (although see next paragraph, they may get converted properly with 220.127.116.11 or later). The problem arises because DMDX expects you to either use double code page values or Unicode values and not a mix of the two. When not using Unicode there's arcane code in the GDI font rendering routines that takes F0 for instance, the double code page value for a bar-d and realizes it should use Unicode 111. Turn DMDX's Unicode renderer on however where it passes 16 bit values to GDI instead of 8 bit ones that F0 gets interpreted as Unicode F0 (and Icelandic eth, not what we want). Sometimes the double code page value is the same as the Unicode value, but not always. And you can't just override it by editing the raw RTF and changing \'F0 to \u273 (273 decimal is 111 hex) because next time you run Word it'll go and revert them all back to code page values (WordPad operates the same way). Maybe there's some way to force Word to use Unicode but I'm not seeing it. I have reports that LibreOffice will leave characters as Unicode, you might have cut and paste through NotePad++ or EditPad to get it right though.
So push came to shove and in version 18.104.22.168 of DMDX we finally added code to convert code page references to Unicode 16 bit words, which works fine when there's a code page reference DMDX can recognize in the RTF font specification, which is neither guaranteed (our table is not complete, in fact it may not be possible to build a complete one as for instance I can't find a Japanese code page definition) nor error free so conversions are by no means guaranteed. If you come up with a font that doesn't work you're more than welcome to contact me and I'll see if I can add your code page to the table DMDX uses. This can also fail if multi-byte code page characters are used, DMDX currently only looks up one byte at a time. Again should someone actually find such a usage contact me, it appears that editors these days use Unicode instead of multi-byte code page references...
Macros will require a little extra attention when the Unicode code path is on. This is because regions outside of text segments are still ANSI coded 8 bit characters just like they ever were, only inside text segments is Unicode found and it's not real Unicode but instead marked up RTF hex sequences. So the text segment "~M" with the Unicode path turned on gets marked up to "~\'00M\'00". The macro code is smart enough that it will detect macros used outside of quotes and leave them as ANSI sequences and also detect if a macro is used within quotes and mark the macro body up with \'00 correctly. The problem comes if you want to have special Unicode characters in the macro body as the RTF parser is going to strip them out as they won't be in quotes. And you can't use the double quote as a macro delimiter if there's already another text segment as it will get flagged as a frame with multiple text segments. So this would fail:
The solution is to use the keyword version of the macro definition (that I suspect never worked till I fixed it in 22.214.171.124) where the text segment is within the keyword and won't get flagged as multiple text segments:
But care is still needed as use of macro U outside any text segment will freak things out totally as there's no code that strips Unicode sequences out of a macro body when used outside quotes (if it's a real problem I'll add the code but I can't see someone wanting it and only mention it for completeness' sake).
NOTE:- When the Unicode code path is active text in display frame text segments displayed in the diagnostics will be marked up Unicode. So the syntax check example on the DMDX How to Use It page would have looked like this under Windows 8: