Hi Robert,
Great work! We have a need for ICU and I have, in fact, a private build of System.Data.SQLite running with the ICU tokenizer enabled and it works fine (albeit as long as the ICU data files are available...) I was thinking, you could enable ICU by default and distribute data files that only support a sane subset of encodings like English, Spanish, French, German, etc. This will have the effect of shrinking-down the icudtxx.dll data file from ~16MB to around 3-4MB or so. If you're not familiar with how to do this it's quite easy by using the ICU Data Library Customizer (http://apps.icu-project.org/datacustom/) and removing the extraneous conversion tables and locale information (see http://userguide.icu-project.org/icudata). For users that want a different locale, they could provide their own icudtxx.dll file which would get used in lieu of the default one. Once ICU is enabled we can then specify locale-specific tokenizers in table creation statements and resolve most of the internationalization issues...