Source

uchardet-enhanced / langstats / README.txt

Diff from to

langstats/README.txt

    mkcharstats french/french_cp1252.txt | sort -nr +2 > \
          french/charstats_french_cp1252.txt
 
- - Edit the resulting file, get rid of punctuation and numbers keep the rest
+ - Edit the resulting file, Just get rid of a few lines that break the
+   following step (the first one, the last one and the one for space (0x20)
 
  - Run mkpairmodel.py to produce the c++ language model. There are two
    phases, to produce a correspondance table from code point to order in
    mkpairmodel.py french/charstats_french_cp1252.txt \
                   french/french_cp1252.txt             > LangFrenchModel.cpp
 
- - Add header, license etc. to cpp file and integrate with the rest of the
-   models 
+ - Integrate with the lib c++ code (3 files to change to resize the array,
+   declare/define the tables: nsSBCharSetProber.h, nsSBCSGroupProber.cpp
+   nsSBCSGroupProber.h