medoc avatar medoc committed b912dac

Utf8: first coding error is fatal, stop processing (was not handled at all previously. A few random happy sequences, and UTF-8 was detected. Happened for example with TIS-620 for some reason

Comments (0)

Files changed (1)

libcharsetdetect/mozilla/extensions/universalchardet/src/base/nsUTF8Prober.cpp

 {
   nsSMState codingState;
 
+  if (mState == eNotMe)
+    return eNotMe;
+
   for (PRUint32 i = 0; i < aLen; i++)
   {
     codingState = mCodingSM->NextState(aBuf[i]);
       if (mCodingSM->GetCurrentCharLen() >= 2)
         mNumOfMBChar++;
     }
+    else if (codingState == eError)
+    {
+      return mState = eNotMe;
+    }
   }
 
   if (mState == eDetecting)
 
 float nsUTF8Prober::GetConfidence(void)
 {
+  if (mState == eNotMe)
+    return 0.001;
   float unlike = (float)0.99;
 
   if (mNumOfMBChar < 6)
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.