Encode/decode JS entities works on one byte at a time and is not reversible

Issue #1 resolved
Former user created an issue
  • Updated to Notepad 8.3.3 x64
  • HtmlTag was missing after update
  • Re-installed HtmlTag

Test: Following characters require encoding: ä ö ü ß After encoding: Following characters require encoding: ä ö ü ß After decoding encoded text: Following characters require encoding: ä ö ü ß

Official response

Comments (4)

  1. rdipardo repo owner

    The decoding algorithm can only handle single-byte sequences. So, this works:

    \u00E4 \u00F6 \u00FC \u00DF (decode =>) ä ö ü ß
    

    But this is broken:

    ä ö ü ß (encode =>) \u00C3\u00A4 \u00C3\u00B6 \u00C3\u00BC \u00C3\u0178
    

    A file in UTF-8 gives 2 bytes to each character, and the algorithm encodes each one separately.

    That's a limitation of the original author's design (based on pre-Unicode Notepad++). It affects both 32- and 64- bit versions.

    As I said in 82f9b0e,

    More work still needed before utf8mb4 can be encoded *correctly*

    Fixing this will be part of that overall task.

  2. rdipardo repo owner

    If you're running at least Windows 10, here's a way to resolve this issue for the time being:

    • Go to the Control Panel, then “Clock and Region”, and select "Change date, time, or number formats"
    • Click the "Administrative" tab
    • Click "Change system locale..."
    • Check the box labelled "Beta: Use Unicode UTF-8 for worldwide language support” (a reboot is required)

    Here is N++ 8.3.3 (64-bit) on Windows 10 21H2, with the updated system encoding :

    The plugin is most likely calling a standard library function that uses the system's default encoding. What it should do is encode the document's text as Unicode every time, not rely on Windows.

  3. Björn Klug

    I tried your workaround ("Beta: Use Unicode UTF-8 for worldwide language support” checkbox) but it broke all my MS Access 2010 applications, so that is not a viable solution for me.

    Since I’m using this plugin quite frequently I’d by very interested in your estimat when this bug will be fixed.

    [EDIT] Just found the download of version 1.2.2 at https://bitbucket.org/rdipardo/htmltag/downloads/ which works fine again. Thanks!

  4. Log in to comment