dbz export fails if non-latin character is in the destination file path

Issue #253 resolved
Roman Evstifeev created an issue

plugin version: current master

When I export dbz file from daz, it says that the file is exported sucessfuly, but the output dir is empty. This happens if any non-latin character is in the file path:

Comments (38)

  1. Alessandro Padovani

    Roman, I guess it may also help if you can post your daz folders in the global settings, for example below is mine. Personally I find it odd that python can’t handle unicode folders and labels as in commit 390268a, but I know very little of python.

  2. Roman Evstifeev reporter

    How is my daz content directories related to this issue? The DAZ dbz export script asks where to store the file every time, and it don’t use any of the dirs configured in the blender add-on

  3. engetudouiti

    @Roman

    It seems daz script problem (I do not know if it can handle those file path).

    Then could you save the scene as duf in the directory from daz studio without problem?

    Though I suppose this plug in can import duf (with dbz) file which not saved in daz contents directory, but most of case user simply export dbz in one of daz contents directory sub folda, where daz scene file saved (duf) as same as other duf scene file (or preset , scene sub-set etc)

    Even though it will be improved, I had ofen seen same issue, when non latin character is used for file path or file name ,it cause issue for daz contents load correctly. (not about this plug in but about daz studio, load morph etc so you may better avoid non ratin character to use for file path and file name. without you really need to use it.

  4. Alessandro Padovani

    @Roman That’s for the blender part, it may help to check the importer when you have to load the duf and dbz that are to be stored together. Also keep in mind that the plugin is designed to work with the daz folders in the global settings.

    @engetudouiti As for unicode I believe we have to distinguish. What's inside the daz content folders has to be ascii to work international because the daz assets are distributed in zip format that doesn’t support unicode. As for the content folder path itself, that’s what we have in the global settings, I believe it should be allowed to be unicode, because that’s inside the windows user folder.

    And anyway if we use unicode for local distribution the zip format should work fine because non-ascii characters are translated back with the same system locale. The issue should arise only for international distribution.

  5. Thomas Larsson repo owner

    As reported in #246, exporting with non-ascii but utf-8 characters seems to work. 16 bit unicode is a different matter, though. According to the information at http://docs.daz3d.com/doku.php/public/software/dazstudio/4/referenceguide/scripting/api_reference/object_index/script_dz, DS supports three types of script files:

    TextScriptFile - Plain text file - no Unicode support
    DAZScriptFile - Binary DAZ file format - supports Unicode characters
    EncDAZScriptFile - Encrypted Binary DAZ file format - supports Unicode characters
    

    which I think have extensions .dsa, .dsb, and .dse, respectively. So I saved the export script as a .dsb file.

    There are now two files in the Scripts > Diffeomorphic folder with the same Export Blender icon. Could you try the .dsb file and see if the problem persists. I also suspect that even if the file can be exported, there will be problems loading it into Blender.

  6. Roman Evstifeev reporter

    dsb file does not work at all. When I double-click on it, the error appears in the log:
    2020-11-10 16:39:17.200 Loading script: C:/users/Public/Documents/My DAZ 3D Library/Scripts/Scripts/Diffeomorphic/export_to_blender.dsb
    2020-11-10 16:39:17.201 Failed to load script: C:/users/Public/Documents/My DAZ 3D Library/Scripts/Scripts/Diffeomorphic/export_to_blender.dsb

  7. Thomas Larsson repo owner

    I see. Well, it was a long shot anyway. On my file system with only utf-8-compatible paths, both the dsa and dsb script work equally well.

    If anyone who is on a unicode file system (engetudouiti?) has an idea, I would like to know.

  8. engetudouiti

    I test same-thing, yes if I use special character for directory, like “Ω” script could not generate dbz or json file. (I can not catch erroer log do not know way), ds could save duf with same file path.

    , maybe encode URI encode URI componetns, then need to convert file path which script get path

    from doFileDialog?

    I test run with script IDE with dsa only. but could not catch erroer at all. I afraid, if it generate another folda..

    http://docs.daz3d.com/doku.php/public/software/dazstudio/4/referenceguide/scripting/api_reference/object_index/global#a_1a19f7c8dbe468902c60c710403037058f

    And script try to auto set path from duf, then at least file dialog can correctly find the path and open the directory. (it include the character in path)

  9. engetudouiti

    I could detect problem . if I edit script to, not use dzgzfile, but use dzfile, it work.

    So it seems, dzgzfile (zip) can not generate file to the directory..

    maybe data becom huge but it may need untill daz change the dzgzfile I think)

    So I think ask about it in daz script forum, seems best place.

  10. Thomas Larsson repo owner

    So you are saying that you can save the file as ascii but not as zip. Can you import the file into Blender? I suspect that you have to do some special trick in python to read unicode.

    As for creating the dsb, there is simply a menu item in the script IDE pane that does that. You can make encrypted dse files too.

  11. engetudouiti

    Yes I could save (export) json in the “Ω” directory, as non zip json with use DzFile (and it was what we did untill you offer zip version, I suppose)

    Then I did not test to import from blender. if the file path cause issue to import blender after all I just recommend “so it not work then you may better not use the path which include special letter”

    even though it can be solved with some trick, (export from ds and could import by importer tweak code about this importer ) if there will be some new plug in or daz script , need to ask the author.

    (I never use “japanese letter” for path or file name. simply because there are many case it not work. )

    but anyway I suppose at least python offer way for unicode path.

    https://docs.python.org/3/howto/unicode.html

  12. engetudouiti

    Then I test today.. yes if use json to export scene, we can use those unicode directory and file. path name.

    then at least for me, I could import the “…./λ/Ω.json” to blender without any efforts. it not so complex but set some out-fits and three hair with one grafted items.

    But I just report what I could see. so do not say it work for all . and I may not request it, because

    basically, I never recommend to use those letter path , file name, or labell etc, for app without it made as localize version.

  13. Thomas Larsson repo owner

    Should work now. I have tested with a character called Наташа and she can be both exported from DS and imported into Blender.

    As engetudouiti suggested, it is gzip that is the problem, while uncompressed files can be exported. So the simplest fix would be to never compress, but I don’t like that because compression saves a lot of space if it works. Instead, the export script writes to a temporary text file. If it can be compressed, the temporary file is deleted, otherwise it becomes the dbz file.

  14. Alessandro Padovani

    As I know it, zip doesn’t support unicode. But this doesn’t mean it can’t compress unicode text files. Just it doesn’t support unicode file names inside the zip folder. While the uncompressed unicode text file stills unicode and so it is portable among different regional settings.

    It is different the case where the text file is saved as ascii, thus non unicode. In this case the uncompressed text file will get different characters with different regional settings. But it will get the same characters with the same regional settings.

    The ascii table for character conversion is determined by the windows system locale, that the user can change for non unicode programs, but this is a global settings so it affects all the non unicode programs, other than zip.

    For example if I have to uncompress a zip file that I know it was zipped in japan and contains both japanese file names and ascii japanese text files. Then I change the system locale to japanese, then unzip the file so I get the japanese file names, then I re-save the ascii text files as unicode so I get the international version of them, then I can go back to my system locale.

    As a side note rar does support unicode so we can have unicode file names inside a rar folder. But if the rar archive contains ascii text files they will anyway be extracted wrong in a different system locale.

  15. Alessandro Padovani

    Since daz distributes content in zip format the following rules should apply to PAs for international distribution. Please note that for local distribution the zip format and ascii files do just fine.

    1. There should not be international characters in the file names or path inside the daz content folders, since these will not transfer over different countries.
    2. If the duf file contains international characters then it has to be saved as unicode, not as ascii, otherwise it will not transfer over different countries.

    Then I do not know if duf files are intended to support unicode and if daz studio saves duf files as unicode. If not then international characters should be avoided for international distribution. In this case I guess it is not a bug for the plugin to don’t support international characters if daz studio itself doesn’t.

  16. Alessandro Padovani

    Did a quick test. I opened the attached dsa, when I save it notepad shows the dsa is ansi, so it was not saved as unicode by daz studio.

  17. Alessandro Padovani

    Did another quick test. I created a cube scene in daz studio and renamed the cube to キューブ that's the google translator for cube in japanese. This way I see with the notepad that the duf file is unicode. But, if I only use ansi characters in the scene, so I let the cube named cube, then the duf file is ansi and not unicode.

    Cube scene attached. My system locale is USA.

    Also, with commit 806e73d, the japanese cube is imported fine in blender.

  18. Alessandro Padovani

    This time I changed my system locale to hindi and imported a scene with cube names in japanese, russian and greek. It seems to work fine too.

  19. Thomas Larsson repo owner

    The cube files work fine here too, both with unmorphed and dbz fitting. The dbz files are created but not compressed. The encoding is ANSI, but the dbz files don’t contain any non-ascii characters since labels are not exported anymore. Probably there will be problems if the files contain assets whose names or relative paths, and not just their labels, contain non-ascii characters. But that is probably a much less common case than the user saving the files to a path containing unicode characters.

  20. engetudouiti

    The problem is not “wheter zip can compress or uncompress unicode path include files”, even though we can do it, we need to use daz script class to manage compressed zip files. (generate new file on the path, then write data and save)

    Then to manage files, we can choose 3 classes for the purpose DzFileDzGZFile and DzZipFile

    for zip files, we need to use DzGZFile or DzZipFile class . When I test, if I make DzGZFile witch set the unicode directory/ or file name as path, it seems make instance and keep writing when I set Path (and file name which will be saved), on the time I can not see any erroer. , but after all when code close() file, it not generate as gzip (compressed) file on the path. I can not get erroer or log when script failed.

    Though I do not test, but even though DzGZFile could open the already generated zip file in the directory, we need to generate new file, edit and save. then if save (or generate new) not work,, we can not export zip file via daz script.

    From daz documents,  A high-level interface for zipping/unzipping files is provided with the functions zip() and unzip(). Also, a low-level interface is provided, allowing scripts to read and write compressed files directly.

    so I felt, if use zip() and unzip() in code, if I can generate dbz which include “Ω” in file path, but, to be frankly said, I do not think it is good to offer the option . Because after all the user may see same problem with another script, or add on.

    If I bought expensive aprication, it already localized, I may complain and send mail to developer, or sell company.

    but as basic rule,, when use most of aprication which provided mainly for English User, I simply follow the rule. and may not take time much. eg we can not wirte python function with japanese. we can not write Daz script class. with japanese. but we do not complain it.

  21. Alessandro Padovani

    @engetudouiti

    The zip format can’t handle unicode for file names and path, they will be stored as ansi in the zip archive and will only work in the same system locale. This is not a daz library limit, it is a zip format limit. This is why PAs should only use ansi characters for file names.

    Please note that my cube examples are zipped duf and contain international labels. They also have unicode names for the duf file but the duf file itself is not zipped, so they work fine.

    edit. Then I don’t understand why DzGZFile fails with a unicode file name, may be it is supposed to get the file name as ansi. This also makes sense since the zip archive expects ansi file names anyway.

    @Thomas

    I’d suggest to use utf-8 to encode the dbz, so we can handle international labels if we’ll need them. Also there is no reason to avoid zipping the dbz, since unicode will be preserved. You can see that my cube examples are zipped dufs, and the duf itself is coded as utf-8, this is done by daz studio when I save the scene I did nothing special myself. And they work fine with the international labels.

    What will not work are file names and path with international characters inside the dbz, if they are used on a different system locale. But they will not work anyway either if you zip or don’t, and either if you use ansi or unicode. The issue with file names and path is that they are always encoded as ansi in the zip archive and this way they can’t work in a different system locale if international characters are used and the zip archive is used to distribute the content.

    Please note that the rar format does support unicode file names. So if rar is used instead of zip to distribute the content, then file names and path with international characters will work on a different system locale if the dbz is encoded as unicode, but will not work if the dbz is encoded as ansi.

    edit. What I mean for “international characters”. The ansi (or ascii) encoding defines 128 standard characters that are always the same independently from the regional settings, these characters work fine in any pc in the world. Then the windows system locale extends the ansi set to add regional characters and the same codes are reused to map different characters for each region, so these extended characters only work within the same system locale. Then unicode (or utf-8) always extends the ansi set but with a variable lenght encoding so it can include all the languages in the world and more, so it always works independently of the regional settings.

    The zip format only supports the ansi set (not extended) for file names and path inside the zip archive. Then within the same system locale an extended ansi also works with zip because it will refer to the same regional table. The rar format supports unicode for file names and path inside the rar archive.

  22. engetudouiti

    @Alessandro

    I partially understand what you means. because I can zip (compress) json, after generate json (as non zipped), with use same path and file name, like I:\りんご\Ω.json to Ω.zip without problem. the file can locate in same directory.

    Then you say it only work with local. but after all we only use those file path and name with local. (untill we hope to publish them as vendor etc) for most of user.

    So the problem happen only when (at least we open and generate file with my local PC ) generate new gzip which compressed file via daz script.

    because it can not generate the same file path and name dbz file, with use the class.

    For me the problem is caused the limit of daz script . or I suppose there is good way (by daz script) to generate gzip file correctly only for my local pc.

    maybe it is related decode and encode funciton, but after all I still can not find way, and you recommend “ utf-8 to encode the dbz”

    so actually how you do it with daz code? it is reason why I talk about daz script, not talk about how zip or rar work. because I already know it can zip those file path and name for my local pc.

  23. Thomas Larsson repo owner

    Encoding the dbz with utf-8 sounds good, but how do I do that? In the documentation for DzFile (http://docs.daz3d.com/doku.php/public/software/dazstudio/4/referenceguide/scripting/api_reference/object_index/file_dz#a_1a45153e842709466a940079b370bd6f40a35d0dd9a40755601b657244976bfc14b) I don’t find any way to choose the encoding. There is a Text flag, but that just seems to handle newlines (\n vs \r\n). The encoding of the dsa file itself does not seem to matter, as long as it is 8 bit.

  24. Alessandro Padovani

    @Thomas @engetudouiti

    Well the daz script is definitely out of my domain of expertise. But I see the string type is unicode, so I suspect it may save as utf-8 when international characters are used. I can’t copy and paste from google translator to the dsa though since dsa only supports ansi characters. I don’t seem to be able to edit the dsb either, it doesn’t open in the daz script ide.

    I may do some tests with fromCharCode() in the dsa and let you know if it saves as utf-8.

    http://docs.daz3d.com/doku.php/public/software/dazstudio/4/referenceguide/scripting/api_reference/object_index/string

  25. Alessandro Padovani

    If someone could tell me why the script below doesn’t work I could be able to do some tests. It should write a dbz file but it doesn’t.

    var fname = Scene.getFilename();
    var fname2 = fname.left(fname.length - 4) + ".dbz";
    
    MessageBox.information("fname2 = " + fname2,"","");
    
    var fp = new DzGZFile(fname2);
    fp.writeLine("fname2 = " + fname2);
    fp.close();
    

  26. Thomas Larsson repo owner

    You have to open the file before writing to it. Here is a version that writes a zipped file if it can, and otherwise a text file. The code is wrapped in a function because otherwise DS tends to crash for me.

     function exportDbz()
     {
        var fname = Scene.getFilename();
        var fname2 = fname.left(fname.length - 4) + ".dbz";
        var fname20 = fname2 + "0"
    
        MessageBox.information("fname2 = " + fname2,"","");
    
        var fp0 = new DzFile(fname20);
        fp0.open( fp0.WriteOnly );
        fp0.writeLine("fname2 = " + fname2);
        fp0.close();
    
        var fp = new DzGZFile(fname2);
        var ok = fp.zip(fname20);
        fp.close();
        if (ok) {
            fp0.remove()
        }
        else {
            var oDir = fp.dir();
            oDir.move(fname20, fname2)
        }   
     }
    
     exportDbz();
    

  27. Alessandro Padovani

    Thank you Thomas for your help. I see I’m really rusty at programming.

    Well I did some tests and it seems to me that the issue is not DzGZFile, but something in the string type itself. Below my example where I write to file the scene name containing japanese characters. The string is good in the dialog box but it gets converted to ???? when I write it to file. I checked with an hex editor just to be sure.

    Now I see there’s the ByteArray object that can convert among strings and utf8, but I was not able to solve the issue. I always end up with ???? in the file. Anyway this means that either using DzGZFile or DzFile we can’t write unicode characters to file. Unless we understand why and find a solution.

    http://docs.daz3d.com/doku.php/public/software/dazstudio/4/referenceguide/scripting/api_reference/object_index/bytearray

    edit. Also opened a discussion at daz if they may help.

    https://www.daz3d.com/forums/discussion/454371/how-do-we-write-unicode-to-file-help-for-diffeomorphic

  28. Thomas Larsson repo owner

    DzGZFile is an issue too. It cannot create the dbz files for your japanese cube, even though the file itself only contains ascii characters. The name of the cube is "pCube(11cg000)1". The japanese characters only appear in the label, which is not exported because it is not needed to identify the object, and because non-ascii (even utf-8) characters are not accepted by Blender’s json loader. Again, this has probably something to do with the text file being ansi.

  29. Alessandro Padovani

    Thomas I see what you mean, it is the same reported by @engetudouiti. That is, DzGZFile can’t create a file having a unicode file name, while DzFile can. This makes sense though, since the zip format can only handle ansi file names. As for the blender json only accepting ansi files, I didn’t know it. Then it doesn’t matter if we can export Unicode labels if then blender can’t read them.

    Then I guess the final answer is: due to the zip format and blender json limits unicode file names and labels can’t be exported.

    Though python can read unicode strings from the daz duf files. I mean the japanese cube is saved as utf-8 by daz studio and the unicode labels are imported fine in blender as shown above.

  30. engetudouiti

    “zip format can only handle ansi file names” is actually old behavor.

    https://www.artpol-software.com/ZipArchive/KB/0610051525.aspx

    zip already offer way to manage utif-8 unicode file name. (so many japanese use zip wittch incude non ansi file names. with their zip (unzip) tools.

    modern many zip archiver soft can handle them correctly with options. (so I can easy zip and un-zip as utf-8 with japanese file path, and name, txt without problem.

    actuall problem is, how aprication try to generate, compress, and uncompress zip to compatible non ansi users. I said again, zip already offer way.

    About this case, daz script library seems not offer way to manage utf-8 file path and letters.

  31. Alessandro Padovani

    Thank you @engetudouiti I was not up to date and didn’t come to my mind to check for new features. It seems pkware 6.3.0 supports utf-8 file names from 2006 so it’s a while now. Also 7-zip that’s the implementation I use defaults to utf-8.

    As for windows, NTFS stores file names in utf-8, EXFAT uses utf-8 as well. While FAT32 uses the OEM charset. So be aware when you store on usb drives formatted in FAT32. The windows zip utility supports utf-8 from windows 7.

    So yes it seems the daz library needs some update.

    edit. Also recently I did have issues with japanese vendors for daz assets because unzipping didn’t extract the unicode file names. So now I wonder why, from the zip specifications everything should work fine. How do they get to don’t use unicode ? @engetudouiti Assuming you do have a japanese windows version, may you please zip some dummy files with japanese names, using the windows zip utility, so I can check if I can unzip them ? I could do it myself by changing the system locale to japanese, but now I’m not sure I get the same result as a japanese windows version.

    edit. The mystery gets worse. Now it seems the windows zip utility can’t compress unicode names, despite the zip specifications. Here I am on windows 10 pro. I tried with a simple folder containing dummy japanese files and folders translated with google. I can compress it fine with 7-zip though, and I attach the sample zipped folder for anyone interested. So now may be I understand why some japanese daz assets don’t come in right. If they use the windows zip utility it probably uses the OEM charset.

    With the windows zip utility I can extract the unicode archive created with 7-zip though. So it seems it can read but it can’t write unicode.

    references:

    https://en.wikipedia.org/wiki/ZIP_(file_format)

    https://documentation.help/7-Zip/charset.htm

    https://docs.microsoft.com/en-us/windows/win32/intl/character-sets-used-in-file-names

  32. Thomas Larsson repo owner

    This seems to work as well as it will. Zipping does not work with non-latin letters, but in that case the dbz file is left unzipped and can still be loaded. And as long as only labels contain unicode characters, the dbz file can be loaded. There are probably other cases, and if so the problems remain for those, but I don't think we can do better.

  33. Log in to comment