hipMAGMA build issue

Issue #27 resolved
joubert_ornl Joubert created an issue

When I run the translation from cuda to hip I get failures in tools/codegen.py like the following, caused by non-ASCII characters in source code files (used in the comments in C source code). Perhaps it is from using python 3.6 (?). After I modify several source code files, I am able to do the build.

File "tools/codegen.py", line 360, in <module>

main()

File "tools/codegen.py", line 322, in main

src = SourceFile( filename )

File "tools/codegen.py", line 171, in _init_

self._text = fd.read()

File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode

return codecs.ascii_decode(input, self.errors)[0]

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 2301: ordinal not in range(128)

Comments (10)

  1. Cade Brown

    Hello,

    Thanks for the report. I am checking it out. What version of MAGMA are you using? I am on the latest commit on the ‘hipMAGMA’ branch.

    I did not write the ‘codegen.py' script myself, so I can’t say for sure there aren’t unicode literals in that file. However, running this one liner:

    $ python3 -c 'print ([*map(ord, open("tools/codegen.py", "r").read()[2300:2310])])'
    [109, 112, 108, 97, 116, 101, 115, 41, 32, 36]
    $ python3 -c 'print (all(x in range(128) for x in map(ord, open("tools/codegen.py", "r").read()[2300:2310])))'
    True
    

    So, for my code it shows that that particular range does not have any unicode literals, and is entirely ASCII-only. Could you run those scripts and report back? Additionally, try a fresh download/clone and see if the error persists

    EDIT: I see now that it is the file in question is not tools/codegen.py. What file is it trying to open when it fails?

  2. joubert_ornl Joubert reporter

    Thanks. The git hash of the version I am using is f50e717b. The unicode characters are in sourcecode files, in particular:

        sed -i -e '122d' magma_blas_hip/zlarfg.hip.cpp
        sed -i -e '93d' magma_blas_hip/zlarfg-v2.hip.cpp
        sed -i -e '100d' magma_blas_hip/zlarfgx-v2.hip.cpp
        sed -i -e '128d' magma_blas_hip/zlarfgx-v2.hip.cpp
        sed -i -e '595d' testing/magma*generate.cpp
        sed -i -e '13d' sparse/blas/zgeellrtmv.cu
        sed -i -e '115d' sparse/blas/zgeellrtmv.cu
        sed -i -e '64d' sparse/blas/zgeellrtmv.cu
    

  3. Cade Brown

    We would like do probably keep those, since the documentation generates HTML which can support those symbols (I see they are mainly +/-). Therefore, we need to fix the codegen.py script

    This is an odd error; since Python3 has a unified string model. However, the codegen.py file was written for Python 2 (from the shebang #!/usr/bin/env python). This is definitely problematic.

    Try replacing the shebang with:

    #!/usr/bin/env python3
    

    If you wouldn’t mind, try replacing lines ~70 with:

            fd = open( filename, 'rb' )
            self._text = fd.read().decode('utf8')
    

    We definitely need to change this (since older versions of Python are EOL), but this may not be the (complete) solution to this issue

  4. Cade Brown

    Does replacing those two lines fix the problem? If so, I can go ahead and push that if it solves it (and it does work on our machine as well). Otherwise, we can try other things as well.

    The weird thing is that your message displays python3 standard library paths, so I’m wondering why it’s trying to force ASCII… In any case, reading all bytes and decoding as UTF8 should work, since it is UTF8 (file -i magmablas/zlarfg.cu)

  5. Mark Gates

    Oddly, I can’t reproduce this error. For me, codegen.py (via make generate) works with both python2 and python3, as stated in its header ("Tested with python 2.7.9 and 3.4.3."). Currently I'm using python 2.7.16 and 3.7.6.

    magma> rm magmablas/[sdc]larfg.cu
    magma> make generate codegen='python2 tools/codegen.py'
    python2 tools/codegen.py -p s magmablas/zlarfg.cu
    python2 tools/codegen.py -p d magmablas/zlarfg.cu
    python2 tools/codegen.py -p c magmablas/zlarfg.cu
    
    magma> rm magmablas/[sdc]larfg.cu
    magma> make generate codegen='python3 tools/codegen.py'
    python3 tools/codegen.py -p s magmablas/zlarfg.cu
    python3 tools/codegen.py -p d magmablas/zlarfg.cu
    python3 tools/codegen.py -p c magmablas/zlarfg.cu
    

    The python is specified in the Makefile, so changing the #! line won’t change anything. I wouldn’t hard code python3 since that may not exist. It’s not usually hard to support both python 2 and 3.

    However, I recommend removing non-ASCII characters, per the MAGMA Contributors' Guide (http://icl.cs.utk.edu/projectsfiles/magma/doxygen/contributors-guide.html). Since the encoding isn’t specified as UTF-8 in the file, other tools like editors may choke on it. For instance, instead of beta = ±norm, use |beta| = norm.

    There are few instances. This should check for them:

    grep -P '[^\x0-\x7f]' `git ls-files`
    

  6. Cade Brown

    I’ll just remove the non-ASCII and replace them with equivalent constructions (like |x|=y, as opposed to x=U+XXXXy).

    I’m a little bit confused about zlarfgx… It uses a unicode symbol which is a boolean negation (¬). I have seen it used in some contexts as complex conjugation, but I do not want to put something incorrect (considering that it lists it as applied to dxnorm[0], which is a double). @Mark Gates what semantics are they trying to describe here?

  7. Cade Brown

    Removed non-ASCII codepoints from the source code (including some authors names in sparse/blas; just replaced the characters with ASCII equivalents with no accent marks)

    resolves issue #27

    → <<cset bf2f1d78bead>>

  8. Log in to comment