hipMAGMA build issue
When I run the translation from cuda to hip I get failures in tools/codegen.py like the following, caused by non-ASCII characters in source code files (used in the comments in C source code). Perhaps it is from using python 3.6 (?). After I modify several source code files, I am able to do the build.
File "tools/codegen.py", line 360, in <module>
main()
File "tools/codegen.py", line 322, in main
src = SourceFile( filename )
File "tools/codegen.py", line 171, in _init_
self._text = fd.read()
File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 2301: ordinal not in range(128)
Comments (10)
-
-
reporter Thanks. The git hash of the version I am using is f50e717b. The unicode characters are in sourcecode files, in particular:
sed -i -e '122d' magma_blas_hip/zlarfg.hip.cpp sed -i -e '93d' magma_blas_hip/zlarfg-v2.hip.cpp sed -i -e '100d' magma_blas_hip/zlarfgx-v2.hip.cpp sed -i -e '128d' magma_blas_hip/zlarfgx-v2.hip.cpp sed -i -e '595d' testing/magma*generate.cpp sed -i -e '13d' sparse/blas/zgeellrtmv.cu sed -i -e '115d' sparse/blas/zgeellrtmv.cu sed -i -e '64d' sparse/blas/zgeellrtmv.cu
-
We would like do probably keep those, since the documentation generates HTML which can support those symbols (I see they are mainly +/-). Therefore, we need to fix the
codegen.py
scriptThis is an odd error; since Python3 has a unified string model. However, the
codegen.py
file was written for Python 2 (from the shebang#!/usr/bin/env python
). This is definitely problematic.Try replacing the shebang with:
#!/usr/bin/env python3
If you wouldn’t mind, try replacing lines ~70 with:
fd = open( filename, 'rb' ) self._text = fd.read().decode('utf8')
We definitely need to change this (since older versions of Python are EOL), but this may not be the (complete) solution to this issue
-
reporter See the .cu files --
-
reporter Thanks --
-
Does replacing those two lines fix the problem? If so, I can go ahead and push that if it solves it (and it does work on our machine as well). Otherwise, we can try other things as well.
The weird thing is that your message displays
python3
standard library paths, so I’m wondering why it’s trying to force ASCII… In any case, reading all bytes and decoding as UTF8 should work, since it is UTF8 (file -i magmablas/zlarfg.cu
) -
Oddly, I can’t reproduce this error. For me, codegen.py (via
make generate
) works with both python2 and python3, as stated in its header ("Tested with python 2.7.9 and 3.4.3."). Currently I'm using python 2.7.16 and 3.7.6.magma> rm magmablas/[sdc]larfg.cu magma> make generate codegen='python2 tools/codegen.py' python2 tools/codegen.py -p s magmablas/zlarfg.cu python2 tools/codegen.py -p d magmablas/zlarfg.cu python2 tools/codegen.py -p c magmablas/zlarfg.cu magma> rm magmablas/[sdc]larfg.cu magma> make generate codegen='python3 tools/codegen.py' python3 tools/codegen.py -p s magmablas/zlarfg.cu python3 tools/codegen.py -p d magmablas/zlarfg.cu python3 tools/codegen.py -p c magmablas/zlarfg.cu
The python is specified in the Makefile, so changing the #! line won’t change anything. I wouldn’t hard code python3 since that may not exist. It’s not usually hard to support both python 2 and 3.
However, I recommend removing non-ASCII characters, per the MAGMA Contributors' Guide (http://icl.cs.utk.edu/projectsfiles/magma/doxygen/contributors-guide.html). Since the encoding isn’t specified as UTF-8 in the file, other tools like editors may choke on it. For instance, instead of beta = ±norm, use |beta| = norm.
There are few instances. This should check for them:
grep -P '[^\x0-\x7f]' `git ls-files`
-
I’ll just remove the non-ASCII and replace them with equivalent constructions (like |x|=y, as opposed to x=U+XXXXy).
I’m a little bit confused about
zlarfgx
… It uses a unicode symbol which is a boolean negation (¬). I have seen it used in some contexts as complex conjugation, but I do not want to put something incorrect (considering that it lists it as applied todxnorm[0]
, which is a double). @Mark Gates what semantics are they trying to describe here? -
I think the ¬ are typos. Just |beta| = norm… = dxnorm.
-
- changed status to resolved
Removed non-ASCII codepoints from the source code (including some authors names in
sparse/blas
; just replaced the characters with ASCII equivalents with no accent marks)resolves issue
#27→ <<cset bf2f1d78bead>>
- Log in to comment
Hello,
Thanks for the report. I am checking it out. What version of MAGMA are you using? I am on the latest commit on the ‘hipMAGMA’ branch.
I did not write the ‘codegen.py' script myself, so I can’t say for sure there aren’t unicode literals in that file. However, running this one liner:
So, for my code it shows that that particular range does not have any unicode literals, and is entirely ASCII-only. Could you run those scripts and report back? Additionally, try a fresh download/clone and see if the error persists
EDIT: I see now that it is the file in question is not
tools/codegen.py
. What file is it trying to open when it fails?