Note that this is already done in pypy's version of numpy random using cffi. Numpy handles the seed state differently, so this cannot be reused as-is. The corresponding setup.py builds both the mersenne twister dll and the cffi wrapper dll, but may not work on windows
Hello I was lead here by the documentation saying this was a good issue for a newcomer to work on:
So unless there is any objection, I guess I'll try my hand at sorting it out. If anyone has any helpful ideas of a possible approach (other than the info in this thread and all the info in the devguide), I'm all ears!
I have an initial attempt at implementing this at this branch/folder of my repo:
I added a README.md file with a lot of information into that folder, but I'll add in the contents of the README here so no one needs to needless jump over there. I don't have a full implementation and I'm not even sure if what I'm doing is right/possible, but either way this has been kind of a fun intro to pypy and cffi. Hopefully my work here is useful, but the experience has been nice nonetheless.
I tried to only take the pieces that were necessary (i.e. I did not bring any of the distributions code). The C code is found in the mtrand/ folder and it is built using cffi by the _mtrand_build.py script. Note that I simplified the _mtrand_build.py and setup.py scripts so that I could understand them myself. By doing so, I removed some of the parts that are probably needed to compile on Windows. I figure I can add that back in once I actually understand things correctly.
Assuming you have cloned this repo and built everything, you should be able to just run commands like:
There's nothing requiring you to use these make commands, but they are convenient for me to automate things. Also some commands done work well in the wrong order, but honestly I don't care about that. The test in tests/ is the same test as for the original random implementation found here
except there is one added test in my version test_state_assignment. All tests pass except for test_translate. That one fails because apparently the cffi code objects are not valid rpython.
At a minimum there are the following things left to do:
Move this into the main library. For this I'm a bit confused. The original file is rpython and so if I were to do the same thing I think I should change everything here from using cffi to using rffi. I looked at the repo code and docs for this and don't really understand how to do that (and if it's even the right approach).
Make the test_translate test test pass.
Clean stuff up. E.g. right now I have a weird proxy class StateRepr that allows for access to the internal state in a way so that the old api is consistent (at least from the perspective of the tests). This may be the wrong approach though and maybe a better proxy (or no proxy) is better.
This should of course actually be timed to make sure it's actually faster in the first place.
Once 1-4 here are done, then certain optimizations can maybe be attempted in the code I wrote. But there seems to be no reason to try to optimize it until things fit together correctly and timing scripts exist.
By the way at this point there are still fairly weird stylistic issues as well as minor missing functionality. Right now I'm mainly focusing on getting the build totally sorted out so that I can stop thinking about architecture and more about those other details. In any case, I'm extracting the changes from the README and posting them here for convenience:
I extracted the necessary portions of the numpy random
I tried to only take the pieces that were necessary (i.e. I did not bring any
of the distributions code). The C code is found in the mtrand/ folder. I
integrated that code into the codebase in two separate ways. The first as a
guide as a separate rpython script in this folder in rmtrand.py. That script
can be compiled by running
Basically that file does the following:
It defines the interface to the C code using rffi.
It embeds the tests (as many as possible anyway) from
rpython/rlib/tests/test_rrandom.py. The tests that are in there do pass
The second way that this has been embedded into the codebase is directly into
the file rpython/rlib/rrandom.py. Notice that that file is basically the same
as the rmtrand.py file in this folder except with some paths changed to help
the full build. I also moved the C files themselves to the folder
../rpython/translator/c/mtrand/. I'm not sure if this is a reasonable place
for them, but it seems to put them with other C files in the interpreter.
In the current setup the interpreter actually will build in its current form
(i.e. with just a regular make in the root of the repo). However, the code is
not working at run-time. If you go into the pypy/goal folder and try to use
the module, it will fail as follows:
It seems like some sort of linking problem. Though the code does seem to
compile and hook up right during the building of the interpreter so it seems
like the problem is that I'm just not including the compiled C code in the
final interpreter binary correctly (presumably due to a misunderstanding of what
llexternal actually does).