PyPy and it's future challenges
-Obviously I'm biased, but I think PyPy is going pretty good along. I would like
-to however write down some point where I think pypy is staying behind,
-it's not living up to the promises or simply the design decisions didn't
-turn out as good as we hoped for them. In a fairly arbitrary order
+Obviously I'm biased, but I think PyPy is progressing fairly well. However,
+I would like to mention some areas where I think pypy is lagging ---
+not living up to its promises or the design decisions simply didn't
+turn out as good as we hoped for them. In a fairly arbitrary order:
-* **Whole program type inference**. This
is the decision that has been haunting
+* **Whole program type inference**. This decision has been haunting
separate compilation effort for a while. It's also one of the reasons
why RPython errors are confusing and why the compilation time is so long.
This is less of a concern for users, but more of a concern for developers
and potential developers.
-* **Memory impact**. The problem really is we never scientifically measured
- memory impact of PyPy on examples. There are reports about outrageous pypy
+* **Memory impact**. We never scientifically measured
+ memory impact of PyPy on examples. There are reports of outrageous pypy
memory usage, but they're usually very cryptic "my app uses 300M" and not
really reported in a way that's reproducible for us. We simply have to start
- measuring memory impact on benchmarks
, you can definitely help by providing
+ measuring memory impact on benchmarksou can definitely help by providing
us with reproducible examples (they don't have to be small, but they have
-The next bunch is all connected. The real question is - what to do in the
-situation where JIT does not help? It can be for various reasons, but in general
-PyPy is inferior to CPython most of the time in all of them. A good hard example
-is running tests. Ideally, for perfect unit tests each piece of code is executed
-only once. There are other examples like short running scripts. It all can
-be addressed by one or more of the below:
+The next group all are connected. The fundamental question is: What to do
+in the situation where the JIT does not help? There are many causes, but,
+in general, PyPy often is inferior to CPython for all of the examples.
+A representative, difficult exammple is running tests. Ideally, for
+perfect unit tests, each piece of code should be executed only once. There
+are other examples, like short running scripts. It all can
+be addressed by one or more of the following:
-* **Slow runtime**. Our runtime is slow. It is a combination of using a higher
+* **Slow runtime**. Our runtime is slow. This is caused by a combination
level language than C and a relative immaturity compared to CPython. The
former is at least partly a GCC problem. We emit code that does not look
like hand-written C and GCC is doing worse job at optimizing it. A good
- example are operations on longs, which are about 2x slower than cpython's,
- partly because GCC is unable to correctly optimized our code.
+ example is operations on longs, which are about 2x slower than CPython's,
+ partly because GCC is unable to effectively optimize code generated
* **Too large JIT warmup time**. This is again a combination of issues.
Partly this is one of the design decisions of tracing on the metalevel,
- which takes more time, but partly this is
the issue with our current
+ which takes more time, but partly this is issue with our current
implementation that can be addressed. It's also true that in some edge
cases, like running large and complex programs with lots and lots
- of megamorphic call sites, we don't do a very good job tracing. Since a good
- example of this case is running pypy's own test suite, I would expect
- we'll invest some work into that.
+ of megamorphic call sites, we don't do a very good job tracing. Because
+ a good example of this case is running PyPy's own test suite, I expect
+ we will invest some work into this.
* **Slow interpreter**. This one is very similar to the slow runtime - it's
a combination of using RPython and the fact that we did not spend much
time optimizing it. Unlike the runtime, we might solve it by having an
- unoptimizing JIT or some other medium-level solution that would work well
- enough. There were some efforts invested, but as usual we lack manpower to
+ unoptimizing JIT or some other medium-level solution that would work good
+ enough. There were some efforts invested, but, as usual, we lack enough
+ manpower to proceed as rapidly as we would like.
-Thanks for bearing with me that far. This blog post was partly influenced
-by accusations that we're doing disohnest PR that pypy is always fast. I don't
-think this is the case and I hope I clarified some of the week spots both here
-and on `performance page`_.
+Thanks for bearing with me this far. This blog post was partly influenced
+by accusations that we're doing dishonest PR that PyPy is always fast. I don't
+think this is the case and I hope I clarified some of the weak spots, both here
+and on the `performance page`_.