Commits

Author Commit Message Labels Comments Date
Bryan O'Sullivan
Correct the documentation for streaming decoding
Bryan O'Sullivan
streamDecodeUtf8With: accumulate undecoded chunks correctly We had previously gotten the accounting and reporting wrong if an incomplete input was fed in over the course of several continuations, such that we'd report only the incomplete input seen by the most recent continuation. This fixes gh-70.
Adam Vogt
documentation fix
Yuras Shumovich
Mark some internal modules as not-home for Haddock It simplifies navigation through docs
Bryan O'Sullivan
encodeUtf8: squeeze out a little more fast path performance If we're about to drop from the fast to the slow path, try to cut our losses by pushing out a few more bytes before we give up.
Bryan O'Sullivan
Added tag 1.1.0.0 for changeset 62674a9bbc83
Bryan O'Sullivan
Update changelog
Tags
1.1.0.0
Bryan O'Sullivan
Tidy up imports
Bryan O'Sullivan
Drop a redundant import
Bryan O'Sullivan
Drop the old pure-Haskell implementation of encodeUtf8
Bryan O'Sullivan
Drop the Builder-based encodeUtf8 implementation While it is very cool indeed, it is slower than the new C code under all circumstances, sometimes by a factor of two or more.
Bryan O'Sullivan
Let's see if this tweak helps the automated tests
Bryan O'Sullivan
encodeUtf8_2: drop parallel range checks Once I noticed that I'd screwed up the range checking and fixed it, it became slow enough to not be worth it. All test cases are about 10% faster with this extra complexity removed, with the exception of pure Russian, which is about 50% slower.
Bryan O'Sullivan
encodeUtf8_2: fix parallel range check This makes it rather expensive, alas.
Bryan O'Sullivan
Improve Arbitrary instances
Bryan O'Sullivan
I am enjoying these changelog edits
Bryan O'Sullivan
encodeUtf8_2: add fast paths for x86_64 and i386 This helps performance a lot in most cases: up to 2x faster, in fact. The exception seems to be Japanese, which is slowed down by about 10%.
Bryan O'Sullivan
Add a multibyte HTML document benchmark
Bryan O'Sullivan
Revise changelog perf note (yay!)
Bryan O'Sullivan
encodeUtf8_1: so long, it's been nice knowing you! Since encodeUtf8_2 wins under all circumstances, there's no reason to keep the intermediate version around.
Bryan O'Sullivan
encodeUtf8_2: fix an off-by-one-bit error (!)
Bryan O'Sullivan
encodeUtf8_2: cap the number of wasted bytes at 2x This has the odd side effect of improving tiny-string performance from 20% slower then encodeUtf8_1 to about 5% faster. Never stop being weird, GHC optimizer!
Bryan O'Sullivan
encodeUtf8_2: a C-based encoding function Not surprisingly, this is a lot faster than encodeUtf8_1 and the Builder-based rewrite under almost all circumstances. It's slower on tiny inputs (20%), but roughly twice as fast as encodeUtf8_1 on longer inputs.
Simon Meier
Improve small string performance for UTF-8 encoding to bytestrings On a 5 byte string the conversion of strict text to a strict bytestring is still a factor 2x slower than the custom 'encodeUtf8_1' routine. However, this is much better than the factor 4.5x that we started with. I attribute the slowdown to the more expensive startup cost for the bytestring-builder-based solution. Note that this startup cost is shared in case a small string is encoded as part of a…
Bryan O'Sullivan
Begin 1.1 release notes
Bryan O'Sullivan
encodeUtf8_1: get my arithmetic right :-(
Bryan O'Sullivan
Export both encodeUtf8 variants
Bryan O'Sullivan
Drop now-redundant imports
Bryan O'Sullivan
encodeUtf8_1: drop an unnecessary type signature The value that was having too general a type inferred is now a pointer, so inference doesn't accidentally overgeneralize.
Bryan O'Sullivan
encodeUtf8_1: drop a loop induction variable This helps performance quite a bit! Now encoding Japanese text is 2x faster than encodeUtf8, as opposed to 30% faster before. Not bad!
  1. Prev
  2. Next