Source

text / Data / Text / Encoding.hs

Author Commit Message Labels Comments Date
tibbe
Use unsafeDupablePerformIO where possible unsafeDupablePerformIO is much faster than unsafePerformIO and can be used safely as long as the underlying operation is pure and we're fine risking duplicating it in a multi-core scenario. unsafeDupablePerformIO helps performance a lot on short string where the overhead of unsafePerformIO dominates. Closes #41.
Bryan O'Sullivan
Ensure that an encoding error handler's result is safe
Bryan O'Sullivan
Merge pull request #18 from hvr/pull-req-16 Add new `Data.Text.Encoding.decodeLatin1` ISO-8859-1 decoding function
Herbert Valerio Riedel
Add new `Data.Text.Encoding.decodeLatin1` ISO-8859-1 decoding function This has about an order of magnitude lower runtime and/or call-overhead as compared to the more generic `text-icu` approach, e.g. according to criterion with GHC 7.4.1 on Linux/x86_64: * 12 times faster for empty input strings, * 6 times faster for 16-byte strings, and * 3 times faster for 1024-byte strings. `decodeLatin1` is also faster compared to using `decodeUtf8` for plain ASCII: * 2 t…
deian
Top-level interfaces are safe, marked trustworthy
Bryan O'Sullivan
Switch decodeUtf8 to runText This improves performance by 10% on small strings, and reduces the amount of memory allocated by 17%.
Bryan O'Sullivan
Many improvements, all small. Add a span_ function, using unboxed tuples, to Text.Private. Use span_ in a few places where it can help a little. Relax the constraint on rational to Fractional. Specialize over many more integral types.
Tags
0.11.1.12
Bryan O'Sullivan
Reduce pointer arithmetic for better speed.
Bryan O'Sullivan
Improve ASCII encoding performance in a safer way.
Bryan O'Sullivan
Merge the performance- and correctness-affecting commits away
Bryan O'Sullivan
Oops! Back out part of 59aad6977070 - it was wrong My assertion that it was safe to skip the "do I have 1 byte available?" check was incorrect.
Bryan O'Sullivan
A valiant attempt at improving UTF-8 encoding performance. This didn't actually work - it slowed down aeson encoding by almost 2x!
Bryan O'Sullivan
Make encoding slightly faster. The improvement mainly comes from dropping a redundant check when decoding an ASCII byte.
Bryan O'Sullivan
Silence a compiler warning.
Bryan O'Sullivan
Mark the ASCII decoding functions as deprecated.
Bryan O'Sullivan
Portable native UTF-8 decoder gives 3.7x faster decoding This code is derived from Björn Höhrmann's UTF-8 decoder. Compared to the original Haskell decoder from cac7dbcbc392, it's between 2.17 and 3.68 times faster. It's even between 1.18 and 3.58 times faster than the improved Haskell decoder from 71ead801296a. The x86-specific decoding path gives a substantial win for entirely and partly ASCII text, e.g. HTML and XML, at the cost of being about 17%…
Bryan O'Sullivan
Speed up UTF-8 decoding by a little over 2x The previous code was more concise, but alas GHC boxed each Word8 it read from the ByteString, which resulted in poor performance. This mankier code adds (seemingly required) strictness annotations, along with a little bit of manual CSE. Timing of the DecodeUtf8/Strict benchmark went from 41.8ms to 19.6ms, a pleasing improvement.
Bryan O'Sullivan
Oh noes! I was miscalculating the initial buffer size! When performance testing encodeUtf8, I noticed that for some reason I was still seeing "ensure" show up in the profile, when I expected it shouldn't have been. Turns out I was using a "min" where I should have been using a "max", and thus allocating an initial bytestring that would almost always be too small, thus forcing reallocations and copying. Boo!
Bryan O'Sullivan
Eliminate unnecessary resizes from encodeUtf8. We had been performing a resize any time that (a) we had data to write and (b) we got to within 4 bytes of filling the target bytestring. This was safe, but suboptimal, as it meant that in the common case of encoding ASCII text, we would *always* perform a resize. Now, we check the exact number of bytes we need to fit, and resize only if they won't fit. This eliminates resizes for ASCII data, an…
Bryan O'Sullivan
Improve error message.
Bryan O'Sullivan
Add decodeUtf8'.
Bryan O'Sullivan
Many small documentation improvements.
Bryan O'Sullivan
Get rid of the old decode function
Bryan O'Sullivan
Add a rewrite rule for fusion
Bryan O'Sullivan
Write a faster UTF-8 decoder
Bryan O'Sullivan
Remove old UTF-8 encoding functions
Bryan O'Sullivan
Update copyright
Bryan O'Sullivan
Rewrite encodeUtf8 for speed This was inspired by a patch from Simon Meier, who wrote a direct implementation of encodeUtf8 using his 'blaze-builder' package. His code showed a very impressive speedup. My code is similar in both structure and performance, its chief difference being that it doesn't require 'blaze-builder'.
Bryan O'Sullivan
Change Tom's email address
Bryan O'Sullivan
Add controllable error handling and recovery code.
  1. Prev
  2. 1
  3. 2
  4. Next