Commits

Author Commit Message Labels Comments Date
Bryan O'Sullivan
encodeUtf8_1: hoist ensure up a level
Bryan O'Sullivan
encodeUtf8_1: refactor go to accept a pointer parameter
Bryan O'Sullivan
encodeUtf8_1: hoist poke8 up a level
Bryan O'Sullivan
Duplicate encodeUtf8 as encodeUtf8_1 temporarily
Bryan O'Sullivan
Add new encoding benchmarks These require at least the following version of the text-test-data, dated January 6: git: 2183e3e5423fbf0d9d0187a4455df699c5e04b74 hg: 6c0e2b527bbbc6e18c622e452d16634b5d953b34
Bryan O'Sullivan
Merge pull request #63 from meiersi/polish-text-bytestring-builder-integration Polish UTF-8 bytestring builder support
Simon Meier
Add back 'ensure 1' to avoid overflowing an output buffer The counter-example for the existing code is a string of length '2*n' that starts with 'n' characters with codepoints in the range (0x7F, 0x7FF) and ends with 'n' ASCII characters. All 'n' ASCII characters will be written after the end of the output buffer.
Simon Meier
Polish UTF-8 bytestring builder support - adjust function names to 'encodeUtf8Builder' and 'encodeUtf8BuilderEscaped' - expose the same conversion to builders for both lazy and strict text - ensure 'Escaped' versions are inlined to allow specialization for specific escaping primitives - fix some Haddock references - add Haddock comment about bytestring >= 0.10.4.0 dependency - remove stream-to-builder encoding functions. There is no d…
Bryan O'Sullivan
Drop some special-casing for ASCII during UTF-8 encoding I somehow forgot that we allocate the initial ByteString to contain the same number of bytes as the Text contains code units. This means that we never need to ensure that the ByteString is big enough, nor (with this observation) does a special-cased ASCII-only loop help performance.
Bryan O'Sullivan
Merge the new bytestring builder code
Simon Meier
Merge branch 'master' into feature-new-bytestring-builder - newest benchmark results: 8.2 -> 7.2 ms for EncodeUtf8/Text benchamrk 18.2 -> 10.0 ms for EncodeUtf8/TextLazy benchmark ==> 13% and 81% speed improvement :-) Conflicts: Data/Text/Encoding.hs text.cabal
Simon Meier
Merge branch 'master' of https://github.com/bos/text
Simon Meier
implement 'encodeUtf8Builder' using 'encodeUtf8Escaped'
Simon Meier
implemented 'Text -> Builder' UTF-8 encoders It uses a coupled end-of-input-and-output boundary and exploits the UTF-16 representation of the 'Text' value. According to preliminary benchmarks, it is 25% faster than the existing 'encodeUtf8 :: Text -> ByteString' function. We also support an 'encodeUtf8AsciiEscaped' encoder that allows to special case encoding of ASCII characters. This is a very useful function for implementing escaping enco…
Simon Meier
implement strict Text to Builder encoder using BoundedEncodings A first test of the infrastructure can be found in my 'aeson' branch.
Simon Meier
.gitignore 'dist' and 'cabal-dev'
Bryan O'Sullivan
Small generator fix for invalid UTF-8
Bryan O'Sullivan
Oops, missed a macro definition spot
Bryan O'Sullivan
Merge from 1.0 branch
Bryan O'Sullivan
Fix test suite execution when run under Jenkins
Bookmarks
1.0
Bryan O'Sullivan
Fix test suite build with GHC 7.0.x
Bryan O'Sullivan
Merge from 1.0 branch again
Bryan O'Sullivan
Added tag 1.0.0.1 for changeset 7cba97c86467
Bryan O'Sullivan
Add two more kinds of invalid UTF-8 to generate As far as I know, this completes the set of possible invalid encodings.
Tags
1.0.0.1
Bryan O'Sullivan
Merge from 1.0 branch
Bryan O'Sullivan
Amend release notes
Bryan O'Sullivan
Bump version to 1.0.0.1
Bryan O'Sullivan
Merge fix for gh-61 into 1.0 branch
Bryan O'Sullivan
Merge
Bryan O'Sullivan
Ensure that t_utf8_err gets fed *only* invalid UTF-8 inputs This test currently fails due to gh-61.
  1. Prev
  2. Next