text-builder-linear-0.1.3/0000755000000000000000000000000007346545000013564 5ustar0000000000000000text-builder-linear-0.1.3/LICENSE0000644000000000000000000000277307346545000014602 0ustar0000000000000000Copyright Andrew Lelechenko (c) 2022 All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Andrew Lelechenko nor the names of other contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. text-builder-linear-0.1.3/README.md0000644000000000000000000002540107346545000015045 0ustar0000000000000000# text-builder-linear [![Hackage](http://img.shields.io/hackage/v/text-builder-linear.svg)](https://hackage.haskell.org/package/text-builder-linear) [![Stackage LTS](http://stackage.org/package/text-builder-linear/badge/lts)](http://stackage.org/lts/package/text-builder-linear) [![Stackage Nightly](http://stackage.org/package/text-builder-linear/badge/nightly)](http://stackage.org/nightly/package/text-builder-linear) _Linear types for linear times!_ Builder for strict `Text` and `ByteString`, based on linear types. It consistently outperforms lazy `Builder` from `text` as well as a strict builder from `text-builder`, and scales better. ## Example ```haskell > :set -XOverloadedStrings > import Data.Text.Builder.Linear > fromText "foo" <> fromChar '_' <> fromDec (42 :: Int) "foo_42" ``` ## Design String builders in Haskell serve the same purpose as `StringBuilder` in Java to prevent quadratic slow down in concatenation. Classic builders such as `Data.Text.Lazy.Builder` are lazy and fundamentally are [`dlist`](https://hackage.haskell.org/package/dlist) with bells and whistles: instead of actually concatenating substrings we compose actions, which implement concatenation, building a tree of thunks. The tree can be forced partially, left-to-right, producing chunks of strict `Text`, combined into a lazy one. Neither input, nor output need to be materialized in full, which potentially allows for fusion. Such builders allow linear time complexity, but constant factors are relatively high, because thunks are expensive. To a certain degree this is mitigated by inlining, which massively reduces number of nodes. Strict builders such as [`text-builder`](https://hackage.haskell.org/package/text-builder) offer another design: they first inspect their input in full to determine output length, then allocate a buffer of required size and fill it in one go. If everything inlines nicely, the length may be known in compile time, which gives blazingly fast runtime. In more complex cases it still builds a tree of thunks and forces all inputs to be materialized. This package offers two interfaces. One is a mutable `Buffer` with linear API, which operates very similar to `StringBuilder` in Java. It allocates a buffer with extra space at the ends to append new strings. If there is not enough free space to insert new data, it allocates a twice larger buffer and copies itself there. The dispatch happens in runtime, so we do not need to inspect and materialize all inputs beforehand; and inlining is mostly irrelevant. Exponential growth provides for amortized linear time. Such structure can be implemented without linear types, but that would greatly affect user experience by polluting everything with `ST` monad. Users are encouraged to use `Buffer` API, and built-in benchmarks refer to it. The second interface is more traditional `newtype Builder = Builder (Buffer ⊸ Buffer)` with `Monoid` instance. This type provides easy migration from other builders, but may suffer from insufficient inlining, allocating a tree of thunks. It is still significantly faster than `Data.Text.Lazy.Builder`, as witnessed by benchmarks for `blaze-builder` below. ## Case study Let's benchmark builders, which concatenate all `Char` from `minBound` to `maxBound`, producing a large `Text`: ```haskell #!/usr/bin/env cabal {- cabal: build-depends: base, tasty-bench, text, text-builder, text-builder-linear ghc-options: -O2 -} import qualified Data.Text as T import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.Builder as TLB import qualified Text.Builder as TB import qualified Data.Text.Builder.Linear as TBL import System.Environment (getArgs) import Test.Tasty.Bench mkBench :: Monoid a => String -> (Char -> a) -> (a -> Int) -> Benchmark mkBench s f g = bench s $ nf (g . foldMap f . enumFromTo minBound) maxBound {-# INLINE mkBench #-} main :: IO () main = defaultMain [ mkBench "text, lazy" TLB.singleton (fromIntegral . TL.length . TLB.toLazyText) , mkBench "text, strict" TLB.singleton (T.length . TL.toStrict . TLB.toLazyText) , mkBench "text-builder" TB.char (T.length . TB.run) , mkBench "text-builder-linear" TBL.fromChar (T.length . TBL.runBuilder) ] ``` Running this program with `cabal run Main.hs -- +RTS -T` yields following results: ``` text, lazy: 4.25 ms ± 107 μs, 11 MB allocated, 912 B copied text, strict: 7.18 ms ± 235 μs, 24 MB allocated, 10 MB copied text-builder: 80.1 ms ± 3.0 ms, 218 MB allocated, 107 MB copied text-builder-linear: 5.37 ms ± 146 μs, 44 MB allocated, 78 KB copied ``` The first result seems the best both in time and memory and corresponds to the usual `Text` builder, where we do not materialize the entire result at all. It builds chunks of lazy `Text` lazily and consumes them at once by `TL.length`. Thus there are 11 MB of allocations in nursery, none of which survive generation 0 garbage collector, so nothing is copied. The second result is again the usual `Text` builder, but emulates a strict consumer: we materialize a strict `Text` before computing length. Allocation are doubled, and half of them (corresponding to the strict `Text`) survive to the heap. Time is also almost twice longer, but still quite good. The third result is for `text-builder` and demonstrates how bad things could go with strict builders, aiming to precompute the precise length of the buffer: allocating a thunk per char is tremendously slow and expensive. The last result corresponds to the current package. We generate a strict `Text` by growing and reallocating the buffer, thus allocations are quite high. Nevertheless, it is already faster than the usual `Text` builder with strict consumer and does not strain the garbage collector. Things get very different if we remove `{-# INLINE mkBench #-}`: ``` text, lazy: 36.9 ms ± 599 μs, 275 MB allocated, 30 KB copied text, strict: 44.7 ms ± 1.3 ms, 287 MB allocated, 25 MB copied text-builder: 77.6 ms ± 2.2 ms, 218 MB allocated, 107 MB copied text-builder-linear: 5.35 ms ± 212 μs, 44 MB allocated, 79 KB copied ``` Builders from `text` package degrade rapidly, 6-8x slower and 10-20x more allocations. That's because their constant factors rely crucially on everything getting inlined, which makes their performance fragile and unreliable in large-scale applications. On the bright side of things, our builder remains as fast as before and now is a clear champion. ## Benchmarks for `Text` Measured with GHC 9.6 on aarch64: |Group / size|`text`|`text-builder`| |This package| | |------------|-----:|-------------:|-:|-----------:|-:| | **Text** |||||| |1|47.4 ns|24.2 ns|0.51x|35.2 ns|0.74x| |10|509 ns|195 ns|0.38x|197 ns|0.39x| |100|4.94 μs|1.74 μs|0.35x|1.66 μs|0.34x| |1000|52.6 μs|17.0 μs|0.32x|15.0 μs|0.28x| |10000|646 μs|206 μs|0.32x|155 μs|0.24x| |100000|12.2 ms|3.34 ms|0.27x|2.60 ms|0.21x| |1000000|159 ms|55.3 ms|0.35x|16.1 ms|0.10x| | **Char** |||||| |1|46.9 ns|21.1 ns|0.45x|22.3 ns|0.48x| |10|229 ns|152 ns|0.66x|79.9 ns|0.35x| |100|2.00 μs|1.23 μs|0.61x|618 ns|0.31x| |1000|21.9 μs|10.3 μs|0.47x|6.28 μs|0.29x| |10000|285 μs|153 μs|0.54x|68.5 μs|0.24x| |100000|7.70 ms|4.08 ms|0.53x|992 μs|0.13x| |1000000|110 ms|106 ms|0.96x|9.19 ms|0.08x| | **Decimal** |||||| |1|97.7 ns|872 ns|8.92x|80.2 ns|0.82x| |10|864 ns|8.72 μs|10.09x|684 ns|0.79x| |100|9.07 μs|93.5 μs|10.32x|7.25 μs|0.80x| |1000|92.4 μs|1.06 ms|11.44x|67.5 μs|0.73x| |10000|1.13 ms|13.4 ms|11.88x|667 μs|0.59x| |100000|18.7 ms|141 ms|7.57x|7.57 ms|0.41x| |1000000|229 ms|1.487 s|6.48x|67.8 ms|0.30x| | **Hexadecimal** |||||| |1|403 ns|749 ns|1.86x|43.9 ns|0.11x| |10|3.94 μs|7.66 μs|1.94x|308 ns|0.08x| |100|42.8 μs|89.0 μs|2.08x|2.88 μs|0.07x| |1000|486 μs|986 μs|2.03x|27.7 μs|0.06x| |10000|7.10 ms|12.6 ms|1.77x|283 μs|0.04x| |100000|80.1 ms|133 ms|1.65x|3.53 ms|0.04x| |1000000|867 ms|1.340 s|1.55x|28.9 ms|0.03x| | **Double** |||||| |1|7.56 μs|18.3 μs|2.42x|414 ns|0.05x| |10|76.5 μs|188 μs|2.46x|4.23 μs|0.06x| |100|754 μs|2.35 ms|3.11x|44.4 μs|0.06x| |1000|7.94 ms|25.8 ms|3.25x|436 μs|0.05x| |10000|79.1 ms|285 ms|3.60x|4.90 ms|0.06x| |100000|796 ms|2.938 s|3.69x|45.1 ms|0.06x| |1000000|8.003 s|32.411 s|4.05x|436 ms|0.05x| If you are not convinced by synthetic data, here are benchmarks for [`blaze-markup` after migration to `Data.Text.Builder.Linear`](https://github.com/Bodigrim/blaze-markup): ``` bigTable 992 μs ± 80 μs, 49% less than baseline basic 4.35 μs ± 376 ns, 47% less than baseline wideTree 1.26 ms ± 85 μs, 53% less than baseline wideTreeEscaping 217 μs ± 7.8 μs, 58% less than baseline deepTree 242 μs ± 23 μs, 48% less than baseline manyAttributes 811 μs ± 79 μs, 58% less than baseline customAttribute 1.68 ms ± 135 μs, 56% less than baseline ``` ## Benchmarks for `ByteString` Somewhat surprisingly, `text-builder-linear` now offers rendering to strict `ByteString` as well. It is consistently faster than `bytestring` when a string gets over 32k (which is `defaultChunkSize` for `bytestring` builder). For mid-sized strings `bytestring` is slightly faster in certain disciplines, mostly by virtue of using `cbits` via FFI, while this package remains 100% native Haskell. Benchmarks below were measured with GHC 9.6 on aarch64 and include comparison to [`bytestring-strict-builder`](https://hackage.haskell.org/package/bytestring-strict-builder): |Group / size|`bytestring`|`…-strict-builder`| |This package| | |------------|-----------:|-----------------:|-:|-----------:|-:| | **Text** |||||| |1|106 ns|33.5 ns|0.32x|35.2 ns|0.33x| |10|322 ns|217 ns|0.68x|197 ns|0.61x| |100|2.49 μs|1.89 μs|0.76x|1.66 μs|0.67x| |1000|21.8 μs|18.5 μs|0.85x|15.0 μs|0.69x| |10000|231 μs|212 μs|0.92x|155 μs|0.67x| |100000|3.97 ms|3.54 ms|0.89x|2.60 ms|0.66x| |1000000|81.2 ms|51.5 ms|0.63x|16.1 ms|0.20x| | **Char** |||||| |1|99.0 ns|19.4 ns|0.20x|22.3 ns|0.23x| |10|270 ns|82.9 ns|0.31x|79.9 ns|0.30x| |100|1.77 μs|723 ns|0.41x|618 ns|0.35x| |1000|20.4 μs|8.37 μs|0.41x|6.28 μs|0.31x| |10000|322 μs|129 μs|0.40x|68.5 μs|0.21x| |100000|10.4 ms|2.50 ms|0.24x|992 μs|0.10x| |1000000|143 ms|67.4 ms|0.47x|9.19 ms|0.06x| | **Decimal** |||||| |1|152 ns|174 ns|1.14x|80.2 ns|0.53x| |10|685 ns|1.55 μs|2.26x|684 ns|1.00x| |100|5.88 μs|17.2 μs|2.93x|7.25 μs|1.23x| |1000|60.3 μs|196 μs|3.25x|67.5 μs|1.12x| |10000|648 μs|4.25 ms|6.57x|667 μs|1.03x| |100000|11.2 ms|62.8 ms|5.62x|7.57 ms|0.68x| |1000000|150 ms|655 ms|4.37x|67.8 ms|0.45x| | **Hexadecimal** |||||| |1|94.7 ns|||43.9 ns|0.46x| |10|255 ns|||308 ns|1.21x| |100|1.72 μs|||2.88 μs|1.67x| |1000|18.9 μs|||27.7 μs|1.46x| |10000|250 μs|||283 μs|1.13x| |100000|6.94 ms|||3.53 ms|0.51x| |1000000|93.2 ms|||28.9 ms|0.31x| | **Double** |||||| |1|457 ns|||414 ns|0.91x| |10|3.94 μs|||4.23 μs|1.07x| |100|40.3 μs|||44.4 μs|1.10x| |1000|398 μs|||436 μs|1.10x| |10000|5.65 ms|||4.90 ms|0.87x| |100000|63.3 ms|||45.1 ms|0.71x| |1000000|673 ms|||436 ms|0.65x| text-builder-linear-0.1.3/bench/0000755000000000000000000000000007346545000014643 5ustar0000000000000000text-builder-linear-0.1.3/bench/BenchChar.hs0000644000000000000000000001760207346545000017022 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module BenchChar (benchChar) where import qualified Data.ByteString as B import qualified Data.ByteString.Builder as B import Data.Char import qualified Data.Text as T import Data.Text.Builder.Linear.Buffer import qualified Data.Text.Lazy as TL import Data.Text.Lazy (toStrict) import qualified Data.Text.Lazy.Builder as TB import Data.Text.Lazy.Builder (toLazyText, singleton) import qualified Data.Text.Internal.Fusion.Common as Fusion import qualified Data.Text.Internal.Fusion as Fusion import Test.Tasty.Bench #ifdef MIN_VERSION_text_builder import qualified Text.Builder #endif #ifdef MIN_VERSION_bytestring_strict_builder import qualified ByteString.StrictBuilder #endif -------------------------------------------------------------------------------- -- Single char -------------------------------------------------------------------------------- benchLazyBuilder ∷ Int → T.Text benchLazyBuilder = toStrict . toLazyText . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go (singleton ch <> (acc <> singleton ch)) (n - 1) benchLazyBuilderBS ∷ Int → B.ByteString benchLazyBuilderBS = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go (B.charUtf8 ch <> (acc <> B.charUtf8 ch)) (n - 1) #ifdef MIN_VERSION_text_builder benchStrictBuilder ∷ Int → T.Text benchStrictBuilder = Text.Builder.run . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go (Text.Builder.char ch <> (acc <> Text.Builder.char ch)) (n - 1) #endif #ifdef MIN_VERSION_bytestring_strict_builder benchStrictBuilderBS ∷ Int → B.ByteString benchStrictBuilderBS = ByteString.StrictBuilder.builderBytes . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go (ByteString.StrictBuilder.utf8Char ch <> (acc <> ByteString.StrictBuilder.utf8Char ch)) (n - 1) #endif benchLinearBuilder ∷ Int → T.Text benchLinearBuilder m = runBuffer (\b → go b m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let ch = chr n in go (ch .<| (acc |>. ch)) (n - 1) benchSingleChar ∷ Benchmark benchSingleChar = bgroup "Single" $ map mkGroupChar [1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6] mkGroupChar :: Int → Benchmark mkGroupChar n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf benchLazyBuilder n , bench "Data.ByteString.Builder" $ nf benchLazyBuilderBS n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf benchStrictBuilder n #endif #ifdef MIN_VERSION_bytestring_strict_builder , bench "ByteString.StrictBuilder" $ nf benchStrictBuilderBS n #endif , bench "Data.Text.Builder.Linear" $ nf benchLinearBuilder n ] -------------------------------------------------------------------------------- -- Multiple chars -------------------------------------------------------------------------------- charCount :: Word charCount = 3 benchCharsLazyBuilder ∷ Int → T.Text benchCharsLazyBuilder = TL.toStrict . TB.toLazyText . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go (replicateChar ch <> (acc <> replicateChar ch)) (n - 1) replicateChar ch = TB.fromText (Fusion.unstream (Fusion.replicateCharI charCount ch)) {- [FIXME] bad performance benchCharsLazyBuilderBS ∷ Int → B.ByteString benchCharsLazyBuilderBS = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go (replicateChar ch <> (acc <> replicateChar ch)) (n - 1) replicateChar ch = stimes charCount (B.charUtf8 ch) -} #ifdef MIN_VERSION_text_builder benchCharsStrictBuilder ∷ Int → T.Text benchCharsStrictBuilder = Text.Builder.run . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go (replicateChar ch <> (acc <> replicateChar ch)) (n - 1) -- [TODO] Is there a better way? replicateChar ch = Text.Builder.padFromRight (fromIntegral charCount) ch mempty #endif {- [TODO] #ifdef MIN_VERSION_bytestring_strict_builder benchCharsStrictBuilderBS ∷ Int → B.ByteString benchCharsStrictBuilderBS = ByteString.StrictBuilder.builderBytes . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go _ (n - 1) #endif -} benchCharsLinearBuilder ∷ Int → T.Text benchCharsLinearBuilder m = runBuffer (\b → go b m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let ch = chr n in go (prependChars charCount ch (appendChars charCount ch acc)) (n - 1) benchMultipleChars ∷ Benchmark benchMultipleChars = bgroup "Multiple" $ map mkGroupChars [1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6] mkGroupChars :: Int → Benchmark mkGroupChars n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf benchCharsLazyBuilder n -- , bench "Data.ByteString.Builder" $ nf benchCharsLazyBuilderBS n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf benchCharsStrictBuilder n #endif -- #ifdef MIN_VERSION_bytestring_strict_builder -- , bench "ByteString.StrictBuilder" $ nf benchCharsStrictBuilderBS n -- #endif , bench "Data.Text.Builder.Linear" $ nf benchCharsLinearBuilder n ] -------------------------------------------------------------------------------- -- Padding -------------------------------------------------------------------------------- benchPaddingLazyBuilder ∷ Int → T.Text benchPaddingLazyBuilder = toStrict . toLazyText . go mempty 0 where go !acc !_ 0 = acc go !acc l n = let ch = chr n !l' = l + 2 * fromIntegral charCount in go (withText (T.justifyLeft l' ch) (withText (T.justifyRight (l + fromIntegral charCount) ch) acc)) l' (n - 1) withText f = TB.fromText . f . TL.toStrict . TB.toLazyText {- [TODO] benchPaddingLazyBuilderBS ∷ Int → B.ByteString benchPaddingLazyBuilderBS = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go _ (n - 1) -} #ifdef MIN_VERSION_text_builder benchPaddingStrictBuilder ∷ Int → T.Text benchPaddingStrictBuilder = Text.Builder.run . go mempty 0 where go !acc !_ 0 = acc go !acc l n = let ch = chr n !l' = l + 2 * fromIntegral charCount in go (Text.Builder.padFromRight l' ch (Text.Builder.padFromLeft (l + fromIntegral charCount) ch acc)) l' (n - 1) #endif {- [TODO] #ifdef MIN_VERSION_bytestring_strict_builder benchPaddingStrictBuilderBS ∷ Int → B.ByteString benchPaddingStrictBuilderBS = ByteString.StrictBuilder.builderBytes . go mempty where go !acc 0 = acc go !acc n = let ch = chr n in go _ (n - 1) #endif -} benchPaddingLinearBuilder ∷ Int → T.Text benchPaddingLinearBuilder m = runBuffer (\b → go b 0 m) where go ∷ Buffer ⊸ Word → Int → Buffer go !acc !_ 0 = acc go !acc l n = let ch = chr n !l' = l + 2 * charCount in go (justifyLeft l' ch (justifyRight (l + charCount) ch acc)) l' (n - 1) benchPadding ∷ Benchmark benchPadding = bgroup "Padding" $ map mkGroupPadding [1e0, 1e1, 1e2, 1e3, 1e4{-, 1e5, 1e6-}] -- NOTE: too long with 1e5 mkGroupPadding :: Int → Benchmark mkGroupPadding n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf benchPaddingLazyBuilder n -- , bench "Data.ByteString.Builder" $ nf benchPaddingLazyBuilderBS n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf benchPaddingStrictBuilder n #endif -- #ifdef MIN_VERSION_bytestring_strict_builder -- , bench "ByteString.StrictBuilder" $ nf benchPaddingStrictBuilderBS n -- #endif , bench "Data.Text.Builder.Linear" $ nf benchPaddingLinearBuilder n ] -------------------------------------------------------------------------------- -- All benchmarks -------------------------------------------------------------------------------- benchChar ∷ Benchmark benchChar = bgroup "Char" [ benchSingleChar , benchMultipleChars , benchPadding ] text-builder-linear-0.1.3/bench/BenchDecimal.hs0000644000000000000000000001273107346545000017501 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module BenchDecimal (benchDecimal) where import Data.ByteString qualified as B import Data.ByteString.Builder qualified as B import Data.Text qualified as T import Data.Text.Builder.Linear.Buffer (Buffer, runBuffer, ($$<|), ($<|), (|>$), (|>$$)) import Data.Text.Lazy (toStrict) import Data.Text.Lazy.Builder (toLazyText) import Data.Text.Lazy.Builder.Int (decimal) import Test.Tasty.Bench (Benchmark, bench, bgroup, nf) #ifdef MIN_VERSION_text_builder import qualified Text.Builder #endif #ifdef MIN_VERSION_bytestring_strict_builder import qualified ByteString.StrictBuilder #endif benchDecimal ∷ Benchmark benchDecimal = bgroup "Decimal" [benchBoundedDecimal, benchUnboundedDecimal] -------------------------------------------------------------------------------- -- Bounded -------------------------------------------------------------------------------- int ∷ Int int = 123456789123456789 benchLazyBuilder ∷ Integral a ⇒ a → Int → T.Text benchLazyBuilder k = toStrict . toLazyText . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (decimal i <> (acc <> decimal i)) (n - 1) {-# SPECIALIZE benchLazyBuilder ∷ Int → Int → T.Text #-} {-# SPECIALIZE benchLazyBuilder ∷ Integer → Int → T.Text #-} benchLazyBuilderBS ∷ Int → Int → B.ByteString benchLazyBuilderBS k = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let i = n * k in go (B.intDec i <> (acc <> B.intDec i)) (n - 1) #ifdef MIN_VERSION_text_builder benchStrictBuilder ∷ (Integral a) ⇒ a → Int → T.Text benchStrictBuilder k = Text.Builder.run . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (Text.Builder.decimal i <> (acc <> Text.Builder.decimal i)) (n - 1) {-# SPECIALIZE benchStrictBuilder ∷ Int → Int → T.Text #-} {-# SPECIALIZE benchStrictBuilder ∷ Integer → Int → T.Text #-} #endif #ifdef MIN_VERSION_bytestring_strict_builder benchStrictBuilderBS ∷ (Integral a) ⇒ a → Int → B.ByteString benchStrictBuilderBS k = ByteString.StrictBuilder.builderBytes . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (ByteString.StrictBuilder.asciiIntegral i <> (acc <> ByteString.StrictBuilder.asciiIntegral i)) (n - 1) {-# SPECIALIZE benchStrictBuilderBS ∷ Int → Int → B.ByteString #-} {-# SPECIALIZE benchStrictBuilderBS ∷ Integer → Int → B.ByteString #-} #endif benchBoundedLinearBuilder ∷ Int → Int → T.Text benchBoundedLinearBuilder k m = runBuffer (\b → go b m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let i = n * k in go (i $<| (acc |>$ i)) (n - 1) benchBoundedDecimal ∷ Benchmark benchBoundedDecimal = bgroup "Bounded" $ map mkBoundedGroup [1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6] mkBoundedGroup ∷ Int → Benchmark mkBoundedGroup n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf (benchLazyBuilder int) n , bench "Data.ByteString.Builder" $ nf (benchLazyBuilderBS int) n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf (benchStrictBuilder int) n #endif #ifdef MIN_VERSION_bytestring_strict_builder , bench "ByteString.StrictBuilder" $ nf (benchStrictBuilderBS int) n #endif , bench "Data.Text.Builder.Linear" $ nf (benchBoundedLinearBuilder int) n ] -------------------------------------------------------------------------------- -- Unbounded -------------------------------------------------------------------------------- integerSmall ∷ Integer integerSmall = toInteger (div @Word maxBound 20) integerBig ∷ Integer integerBig = toInteger (maxBound @Word - 1) ^ (10 ∷ Word) integerHuge ∷ Integer integerHuge = toInteger (maxBound @Word - 1) ^ (100 ∷ Word) benchUnboundedDecimal ∷ Benchmark benchUnboundedDecimal = bgroup "Unbounded" [ bgroup "Small" $ map (mkUnboundedGroup integerSmall) [1e0, 1e1, 1e2, 1e3, 1e4, 1e5] , bgroup "Big" $ map (mkUnboundedGroup integerBig) [1e0, 1e1, 1e2, 1e3, 1e4] , bgroup "Huge" $ map (mkUnboundedGroup integerHuge) [1e0, 1e1, 1e2, 1e3] ] -- NOTE: In the following benchmarks, the ByteString builder would share work -- if the prepender and the appender are identical, while our linear buffer does -- not. So we increment the appender to get a fair benchmark. benchUnboundedLazyBuilderBS ∷ Integer → Int → B.ByteString benchUnboundedLazyBuilderBS k = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (B.integerDec i <> (acc <> B.integerDec (i + 1))) (n - 1) benchUnboundedLinearBuilder ∷ Integer → Int → T.Text benchUnboundedLinearBuilder k m = runBuffer (\b → go b m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (i $$<| (acc |>$$ (i + 1))) (n - 1) mkUnboundedGroup ∷ Integer → Int → Benchmark mkUnboundedGroup integer n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf (benchLazyBuilder integer) n , bench "Data.ByteString.Builder" $ nf (benchUnboundedLazyBuilderBS integer) n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf (benchStrictBuilder integer) n #endif #ifdef MIN_VERSION_bytestring_strict_builder , bench "ByteString.StrictBuilder" $ nf (benchStrictBuilderBS integer) n #endif , bench "Data.Text.Builder.Linear" $ nf (benchUnboundedLinearBuilder integer) n ] text-builder-linear-0.1.3/bench/BenchDecimalUnbounded.hs0000644000000000000000000001610407346545000021343 0ustar0000000000000000{-# LANGUAGE NumDecimals #-} -- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module BenchDecimalUnbounded (benchDecimalUnbounded) where import Data.ByteString qualified as B import Data.ByteString.Builder qualified as B import Data.Text qualified as T import Data.Text.Builder.Linear.Buffer (Buffer, runBuffer, ($$<|), (|>$$)) import Data.Text.Lazy qualified as TL import Data.Text.Lazy.Builder qualified as TB import Data.Text.Lazy.Builder.Int qualified as TB import Test.Tasty.Bench (Benchmark, bench, bgroup, nf) benchUnboundedLinearBuilderAppend ∷ Integer → Int → T.Text benchUnboundedLinearBuilderAppend k m = runBuffer (`go` m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (acc |>$$ i) (n - 1) benchUnboundedLinearBuilderPrepend ∷ Integer → Int → T.Text benchUnboundedLinearBuilderPrepend k m = runBuffer (`go` m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (i $$<| acc) (n - 1) -- NOTE: In the following benchmark, the ByteString builder would share work -- if the prepender and the appender are identical, while our linear buffer does -- not. So we increment the appender to get a fair benchmark. benchUnboundedLinearBuilder ∷ Integer → Int → T.Text benchUnboundedLinearBuilder k m = runBuffer (`go` m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (i $$<| (acc |>$$ (i + 1))) (n - 1) benchUnboundedLazyBuilderBSAppend ∷ Integer → Int → B.ByteString benchUnboundedLazyBuilderBSAppend k = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (acc <> B.integerDec i) (n - 1) benchUnboundedLazyBuilderBSPrepend ∷ Integer → Int → B.ByteString benchUnboundedLazyBuilderBSPrepend k = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (B.integerDec i <> acc) (n - 1) benchUnboundedLazyBuilderBS ∷ Integer → Int → B.ByteString benchUnboundedLazyBuilderBS k = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (B.integerDec i <> (acc <> B.integerDec (i + 1))) (n - 1) benchLazyBuilderAppend ∷ Integer → Int → T.Text benchLazyBuilderAppend k = TL.toStrict . TB.toLazyText . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (acc <> TB.decimal i) (n - 1) benchLazyBuilderPrepend ∷ Integer → Int → T.Text benchLazyBuilderPrepend k = TL.toStrict . TB.toLazyText . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (TB.decimal i <> acc) (n - 1) benchLazyBuilder ∷ Integer → Int → T.Text benchLazyBuilder k = TL.toStrict . TB.toLazyText . go mempty where go !acc 0 = acc go !acc n = let i = fromIntegral n * k in go (TB.decimal i <> (acc <> TB.decimal (i + 1))) (n - 1) data NamedInteger = I !String !Integer mkGroup ∷ String → [Int] → (Integer → Int → T.Text) → (Integer → Int → B.ByteString) → (Integer → Int → T.Text) → [NamedInteger] → Benchmark mkGroup name counts f g h = bgroup name . map mkBenches where mkBenches (I benchName i) = bgroup benchName (map (\count → bgroup (show count) (mkBench i count)) counts) mkBench i count = [ bench "Data.Text.Lazy.Builder" $ nf (f i) count , bench "Data.ByteString.Builder" $ nf (g i) count , bench "Data.Text.Builder.Linear" $ nf (h i) count ] {-# INLINE mkGroup #-} integers ∷ [NamedInteger] integers = [ I "Small" (toInteger (div @Word maxBound 20)) -- ~ 9e17 , I "Big01" (toInteger (maxBound @Word - 1) ^ (2 ∷ Word)) -- ~3e38 , I "Big02" (toInteger (maxBound @Word - 1) ^ (5 ∷ Word)) -- ~2e96 , I "Big03" (toInteger (maxBound @Word - 1) ^ (10 ∷ Word)) -- ~5e192 , I "Big04" (toInteger (maxBound @Word - 1) ^ (15 ∷ Word)) -- ~1e289 , I "Big05" (toInteger (maxBound @Word - 1) ^ (20 ∷ Word)) -- ~2e385 -- , I "Big05a" (toInteger (maxBound @Word - 1) ^ (21 ∷ Word)) -- ~4e404 -- , I "Big05b" (toInteger (maxBound @Word - 1) ^ (22 ∷ Word)) -- ~7e423 -- , I "Big05c" (toInteger (maxBound @Word - 1) ^ (23 ∷ Word)) -- ~1e443 -- , I "Big05d" (toInteger (maxBound @Word - 1) ^ (24 ∷ Word)) -- ~2e462 , I "Big06" (toInteger (maxBound @Word - 1) ^ (25 ∷ Word)) -- ~4e481 -- , I "Big06a" (toInteger (maxBound @Word - 1) ^ (26 ∷ Word)) -- ~8e500 -- , I "Big06b" (toInteger (maxBound @Word - 1) ^ (27 ∷ Word)) -- ~2e520 -- , I "Big06c" (toInteger (maxBound @Word - 1) ^ (28 ∷ Word)) -- , I "Big06d" (toInteger (maxBound @Word - 1) ^ (29 ∷ Word)) , I "Big07" (toInteger (maxBound @Word - 1) ^ (30 ∷ Word)) -- ~ 9e577 , I "Big08" (toInteger (maxBound @Word - 1) ^ (35 ∷ Word)) -- ~ 2e674 , I "Big09" (toInteger (maxBound @Word - 1) ^ (40 ∷ Word)) -- ~ 4e770 , I "Big10" (toInteger (maxBound @Word - 1) ^ (45 ∷ Word)) -- ~ 9e866 , I "Big11" (toInteger (maxBound @Word - 1) ^ (50 ∷ Word)) -- ~ 2e963 , I "Huge01" (toInteger (maxBound @Word - 1) ^ (75 ∷ Word)) -- ~9e1444 , I "Huge02" (toInteger (maxBound @Word - 1) ^ (100 ∷ Word)) -- ~4e1926 , I "Huge03" (toInteger (maxBound @Word - 1) ^ (200 ∷ Word)) -- ~2e3853 , I "Huge04" (toInteger (maxBound @Word - 1) ^ (300 ∷ Word)) -- ~6e5779 , I "Huge05" (toInteger (maxBound @Word - 1) ^ (400 ∷ Word)) -- ~2e7706 -- , I "Huge05a" (toInteger (maxBound @Word - 1) ^ (450 ∷ Word)) , I "Huge06" (toInteger (maxBound @Word - 1) ^ (500 ∷ Word)) -- ~9e9632 -- , I "Huge06b" (toInteger (maxBound @Word - 1) ^ (600 ∷ Word)) , I "Huge07" (toInteger (maxBound @Word - 1) ^ (700 ∷ Word)) -- ~1e13486 , I "Huge08" (toInteger (maxBound @Word - 1) ^ (1000 ∷ Word)) -- ~8e19265 , I "Huge09" (toInteger (maxBound @Word - 1) ^ (3000 ∷ Word)) -- ~6e57797 , I "Huge10" (toInteger (maxBound @Word - 1) ^ (5000 ∷ Word)) -- ~4e96329 , I "Huge11" (toInteger (maxBound @Word - 1) ^ (10000 ∷ Word)) -- ~2e192659 , I "Huge12" (toInteger (maxBound @Word - 1) ^ (100000 ∷ Word)) -- ~9e1926591 -- , I "Huge13" (toInteger (maxBound @Word - 1) ^ (1000000 ∷ Word)) , I "1e20" 1e20 , I "1e100" 1e100 , I "1e300" (10 ^ (300 ∷ Word)) , I "1e500" (10 ^ (500 ∷ Word)) , I "1e1000" (10 ^ (1000 ∷ Word)) ] benchDecimalUnbounded ∷ Benchmark benchDecimalUnbounded = bgroup "Decimal: detailed unbounded" [ mkGroup "Append" counts benchLazyBuilderAppend benchUnboundedLazyBuilderBSAppend benchUnboundedLinearBuilderAppend integers , mkGroup "Prepend" counts benchLazyBuilderPrepend benchUnboundedLazyBuilderBSPrepend benchUnboundedLinearBuilderPrepend integers , mkGroup "Both" counts benchLazyBuilder benchUnboundedLazyBuilderBS benchUnboundedLinearBuilder integers ] where counts ∷ [Int] counts = [1e0, 1e1, 1e2] text-builder-linear-0.1.3/bench/BenchDouble.hs0000644000000000000000000000372307346545000017356 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module BenchDouble (benchDouble) where import qualified Data.ByteString as B import qualified Data.ByteString.Builder as B import qualified Data.Text as T import Data.Text.Builder.Linear.Buffer import Data.Text.Lazy (toStrict) import Data.Text.Lazy.Builder (toLazyText) import Data.Text.Lazy.Builder.RealFloat (realFloat) import Test.Tasty.Bench #ifdef MIN_VERSION_text_builder import qualified Text.Builder #endif dbl :: Double dbl = - pi * 1e300 benchLazyBuilder ∷ Int → T.Text benchLazyBuilder = toStrict . toLazyText . go mempty where go !acc 0 = acc go !acc n = let d = fromIntegral n * dbl in go (realFloat d <> (acc <> realFloat d)) (n - 1) benchLazyBuilderBS ∷ Int → B.ByteString benchLazyBuilderBS = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = let d = fromIntegral n * dbl in go (B.doubleDec d <> (acc <> B.doubleDec d)) (n - 1) #ifdef MIN_VERSION_text_builder benchStrictBuilder ∷ Int → T.Text benchStrictBuilder = Text.Builder.run . go mempty where go !acc 0 = acc go !acc n = let d = fromIntegral n * dbl in go (Text.Builder.fixedDouble 17 d <> (acc <> Text.Builder.fixedDouble 17 d)) (n - 1) #endif benchLinearBuilder ∷ Int → T.Text benchLinearBuilder m = runBuffer (\b → go b m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let d = fromIntegral n * dbl in go (d %<| (acc |>% d)) (n - 1) benchDouble ∷ Benchmark benchDouble = bgroup "Double" $ map mkGroup [1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6] mkGroup :: Int → Benchmark mkGroup n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf benchLazyBuilder n , bench "Data.ByteString.Builder" $ nf benchLazyBuilderBS n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf benchStrictBuilder n #endif , bench "Data.Text.Builder.Linear" $ nf benchLinearBuilder n ] text-builder-linear-0.1.3/bench/BenchHexadecimal.hs0000644000000000000000000000440407346545000020345 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module BenchHexadecimal (benchHexadecimal) where import qualified Data.ByteString as B import qualified Data.ByteString.Builder as B import qualified Data.Text as T import Data.Text.Builder.Linear.Buffer import Data.Text.Lazy (toStrict) import Data.Text.Lazy.Builder (toLazyText) import Data.Text.Lazy.Builder.Int (hexadecimal) import Test.Tasty.Bench #ifdef MIN_VERSION_text_builder import qualified Text.Builder #endif word :: Word word = 123456789123456789 benchLazyBuilder ∷ Word → T.Text benchLazyBuilder = toStrict . toLazyText . go mempty where go !acc 0 = acc go !acc n = let i = n * word in go (hexadecimal i <> (acc <> hexadecimal i)) (n - 1) benchLazyBuilderBS ∷ Word → B.ByteString benchLazyBuilderBS = B.toStrict . B.toLazyByteString . go mempty where go !acc 0 = acc go !acc n = go (B.wordHex n <> (acc <> B.wordHex n)) (n - 1) #ifdef MIN_VERSION_text_builder benchStrictBuilder ∷ Word → T.Text benchStrictBuilder = Text.Builder.run . go mempty where go !acc 0 = acc go !acc n = let i = n * word in go (Text.Builder.hexadecimal i <> (acc <> Text.Builder.hexadecimal i)) (n - 1) #endif benchLinearBuilderWord ∷ Word → T.Text benchLinearBuilderWord m = runBuffer (\b → go b m) where go ∷ Buffer ⊸ Word → Buffer go !acc 0 = acc go !acc n = let i = n * word in go (i &<| (acc |>& i)) (n - 1) benchLinearBuilderInt ∷ Word → T.Text benchLinearBuilderInt m = runBuffer (\b → go b (fromIntegral m)) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = let i = n * fromIntegral word in go (i &<| (acc |>& i)) (n - 1) benchHexadecimal ∷ Benchmark benchHexadecimal = bgroup "Hexadecimal" $ map mkGroup [1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6] mkGroup :: Word → Benchmark mkGroup n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf benchLazyBuilder n , bench "Data.ByteString.Builder" $ nf benchLazyBuilderBS n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf benchStrictBuilder n #endif , bench "Data.Text.Builder.Linear (Word)" $ nf benchLinearBuilderWord n , bench "Data.Text.Builder.Linear (Int)" $ nf benchLinearBuilderInt n ] text-builder-linear-0.1.3/bench/BenchText.hs0000644000000000000000000000460207346545000017065 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module BenchText (benchText) where import qualified Data.ByteString as B import qualified Data.ByteString.Builder as B import qualified Data.Text as T import qualified Data.Text.Encoding as T import Data.Text.Builder.Linear.Buffer import Data.Text.Lazy (toStrict) import Data.Text.Lazy.Builder (toLazyText, fromText) import Test.Tasty.Bench #ifdef MIN_VERSION_text_builder import qualified Text.Builder #endif #ifdef MIN_VERSION_bytestring_strict_builder import qualified ByteString.StrictBuilder #endif txt ∷ T.Text txt = T.pack "Haskell + Linear Types = ♡" benchLazyBuilder ∷ Int → T.Text benchLazyBuilder = toStrict . toLazyText . go mempty where txtB = fromText txt go !acc 0 = acc go !acc n = go (txtB <> (acc <> txtB)) (n - 1) benchLazyBuilderBS ∷ Int → B.ByteString benchLazyBuilderBS = B.toStrict . B.toLazyByteString . go mempty where txtB = B.byteString $ T.encodeUtf8 txt go !acc 0 = acc go !acc n = go (txtB <> (acc <> txtB)) (n - 1) #ifdef MIN_VERSION_text_builder benchStrictBuilder ∷ Int → T.Text benchStrictBuilder = Text.Builder.run . go mempty where txtB = Text.Builder.text txt go !acc 0 = acc go !acc n = go (txtB <> (acc <> txtB)) (n - 1) #endif #ifdef MIN_VERSION_bytestring_strict_builder benchStrictBuilderBS ∷ Int → B.ByteString benchStrictBuilderBS = ByteString.StrictBuilder.builderBytes . go mempty where txtB = ByteString.StrictBuilder.bytes $ T.encodeUtf8 txt go !acc 0 = acc go !acc n = go (txtB <> (acc <> txtB)) (n - 1) #endif benchLinearBuilder ∷ Int → T.Text benchLinearBuilder m = runBuffer (\b → go b m) where go ∷ Buffer ⊸ Int → Buffer go !acc 0 = acc go !acc n = go (txt <| (acc |> txt)) (n - 1) benchText ∷ Benchmark benchText = bgroup "Text" $ map mkGroup [1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6] mkGroup :: Int → Benchmark mkGroup n = bgroup (show n) [ bench "Data.Text.Lazy.Builder" $ nf benchLazyBuilder n , bench "Data.ByteString.Builder" $ nf benchLazyBuilderBS n #ifdef MIN_VERSION_text_builder , bench "Text.Builder" $ nf benchStrictBuilder n #endif #ifdef MIN_VERSION_bytestring_strict_builder , bench "ByteString.StrictBuilder" $ nf benchStrictBuilderBS n #endif , bench "Data.Text.Builder.Linear" $ nf benchLinearBuilder n ] text-builder-linear-0.1.3/bench/Main.hs0000644000000000000000000000156507346545000016072 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module Main where import Test.Tasty.Bench import Test.Tasty.Patterns.Printer import BenchChar import BenchDecimal import BenchDecimalUnbounded (benchDecimalUnbounded) import BenchDouble import BenchHexadecimal import BenchText main ∷ IO () main = defaultMain $ map (mapLeafBenchmarks addCompare) $ [ benchText , benchChar , benchDecimal , benchDecimalUnbounded , benchHexadecimal , benchDouble ] textBenchName ∷ String -- textBenchName = "Data.Text.Lazy.Builder" textBenchName = "Data.ByteString.Builder" addCompare ∷ ([String] → Benchmark → Benchmark) addCompare (name : path) | name /= textBenchName = bcompare (printAwkExpr (locateBenchmark (textBenchName : path))) addCompare _ = id text-builder-linear-0.1.3/changelog.md0000644000000000000000000000144007346545000016034 0ustar0000000000000000## 0.1.3 * Add decimal builders for unbounded inputs: `fromUnboundedDec`, `(|>$$)` and `($$<|)`. ## 0.1.2 * Fix unsound behaviour caused by inlining of `runBuffer` / `runBufferBS` and CSE (common subexpression elimination). * Fix hexadecimal builder, looping on negative inputs. * Fix decimal builder for non-standard bitness of the input. * Add `(#<|)` and deprecate `(|>#)`. * Add `newEmptyBuffer`. * Add `prependChars` and `appendChars`. * Add `justifyLeft`, `justifyRight` and `center`. ## 0.1.1.1 * Support `text-2.1`. ## 0.1.1 * Introduce `ByteString` backend (thanks @oberblastmeister for the idea). * Fix decimal builder for 30- and 31-bit wide types. * Speed up decimal builder on aarch64. * Speed up hexadecimal builder. * Support 32-bit architectures. ## 0.1 * Initial release. text-builder-linear-0.1.3/src/Data/Text/Builder/0000755000000000000000000000000007346545000017536 5ustar0000000000000000text-builder-linear-0.1.3/src/Data/Text/Builder/Linear.hs0000644000000000000000000001074607346545000021314 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko -- -- Builder for strict 'Text' and 'ByteString', based on linear types. It consistently -- outperforms "Data.Text.Lazy.Builder" -- from @text@ as well as a strict builder from @text-builder@, -- and scales better. module Data.Text.Builder.Linear ( Builder (..), runBuilder, runBuilderBS, fromText, fromChar, fromAddr, fromDec, fromUnboundedDec, fromHex, fromDouble, ) where import Data.Bits (FiniteBits) import Data.ByteString.Internal (ByteString (..)) import Data.Text.Internal (Text (..)) import GHC.Exts (Addr#, IsString (..)) import Data.Text.Builder.Linear.Buffer -- | Thin wrapper over 'Buffer' with a handy 'Semigroup' instance. -- -- >>> :set -XOverloadedStrings -XMagicHash -- >>> fromText "foo" <> fromChar '_' <> fromAddr "bar"# -- "foo_bar" -- -- Remember: this is a strict builder, so on contrary to "Data.Text.Lazy.Builder" -- for optimal performance you should use strict left folds instead of lazy right ones. -- -- Note that (similar to other builders) concatenation of 'Builder's allocates -- thunks. This is to a certain extent mitigated by aggressive inlining, -- but it is faster to use 'Buffer' directly. newtype Builder = Builder {unBuilder ∷ Buffer ⊸ Buffer} -- | Run 'Builder' computation on an empty 'Buffer', returning strict 'Text'. -- -- >>> :set -XOverloadedStrings -XMagicHash -- >>> runBuilder (fromText "foo" <> fromChar '_' <> fromAddr "bar"#) -- "foo_bar" -- -- This function has a polymorphic arrow and thus can be used both in -- usual and linear contexts. runBuilder ∷ ∀ m. Builder %m → Text runBuilder (Builder f) = runBuffer f {-# INLINE runBuilder #-} -- | Same as 'runBuilder', but returning a UTF-8 encoded strict 'ByteString'. runBuilderBS ∷ ∀ m. Builder %m → ByteString runBuilderBS (Builder f) = runBufferBS f {-# INLINE runBuilderBS #-} instance Show Builder where show (Builder f) = show (runBuffer f) instance Semigroup Builder where Builder f <> Builder g = Builder $ \b → g (f b) {-# INLINE (<>) #-} instance Monoid Builder where mempty = Builder (\b → b) {-# INLINE mempty #-} -- | Use 'fromString' to create 'Builder' from 'String'. instance IsString Builder where fromString = fromText . fromString {-# INLINE fromString #-} -- | Create 'Builder', containing a given 'Text'. -- -- >>> :set -XOverloadedStrings -- >>> fromText "foo" <> fromText "bar" -- "foobar" -- -- For literal strings it is faster to use 'fromAddr' instead of 'fromText'. fromText ∷ Text → Builder fromText x = Builder $ \b → b |> x {-# INLINE fromText #-} -- | Create 'Builder', containing a given 'Char'. -- -- >>> fromChar 'x' <> fromChar 'y' -- "xy" -- -- In contrast to 'Data.Text.Lazy.Builder.singleton', it's a responsibility -- of the caller to sanitize surrogate code points with 'Data.Text.Internal.safe'. fromChar ∷ Char → Builder fromChar x = Builder $ \b → b |>. x {-# INLINE fromChar #-} -- | Create 'Builder', containing a null-terminated UTF-8 string, specified by 'Addr#'. -- -- >>> :set -XMagicHash -- >>> fromAddr "foo"# <> fromAddr "bar"# -- "foobar" -- -- The literal string must not contain zero bytes @\\NUL@ and must be a valid UTF-8, -- these conditions are not checked. fromAddr ∷ Addr# → Builder fromAddr x = Builder $ \b → b |># x {-# INLINE fromAddr #-} -- | Create 'Builder', containing decimal representation of a given /bounded/ integer. -- -- >>> fromChar 'x' <> fromDec (123 :: Int) -- "x123" fromDec ∷ (Integral a, FiniteBits a) ⇒ a → Builder fromDec x = Builder $ \b → b |>$ x {-# INLINE fromDec #-} -- | Create 'Builder', containing decimal representation of a given /unbounded/ integer. -- -- >>> fromChar 'x' <> fromUnboundedDec (1e24 :: Integer) -- "x1000000000000000000000000" -- -- @since 0.1.3 fromUnboundedDec ∷ Integral a ⇒ a → Builder fromUnboundedDec x = Builder $ \b → b |>$$ x {-# INLINE fromUnboundedDec #-} -- | Create 'Builder', containing hexadecimal representation of a given integer. -- -- >>> :set -XMagicHash -- >>> fromAddr "0x"# <> fromHex (0x123def :: Int) -- "0x123def" fromHex ∷ (Integral a, FiniteBits a) ⇒ a → Builder fromHex x = Builder $ \b → b |>& x {-# INLINE fromHex #-} -- | Create 'Builder', containing decimal representation of a given 'Double'. -- -- >>> :set -XMagicHash -- >>> fromAddr "pi="# <> fromDouble pi -- "pi=3.141592653589793" fromDouble ∷ Double → Builder fromDouble x = Builder $ \b → b |>% x {-# INLINE fromDouble #-} text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/0000755000000000000000000000000007346545000020750 5ustar0000000000000000text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Array.hs0000644000000000000000000000445407346545000022371 0ustar0000000000000000{-# LANGUAGE CPP #-} -- | -- Copyright: (c) 2022 Andrew Lelechenko -- (c) 2023 Pierre Le Marre -- Licence: BSD3 -- Maintainer: Andrew Lelechenko -- -- Low-level routines for 'A.MArray' manipulations. module Data.Text.Builder.Linear.Array ( unsafeThaw, sizeofByteArray, isPinned, unsafeTile, unsafeReplicate, ) where import Data.Text.Array qualified as A import GHC.Exts (Int (..), isByteArrayPinned#, isTrue#, setByteArray#, sizeofByteArray#) import GHC.ST (ST (..)) #if __GLASGOW_HASKELL__ >= 909 import GHC.Exts (unsafeThawByteArray#) #else import GHC.Exts (unsafeCoerce#) #endif unsafeThaw ∷ A.Array → ST s (A.MArray s) #if __GLASGOW_HASKELL__ >= 909 unsafeThaw (A.ByteArray a) = ST $ \s# → case unsafeThawByteArray# a s# of (# s'#, ma #) -> (# s'#, A.MutableByteArray ma #) #else unsafeThaw (A.ByteArray a) = ST $ \s# → (# s#, A.MutableByteArray (unsafeCoerce# a) #) #endif sizeofByteArray ∷ A.Array → Int sizeofByteArray (A.ByteArray a) = I# (sizeofByteArray# a) isPinned ∷ A.Array → Bool isPinned (A.ByteArray a) = isTrue# (isByteArrayPinned# a) -- | Replicate an ASCII character -- -- __Warning:__ it is the responsibility of the caller to ensure that the 'Int' -- is a valid ASCII character. unsafeReplicate ∷ A.MArray s -- ^ Mutable array → Int -- ^ Offset → Int -- ^ Count → Int -- ^ ASCII character → ST s () unsafeReplicate (A.MutableByteArray dst#) (I# dstOff#) (I# count#) (I# w#) = ST (\s# → (# setByteArray# dst# dstOff# count# w# s#, () #)) {-# INLINE unsafeReplicate #-} -- | Duplicate a portion of an array in-place. -- -- Example of use: -- -- @ -- -- Write @count@ times the char @c@ -- let cLen = utf8Length c; totalLen = cLen * count -- in unsafeWrite dst dstOff ch *> 'unsafeTile' dst dstOff totalLen cLen -- @ unsafeTile ∷ A.MArray s -- ^ Mutable array → Int -- ^ Start of the portion to duplicate → Int -- ^ Total length of the duplicate → Int -- ^ Length of the portion to duplicate → ST s () unsafeTile dest destOff totalLen = go where -- Adapted from Data.Text.Array.tile go l | 2 * l > totalLen = A.copyM dest (destOff + l) dest destOff (totalLen - l) | otherwise = A.copyM dest (destOff + l) dest destOff l *> go (2 * l) {-# INLINE unsafeTile #-} text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Buffer.hs0000644000000000000000000001261207346545000022517 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- (c) 2023 Pierre Le Marre -- Licence: BSD3 -- Maintainer: Andrew Lelechenko -- -- 'Buffer' for strict 'Text', based on linear types. module Data.Text.Builder.Linear.Buffer ( -- * Type Buffer, -- * Basic interface runBuffer, runBufferBS, dupBuffer, consumeBuffer, eraseBuffer, foldlIntoBuffer, newEmptyBuffer, (><), -- * Single character (|>.), (.<|), -- * Multiple characters -- ** Character replication prependChars, appendChars, -- ** Text (|>), (<|), (|>…), (…<|), -- ** Raw 'Addr#' (|>#), ( #<| ), -- NOTE: extra spaces required because of -XUnboxedTuples (<|#), -- * Padding justifyLeft, justifyRight, center, -- * Number formatting -- ** Decimal -- *** Bounded numbers (|>$), ($<|), -- *** Unbounded numbers (|>$$), ($$<|), -- ** Hexadecimal -- *** Lower-case (|>&), (&<|), -- *** Upper-case and padding -- $custom_hexadecimal -- ** Double (|>%), (%<|), ) where import Data.Text.Array qualified as A import Data.Text.Internal (Text (..)) import GHC.Exts (Addr#, Int (..), Ptr (..), cstringLength#, setByteArray#) import GHC.ST (ST (..)) import Data.Text.Builder.Linear.Char import Data.Text.Builder.Linear.Core import Data.Text.Builder.Linear.Dec.Bounded import Data.Text.Builder.Linear.Dec.Unbounded import Data.Text.Builder.Linear.Double import Data.Text.Builder.Linear.Hex -- | Append 'Text' suffix to a 'Buffer' by mutating it. -- If a suffix is statically known, consider using '(|>#)' for optimal performance. -- -- >>> :set -XOverloadedStrings -XLinearTypes -- >>> runBuffer (\b -> b |> "foo" |> "bar") -- "foobar" (|>) ∷ Buffer ⊸ Text → Buffer infixl 6 |> buffer |> (Text src srcOff srcLen) = appendExact srcLen (\dst dstOff → A.copyI srcLen dst dstOff src srcOff) buffer -- | Prepend 'Text' prefix to a 'Buffer' by mutating it. -- If a prefix is statically known, consider using '(#<|)' for optimal performance. -- -- >>> :set -XOverloadedStrings -XLinearTypes -- >>> runBuffer (\b -> "foo" <| "bar" <| b) -- "foobar" (<|) ∷ Text → Buffer ⊸ Buffer infixr 6 <| Text src srcOff srcLen <| buffer = prependExact srcLen (\dst dstOff → A.copyI srcLen dst dstOff src srcOff) buffer -- | Append a null-terminated UTF-8 string -- to a 'Buffer' by mutating it. E. g., -- -- >>> :set -XOverloadedStrings -XLinearTypes -XMagicHash -- >>> runBuffer (\b -> b |># "foo"# |># "bar"#) -- "foobar" -- -- The literal string must not contain zero bytes @\\NUL@ and must be a valid UTF-8, -- these conditions are not checked. (|>#) ∷ Buffer ⊸ Addr# → Buffer infixl 6 |># buffer |># addr# = appendExact srcLen (\dst dstOff → A.copyFromPointer dst dstOff (Ptr addr#) srcLen) buffer where srcLen = I# (cstringLength# addr#) -- | Prepend a null-terminated UTF-8 string -- to a 'Buffer' by mutating it. E. g., -- -- >>> :set -XOverloadedStrings -XLinearTypes -XMagicHash -- >>> runBuffer (\b -> "foo"# #<| "bar"# #<| b) -- "foobar" -- -- The literal string must not contain zero bytes @\\NUL@ and must be a valid UTF-8, -- these conditions are not checked. -- -- /Note:/ When the syntactic extensions @UnboxedTuples@ or @UnboxedSums@ are -- enabled, extra spaces are required when using parentheses: i.e. use @( '#<|' )@ -- instead of @('#<|')@. See the GHC User Guide chapter -- “” -- for further information. ( #<| ) ∷ Addr# → Buffer ⊸ Buffer infixr 6 #<|, <|# addr# #<| buffer = prependExact srcLen (\dst dstOff → A.copyFromPointer dst dstOff (Ptr addr#) srcLen) buffer where srcLen = I# (cstringLength# addr#) -- | Alias for @'(#<|)'@. {-# DEPRECATED (<|#) "Use '(#<|)' instead" #-} (<|#) ∷ Addr# → Buffer ⊸ Buffer (<|#) = ( #<| ) -- NOTE: extra spaces required because of -XUnboxedTuples {-# INLINE (<|#) #-} -- | Append given number of spaces. (|>…) ∷ Buffer ⊸ Word → Buffer infixr 6 |>… buf |>… 0 = buf buffer |>… (fromIntegral → spaces@(I# spaces#)) = appendExact spaces ( \(A.MutableByteArray dst#) (I# dstOff#) → ST ( \s# → (# setByteArray# dst# dstOff# spaces# 32# s#, () #) ) ) buffer -- | Prepend given number of spaces. (…<|) ∷ Word → Buffer ⊸ Buffer infixr 6 …<| 0 …<| buf = buf (fromIntegral → spaces@(I# spaces#)) …<| buffer = prependExact spaces ( \(A.MutableByteArray dst#) (I# dstOff#) → ST ( \s# → (# setByteArray# dst# dstOff# spaces# 32# s#, () #) ) ) buffer -- | This is just a normal 'Data.List.foldl'', but with a linear arrow -- and unlifted accumulator. foldlIntoBuffer ∷ ∀ a. (Buffer ⊸ a → Buffer) → Buffer ⊸ [a] → Buffer foldlIntoBuffer f = go where go ∷ Buffer ⊸ [a] → Buffer go !acc [] = acc go !acc (x : xs) = go (f acc x) xs -- $custom_hexadecimal -- -- Note that neither /upper/ case nor padded hexadecimal formatting is provided. -- This package provides a minimal API with utility functions only for common cases. -- For other use cases, please adapt the code of this package, e.g. as shown in -- the [Unicode code point example](https://github.com/Bodigrim/linear-builder/blob/master/examples/src/Examples/Unicode.hs). text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Char.hs0000644000000000000000000001563607346545000022174 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module Data.Text.Builder.Linear.Char ( -- * Single character (|>.), (.<|), -- * Multiple characters prependChars, appendChars, -- * Padding justifyLeft, justifyRight, center, ) where import Data.Char (isAscii) import Data.Text.Array qualified as A import Data.Text.Internal.Encoding.Utf8 (ord2, ord3, ord4, utf8Length) import Data.Text.Internal.Unsafe.Char (ord, unsafeWrite) import GHC.ST (ST) import Unsafe.Coerce (unsafeCoerce) import Data.Text.Builder.Linear.Array (unsafeReplicate, unsafeTile) import Data.Text.Builder.Linear.Core -------------------------------------------------------------------------------- -- Single char -------------------------------------------------------------------------------- -- | Append 'Char' to a 'Buffer' by mutating it. -- -- >>> :set -XLinearTypes -- >>> runBuffer (\b -> b |>. 'q' |>. 'w') -- "qw" -- -- __Warning:__ In contrast to 'Data.Text.Lazy.Builder.singleton', it is the -- responsibility of the caller to sanitize surrogate code points with -- 'Data.Text.Internal.safe'. (|>.) ∷ Buffer ⊸ Char → Buffer infixl 6 |>. buffer |>. ch = appendBounded 4 (\dst dstOff → unsafeWrite dst dstOff ch) buffer -- | Prepend 'Char' to a 'Buffer' by mutating it. -- -- >>> :set -XLinearTypes -- >>> runBuffer (\b -> 'q' .<| 'w' .<| b) -- "qw" -- -- __Warning:__ In contrast to 'Data.Text.Lazy.Builder.singleton', it is the -- responsibility of the caller to sanitize surrogate code points with -- 'Data.Text.Internal.safe'. (.<|) ∷ Char → Buffer ⊸ Buffer infixr 6 .<| ch .<| buffer = prependBounded 4 (\dst dstOff → unsafePrependCharM dst dstOff ch) (\dst dstOff → unsafeWrite dst dstOff ch) buffer -- | Similar to 'Data.Text.Internal.Unsafe.Char.unsafeWrite', -- but writes _before_ a given offset. unsafePrependCharM ∷ A.MArray s → Int → Char → ST s Int unsafePrependCharM marr off c = case utf8Length c of 1 → do let n0 = fromIntegral (ord c) A.unsafeWrite marr (off - 1) n0 pure 1 2 → do let (n0, n1) = ord2 c A.unsafeWrite marr (off - 2) n0 A.unsafeWrite marr (off - 1) n1 pure 2 3 → do let (n0, n1, n2) = ord3 c A.unsafeWrite marr (off - 3) n0 A.unsafeWrite marr (off - 2) n1 A.unsafeWrite marr (off - 1) n2 pure 3 _ → do let (n0, n1, n2, n3) = ord4 c A.unsafeWrite marr (off - 4) n0 A.unsafeWrite marr (off - 3) n1 A.unsafeWrite marr (off - 2) n2 A.unsafeWrite marr (off - 1) n3 pure 4 -------------------------------------------------------------------------------- -- Multiple chars -------------------------------------------------------------------------------- -- | Prepend a given count of a 'Char' to a 'Buffer'. -- -- >>> :set -XLinearTypes -- >>> runBuffer (\b -> prependChars 3 'x' (b |>. 'A')) -- "xxxA" prependChars ∷ Word → Char → Buffer ⊸ Buffer prependChars count ch buff | count == 0 = buff | otherwise = case utf8Length ch of cLen → case cLen * fromIntegral count of totalLen → prependExact totalLen ( if isAscii ch then \dst dstOff → unsafeReplicate dst dstOff (fromIntegral count) (ord ch) else \dst dstOff → unsafeWrite dst dstOff ch *> unsafeTile dst dstOff totalLen cLen ) buff -- | Apppend a given count of a 'Char' to a 'Buffer'. -- -- >>> :set -XLinearTypes -- >>> runBuffer (\b -> appendChars 3 'x' (b |>. 'A')) -- "Axxx" appendChars ∷ Word → Char → Buffer ⊸ Buffer appendChars count ch buff | count == 0 = buff | otherwise = case utf8Length ch of cLen → case cLen * fromIntegral count of totalLen → appendExact totalLen ( if isAscii ch then \dst dstOff → unsafeReplicate dst dstOff (fromIntegral count) (ord ch) else \dst dstOff → unsafeWrite dst dstOff ch *> unsafeTile dst dstOff totalLen cLen ) buff -------------------------------------------------------------------------------- -- Padding -------------------------------------------------------------------------------- -- | Pad a builder from the /left/ side to the specified length with the specified -- character. -- -- >>> :set -XLinearTypes -- >>> runBuffer (\b -> justifyRight 10 'x' (appendChars 3 'A' b)) -- "xxxxxxxAAA" -- >>> runBuffer (\b -> justifyRight 5 'x' (appendChars 6 'A' b)) -- "AAAAAA" -- -- Note that 'newEmptyBuffer' is needed in some situations. The following example creates -- a utility function that justify a text and then append it to a buffer. -- -- >>> :set -XOverloadedStrings -XLinearTypes -XUnboxedTuples -- >>> import Data.Text.Builder.Linear.Buffer -- >>> import Data.Text (Text) -- >>> :{ -- appendJustified :: Buffer %1 -> Text -> Buffer -- appendJustified b t = case newEmptyBuffer b of -- -- Note that we need to create a new buffer from the text, in order -- -- to justify only the text and not the input buffer. -- (# b', empty #) -> b' >< justifyRight 12 ' ' (empty |> t) -- :} -- -- >>> runBuffer (\b -> (b |> "Test:") `appendJustified` "AAA" `appendJustified` "BBBBBBB") -- "Test: AAA BBBBBBB" justifyRight ∷ Word → Char → Buffer ⊸ Buffer justifyRight n ch buff = case lengthOfBuffer buff of (# buff', len #) → toLinearWord (\l b → if n <= l then b else prependChars (n - l) ch b) len buff' -- | Pad a builder from the /right/ side to the specified length with the specified -- character. -- -- >>> :set -XLinearTypes -- >>> runBuffer (\b -> justifyLeft 10 'x' (appendChars 3 'A' b)) -- "AAAxxxxxxx" -- >>> runBuffer (\b -> justifyLeft 5 'x' (appendChars 6 'A' b)) -- "AAAAAA" -- -- Note that 'newEmptyBuffer' is needed in some situations. See 'justifyRight' -- for an example. justifyLeft ∷ Word → Char → Buffer ⊸ Buffer justifyLeft n ch buff = case lengthOfBuffer buff of (# buff', len #) → toLinearWord (\l b → if n <= l then b else appendChars (n - l) ch b) len buff' -- | Center a builder to the specified length with the specified character. -- -- >>> :set -XLinearTypes -- >>> runBuffer (\b -> center 10 'x' (appendChars 3 'A' b)) -- "xxxxAAAxxx" -- >>> runBuffer (\b -> center 5 'x' (appendChars 6 'A' b)) -- "AAAAAA" -- -- Note that 'newEmptyBuffer' is needed in some situations. See 'justifyRight' -- for an example. center ∷ Word → Char → Buffer ⊸ Buffer center n ch buff = case lengthOfBuffer buff of (# buff', len #) → toLinearWord ( \l b → if n <= l then b else case n - l of !d → case d `quot` 2 of !r → appendChars r ch (prependChars (d - r) ch b) ) len buff' -- Despite the use of unsafeCoerce, this is safe. toLinearWord ∷ (Word → a) → (Word ⊸ a) toLinearWord = unsafeCoerce text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Core.hs0000644000000000000000000000115407346545000022175 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- (c) 2023 Pierre Le Marre -- Licence: BSD3 -- Maintainer: Andrew Lelechenko -- -- Low-level routines for 'Buffer' manipulations. module Data.Text.Builder.Linear.Core ( -- * Type Buffer, -- * Basic interface runBuffer, runBufferBS, dupBuffer, consumeBuffer, eraseBuffer, byteSizeOfBuffer, lengthOfBuffer, dropBuffer, takeBuffer, newEmptyBuffer, -- * Text concatenation appendBounded, appendExact, prependBounded, prependExact, (><), ) where import Data.Text.Builder.Linear.Internal text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Dec/0000755000000000000000000000000007346545000021443 5ustar0000000000000000text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Dec/Bounded.hs0000644000000000000000000001414107346545000023360 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE TemplateHaskell #-} -- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko #ifdef aarch64_HOST_ARCH {-# OPTIONS_GHC -Wno-unused-imports -Wno-unused-top-binds #-} #endif module Data.Text.Builder.Linear.Dec.Bounded ( (|>$), ($<|), unsafePrependDec, unsafeAppendDec, maxDecLen, quotRem100, digits, ) where #include "MachDeps.h" import Data.Bits (Bits (..), FiniteBits (..)) import Data.Int (Int16, Int32, Int8) import Data.Text.Array qualified as A import Data.Word (Word16, Word32, Word8) import Foreign.C.String (CString) import GHC.Exts (Int (..), Ptr (..), dataToTag#, (>=#)) import GHC.Ptr (plusPtr) import GHC.ST (ST (..)) import Numeric.QuoteQuot (assumeNonNegArg, astQuot, quoteAST, quoteQuot) import Data.Text.Builder.Linear.Core -- | Append the decimal representation of a /bounded/ integral number. (|>$) ∷ (Integral a, FiniteBits a) ⇒ Buffer ⊸ a → Buffer infixl 6 |>$ buffer |>$ n = appendBounded (maxDecLen n) (\dst dstOff → unsafeAppendDec dst dstOff n) buffer {-# INLINEABLE (|>$) #-} -- | Prepend the decimal representation of a /bounded/ integral number. ($<|) ∷ (Integral a, FiniteBits a) ⇒ a → Buffer ⊸ Buffer infixr 6 $<| n $<| buffer = prependBounded (maxDecLen n) (\dst dstOff → unsafePrependDec dst dstOff n) (\dst dstOff → unsafeAppendDec dst dstOff n) buffer {-# INLINEABLE ($<|) #-} -- | ceiling (fbs a * logBase 10 2) < ceiling (fbs a * 5 / 16) < 1 + floor (fbs a * 5 / 16) maxDecLen ∷ FiniteBits a ⇒ a → Int maxDecLen a | isSigned a = 2 + (finiteBitSize a * 5) `shiftR` 4 | otherwise = 1 + (finiteBitSize a * 5) `shiftR` 4 {-# INLINEABLE maxDecLen #-} exactDecLen ∷ (Integral a, FiniteBits a) ⇒ a → Int exactDecLen n | n < 0 = go 2 (complement n + fromIntegral (I# (dataToTag# (n > bit (finiteBitSize n - 1))))) | otherwise = go 1 n where go ∷ (Integral a, FiniteBits a) ⇒ Int → a → Int go acc k | finiteBitSize k >= if isSigned k then 31 else 30, k >= 1e9 = go (acc + 9) (quotBillion k) | otherwise = acc + exactIntDecLen (fromIntegral k) exactIntDecLen ∷ Int → Int exactIntDecLen l@(I# l#) | l >= 1e5 = 5 + I# (l# >=# 100_000_000#) + I# (l# >=# 10_000_000#) + I# (l# >=# 1_000_000#) | otherwise = I# (l# >=# 10_000#) + I# (l# >=# 1_000#) + I# (l# >=# 100#) + I# (l# >=# 10#) {-# INLINEABLE exactDecLen #-} unsafeAppendDec ∷ (Integral a, FiniteBits a) ⇒ A.MArray s → Int → a → ST s Int unsafeAppendDec marr off n = unsafePrependDec marr (off + exactDecLen n) n {-# INLINEABLE unsafeAppendDec #-} unsafePrependDec ∷ ∀ s a. (Integral a, FiniteBits a) ⇒ A.MArray s → Int → a → ST s Int unsafePrependDec marr !off n | n < 0 , n == bit (finiteBitSize n - 1) = do A.unsafeWrite marr (off - 1) (fromIntegral (0x30 + minBoundLastDigit n)) go (off - 2) (abs (bit (finiteBitSize n - 1) `quot` 10)) >>= sign | n == 0 = do A.unsafeWrite marr (off - 1) 0x30 >> pure 1 | otherwise = go (off - 1) (abs n) >>= sign where sign !o | n > 0 = pure (off - o) | otherwise = do A.unsafeWrite marr (o - 1) 0x2d -- '-' pure (off - o + 1) go ∷ Int → a → ST s Int go o k | k >= 10 = do let (q, r) = quotRem100 k A.copyFromPointer marr (o - 1) (digits `plusPtr` (fromIntegral r `shiftL` 1)) 2 if k < 100 then pure (o - 1) else go (o - 2) q | otherwise = do A.unsafeWrite marr o (fromIntegral (0x30 + k)) pure o {-# INLINEABLE unsafePrependDec #-} digits ∷ CString digits = Ptr "00010203040506070809101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899"# {-# NOINLINE digits #-} -- Compute rem minBound 10 efficiently. Given that: -- • minBound = 1 `shiftL` (finiteBitSize a - 1) = -2^(finiteBitSize a - 1) -- • the last digit of 2^k forms a cycle for k≥1: 2,4,8,6 -- Then it is enough to pattern-match rem (finiteBitSize a) 4, -- i.e. finiteBitSize a .&. 3 minBoundLastDigit ∷ FiniteBits a ⇒ a → Int minBoundLastDigit a = case finiteBitSize a .&. 3 of 0 → 8 1 → 6 2 → 2 _ → 4 {-# INLINEABLE minBoundLastDigit #-} quotRem100 ∷ (Integral a, FiniteBits a) ⇒ a → (a, a) -- https://gitlab.haskell.org/ghc/ghc/-/issues/22933 #ifdef aarch64_HOST_ARCH quotRem100 a = a `quotRem` 100 #else quotRem100 a = let q = quot100 a in (q, a - 100 * q) #endif {-# INLINEABLE quotRem100 #-} quot100 ∷ (Integral a, FiniteBits a) ⇒ a → a quot100 a = case (finiteBitSize a, isSigned a) of (64, True) | finiteBitSize (0 ∷ Int) == 64 → cast $$(quoteAST $ assumeNonNegArg $ astQuot (100 ∷ Int)) (64, False) | finiteBitSize (0 ∷ Word) == 64 → cast $$(quoteQuot (100 ∷ Word)) (32, True) → cast $$(quoteAST $ assumeNonNegArg $ astQuot (100 ∷ Int32)) (32, False) → cast $$(quoteQuot (100 ∷ Word32)) (16, True) → cast $$(quoteAST $ assumeNonNegArg $ astQuot (100 ∷ Int16)) (16, False) → cast $$(quoteQuot (100 ∷ Word16)) (8, True) → cast $$(quoteAST $ assumeNonNegArg $ astQuot (100 ∷ Int8)) (8, False) → cast $$(quoteQuot (100 ∷ Word8)) _ → a `quot` 100 where cast ∷ (Integral a, Integral b) ⇒ (b → b) → a cast f = fromIntegral (f (fromIntegral a)) {-# INLINEABLE quot100 #-} quotBillion ∷ (Integral a, FiniteBits a) ⇒ a → a #ifdef aarch64_HOST_ARCH quotBillion a = a `quot` 1e9 #else quotBillion a = case (finiteBitSize a, isSigned a) of (64, True) | finiteBitSize (0 :: Int) == 64 → cast $$(quoteAST $ assumeNonNegArg $ astQuot (1e9 :: Int)) (64, False) | finiteBitSize (0 :: Word) == 64 → cast $$(quoteQuot (1e9 :: Word)) (32, True) → cast $$(quoteAST $ assumeNonNegArg $ astQuot (1e9 :: Int32)) (32, False) → cast $$(quoteQuot (1e9 :: Word32)) _ → a `quot` 1e9 where cast :: (Integral a, Integral b) => (b → b) → a cast f = fromIntegral (f (fromIntegral a)) #endif {-# INLINEABLE quotBillion #-} text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Dec/Unbounded.hs0000644000000000000000000002652407346545000023733 0ustar0000000000000000-- | -- Copyright: (c) 2024 Pierre Le Marre -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module Data.Text.Builder.Linear.Dec.Unbounded ( (|>$$), ($$<|), -- prependUnboundedDecimal, -- Strategy (..), ) where import Data.Bits (Bits (..), FiniteBits (..)) import Data.Text.Array qualified as A import Data.Word (Word64) import GHC.Exts ( Int (..), Int#, State#, Word (..), Word#, word2Int#, (-#), ) import GHC.Num.BigNat qualified as BN import GHC.Num.Integer qualified as I import GHC.Num.Natural qualified as N import GHC.Ptr (plusPtr) import GHC.ST (ST (..)) import Data.Text.Builder.Linear.Array (unsafeReplicate) import Data.Text.Builder.Linear.Core (Buffer) import Data.Text.Builder.Linear.Dec.Bounded (digits, maxDecLen, quotRem100) import Data.Text.Builder.Linear.Dec.Bounded qualified as Bounded import Data.Text.Builder.Linear.Internal (appendBounded', prependBounded') -------------------------------------------------------------------------------- -- Append -------------------------------------------------------------------------------- -- | Append the decimal representation of an /unbounded/ integral number. -- -- @since 0.1.3 (|>$$) ∷ Integral a ⇒ Buffer ⊸ a → Buffer infixl 6 |>$$ buffer |>$$ n = case toInteger n of !n' → appendBounded' (maxIntegerDecLen n') (unsafeAppendDec n') buffer {-# INLINEABLE (|>$$) #-} -- • For small 'Integers', `unsafeAppendDec` -- • For 'BigNat's, use a buffer with `unsafePrependUnboundedDec`, then copy it. -- -- For *bounded* integers we used the exact size of the decimal representation to -- compute the offset from which we can use the prepend action to actually append. -- -- But the exact size of an (unbounded) 'Integer' could be expensive to compute. -- So it is faster to use a buffer and then copy it. unsafeAppendDec ∷ ∀ s x . Integer → ((A.MArray s → Int → ST s Int) → ST s x) → ((A.MArray s → Int → ST s Int) → ST s x) → ST s x unsafeAppendDec n = case n of I.IS i# → \append _ → append (\marr off → Bounded.unsafeAppendDec marr off (I# i#)) _ → \_ prepend → prepend (\marr off → unsafePrependDec marr off n) {-# INLINEABLE unsafeAppendDec #-} -------------------------------------------------------------------------------- -- Prepend -------------------------------------------------------------------------------- -- | Prepend the decimal representation of an /unbounded/ integral number. -- -- @since 0.1.3 ($$<|) ∷ Integral a ⇒ a → Buffer ⊸ Buffer infixr 6 $$<| n $$<| buffer = case toInteger n of !n' → prependBounded' (maxIntegerDecLen n') (\dst dstOff → unsafePrependDec dst dstOff n') buffer {-# INLINEABLE ($$<|) #-} unsafePrependDec ∷ ∀ s. A.MArray s → Int → Integer → ST s Int unsafePrependDec marr off@(I# off#) n = case n of I.IS i# → Bounded.unsafePrependDec marr off (I# i#) _ → unsafePrependBigNatDec marr (off# -# 1#) (integerToBigNat# n) >>= prependSign where prependSign !off' = if n < 0 then do A.unsafeWrite marr (off' - 1) 0x2d -- '-' pure (off - off' + 1) else pure (off - off') {-# INLINEABLE unsafePrependDec #-} type DigitsWriter s = Int# → BN.BigNat# → ST s Int -- Use the fastest writer depending on the BigNat size unsafePrependBigNatDec ∷ ∀ s. A.MArray s → DigitsWriter s unsafePrependBigNatDec marr !off0 !n0 | BN.bigNatSize n0 < hugeSizeThreshold = prependSmallNat marr off0 n0 | otherwise = prependHugeNat marr off0 n0 where hugeSizeThreshold ∷ Word hugeSizeThreshold = 80 -- Writer for “small” 'BigNat's. -- -- Divide repeatedly by poweredBase. prependSmallNat ∷ ∀ s. A.MArray s → DigitsWriter s prependSmallNat marr = go where !(# power, poweredBase, _poweredBase² #) = selectPower (# #) go ∷ DigitsWriter s go !o1 !n = case n `BN.bigNatQuotRemWord#` poweredBase of (# q, r #) → do !o2 ← unsafePrependWordDec marr (I# o1) (W# r) if BN.bigNatIsZero q then pure o2 else do let !o3 = o1 -# (word2Int# power -# 1#) padWithZeros marr (I# o3) (o2 - I# o3) go (o3 -# 1#) q -- Use the raw state in order to avoid boxed Int in `scaleWriter` type DigitsWriter# s = Int# → BN.BigNat# → State# s → (# State# s, Int# #) -- Writer for “huge” 'BigNat's. -- -- Algorithm used in bytestring-0.12.1 (simplified): -- -- 1. Find k0 = min k such that pow10 ^ (2 ^ (k + 1)) > n0 -- 2. Set k to k0 and n to n0 -- 3. Set (q, r) = n `quotRem` (pow10 ^ (2 ^ k)) -- 4. if k = 0, then write decimal representation of q and r -- else repeat recursively 3 and 4 with n = {q,r} and k = k - 1 prependHugeNat ∷ ∀ s. A.MArray s → DigitsWriter s prependHugeNat marr off n = ST $ \s1 → case go prependTiny# poweredBase² off n s1 of (# s2, off'# #) → (# s2, I# off'# #) where !(# power, poweredBase, poweredBase² #) = selectPower (# #) go ∷ (Bool → DigitsWriter# s) → BN.BigNat# → DigitsWriter# s go !write !pow10 !o !n# = if BN.bigNatLt n# pow10 then write True o n# else go (scaleWriter write pow10) (BN.bigNatMul pow10 pow10) o n# scaleWriter ∷ (Bool → DigitsWriter# s) → BN.BigNat# → Bool → DigitsWriter# s scaleWriter !write !pow10 = \ !high !o1 !n# s1 → case BN.bigNatQuotRem# n# pow10 of (# q, r #) | high && BN.bigNatIsZero q → write high o1 r s1 | otherwise → case write False o1 r s1 of (# s2, o2 #) → write high (o2 -# 1#) q s2 prependTiny# ∷ Bool → DigitsWriter# s prependTiny# !high !o1 !n# = case prependTiny high o1 n# of ST f → \s1 → case f s1 of (# s2, I# o2 #) → (# s2, o2 #) -- Use ST instead of raw state as the utils functions do. -- `prependTiny` must inline to leave no boxing/unboxing roundtrip. {-# INLINE prependTiny #-} prependTiny ∷ Bool → DigitsWriter s prependTiny !high !o1 !n# = case BN.bigNatQuotRemWord# n# poweredBase of (# q, r #) → do !o2 ← unsafePrependWordDec marr (I# o1) (W# r) if high && BN.bigNatIsZero q then pure o2 else do let !o3 = I# o1 - (fromIntegral (W# power) - 1) padWithZeros marr o3 (o2 - o3) !o4 ← unsafePrependWordDec marr (o3 - 1) (BN.bigNatToWord q) if high then pure o4 else do let !o5 = o3 - fromIntegral (W# power) padWithZeros marr o5 (o4 - o5) pure o5 -------------------------------------------------------------------------------- -- Prepend word -------------------------------------------------------------------------------- unsafePrependWordDec ∷ ∀ s. A.MArray s → Int → Word → ST s Int unsafePrependWordDec = f where f marr !o !k | k >= 10 = do let (q, r) = quotRem100 k A.copyFromPointer marr (o - 1) (digits `plusPtr` (fromIntegral r `shiftL` 1)) 2 if k < 100 then pure (o - 1) else f marr (o - 2) q | otherwise = do A.unsafeWrite marr o (fromIntegral (0x30 + k)) pure o -------------------------------------------------------------------------------- -- Utils -------------------------------------------------------------------------------- maxIntegerDecLen ∷ Integer → Int maxIntegerDecLen a = case a of I.IS i# → maxDecLen (I# i#) I.IP n# → maxBitNatDecLen n# I.IN n# → 1 + maxBitNatDecLen n# {-# INLINEABLE maxIntegerDecLen #-} -- | ceiling (fbs a * logBase 10 2) < ceiling (fbs a * 5 / 16) < 1 + floor (fbs a * 5 / 16) -- -- We approximate @fbs a@ to @bigNatSize a * word_size@. maxBitNatDecLen ∷ BN.BigNat# → Int maxBitNatDecLen n# -- This can overflow in theory, but in practice it would overflow for a BigNat# -- of at least: -- -- • On 32 bits platform: 6.4 GiB, out of max 4 GiB RAM -- → BN.bigNatSize n# = 214748364 = -- (maxBound @Int32 - 1) `div` fromIntegral (shiftR (finiteBitSize @Word32 0 * 5) 4) -- • On 64 bits platform: 3276 PiB -- → BN.bigNatSize n# = 461168601842738790 = -- (maxBound @Int64 - 1) `div` fromIntegral (shiftR (finiteBitSize @Word64 0 * 5) 4) -- -- These thresholds are too big to be realistic (32 bits: more than available RAM, 64 -- bits: integer size in petabytes), so it is perfectly reasonable to have no -- special handling of overflow here. -- Word bit size is multiple of 16 (e.g. 32 and 64 bits arch) | rem (finiteBitSize @Word 0) 16 == 0 = 1 + fromIntegral (BN.bigNatSize n# * shiftR (fromIntegral (finiteBitSize @Word 0) * 5) 4) -- Other cases (non-standard arch) | otherwise = 1 + fromIntegral @Word64 ( (fromIntegral (BN.bigNatSize n#) * fromIntegral (finiteBitSize @Word 0) * 5) `shiftR` 4 ) {-# INLINEABLE maxBitNatDecLen #-} integerToBigNat# ∷ Integer → BN.BigNat# integerToBigNat# n = case I.integerToBigNatSign# n of (# _, n# #) → n# {-# INLINE integerToBigNat# #-} -- Maximal power of 10 fitting into a 'Word': -- • 10 ^ 9 for 32 bit words (32 * log 2 / log 10 ≈ 9.63) -- • 10 ^ 19 for 64 bit words (64 * log 2 / log 10 ≈ 19.27) -- -- Why (# #)? We can't have top-level unlifted bindings -- (see: https://gitlab.haskell.org/ghc/ghc/-/issues/17521). So we use a function -- that take an empty argument (# #) that will be discarded at compile time. selectPower ∷ (# #) → (# Word#, Word#, BN.BigNat# #) selectPower _ = case finiteBitSize @Word 0 of 64 → (# 19##, 10000000000000000000##, N.naturalToBigNat# tenPower38 #) -- Not 64 bits: assume 32 bits _ → (# 9##, 1000000000##, N.naturalToBigNat# tenPower18 #) -- NOTE: ensure to not inline the following numbers, in order to avoid allocations. tenPower18 ∷ N.Natural tenPower18 = 1e18 {-# NOINLINE tenPower18 #-} tenPower38 ∷ N.Natural tenPower38 = 1e38 {-# NOINLINE tenPower38 #-} padWithZeros ∷ ∀ s. A.MArray s → Int → Int → ST s () padWithZeros marr off count = unsafeReplicate marr off count 0x30 {-# INLINE padWithZeros #-} -------------------------------------------------------------------------------- -- For testing purpose only -------------------------------------------------------------------------------- -- data Strategy = SmallOnly | HugeOnly -- prependUnboundedDecimal ∷ Integral a ⇒ Strategy → a → Buffer ⊸ Buffer -- prependUnboundedDecimal strategy n buffer = case toInteger n of -- !n' → -- prependBounded' -- (maxIntegerDecLen n') -- (\dst dstOff → unsafePrependDec' strategy dst dstOff n') -- buffer -- unsafePrependDec' ∷ ∀ s. Strategy → A.MArray s → Int → Integer → ST s Int -- unsafePrependDec' s marr off@(I# off#) n' = case n' of -- I.IS i# → Bounded.unsafePrependDec marr off (I# i#) -- _ → unsafePrependBigNatDec' s marr (off# -# 1#) (integerToBigNat# n') >>= prependSign -- where -- prependSign !off' = -- if n' < 0 -- then do -- A.unsafeWrite marr (off' - 1) 0x2d -- '-' -- pure (off - off' + 1) -- else pure (off - off') -- {-# INLINEABLE unsafePrependDec' #-} -- unsafePrependBigNatDec' ∷ ∀ s. Strategy → A.MArray s → DigitsWriter s -- unsafePrependBigNatDec' strategy marr !off0 !n0 = case strategy of -- SmallOnly → prependSmallNat marr off0 n0 -- HugeOnly → prependHugeNat marr off0 n0 text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Double.hs0000644000000000000000000000526707346545000022530 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module Data.Text.Builder.Linear.Double ( (|>%), (%<|), ) where import Data.ByteString.Builder qualified as BB import Data.ByteString.Builder.Internal qualified as BBI import Data.Text.Array qualified as A import Data.Word (Word8) import GHC.Exts (Ptr (..)) import GHC.ForeignPtr (ForeignPtr, touchForeignPtr, unsafeForeignPtrToPtr, unsafeWithForeignPtr) import GHC.IO (unsafeDupablePerformIO, unsafeIOToST, unsafeSTToIO) import GHC.Ptr (minusPtr) import GHC.ST (ST) import Data.Text.Builder.Linear.Core -- | Append the decimal representation of a 'Double'. -- -- Matches 'show' in displaying in standard or scientific notation: -- -- >>> runBuffer (\b -> b |>% 123.456) -- "123.456" -- -- >>> runBuffer (\b -> b |>% 1.23e7) -- "1.23e7" (|>%) ∷ Buffer ⊸ Double → Buffer infixl 6 |>% buffer |>% x = appendBounded maxDblLen (\dst dstOff → unsafeAppendDouble dst dstOff x) buffer -- | Prepend the decimal representation of a 'Double'. -- -- Matches 'show' in displaying in standard or scientific notation -- (see examples in @'(|>%)'@). (%<|) ∷ Double → Buffer ⊸ Buffer infixr 6 %<| x %<| buffer = prependBounded maxDblLen (\dst dstOff → unsafePrependDouble dst dstOff x) (\dst dstOff → unsafeAppendDouble dst dstOff x) buffer unsafeAppendDouble ∷ A.MArray s → Int → Double → ST s Int unsafeAppendDouble dst !dstOff !x = do let (fp, !srcLen) = runDoubleBuilder x unsafeIOToST $ unsafeWithForeignPtr fp $ \(Ptr addr#) → unsafeSTToIO $ A.copyFromPointer dst dstOff (Ptr addr#) srcLen pure srcLen unsafePrependDouble ∷ A.MArray s → Int → Double → ST s Int unsafePrependDouble dst !dstOff !x = do let (fp, !srcLen) = runDoubleBuilder x unsafeIOToST $ unsafeWithForeignPtr fp $ \(Ptr addr#) → unsafeSTToIO $ A.copyFromPointer dst (dstOff - srcLen) (Ptr addr#) srcLen pure srcLen runDoubleBuilder ∷ Double → (ForeignPtr Word8, Int) runDoubleBuilder = unsafeDupablePerformIO . buildStepToFirstChunk . BBI.runBuilder . BB.doubleDec {-# INLINE runDoubleBuilder #-} buildStepToFirstChunk ∷ BBI.BuildStep a → IO (ForeignPtr Word8, Int) buildStepToFirstChunk = \step → BBI.newBuffer maxDblLen >>= fill step where fill !step (BBI.Buffer fpbuf br) = do let doneH op' _ = pure (fpbuf, op' `minusPtr` unsafeForeignPtrToPtr fpbuf) fullH _ _ nextStep = BBI.newBuffer maxDblLen >>= fill nextStep res ← BBI.fillWithBuildStep step doneH fullH undefined br touchForeignPtr fpbuf return res maxDblLen ∷ Int maxDblLen = 24 -- length (show (-1.0000000000000004e-308 :: Double)) text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Hex.hs0000644000000000000000000001002407346545000022025 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module Data.Text.Builder.Linear.Hex ( (|>&), (&<|), ) where import Data.Bits (Bits (..), FiniteBits (..)) import Data.Text.Array qualified as A import Data.Word (Word16, Word32, Word64, Word8) import GHC.Exts (Int (..), (>#)) import GHC.ST (ST) import Data.Text.Builder.Linear.Core -- | Append the lower-case hexadecimal representation of a /bounded/ integral -- number. -- -- Negative numbers are interpreted as their corresponding unsigned number: -- -- >>> :set -XOverloadedStrings -XLinearTypes -- >>> import Data.Int (Int8, Int16) -- >>> runBuffer (\b -> b |>& (-1 :: Int8)) == "ff" -- True -- >>> runBuffer (\b -> b |>& (-1 :: Int16)) == "ffff" -- True (|>&) ∷ (Integral a, FiniteBits a) ⇒ Buffer ⊸ a → Buffer infixl 6 |>& buffer |>& n = appendBounded (maxHexLen n) (\dst dstOff → unsafeAppendHex dst dstOff n) buffer {-# INLINEABLE (|>&) #-} -- | Prepend the lower-case hexadecimal representation of a /bounded/ integral -- number. -- -- Negative numbers are interpreted as their corresponding unsigned number: -- -- >>> :set -XOverloadedStrings -XLinearTypes -- >>> import Data.Int (Int8, Int16) -- >>> runBuffer (\b -> (-1 :: Int8) &<| b) == "ff" -- True -- >>> runBuffer (\b -> (-1 :: Int16) &<| b) == "ffff" -- True (&<|) ∷ (Integral a, FiniteBits a) ⇒ a → Buffer ⊸ Buffer infixr 6 &<| n &<| buffer = prependBounded (maxHexLen n) (\dst dstOff → unsafePrependHex dst dstOff n) (\dst dstOff → unsafeAppendHex dst dstOff n) buffer {-# INLINEABLE (&<|) #-} -- | Compute the number of nibbles that an integral type can hold, rounded up. maxHexLen ∷ (Integral a, FiniteBits a) ⇒ a → Int maxHexLen n = 1 + ((finiteBitSize n - 1) `shiftR` 2) {-# INLINEABLE maxHexLen #-} unsafeAppendHex ∷ (Integral a, FiniteBits a) ⇒ A.MArray s → Int → a → ST s Int unsafeAppendHex marr !off 0 = A.unsafeWrite marr off 0x30 >> pure 1 unsafeAppendHex marr !off n = go (off + len - 1) n where len = lengthAsHex n go !_ 0 = pure len go !o m = do let nibble = m .&. 0x0f writeNibbleAsHex marr o (fromIntegral nibble) go (o - 1) (dropNibble m) {-# INLINEABLE unsafeAppendHex #-} unsafePrependHex ∷ (Integral a, FiniteBits a) ⇒ A.MArray s → Int → a → ST s Int unsafePrependHex marr !off 0 = A.unsafeWrite marr (off - 1) 0x30 >> pure 1 unsafePrependHex marr !off n = go (off - 1) n where go !o 0 = pure (off - 1 - o) go !o m = do let nibble = m .&. 0x0f writeNibbleAsHex marr o (fromIntegral nibble) go (o - 1) (dropNibble m) {-# INLINEABLE unsafePrependHex #-} -- | The usual 'shiftR' performs sign extension on signed number types, -- filling the top bits with 1 if the argument is negative. -- We don't want this behaviour here. -- -- It would suffice to clean the sign bit only once -- instead of doing it on every iteration of unsafe{Ap,Pre}pendHex.go, -- but the performance impact is likely negligible. dropNibble ∷ (Integral a, FiniteBits a) ⇒ a → a dropNibble x = case (isSigned x, finiteBitSize x) of -- This is morally 'iShiftRL#', 'uncheckedIShiftRA64#', etc., -- but there is no polymorphic interface to access them. (True, 8) → fromIntegral @Word8 (shiftR (fromIntegral x) 4) (True, 16) → fromIntegral @Word16 (shiftR (fromIntegral x) 4) (True, 32) → fromIntegral @Word32 (shiftR (fromIntegral x) 4) (True, 64) → fromIntegral @Word64 (shiftR (fromIntegral x) 4) (True, _) → shiftR x 4 .&. ((1 `shiftL` (finiteBitSize x - 4)) - 1) _ → shiftR x 4 {-# INLINE dropNibble #-} -- | This assumes n /= 0. Round the number of nibbles up, as in 'maxHexLen'. lengthAsHex ∷ FiniteBits a ⇒ a → Int lengthAsHex n = 1 + shiftR (finiteBitSize n - countLeadingZeros n - 1) 2 {-# INLINEABLE lengthAsHex #-} writeNibbleAsHex ∷ A.MArray s → Int → Int → ST s () writeNibbleAsHex marr off n@(I# n#) = A.unsafeWrite marr off (fromIntegral hex) where hex = 0x30 + n + I# (n# ># 9#) * (0x60 - 0x39) text-builder-linear-0.1.3/src/Data/Text/Builder/Linear/Internal.hs0000644000000000000000000004210407346545000023061 0ustar0000000000000000-- | -- Copyright: (c) 2022 Andrew Lelechenko -- (c) 2023 Pierre Le Marre -- Licence: BSD3 -- Maintainer: Andrew Lelechenko -- -- Internal routines for 'Buffer' manipulations. module Data.Text.Builder.Linear.Internal ( -- * Type Buffer, -- * Basic interface runBuffer, runBufferBS, dupBuffer, consumeBuffer, eraseBuffer, byteSizeOfBuffer, lengthOfBuffer, dropBuffer, takeBuffer, newEmptyBuffer, -- * Text concatenation appendBounded, appendExact, prependBounded, prependBounded', appendBounded', prependExact, (><), ) where import Data.ByteString.Internal (ByteString (..)) import Data.Text qualified as T import Data.Text.Array qualified as A import Data.Text.Internal (Text (..)) import GHC.Exts (Int (..), Levity (..), RuntimeRep (..), TYPE, byteArrayContents#, plusAddr#, unsafeCoerce#) import GHC.ForeignPtr (ForeignPtr (..), ForeignPtrContents (..)) import GHC.ST (ST (..), runST) import Data.Text.Builder.Linear.Array -- | Internally 'Buffer' is a mutable buffer. -- If a client gets hold of a variable of type 'Buffer', -- they'd be able to pass a mutable buffer to concurrent threads. -- That's why API below is carefully designed to prevent such possibility: -- clients always work with linear functions 'Buffer' ⊸ 'Buffer' instead -- and run them on an empty 'Buffer' to extract results. -- -- In terms of [@linear-base@](https://hackage.haskell.org/package/linear-base) -- 'Buffer' is [@Consumable@](https://hackage.haskell.org/package/linear-base/docs/Prelude-Linear.html#t:Consumable) -- (see 'consumeBuffer') -- and [@Dupable@](https://hackage.haskell.org/package/linear-base/docs/Prelude-Linear.html#t:Dupable) -- (see 'dupBuffer'), -- but not [@Movable@](https://hackage.haskell.org/package/linear-base/docs/Prelude-Linear.html#t:Movable). -- -- >>> :set -XOverloadedStrings -XLinearTypes -- >>> import Data.Text.Builder.Linear.Buffer -- >>> runBuffer (\b -> '!' .<| "foo" <| (b |> "bar" |>. '.')) -- "!foobar." -- -- Remember: this is a strict builder, so on contrary to "Data.Text.Lazy.Builder" -- for optimal performance you should use strict left folds instead of lazy right ones. -- -- 'Buffer' is an unlifted datatype, -- so you can put it into an unboxed tuple @(# ..., ... #)@, -- but not into @(..., ...)@. data Buffer ∷ TYPE ('BoxedRep 'Unlifted) where Buffer ∷ {-# UNPACK #-} !Text → Buffer -- | Unwrap 'Buffer', no-op. -- Most likely, this is not the function you're looking for -- and you need 'runBuffer' instead. unBuffer ∷ Buffer ⊸ Text unBuffer (Buffer x) = x -- | Run a linear function on an empty 'Buffer', producing a strict 'Text'. -- -- Be careful to write @runBuffer (\\b -> ...)@ instead of @runBuffer $ \\b -> ...@, -- because current implementation of linear types lacks special support for '($)'. -- Another option is to enable @{-# LANGUAGE BlockArguments #-}@ -- and write @runBuffer \\b -> ...@. -- Alternatively, you can import -- [@($)@](https://hackage.haskell.org/package/linear-base/docs/Prelude-Linear.html#v:-36-) -- from [@linear-base@](https://hackage.haskell.org/package/linear-base). -- -- 'runBuffer' is similar in spirit to mutable arrays API in -- [@Data.Array.Mutable.Linear@](https://hackage.haskell.org/package/linear-base/docs/Data-Array-Mutable-Linear.html), -- which provides functions like -- [@fromList@](https://hackage.haskell.org/package/linear-base/docs/Data-Array-Mutable-Linear.html#v:fromList) ∷ [@a@] → (@Vector@ @a@ ⊸ [@Ur@](https://hackage.haskell.org/package/linear-base-0.3.0/docs/Prelude-Linear.html#t:Ur) b) ⊸ [@Ur@](https://hackage.haskell.org/package/linear-base-0.3.0/docs/Prelude-Linear.html#t:Ur) @b@. -- Here the initial buffer is always empty and @b@ is 'Text'. Since 'Text' is -- [@Movable@](https://hackage.haskell.org/package/linear-base/docs/Prelude-Linear.html#t:Movable), -- 'Text' and [@Ur@](https://hackage.haskell.org/package/linear-base-0.3.0/docs/Prelude-Linear.html#t:Ur) 'Text' are equivalent. runBuffer ∷ (Buffer ⊸ Buffer) ⊸ Text runBuffer f = unBuffer (shrinkBuffer (f (Buffer mempty))) {-# NOINLINE runBuffer #-} {- See https://github.com/Bodigrim/linear-builder/issues/19 and https://github.com/tweag/linear-base/pull/187#discussion_r489081926 for the discussion why NOINLINE here and below in 'runBufferBS' is necessary. Without it CSE (common subexpression elimination) can pull out 'Buffer's from different 'runBuffer's and share them, which is absolutely not what we want. -} -- | Same as 'runBuffer', but returning a UTF-8 encoded strict 'ByteString'. runBufferBS ∷ (Buffer ⊸ Buffer) ⊸ ByteString runBufferBS f = case shrinkBuffer (f (Buffer memptyPinned)) of Buffer (Text (A.ByteArray arr) (I# from) len) → BS fp len where addr# = byteArrayContents# arr `plusAddr#` from fp = ForeignPtr addr# (PlainPtr (unsafeCoerce# arr)) {-# NOINLINE runBufferBS #-} shrinkBuffer ∷ Buffer ⊸ Buffer shrinkBuffer (Buffer (Text arr from len)) = Buffer $ runST $ do arrM ← unsafeThaw arr A.shrinkM arrM (from + len) arr' ← A.unsafeFreeze arrM pure $ Text arr' from len memptyPinned ∷ Text memptyPinned = runST $ do marr ← A.newPinned 0 arr ← A.unsafeFreeze marr pure $ Text arr 0 0 -- | Create an empty 'Buffer'. -- -- The first 'Buffer' is the input and the second is a new empty 'Buffer'. -- -- This function is needed in some situations, e.g. with -- 'Data.Text.Builder.Linear.Buffer.justifyRight'. The following example creates -- a utility function that justify a text and then append it to a buffer. -- -- >>> :set -XOverloadedStrings -XLinearTypes -XUnboxedTuples -- >>> import Data.Text.Builder.Linear.Buffer -- >>> import Data.Text (Text) -- >>> :{ -- appendJustified :: Buffer %1 -> Text -> Buffer -- appendJustified b t = case newEmptyBuffer b of -- -- Note that we need to create a new buffer from the text, in order -- -- to justify only the text and not the input buffer. -- (# b', empty #) -> b' >< justifyRight 12 ' ' (empty |> t) -- :} -- -- >>> runBuffer (\b -> (b |> "Test:") `appendJustified` "AAA" `appendJustified` "BBBBBBB") -- "Test: AAA BBBBBBB" -- -- Note: a previous buffer is necessary in order to create an empty buffer with -- the same characteristics. newEmptyBuffer ∷ Buffer ⊸ (# Buffer, Buffer #) newEmptyBuffer (Buffer t@(Text arr _ _)) = (# Buffer t, Buffer (if isPinned arr then memptyPinned else mempty) #) -- | Duplicate builder. Feel free to process results in parallel threads. -- Similar to -- [@Dupable@](https://hackage.haskell.org/package/linear-base/docs/Prelude-Linear.html#t:Dupable) -- from [@linear-base@](https://hackage.haskell.org/package/linear-base). -- -- It is a bit tricky to use because of -- -- of linear types with regards to @let@ and @where@. E. g., one cannot write -- -- > let (# b1, b2 #) = dupBuffer b in ("foo" <| b1) >< (b2 |> "bar") -- -- Instead write: -- -- >>> :set -XOverloadedStrings -XLinearTypes -XUnboxedTuples -- >>> import Data.Text.Builder.Linear.Buffer -- >>> runBuffer (\b -> case dupBuffer b of (# b1, b2 #) -> ("foo" <| b1) >< (b2 |> "bar")) -- "foobar" -- -- Note the unboxed tuple: 'Buffer' is an unlifted datatype, -- so it cannot be put into @(..., ...)@. dupBuffer ∷ Buffer ⊸ (# Buffer, Buffer #) dupBuffer (Buffer x) = (# Buffer x, Buffer (T.copy x) #) -- | Consume buffer linearly, -- similar to -- [@Consumable@](https://hackage.haskell.org/package/linear-base/docs/Prelude-Linear.html#t:Consumable) -- from [@linear-base@](https://hackage.haskell.org/package/linear-base). consumeBuffer ∷ Buffer ⊸ () consumeBuffer Buffer {} = () -- | Erase buffer's content, replacing it with an empty 'Text'. eraseBuffer ∷ Buffer ⊸ Buffer eraseBuffer (Buffer (Text arr _ _)) = Buffer (if isPinned arr then memptyPinned else mempty) -- | Return buffer's size in __bytes__ (not in 'Char's). -- This could be useful to implement a lazy builder atop of a strict one. byteSizeOfBuffer ∷ Buffer ⊸ (# Buffer, Word #) byteSizeOfBuffer (Buffer t@(Text _ _ len)) = (# Buffer t, fromIntegral len #) -- | Return buffer's length in 'Char's (not in bytes). -- This could be useful to implement @dropEndBuffer@ and @takeEndBuffer@, e. g., -- -- @ -- import Data.Unrestricted.Linear -- -- dropEndBuffer :: Word -> Buffer %1 -> Buffer -- dropEndBuffer n buf = case lengthOfBuffer buf of -- (# buf', len #) -> case move len of -- Ur len' -> takeBuffer (len' - n) buf' -- @ lengthOfBuffer ∷ Buffer ⊸ (# Buffer, Word #) lengthOfBuffer (Buffer t) = (# Buffer t, fromIntegral (T.length t) #) -- | Slice 'Buffer' by dropping given number of 'Char's. dropBuffer ∷ Word → Buffer ⊸ Buffer dropBuffer nChar (Buffer t@(Text arr off len)) | nByte <= 0 = Buffer (Text arr (off + len) 0) | otherwise = Buffer (Text arr (off + nByte) (len - nByte)) where nByte = T.measureOff (fromIntegral nChar) t -- | Slice 'Buffer' by taking given number of 'Char's. takeBuffer ∷ Word → Buffer ⊸ Buffer takeBuffer nChar (Buffer t@(Text arr off _)) | nByte <= 0 = Buffer t | otherwise = Buffer (Text arr off nByte) where nByte = T.measureOff (fromIntegral nChar) t -- | Low-level routine to append data of unknown size to a 'Buffer'. appendBounded ∷ Int -- ^ Upper bound for the number of bytes, written by an action → (∀ s. A.MArray s → Int → ST s Int) -- ^ Action, which writes bytes __starting__ from the given offset -- and returns an actual number of bytes written. → Buffer ⊸ Buffer appendBounded maxSrcLen appender (Buffer (Text dst dstOff dstLen)) = Buffer $ runST $ do let dstFullLen = sizeofByteArray dst newFullLen = dstOff + 2 * (dstLen + maxSrcLen) newM ← if dstOff + dstLen + maxSrcLen <= dstFullLen then unsafeThaw dst else do tmpM ← (if isPinned dst then A.newPinned else A.new) newFullLen A.copyI dstLen tmpM dstOff dst dstOff pure tmpM srcLen ← appender newM (dstOff + dstLen) new ← A.unsafeFreeze newM pure $ Text new dstOff (dstLen + srcLen) {-# INLINE appendBounded #-} -- | Low-level routine to append data of unknown size to a 'Buffer', giving -- the action the choice between two strategies. -- -- See also: 'appendBounded'. -- -- @since 0.1.3 appendBounded' ∷ Int -- ^ Upper bound for the number of bytes, written by an action → (∀ s x. ((A.MArray s → Int → ST s Int) → ST s x) → ((A.MArray s → Int → ST s Int) → ST s x) → ST s x) -- ^ Action, which appends bytes using one of the following strategies: -- -- * writes bytes __starting__ from the given offset, using its first argument, -- * writes bytes __finishing__ before the given offset, using its second argument. -- -- The function passed to either argument returns the actual number of bytes written. → Buffer ⊸ Buffer appendBounded' maxSrcLen writer (Buffer (Text dst dstOff dstLen)) = Buffer $ runST $ do let dstFullLen = sizeofByteArray dst newFullLen = dstOff + 2 * (dstLen + maxSrcLen) newM ← if dstOff + dstLen + maxSrcLen <= dstFullLen then unsafeThaw dst else do tmpM ← (if isPinned dst then A.newPinned else A.new) newFullLen A.copyI dstLen tmpM dstOff dst dstOff pure tmpM let append = \appender → do count ← appender newM (dstOff + dstLen) pure (dstOff, count) -- Action that prepends then copies the result to the final destination, if necessary let prepend = \prepender → case dstLen of 0 → do -- Buffer is empty: prepend to final destination count ← prepender newM maxSrcLen pure (maxSrcLen - count, count) _ → do -- Require extra buffer + copy to final destination let off' -- Reuse space before current data (no overlap) | dstOff >= maxSrcLen = dstOff -- Reuse space after current data (overlap) | otherwise = dstOff + dstLen + maxSrcLen count ← prepender newM off' -- Note: we rely on copyM allowing overlaps A.copyM newM (dstOff + dstLen) newM (off' - count) count pure (dstOff, count) !(dstOff', srcLen) ← writer append prepend new ← A.unsafeFreeze newM pure $ Text new dstOff' (dstLen + srcLen) {-# INLINE appendBounded' #-} -- | Low-level routine to append data of known size to a 'Buffer'. appendExact ∷ Int -- ^ Exact number of bytes, written by an action → (∀ s. A.MArray s → Int → ST s ()) -- ^ Action, which writes bytes __starting__ from the given offset → Buffer ⊸ Buffer appendExact srcLen appender = appendBounded srcLen (\dst dstOff → appender dst dstOff >> pure srcLen) {-# INLINE appendExact #-} -- | Low-level routine to prepend data of unknown size to a 'Buffer'. prependBounded ∷ Int -- ^ Upper bound for the number of bytes, written by an action → (∀ s. A.MArray s → Int → ST s Int) -- ^ Action, which writes bytes __finishing__ before the given offset -- and returns an actual number of bytes written. → (∀ s. A.MArray s → Int → ST s Int) -- ^ Action, which writes bytes __starting__ from the given offset -- and returns an actual number of bytes written. → Buffer ⊸ Buffer prependBounded maxSrcLen prepender appender (Buffer (Text dst dstOff dstLen)) | maxSrcLen <= dstOff = Buffer $ runST $ do newM ← unsafeThaw dst srcLen ← prepender newM dstOff new ← A.unsafeFreeze newM pure $ Text new (dstOff - srcLen) (srcLen + dstLen) | otherwise = Buffer $ runST $ do let dstFullLen = sizeofByteArray dst newOff = dstLen + maxSrcLen newFullLen = 2 * newOff + (dstFullLen - dstOff - dstLen) newM ← (if isPinned dst then A.newPinned else A.new) newFullLen srcLen ← appender newM newOff A.copyI dstLen newM (newOff + srcLen) dst dstOff new ← A.unsafeFreeze newM pure $ Text new newOff (dstLen + srcLen) {-# INLINE prependBounded #-} -- | Low-level routine to prepend data of unknown size to a 'Buffer'. -- -- Contrary to 'prependBounded', only use a prepend action. -- -- @since 0.1.3 prependBounded' ∷ Int -- ^ Upper bound for the number of bytes, written by an action → (∀ s. A.MArray s → Int → ST s Int) -- ^ Action, which writes bytes __finishing__ before the given offset -- and returns an actual number of bytes written. → Buffer ⊸ Buffer prependBounded' maxSrcLen prepender (Buffer (Text dst dstOff dstLen)) | maxSrcLen <= dstOff = Buffer $ runST $ do newM ← unsafeThaw dst srcLen ← prepender newM dstOff new ← A.unsafeFreeze newM pure $ Text new (dstOff - srcLen) (srcLen + dstLen) | otherwise = Buffer $ runST $ do let dstFullLen = sizeofByteArray dst off = dstLen + 2 * maxSrcLen newFullLen = off + (dstFullLen - dstOff) newM ← (if isPinned dst then A.newPinned else A.new) newFullLen srcLen ← prepender newM off A.copyI dstLen newM off dst dstOff new ← A.unsafeFreeze newM pure $ Text new (off - srcLen) (dstLen + srcLen) {-# INLINE prependBounded' #-} -- | Low-level routine to append data of known size to a 'Buffer'. prependExact ∷ Int -- ^ Exact number of bytes, written by an action → (∀ s. A.MArray s → Int → ST s ()) -- ^ Action, which writes bytes __starting__ from the given offset → Buffer ⊸ Buffer prependExact srcLen appender = prependBounded srcLen (\dst dstOff → appender dst (dstOff - srcLen) >> pure srcLen) (\dst dstOff → appender dst dstOff >> pure srcLen) {-# INLINE prependExact #-} -- | Concatenate two 'Buffer's, potentially mutating both of them. -- -- You likely need to use 'dupBuffer' to get hold on two builders at once: -- -- >>> :set -XOverloadedStrings -XLinearTypes -XUnboxedTuples -- >>> import Data.Text.Builder.Linear.Buffer -- >>> runBuffer (\b -> case dupBuffer b of (# b1, b2 #) -> ("foo" <| b1) >< (b2 |> "bar")) -- "foobar" (><) ∷ Buffer ⊸ Buffer ⊸ Buffer infix 6 >< Buffer (Text left leftOff leftLen) >< Buffer (Text right rightOff rightLen) = Buffer $ runST $ do let leftFullLen = sizeofByteArray left rightFullLen = sizeofByteArray right canCopyToLeft = leftOff + leftLen + rightLen <= leftFullLen canCopyToRight = leftLen <= rightOff shouldCopyToLeft = canCopyToLeft && (not canCopyToRight || leftLen >= rightLen) if shouldCopyToLeft then do newM ← unsafeThaw left A.copyI rightLen newM (leftOff + leftLen) right rightOff new ← A.unsafeFreeze newM pure $ Text new leftOff (leftLen + rightLen) else if canCopyToRight then do newM ← unsafeThaw right A.copyI leftLen newM (rightOff - leftLen) left leftOff new ← A.unsafeFreeze newM pure $ Text new (rightOff - leftLen) (leftLen + rightLen) else do let fullLen = leftOff + leftLen + rightLen + (rightFullLen - rightOff - rightLen) newM ← (if isPinned left || isPinned right then A.newPinned else A.new) fullLen A.copyI leftLen newM leftOff left leftOff A.copyI rightLen newM (leftOff + leftLen) right rightOff new ← A.unsafeFreeze newM pure $ Text new leftOff (leftLen + rightLen) text-builder-linear-0.1.3/test/0000755000000000000000000000000007346545000014543 5ustar0000000000000000text-builder-linear-0.1.3/test/Main.hs0000644000000000000000000004275507346545000016000 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE GADTs #-} #if __GLASGOW_HASKELL__ >= 907 {-# LANGUAGE TypeAbstractions #-} #endif -- | -- Copyright: (c) 2022 Andrew Lelechenko -- Licence: BSD3 -- Maintainer: Andrew Lelechenko module Main where import Prelude hiding (Foldable(..)) import Data.Bits (Bits(..), FiniteBits(..), bitDefault) import Data.Foldable (Foldable(..)) import Data.Int import Data.List (intersperse) import Data.Proxy (Proxy(..)) import qualified Data.Text as T import qualified Data.Text.Encoding as T import Data.Text.Builder.Linear.Buffer import Data.Text.Internal (Text(..)) import Data.Text.Lazy (toStrict) import Data.Text.Lazy.Builder qualified as TB import Data.Text.Lazy.Builder (toLazyText) import Data.Text.Lazy.Builder.Int (decimal, hexadecimal) import Data.Text.Lazy.Builder.RealFloat (realFloat) import Data.Word import GHC.Generics import GHC.TypeLits (KnownNat, OrderingI (..), SomeNat (..), cmpNat, natVal, sameNat, someNatVal) import Numeric.Natural (Natural) import Test.Tasty import Test.Tasty.QuickCheck hiding ((><), (.&.)) instance Arbitrary Text where arbitrary = do xs ← T.pack <$> arbitrary d ← (`mod` (T.length xs + 1)) <$> arbitrary pure $ T.drop d xs shrink t@(Text arr off len) = map (T.drop d . T.pack) (shrink ys) ++ map (\d' → T.drop d' $ T.pack $ drop (d - d') ys) (shrink d) where xs = T.unpack t ys = T.unpack (Text arr 0 (off + len)) d = length ys - length xs data Action = AppendText Text | PrependText Text | AppendChar Char | PrependChar Char | AppendChars Word Char | PrependChars Word Char | JustifyLeft Word Char | JustifyRight Word Char | Center Word Char | HexInt Int8 Int16 (IntN 30) (IntN 31) Int32 (IntN 33) Int64 | HexWord Word8 Word16 Word32 Word64 | AppendHexI SomeIntN | PrependHexI SomeIntN | AppendHexW SomeWordN | PrependHexW SomeWordN | DecInt Int8 Int16 (IntN 30) (IntN 31) Int32 (IntN 33) Int64 | DecWord Word8 Word16 (WordN 30) (WordN 31) Word32 (WordN 33) Word64 | AppendDecW Word | PrependDecW Word | AppendDecI Int | PrependDecI Int | AppendDecI30 (IntN 30) | PrependDecI30 (IntN 30) | AppendDecInteger Integer | PrependDecInteger Integer | AppendDouble Double | PrependDouble Double | AppendSpaces Word | PrependSpaces Word deriving (Eq, Ord, Show, Generic) instance Arbitrary Action where arbitrary = oneof [ AppendText <$> arbitrary , PrependText <$> arbitrary , AppendChar <$> arbitraryUnicodeChar , PrependChar <$> arbitraryUnicodeChar , AppendChars <$> arbitraryCharCount <*> arbitraryUnicodeChar , PrependChars <$> arbitraryCharCount <*> arbitraryUnicodeChar , JustifyLeft <$> arbitraryTotalLength <*> arbitraryUnicodeChar , JustifyRight <$> arbitraryTotalLength <*> arbitraryUnicodeChar , Center <$> arbitraryTotalLength <*> arbitraryUnicodeChar , AppendHexI <$> arbitrary , PrependHexI <$> arbitrary , AppendHexW <$> arbitrary , PrependHexW <$> arbitrary , AppendDecW <$> arbitraryBoundedIntegral , PrependDecW <$> arbitraryBoundedIntegral , AppendDecI <$> arbitraryBoundedIntegral , PrependDecI <$> arbitraryBoundedIntegral , AppendDecI30 <$> arbitraryBoundedIntegral , PrependDecI30 <$> arbitraryBoundedIntegral , AppendDecInteger <$> arbitraryInteger , PrependDecInteger <$> arbitraryInteger , pure $ HexWord minBound minBound minBound minBound , pure $ HexWord maxBound maxBound maxBound maxBound , pure $ HexInt minBound minBound minBound minBound minBound minBound minBound , pure $ HexInt maxBound maxBound maxBound maxBound maxBound maxBound maxBound , pure $ HexInt 0 0 0 0 0 0 0 , pure $ DecInt minBound minBound minBound minBound minBound minBound minBound , pure $ DecInt maxBound maxBound maxBound maxBound maxBound maxBound maxBound , pure $ DecInt 0 0 0 0 0 0 0 , pure $ DecWord minBound minBound minBound minBound minBound minBound minBound , pure $ DecWord maxBound maxBound maxBound maxBound maxBound maxBound maxBound , AppendDouble <$> arbitrary , PrependDouble <$> arbitrary , AppendSpaces . getNonNegative <$> arbitrary , PrependSpaces . getNonNegative <$> arbitrary ] where arbitraryCharCount = chooseBoundedIntegral (0, 6) arbitraryTotalLength = chooseBoundedIntegral (3, 20) arbitraryInteger = chooseInteger ( fromIntegral @Int minBound ^ (3 :: Word) , fromIntegral @Int maxBound ^ (3 :: Word) ) shrink = genericShrink interpretOnText ∷ [Action] → Text → Text interpretOnText xs z = foldl' go z xs where go ∷ Text → Action → Text go b (AppendText x) = b <> x go b (PrependText x) = x <> b go b (AppendChar x) = T.snoc b x go b (PrependChar x) = T.cons x b go b (AppendChars n x) = b <> T.replicate (fromIntegral n) (T.singleton x) go b (PrependChars n x) = T.replicate (fromIntegral n) (T.singleton x) <> b go b (JustifyLeft n x) = T.justifyLeft (fromIntegral n) x b go b (JustifyRight n x) = T.justifyRight (fromIntegral n) x b go b (Center n x) = T.center (fromIntegral n) x b go b (HexInt r s t u v w x) = intersperseText [ hexadecimal (fromIntegral @Int16 @Word16 s) , hexadecimalI t , hexadecimal (fromIntegral @Int64 @Word64 x) ] <> b <> intersperseText [ hexadecimal (fromIntegral @Int8 @Word8 r) , hexadecimalI u , hexadecimal (fromIntegral @Int32 @Word32 v) , hexadecimalI w ] go b (HexWord u v w x) = intersperseText [hexadecimal u, hexadecimal x] <> b <> intersperseText [hexadecimal v, hexadecimal w ] go b (AppendHexI x) = b <> toStrict (toLazyText (hexadecimalSI x)) go b (PrependHexI x) = toStrict (toLazyText (hexadecimalSI x)) <> b go b (AppendHexW x) = b <> toStrict (toLazyText (hexadecimalSW x)) go b (PrependHexW x) = toStrict (toLazyText (hexadecimalSW x)) <> b go b (DecInt r s t u v w x) = intersperseText [decimal s, decimal t, decimal x] <> b <> intersperseText [decimal r, decimal u, decimal v, decimal w] go b (DecWord r s t u v w x) = intersperseText [decimal s, decimal t, decimal x] <> b <> intersperseText [decimal r, decimal u, decimal v, decimal w] go b (AppendDecW x) = b <> toStrict (toLazyText (decimal x)) go b (PrependDecW x) = toStrict (toLazyText (decimal x)) <> b go b (AppendDecI x) = b <> toStrict (toLazyText (decimal x)) go b (PrependDecI x) = toStrict (toLazyText (decimal x)) <> b go b (AppendDecI30 x) = b <> toStrict (toLazyText (decimal x)) go b (PrependDecI30 x) = toStrict (toLazyText (decimal x)) <> b go b (AppendDecInteger x) = b <> toStrict (toLazyText (decimal x)) go b (PrependDecInteger x) = toStrict (toLazyText (decimal x)) <> b go b (AppendDouble x) = b <> toStrict (toLazyText (realFloat x)) go b (PrependDouble x) = toStrict (toLazyText (realFloat x)) <> b go b (AppendSpaces n) = b <> T.replicate (fromIntegral n) (T.singleton ' ') go b (PrependSpaces n) = T.replicate (fromIntegral n) (T.singleton ' ') <> b hexadecimalSI (SomeIntN x) = hexadecimalI x hexadecimalI ∷ (KnownNat n) ⇒ IntN n → TB.Builder hexadecimalI x = if x >= 0 then hexadecimal x else hexadecimal (fromIntegral @_ @Word64 x .&. (shiftL 1 (intSize x) - 1)) hexadecimalSW (SomeWordN x) = hexadecimalW x hexadecimalW ∷ (KnownNat n) ⇒ WordN n → TB.Builder hexadecimalW x = if x >= 0 then hexadecimal x else hexadecimal (fromIntegral @_ @Word64 x .&. (shiftL 1 (intSize x) - 1)) intersperseText ∷ [TB.Builder] → Text intersperseText bs = toStrict (toLazyText (mconcat (intersperse (TB.singleton ';') bs))) interpretOnBuffer ∷ [Action] → Buffer ⊸ Buffer interpretOnBuffer xs z = foldlIntoBuffer go z xs where go ∷ Buffer ⊸ Action → Buffer go b (AppendText x) = b |> x go b (PrependText x) = x <| b go b (AppendChar x) = b |>. x go b (PrependChar x) = x .<| b go b (AppendChars n x) = appendChars n x b go b (PrependChars n x) = prependChars n x b go b (JustifyLeft n x) = justifyLeft n x b go b (JustifyRight n x) = justifyRight n x b go b (Center n x) = center n x b go b (HexInt r s t u v w x) = s &<| ";"# #<| t &<| ";"# #<| x &<| (b |>& r |># ";"# |>& u |># ";"# |>& v |># ";"# |>& w) go b (HexWord u v w x) = u &<| ";"# #<| x &<| (b |>& v |># ";"# |>& w) go b (AppendHexI x) = case x of {SomeIntN i → b |>& i} go b (PrependHexI x) = case x of {SomeIntN i → i &<| b} go b (AppendHexW x) = case x of {SomeWordN i → b |>& i} go b (PrependHexW x) = case x of {SomeWordN i → i &<| b} go b (DecInt r s t u v w x) = s $<| ";"# #<| t $<| ";"# #<| x $<| (b |>$ r |># ";"# |>$ u |># ";"# |>$ v |># ";"# |>$ w) go b (DecWord r s t u v w x) = s $<| ";"# #<| t $<| ";"# #<| x $<| (b |>$ r |># ";"# |>$ u |># ";"# |>$ v |># ";"# |>$ w) go b (AppendDecW x) = b |>$ x go b (PrependDecW x) = x $<| b go b (AppendDecI x) = b |>$ x go b (PrependDecI x) = x $<| b go b (AppendDecI30 x) = b |>$ x go b (PrependDecI30 x) = x $<| b go b (AppendDecInteger x) = b |>$$ x go b (PrependDecInteger x) = x $$<| b go b (AppendDouble x) = b |>% x go b (PrependDouble x) = x %<| b go b (AppendSpaces n) = b |>… n go b (PrependSpaces n) = n …<| b main ∷ IO () main = defaultMain $ testGroup "All" [ testProperty "sequence of actions" prop1 , testProperty "two sequences of actions" prop2 , testProperty "append addr#" prop3 , testProperty "prepend addr#" prop4 , testProperty "bytestring builder" prop5 , testProperty "CSE 1" prop6 , testProperty "CSE 2" prop7 , testProperty "unbounded integers" prop8 ] prop1 ∷ [Action] → Property prop1 acts = interpretOnText acts mempty === runBuffer (\b → interpretOnBuffer acts b) prop2 ∷ [Action] → [Action] → Property prop2 acts1 acts2 = interpretOnText acts1 mempty <> interpretOnText acts2 mempty === runBuffer (\b → go (dupBuffer b)) where go ∷ (# Buffer, Buffer #) ⊸ Buffer go (# b1, b2 #) = interpretOnBuffer acts1 b1 >< interpretOnBuffer acts2 b2 prop3 :: [Action] → Property prop3 acts = runBuffer f1 === runBuffer f2 where addr# = "foo"# f1, f2 :: Buffer ⊸ Buffer f1 = \b → interpretOnBuffer acts b |># addr# f2 = \b → interpretOnBuffer acts b |> T.pack "foo" prop4 :: [Action] → Property prop4 acts = runBuffer f1 === runBuffer f2 where addr# = "foo"# f1, f2 :: Buffer ⊸ Buffer f1 = \b → addr# #<| interpretOnBuffer acts b f2 = \b → T.pack "foo" <| interpretOnBuffer acts b prop5 ∷ [Action] → Property prop5 acts = T.encodeUtf8 (interpretOnText acts mempty) === runBufferBS (\b → interpretOnBuffer acts b) prop6 :: Property prop6 = T.pack "_a_b" === runBuffer (\buf -> buf |>. '_' |>. 'a' |> runBuffer (\buf' -> buf' |>. '_' |>. 'b')) prop7 :: Property prop7 = let !x = runBuffer (\buf -> (buf |>. '_' |>. 'a') |>… 5) !y = runBuffer (\buf -> (buf |>. '_' |>. 'b') |>… 5) in (x, y) === (T.pack "_a ", T.pack "_b ") prop8 ∷ Property prop8 = conjoin [ check 0 , check 1e18 , check 1e19 , check 1e20 , check 1e50 , check 1e100 , check (10 ^ (400 ∷ Word)) , check (10 ^ (600 ∷ Word)) , check (10 ^ (1000 ∷ Word)) , check (toInteger @Word maxBound) , check (toInteger @Word maxBound + 1) , check (negate (toInteger @Word maxBound)) , check (negate (toInteger @Word maxBound + 1)) , check (toInteger @Word maxBound ^ (2 ∷ Word)) , check (toInteger @Word maxBound ^ (20 ∷ Word)) , check (toInteger @Word maxBound ^ (40 ∷ Word)) ] where check ∷ Integer → Property check i = decimalText i === runBuffer (i $$<|) .&&. decimalText i === runBuffer (|>$$ i) decimalText = toStrict . toLazyText . decimal -------------------------------------------------------------------------------- -- IntN -------------------------------------------------------------------------------- newtype IntN (n ∷ Natural) = IntN' {unIntN ∷ Int64} deriving stock (Eq, Ord) deriving newtype (Enum, Real, Integral) instance (KnownNat n) ⇒ Show (IntN n) where showsPrec p (IntN x) = showParen (p > 10) (\s → mconcat ["IntN @", show (natVal (Proxy @n)), " ", show x, s]) pattern IntN ∷ forall n. (KnownNat n) => Int64 → IntN n pattern IntN x ← IntN' x where IntN x = IntN' x' where -- If the nth bit is 1, then interpret the value as negative and fill the -- bits from nth position with 1s. Otherwise clear them to 0s. size = intSize (Proxy @n) x' = if testBit x (size - 1) then x .|. m1 else x .&. m2 m1 = complement ((1 `shiftL` (size - 1)) - 1) m2 = (1 `shiftL` size) - 1 {-# COMPLETE IntN #-} intSize ∷ forall p n. (KnownNat n) => p n → Int intSize _ = fromInteger (natVal (Proxy @n)) instance (KnownNat n) => Arbitrary (IntN n) where arbitrary = IntN <$> chooseBoundedIntegral (unIntN @n minBound, unIntN @n maxBound) shrink = shrinkIntegral instance (KnownNat n) => Bounded (IntN n) where minBound = IntN (negate (1 `shiftL` (intSize (Proxy @n) - 1))) maxBound = IntN ((1 `shiftL` (intSize (Proxy @n) - 1)) - 1) instance (KnownNat n) => Num (IntN n) where IntN x + IntN y = IntN (x + y) IntN x * IntN y = IntN (x * y) abs (IntN x) = IntN (abs x) signum = undefined negate (IntN x) = IntN (negate x) fromInteger x = IntN (fromInteger x) instance (KnownNat n) => Bits (IntN n) where IntN a .&. IntN b = IntN (a .&. b) IntN a .|. IntN b = IntN (a .|. b) xor = undefined complement (IntN x) = IntN (complement x) shift (IntN x) i = IntN (shift x i) rotate = undefined bitSize = const (intSize (Proxy @n)) bitSizeMaybe = const (Just (intSize (Proxy @n))) isSigned = const True testBit (IntN x) = testBit x bit = bitDefault popCount = undefined instance (KnownNat n) => FiniteBits (IntN n) where finiteBitSize = const (intSize (Proxy @n)) data SomeIntN = forall n. (KnownNat n) ⇒ SomeIntN (IntN n) instance Eq SomeIntN where SomeIntN (IntN @n1 i1) == SomeIntN (IntN @n2 i2) = case sameNat (Proxy @n1) (Proxy @n2) of Just _ → i1 == i2 Nothing → False instance Ord SomeIntN where SomeIntN (IntN @n1 i1) `compare` SomeIntN (IntN @n2 i2) = case cmpNat (Proxy @n1) (Proxy @n2) of LTI → LT EQI → compare i1 i2 GTI → GT instance Show SomeIntN where show (SomeIntN i) = show i instance Arbitrary SomeIntN where arbitrary = do s <- chooseInt (8, 64) case someNatVal (toInteger s) of Just (SomeNat (Proxy ∷ Proxy n)) → SomeIntN <$> arbitraryBoundedIntegral @(IntN n) Nothing → error "impossible" shrink (SomeIntN i) = SomeIntN <$> shrinkIntegral i -------------------------------------------------------------------------------- -- WordN -------------------------------------------------------------------------------- newtype WordN (n ∷ Natural) = WordN' { unWordN :: Word64 } deriving stock (Eq, Ord) deriving newtype (Enum, Real, Integral) instance (KnownNat n) ⇒ Show (WordN n) where showsPrec p (WordN x) = showParen (p > 10) (\s → mconcat ["WordN @", show (natVal (Proxy @n)), " ", show x, s]) pattern WordN ∷ forall n. (KnownNat n) => Word64 → WordN n pattern WordN x ← WordN' x where WordN x = WordN' (x .&. ((1 `shiftL` intSize (Proxy @n)) - 1)) {-# COMPLETE WordN #-} instance (KnownNat n) => Arbitrary (WordN n) where arbitrary = WordN <$> chooseBoundedIntegral (unWordN @n minBound, unWordN @n maxBound) shrink = shrinkIntegral instance (KnownNat n) => Bounded (WordN n) where minBound = WordN' 0 maxBound = WordN ((1 `shiftL` intSize (Proxy @n)) - 1) instance (KnownNat n) => Num (WordN n) where WordN x + WordN y = WordN (x + y) WordN x * WordN y = WordN (x * y) abs = id signum = undefined negate (WordN x) = WordN (negate x) fromInteger x = WordN (fromInteger x) instance (KnownNat n) => Bits (WordN n) where WordN a .&. WordN b = WordN (a .&. b) WordN a .|. WordN b = WordN (a .|. b) xor = undefined complement (WordN x) = WordN (complement x) shift (WordN x) i = WordN (shift x i) rotate = undefined bitSize = const (intSize (Proxy @n)) bitSizeMaybe = const (Just (intSize (Proxy @n))) isSigned = const False testBit (WordN x) = testBit x bit = bitDefault popCount = undefined instance (KnownNat n) => FiniteBits (WordN n) where finiteBitSize = const (intSize (Proxy @n)) data SomeWordN = forall n. (KnownNat n) ⇒ SomeWordN (WordN n) instance Eq SomeWordN where SomeWordN (WordN @n1 i1) == SomeWordN (WordN @n2 i2) = case sameNat (Proxy @n1) (Proxy @n2) of Just _ → i1 == i2 Nothing → False instance Ord SomeWordN where SomeWordN (WordN @n1 i1) `compare` SomeWordN (WordN @n2 i2) = case cmpNat (Proxy @n1) (Proxy @n2) of LTI → LT EQI → compare i1 i2 GTI → GT instance Show SomeWordN where show (SomeWordN i) = show i instance Arbitrary SomeWordN where arbitrary = do s <- chooseInt (8, 64) case someNatVal (toInteger s) of Just (SomeNat (Proxy ∷ Proxy n)) → SomeWordN <$> arbitraryBoundedIntegral @(WordN n) Nothing → error "impossible" shrink (SomeWordN i) = SomeWordN <$> shrinkIntegral i text-builder-linear-0.1.3/text-builder-linear.cabal0000644000000000000000000000611507346545000020433 0ustar0000000000000000cabal-version: 2.4 name: text-builder-linear version: 0.1.3 license: BSD-3-Clause license-file: LICENSE copyright: 2022 Andrew Lelechenko maintainer: Andrew Lelechenko author: Andrew Lelechenko tested-with: ghc ==9.2.8 ghc ==9.4.8 ghc ==9.6.6 ghc ==9.8.2 ghc ==9.10.1 homepage: https://github.com/Bodigrim/linear-builder synopsis: Builder for Text and ByteString based on linear types description: Strict Text and ByteString builder, which hides mutable buffer behind linear types and takes amortized linear time. category: Text extra-doc-files: changelog.md README.md source-repository head type: git location: git://github.com/Bodigrim/linear-builder.git library exposed-modules: Data.Text.Builder.Linear Data.Text.Builder.Linear.Buffer Data.Text.Builder.Linear.Core hs-source-dirs: src other-modules: Data.Text.Builder.Linear.Array Data.Text.Builder.Linear.Char Data.Text.Builder.Linear.Dec.Bounded Data.Text.Builder.Linear.Dec.Unbounded Data.Text.Builder.Linear.Double Data.Text.Builder.Linear.Hex Data.Text.Builder.Linear.Internal default-language: GHC2021 default-extensions: LinearTypes MagicHash NumDecimals UnboxedTuples UnicodeSyntax UnliftedDatatypes ViewPatterns ghc-options: -Wall -O2 -fexpose-all-unfoldings build-depends: base >=4.16 && <5, text >=2.0 && <2.2, bytestring >=0.11 && <0.13, ghc-bignum >=1.1 && < 2.0, quote-quot >=0.2.1 && <0.3 test-suite linear-builder-tests type: exitcode-stdio-1.0 main-is: Main.hs hs-source-dirs: test default-language: GHC2021 default-extensions: DerivingStrategies LinearTypes MagicHash NumDecimals PatternSynonyms UnboxedTuples UnicodeSyntax ghc-options: -Wall -Wno-orphans -threaded -rtsopts "-with-rtsopts -N" build-depends: base, text, text-builder-linear, tasty >=1.4 && <1.6, tasty-quickcheck >=0.10 && <0.12 benchmark linear-builder-bench type: exitcode-stdio-1.0 main-is: Main.hs hs-source-dirs: bench other-modules: BenchChar BenchDecimal BenchDecimalUnbounded BenchDouble BenchHexadecimal BenchText default-language: GHC2021 default-extensions: CPP LinearTypes NumDecimals UnicodeSyntax ghc-options: -Wall -rtsopts -O2 -fproc-alignment=64 build-depends: base, bytestring, text, text-builder-linear, -- NOTE: The following packages are optional, but are not required that -- often. While they could be guarded by a flag, we prefer keeping -- the Hackage page simple. Just uncomment these lines when needed. -- bytestring-strict-builder >= 0.4.5 && < 0.5, -- text-builder >= 0.6.7 && < 0.7, tasty, tasty-bench >=0.4 && <0.5