cborg-0.2.10.0/0000755000000000000000000000000007346545000011215 5ustar0000000000000000cborg-0.2.10.0/ChangeLog.md0000644000000000000000000000526107346545000013372 0ustar0000000000000000# Revision history for cborg ## 0.2.10.0 * Return `TypeUInt64` and `TypeNInt64` from `tokenType` when appropriate, fixing [#324](https://github.com/well-typed/cborg/issues/324) * Don't rely on `MonadFail Gen` * Support for GHC 9.8 ## 0.2.9.0 * Fix build with `base >= 4.17` on platforms without unaligned memory operations. * Fix `Eq`, `Ord`, `Show` and `IsList` instances for `SlicedByteArray` when offset is present * Fix `toBuilder` and `encodeByteArray` for `SlicedByteArray` when offset is present ## 0.2.8.0 -- 2022-09-24 * Support GHC 9.4 * Fix compatibility with primitive 0.7.4.0 * Drop GHC 8.0 and 8.2 support * Support aeson 2.1 ## 0.2.3.1 -- 2020-05-10 * Bounds updates for GHC 8.10 * `Decoder` is now a `newtype`. ## 0.2.2.1 -- 2019-12-29 * Testsuite updates for GHC 8.8 ## 0.2.2.0 -- 2019-07-31 * Add peekByteOffset for the current ByteOffset in the input byte sequence. By keeping track of the byte offsets before and after decoding asubterm (a pattern captured by decodeWithByteSpan) and if the overall input data is retained then this is enables later retrieving the span of bytes for the subterm. * Add encodePreEncoded function. This allows pre-encoded CBOR data to be included into an Encoding. This is useful in cases where one has known-valid encoded CBOR data, e.g. on disk, that you want to include into a larger CBOR data stream. This makes it possible in such cases to avoid decoding and re-encoding. * Improved test suite property coverage. We now have property coverering most parts of a commuting diagram, which gives more confidence about what are the right properties to test and what is enough. * Improved test coverage for decoding non-canonical terms * Fix a bug in the checks in the canonical decoding of float16 NaNs. There are multiple bit representations of NaNs and while we were checking this correctly for float32 and float64 we were not checking this correctly for the float16 encoding. * Improved test coverage for special float values. We now have pretty comprehensive coverage of round-tripping properties for special float values, +/- infinity and non-canonical NaNs. * Improved the structure of the test suite * Use new GHC primitives castWord{32ToFloat,64ToDouble} rather than home grown * Support GHC built with integer-simple * Support GHC 8.8 ## 0.2.1.0 -- 2018-10-11 * Bounds bumps and GHC 8.6 compatibility ## 0.2.0.0 -- 2017-11-30 * Improved robustness of non-UTF-8 strings * Add encoders and decoders for `ByteArray` * Add decoding variants that check for canonical encodings * Expose `Codec.CBOR.Read.deserialiseFromBytesWithSize` ## 0.1.0.0 -- 2017-06-28 * First version. Released on an unsuspecting world. cborg-0.2.10.0/LICENSE.txt0000644000000000000000000000310607346545000013040 0ustar0000000000000000Copyright (c) 2015-2017 Duncan Coutts, 2015-2017 Well-Typed LLP, 2015 IRIS Connect Ltd. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Duncan Coutts nor the names of other contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. cborg-0.2.10.0/Setup.hs0000644000000000000000000000005607346545000012652 0ustar0000000000000000import Distribution.Simple main = defaultMain cborg-0.2.10.0/cborg.cabal0000644000000000000000000001211007346545000013270 0ustar0000000000000000name: cborg version: 0.2.10.0 synopsis: Concise Binary Object Representation (CBOR) license: BSD3 license-file: LICENSE.txt author: Duncan Coutts maintainer: duncan@community.haskell.org, ben@smart-cactus.org bug-reports: https://github.com/well-typed/cborg/issues copyright: 2015-2019 Duncan Coutts, 2015-2019 Well-Typed LLP, 2015 IRIS Connect Ltd category: Codec build-type: Simple cabal-version: >= 1.10 tested-with: GHC == 8.4.4, GHC == 8.6.5, GHC == 8.8.3, GHC == 8.10.7, GHC == 9.0.1, GHC == 9.2.2, GHC == 9.4.2, GHC == 9.6.1 extra-source-files: ChangeLog.md description: This package provides an efficient implementation of the Concise Binary Object Representation (CBOR), as specified by [RFC 7049](https://tools.ietf.org/html/rfc7049). . If you are looking for a library for serialisation of Haskell values, have a look at the [serialise](/package/serialise) package, which is built upon this library. . An implementation of the standard bijection between CBOR and JSON is provided by the [cborg-json](/package/cborg-json) package. Also see [cbor-tool](/package/cbor-tool) for a convenient command-line utility for working with CBOR data. . This package was formerly known as @binary-serialise-cbor@. extra-source-files: src/cbits/cbor.h tests/test-vectors/appendix_a.json tests/test-vectors/README.md source-repository head type: git location: https://github.com/well-typed/cborg.git -------------------------------------------------------------------------------- -- Flags flag optimize-gmp default: True manual: False description: Use optimized code paths for integer-gmp -------------------------------------------------------------------------------- -- Library library default-language: Haskell2010 ghc-options: -Wall include-dirs: src/cbits hs-source-dirs: src exposed-modules: Codec.CBOR Codec.CBOR.Decoding Codec.CBOR.Encoding Codec.CBOR.FlatTerm Codec.CBOR.Magic Codec.CBOR.Pretty Codec.CBOR.Read Codec.CBOR.Write Codec.CBOR.Term Codec.CBOR.ByteArray Codec.CBOR.ByteArray.Sliced other-modules: Codec.CBOR.ByteArray.Internal other-extensions: CPP, ForeignFunctionInterface, MagicHash, UnboxedTuples, BangPatterns, DeriveDataTypeable, RankNTypes build-depends: array >= 0.4 && < 0.6, base >= 4.11 && < 4.20, bytestring >= 0.10.4 && < 0.13, containers >= 0.5 && < 0.8, deepseq >= 1.0 && < 1.6, ghc-prim >= 0.3.1.0 && < 0.12, half >= 0.2.2.3 && < 0.4, primitive >= 0.5 && < 0.10, text >= 1.1 && < 1.3 || >= 2.0 && <2.2 if flag(optimize-gmp) cpp-options: -DOPTIMIZE_GMP if impl(ghc >= 9.0) cpp-options: -DHAVE_GHC_BIGNUM build-depends: ghc-bignum >= 1.0 && < 2.0 else build-depends: integer-gmp >= 1.0 && < 2.0 if impl(ghc >= 8.0) ghc-options: -Wcompat -Wnoncanonical-monad-instances else build-depends: -- provide/emulate `Control.Monad.Fail` and `Data.Semigroups` API for pre-GHC8 fail == 4.9.*, semigroups >= 0.18 && < 0.21, -- the `PS` pattern synonym in bytestring 0.11 is unavailable with GHC < 8.0 bytestring < 0.11 test-suite tests type: exitcode-stdio-1.0 hs-source-dirs: tests main-is: Main.hs default-language: Haskell2010 ghc-options: -Wall -fno-warn-orphans -threaded -rtsopts "-with-rtsopts=-N2" other-modules: Tests.UnitTests Tests.Properties Tests.Boundary Tests.ByteOffset Tests.Canonical Tests.PreEncoded Tests.Regress Tests.Regress.Issue160 Tests.Regress.Issue162 Tests.Regress.FlatTerm Tests.Reference Tests.Reference.Implementation Tests.Reference.Generators Tests.Reference.TestVectors Tests.Term Tests.UTF8 Tests.Util build-depends: array >= 0.4 && < 0.6, base >= 4.11 && < 4.20, base-orphans, bytestring >= 0.10.4 && < 0.13, text >= 1.1 && < 2.2, primitive >= 0.5 && < 0.10, cborg, aeson >= 0.7 && < 2.3, base64-bytestring >= 1.0 && < 1.3, base16-bytestring >= 1.0 && < 1.1, deepseq >= 1.0 && < 1.6, half >= 0.2.2.3 && < 0.4, QuickCheck >= 2.9 && < 2.15, random, scientific >= 0.3 && < 0.4, tasty >= 0.11 && < 1.6, tasty-hunit >= 0.9 && < 0.11, tasty-quickcheck >= 0.8 && < 0.11, vector >= 0.10 && < 0.14 if !impl(ghc >= 8.0) build-depends: fail >= 4.9.0.0 && < 4.10 cborg-0.2.10.0/src/Codec/0000755000000000000000000000000007346545000013021 5ustar0000000000000000cborg-0.2.10.0/src/Codec/CBOR.hs0000644000000000000000000000640607346545000014110 0ustar0000000000000000{-# OPTIONS_GHC -fno-warn-unused-imports #-} -- | -- Module : Codec.CBOR -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- A library for working with CBOR. -- module Codec.CBOR ( -- $intro -- * Library Structure -- $structure ) where -- used by Haddocks import Codec.CBOR.Decoding (Decoder) import Codec.CBOR.Encoding (Encoding) import Codec.CBOR.FlatTerm (FlatTerm) import Codec.CBOR.Term (Term, encodeTerm, decodeTerm) {- $intro The @cborg@ library is a low-level parsing and encoding library for the Compact Binary Object Representation (CBOR) defined in RFC 7049. CBOR is a language-agnostic, extensible, and size- and computation-efficient encoding for arbitrary data, with a well-defined bijection to the ubiquitous JSON format and a precisely specified canonical form. Note, however, that @cborg@ does not itself aim to be a serialisation library; it merely serves as the substrate on which such a library might be built. See the [serialise](/package/serialise) library if you are looking for convenient serialisation of Haskell values. Instead, @cborg@ targets cases where precise control over the CBOR object structure is needed such as when working with externally-specified CBOR formats. -} {- $structure The library is split into a number of modules, * Decoding * "Codec.CBOR.Decoding" defines the machinery for decoding primitive CBOR terms into Haskell values. In particular, the 'Decoder' type and associated decoders, @ data 'Decoder' s a -- for, e.g., safe in-place mutation during decoding liftST :: ST s a -> 'Decoder' s a -- primitive decoders decodeWord :: 'Decoder' s Word decodeBytes :: 'Decoder' s ByteString -- et cetera @ * "Codec.CBOR.Read" defines the low-level wire-format decoder, e.g. @ 'Codec.CBOR.Read.deserialiseFromBytes' :: 'Decoder' a -> ByteString -> Either String (ByteString, a) @ * Encoding * "Codec.CBOR.Encoding" defines the 'Encoding' type, which is in essence difference-list of CBOR tokens and is used to construct CBOR encodings. @ data 'Encoding' instance Monoid 'Encoding' encodeWord :: Word -> Encoding encodeBytes :: ByteString -> Encoding -- et cetera @ * "Codec.CBOR.Write" defines the low-level wire-format encoder, e.g. @ 'Codec.CBOR.Write.toBuilder' :: 'Encoding' a -> Data.ByteString.Builder.Builder @ * Capturing arbitrary terms * "Codec.CBOR.Term" provides the 'Term' type, which provides a type for capturing arbitrary CBOR terms. 'Term's can be encoded and decoded with, @ data 'Term' = TInt Int | TBytes ByteString -- et cetera 'encodeTerm' :: 'Term' -> 'Encoding' 'decodeTerm' :: 'Decoder' 'Term' @ * Debugging * "Codec.CBOR.FlatTerm" contains the 'FlatTerm' type, which provides a concrete AST for capturing primitive CBOR wire encodings. This can be useful when testing decoders and encoders. -} cborg-0.2.10.0/src/Codec/CBOR/0000755000000000000000000000000007346545000013546 5ustar0000000000000000cborg-0.2.10.0/src/Codec/CBOR/ByteArray.hs0000644000000000000000000000522607346545000016011 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE ScopedTypeVariables #-} -- | -- Module : Codec.CBOR.ByteArray -- Copyright : (c) Ben Gamari 2017-2018 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- A ByteArray with more instances than 'Data.Primitive.ByteArray.ByteArray'. -- Some day when these instances are reliably available from @primitive@ we can -- likely replace this with 'Data.Primitive.ByteArray.ByteArray'. -- module Codec.CBOR.ByteArray ( -- * Simple byte arrays ByteArray(..) , sizeofByteArray -- * Conversions , fromShortByteString , toShortByteString , fromByteString , toBuilder , toSliced ) where import Data.Char (ord) import Data.Word import GHC.Exts (IsList(..), IsString(..)) import qualified Data.Primitive.ByteArray as Prim import qualified Data.ByteString as BS import qualified Data.ByteString.Short as BSS import qualified Data.ByteString.Short.Internal as BSS import qualified Data.ByteString.Builder as BSB import qualified Codec.CBOR.ByteArray.Sliced as Sliced import Codec.CBOR.ByteArray.Internal newtype ByteArray = BA {unBA :: Prim.ByteArray} sizeofByteArray :: ByteArray -> Int {-# INLINE sizeofByteArray #-} sizeofByteArray (BA ba) = Prim.sizeofByteArray ba fromShortByteString :: BSS.ShortByteString -> ByteArray fromShortByteString (BSS.SBS ba) = BA (Prim.ByteArray ba) toShortByteString :: ByteArray -> BSS.ShortByteString toShortByteString (BA (Prim.ByteArray ba)) = BSS.SBS ba fromByteString :: BS.ByteString -> ByteArray fromByteString = fromShortByteString . BSS.toShort toBuilder :: ByteArray -> BSB.Builder toBuilder = Sliced.toBuilder . toSliced toSliced :: ByteArray -> Sliced.SlicedByteArray toSliced ba@(BA arr) = Sliced.SBA arr 0 (sizeofByteArray ba) instance Show ByteArray where showsPrec _ = shows . toSliced instance Eq ByteArray where ba1 == ba2 = toSliced ba1 == toSliced ba2 instance Ord ByteArray where ba1 `compare` ba2 = toSliced ba1 `compare` toSliced ba2 instance IsString ByteArray where fromString = fromList . map checkedOrd where checkedOrd c | c > '\xff' = error "IsString(Codec.CBOR.ByteArray): Non-ASCII character" | otherwise = fromIntegral $ ord c instance IsList ByteArray where type Item ByteArray = Word8 fromList xs = fromListN (Prelude.length xs) xs fromListN n xs = let arr = mkByteArray n xs in BA arr toList ba@(BA arr) = foldrByteArray (:) [] 0 (sizeofByteArray ba) arr cborg-0.2.10.0/src/Codec/CBOR/ByteArray/0000755000000000000000000000000007346545000015450 5ustar0000000000000000cborg-0.2.10.0/src/Codec/CBOR/ByteArray/Internal.hs0000644000000000000000000000447407346545000017571 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE CPP #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE UnboxedTuples #-} {-# LANGUAGE ScopedTypeVariables #-} -- | -- Module : Codec.CBOR.ByteArray.Internal -- Copyright : (c) Ben Gamari 2017-2018 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- Various bytearray utilities -- module Codec.CBOR.ByteArray.Internal ( foldrByteArray , copyToAddr , isTrue# , sameByteArray , mkByteArray , isByteArrayPinned , touch ) where import Control.Monad.ST import Control.Monad import GHC.IO (IO(..)) import GHC.Exts import GHC.Word import qualified Data.Primitive.ByteArray as Prim foldrByteArray :: (Word8 -> a -> a) -> a -> Int -- ^ offset -> Int -- ^ length -> Prim.ByteArray -- ^ array -> a foldrByteArray f z off0 len ba = go off0 where len' = len + off0 go !off | off >= len' = z | otherwise = let x = Prim.indexByteArray ba off in f x (go (off+1)) copyToAddr :: Prim.ByteArray -> Int -> Ptr a -> Int -> IO () copyToAddr (Prim.ByteArray ba) (I# off) (Ptr addr) (I# len) = IO (\s -> case copyByteArrayToAddr# ba off addr len s of s' -> (# s', () #)) sameByteArray :: Prim.ByteArray -> Prim.ByteArray -> Bool sameByteArray (Prim.ByteArray ba1#) (Prim.ByteArray ba2#) = case reallyUnsafePtrEquality# (unsafeCoerce# ba1# :: ()) (unsafeCoerce# ba2# :: ()) of r -> isTrue# r -- | @mkByteArray n xs@ forms a 'Prim.ByteArray' with contents @xs@. Note that -- @n@ must be the precise length of @xs@. mkByteArray :: Int -> [Word8] -> Prim.ByteArray mkByteArray n xs = runST $ do arr <- Prim.newByteArray n zipWithM_ (Prim.writeByteArray arr) [0..n-1] (take n $ xs ++ repeat 0) Prim.unsafeFreezeByteArray arr -- | A conservative estimate of pinned-ness. isByteArrayPinned :: Prim.ByteArray -> Bool isByteArrayPinned (Prim.ByteArray _ba) = #if __GLASGOW_HASKELL__ > 800 case isByteArrayPinned# _ba of 0# -> False _ -> True #else False #endif touch :: a -> IO () touch x = IO $ \s -> case touch# x s of s' -> (# s', () #) cborg-0.2.10.0/src/Codec/CBOR/ByteArray/Sliced.hs0000644000000000000000000001202707346545000017211 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE CPP #-} {-# LANGUAGE TypeFamilies #-} -- | -- Module : Codec.CBOR.ByteArray.Sliced -- Copyright : (c) Ben Gamari 2017-2018 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- A ByteArray with more instances than 'Data.Primitive.ByteArray.ByteArray'. -- Some day when these instances are reliably available from @primitive@ we can -- likely replace this with 'Data.Primitive.ByteArray.ByteArray'. -- module Codec.CBOR.ByteArray.Sliced ( SlicedByteArray(..) -- * Conversions , sizeofSlicedByteArray , fromShortByteString , fromByteString , fromByteArray , toByteString , toBuilder ) where import GHC.Exts import Data.Char (chr, ord) import Data.Word import Foreign.Ptr import Control.Monad.ST import System.IO.Unsafe import qualified Data.Primitive.ByteArray as Prim #if !MIN_VERSION_primitive(0,7,0) import Data.Primitive.Types (Addr(..)) #endif import qualified Data.ByteString as BS import qualified Data.ByteString.Unsafe as BS import qualified Data.ByteString.Short as BSS import qualified Data.ByteString.Short.Internal as BSS import qualified Data.ByteString.Builder as BSB import qualified Data.ByteString.Builder.Internal as BSB import Codec.CBOR.ByteArray.Internal data SlicedByteArray = SBA {unSBA :: !Prim.ByteArray, offset :: !Int, length :: !Int} fromShortByteString :: BSS.ShortByteString -> SlicedByteArray fromShortByteString (BSS.SBS ba) = fromByteArray (Prim.ByteArray ba) fromByteString :: BS.ByteString -> SlicedByteArray fromByteString = fromShortByteString . BSS.toShort fromByteArray :: Prim.ByteArray -> SlicedByteArray fromByteArray ba = SBA ba 0 (Prim.sizeofByteArray ba) sizeofSlicedByteArray :: SlicedByteArray -> Int sizeofSlicedByteArray (SBA _ _ len) = len -- | Note that this may require a copy. toByteString :: SlicedByteArray -> BS.ByteString toByteString sba = unsafePerformIO $ BS.unsafePackCStringFinalizer ptr (sizeofSlicedByteArray sba) (touch pinned) where pinned = toPinned sba #if MIN_VERSION_primitive(0,7,0) !(Ptr addr#) = Prim.byteArrayContents pinned #else !(Addr addr#) = Prim.byteArrayContents pinned #endif ptr = Ptr addr# toPinned :: SlicedByteArray -> Prim.ByteArray toPinned (SBA ba off len) | isByteArrayPinned ba = ba | otherwise = runST $ do ba' <- Prim.newPinnedByteArray len Prim.copyByteArray ba' 0 ba off len Prim.unsafeFreezeByteArray ba' toBuilder :: SlicedByteArray -> BSB.Builder toBuilder = \(SBA ba off len) -> BSB.builder (go ba off (len + off)) where go ba !ip !ipe !k (BSB.BufferRange op ope) | inpRemaining <= outRemaining = do copyToAddr ba ip op inpRemaining let !br' = BSB.BufferRange (op `plusPtr` inpRemaining) ope k br' | otherwise = do copyToAddr ba ip op outRemaining let !ip' = ip + outRemaining return $ BSB.bufferFull 1 ope (go ba ip' ipe k) where outRemaining = ope `minusPtr` op inpRemaining = ipe - ip instance IsString SlicedByteArray where fromString = fromList . map checkedOrd where checkedOrd c | c > '\xff' = error "IsString(Codec.CBOR.ByteArray.Sliced): Non-ASCII character" | otherwise = fromIntegral $ ord c instance IsList SlicedByteArray where type Item SlicedByteArray = Word8 fromList xs = fromListN (Prelude.length xs) xs -- Note that we make no attempt to behave sensibly if @n /= length xs@. -- The class definition allows this. fromListN n xs = let arr = mkByteArray n xs in SBA arr 0 n toList (SBA arr off len) = foldrByteArray (:) [] off len arr instance Show SlicedByteArray where showsPrec _ = shows . map (chr . fromIntegral) . toList instance Eq SlicedByteArray where SBA arr1 off1 len1 == SBA arr2 off2 len2 | len1 /= len2 = False | sameByteArray arr1 arr2 , off1 == off2 , len1 == len2 = True | otherwise = let (!) :: Prim.ByteArray -> Int -> Word8 (!) = Prim.indexByteArray -- len1 and len2 are known to be equal at this point len1' = len1 + off1 go i1 i2 | i1 == len1' = True | (arr1 ! i1) == (arr2 ! i2) = go (i1+1) (i2+1) | otherwise = False in go off1 off2 instance Ord SlicedByteArray where SBA arr1 off1 len1 `compare` SBA arr2 off2 len2 | sameByteArray arr1 arr2 , off1 == off2 , len1 == len2 = EQ | otherwise = let (!) :: Prim.ByteArray -> Int -> Word8 (!) = Prim.indexByteArray len1' = len1 + off1 len2' = len2 + off2 go i1 i2 | i1 == len1' && i2 == len2' = EQ | i1 == len1' || i2 == len2' = len1 `compare` len2 | EQ <- o = go (i1+1) (i2+1) | otherwise = o where o = (arr1 ! i1) `compare` (arr2 ! i2) in go off1 off2 cborg-0.2.10.0/src/Codec/CBOR/Decoding.hs0000644000000000000000000010443707346545000015627 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE BangPatterns #-} {-# LANGUAGE RankNTypes #-} -- | -- Module : Codec.CBOR.Decoding -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- High level API for decoding values that were encoded with the -- "Codec.CBOR.Encoding" module, using a 'Monad' -- based interface. -- module Codec.CBOR.Decoding ( -- * Decode primitive operations Decoder , DecodeAction(..) , liftST , getDecodeAction -- ** Read input tokens , decodeWord -- :: Decoder s Word , decodeWord8 -- :: Decoder s Word8 , decodeWord16 -- :: Decoder s Word16 , decodeWord32 -- :: Decoder s Word32 , decodeWord64 -- :: Decoder s Word64 , decodeNegWord -- :: Decoder s Word , decodeNegWord64 -- :: Decoder s Word64 , decodeInt -- :: Decoder s Int , decodeInt8 -- :: Decoder s Int8 , decodeInt16 -- :: Decoder s Int16 , decodeInt32 -- :: Decoder s Int32 , decodeInt64 -- :: Decoder s Int64 , decodeInteger -- :: Decoder s Integer , decodeFloat -- :: Decoder s Float , decodeDouble -- :: Decoder s Double , decodeBytes -- :: Decoder s ByteString , decodeBytesIndef -- :: Decoder s () , decodeByteArray -- :: Decoder s ByteArray , decodeString -- :: Decoder s Text , decodeStringIndef -- :: Decoder s () , decodeUtf8ByteArray -- :: Decoder s ByteArray , decodeListLen -- :: Decoder s Int , decodeListLenIndef -- :: Decoder s () , decodeMapLen -- :: Decoder s Int , decodeMapLenIndef -- :: Decoder s () , decodeTag -- :: Decoder s Word , decodeTag64 -- :: Decoder s Word64 , decodeBool -- :: Decoder s Bool , decodeNull -- :: Decoder s () , decodeSimple -- :: Decoder s Word8 -- ** Specialised Read input token operations , decodeWordOf -- :: Word -> Decoder s () , decodeListLenOf -- :: Int -> Decoder s () -- ** Branching operations --, decodeBytesOrIndef --, decodeStringOrIndef , decodeListLenOrIndef -- :: Decoder s (Maybe Int) , decodeMapLenOrIndef -- :: Decoder s (Maybe Int) , decodeBreakOr -- :: Decoder s Bool -- ** Inspecting the token type , peekTokenType -- :: Decoder s TokenType , TokenType(..) -- ** Special operations , peekAvailable -- :: Decoder s Int , ByteOffset , peekByteOffset -- :: Decoder s ByteOffset , decodeWithByteSpan -- ** Canonical CBOR -- $canonical , decodeWordCanonical -- :: Decoder s Word , decodeWord8Canonical -- :: Decoder s Word8 , decodeWord16Canonical -- :: Decoder s Word16 , decodeWord32Canonical -- :: Decoder s Word32 , decodeWord64Canonical -- :: Decoder s Word64 , decodeNegWordCanonical -- :: Decoder s Word , decodeNegWord64Canonical -- :: Decoder s Word64 , decodeIntCanonical -- :: Decoder s Int , decodeInt8Canonical -- :: Decoder s Int8 , decodeInt16Canonical -- :: Decoder s Int16 , decodeInt32Canonical -- :: Decoder s Int32 , decodeInt64Canonical -- :: Decoder s Int64 , decodeBytesCanonical -- :: Decoder s ByteString , decodeByteArrayCanonical -- :: Decoder s ByteArray , decodeStringCanonical -- :: Decoder s Text , decodeUtf8ByteArrayCanonical -- :: Decoder s ByteArray , decodeListLenCanonical -- :: Decoder s Int , decodeMapLenCanonical -- :: Decoder s Int , decodeTagCanonical -- :: Decoder s Word , decodeTag64Canonical -- :: Decoder s Word64 , decodeIntegerCanonical -- :: Decoder s Integer , decodeFloat16Canonical -- :: Decoder s Float , decodeFloatCanonical -- :: Decoder s Float , decodeDoubleCanonical -- :: Decoder s Double , decodeSimpleCanonical -- :: Decoder s Word8 , decodeWordCanonicalOf -- :: Word -> Decoder s () , decodeListLenCanonicalOf -- :: Int -> Decoder s () -- * Sequence operations , decodeSequenceLenIndef -- :: ... , decodeSequenceLenN -- :: ... ) where #include "cbor.h" import GHC.Exts import GHC.Word import GHC.Int import Data.Text (Text) import Data.ByteString (ByteString) import Control.Applicative import Control.Monad.ST import qualified Control.Monad.Fail as Fail import Codec.CBOR.ByteArray (ByteArray) import Prelude hiding (decodeFloat) -- | A continuation-based decoder, used for decoding values that were -- previously encoded using the "Codec.CBOR.Encoding" -- module. As 'Decoder' has a 'Monad' instance, you can easily -- write 'Decoder's monadically for building your deserialisation -- logic. -- -- @since 0.2.0.0 newtype Decoder s a = Decoder { runDecoder :: forall r. (a -> ST s (DecodeAction s r)) -> ST s (DecodeAction s r) } -- | An action, representing a step for a decoder to taken and a -- continuation to invoke with the expected value. -- -- @since 0.2.0.0 data DecodeAction s a = ConsumeWord (Word# -> ST s (DecodeAction s a)) | ConsumeWord8 (Word# -> ST s (DecodeAction s a)) | ConsumeWord16 (Word# -> ST s (DecodeAction s a)) | ConsumeWord32 (Word# -> ST s (DecodeAction s a)) | ConsumeNegWord (Word# -> ST s (DecodeAction s a)) | ConsumeInt (Int# -> ST s (DecodeAction s a)) | ConsumeInt8 (Int# -> ST s (DecodeAction s a)) | ConsumeInt16 (Int# -> ST s (DecodeAction s a)) | ConsumeInt32 (Int# -> ST s (DecodeAction s a)) | ConsumeListLen (Int# -> ST s (DecodeAction s a)) | ConsumeMapLen (Int# -> ST s (DecodeAction s a)) | ConsumeTag (Word# -> ST s (DecodeAction s a)) -- 64bit variants for 32bit machines #if defined(ARCH_32bit) | ConsumeWord64 (Word64# -> ST s (DecodeAction s a)) | ConsumeNegWord64 (Word64# -> ST s (DecodeAction s a)) | ConsumeInt64 (Int64# -> ST s (DecodeAction s a)) | ConsumeListLen64 (Int64# -> ST s (DecodeAction s a)) | ConsumeMapLen64 (Int64# -> ST s (DecodeAction s a)) | ConsumeTag64 (Word64# -> ST s (DecodeAction s a)) #endif | ConsumeInteger (Integer -> ST s (DecodeAction s a)) | ConsumeFloat (Float# -> ST s (DecodeAction s a)) | ConsumeDouble (Double# -> ST s (DecodeAction s a)) | ConsumeBytes (ByteString-> ST s (DecodeAction s a)) | ConsumeByteArray (ByteArray -> ST s (DecodeAction s a)) | ConsumeString (Text -> ST s (DecodeAction s a)) | ConsumeUtf8ByteArray (ByteArray -> ST s (DecodeAction s a)) | ConsumeBool (Bool -> ST s (DecodeAction s a)) | ConsumeSimple (Word# -> ST s (DecodeAction s a)) | ConsumeBytesIndef (ST s (DecodeAction s a)) | ConsumeStringIndef (ST s (DecodeAction s a)) | ConsumeListLenIndef (ST s (DecodeAction s a)) | ConsumeMapLenIndef (ST s (DecodeAction s a)) | ConsumeNull (ST s (DecodeAction s a)) | ConsumeListLenOrIndef (Int# -> ST s (DecodeAction s a)) | ConsumeMapLenOrIndef (Int# -> ST s (DecodeAction s a)) | ConsumeBreakOr (Bool -> ST s (DecodeAction s a)) | PeekTokenType (TokenType -> ST s (DecodeAction s a)) | PeekAvailable (Int# -> ST s (DecodeAction s a)) #if defined(ARCH_32bit) | PeekByteOffset (Int64# -> ST s (DecodeAction s a)) #else | PeekByteOffset (Int# -> ST s (DecodeAction s a)) #endif -- All the canonical variants | ConsumeWordCanonical (Word# -> ST s (DecodeAction s a)) | ConsumeWord8Canonical (Word# -> ST s (DecodeAction s a)) | ConsumeWord16Canonical (Word# -> ST s (DecodeAction s a)) | ConsumeWord32Canonical (Word# -> ST s (DecodeAction s a)) | ConsumeNegWordCanonical (Word# -> ST s (DecodeAction s a)) | ConsumeIntCanonical (Int# -> ST s (DecodeAction s a)) | ConsumeInt8Canonical (Int# -> ST s (DecodeAction s a)) | ConsumeInt16Canonical (Int# -> ST s (DecodeAction s a)) | ConsumeInt32Canonical (Int# -> ST s (DecodeAction s a)) | ConsumeListLenCanonical (Int# -> ST s (DecodeAction s a)) | ConsumeMapLenCanonical (Int# -> ST s (DecodeAction s a)) | ConsumeTagCanonical (Word# -> ST s (DecodeAction s a)) #if defined(ARCH_32bit) | ConsumeWord64Canonical (Word64# -> ST s (DecodeAction s a)) | ConsumeNegWord64Canonical (Word64# -> ST s (DecodeAction s a)) | ConsumeInt64Canonical (Int64# -> ST s (DecodeAction s a)) | ConsumeListLen64Canonical (Int64# -> ST s (DecodeAction s a)) | ConsumeMapLen64Canonical (Int64# -> ST s (DecodeAction s a)) | ConsumeTag64Canonical (Word64# -> ST s (DecodeAction s a)) #endif | ConsumeIntegerCanonical (Integer -> ST s (DecodeAction s a)) | ConsumeFloat16Canonical (Float# -> ST s (DecodeAction s a)) | ConsumeFloatCanonical (Float# -> ST s (DecodeAction s a)) | ConsumeDoubleCanonical (Double# -> ST s (DecodeAction s a)) | ConsumeBytesCanonical (ByteString-> ST s (DecodeAction s a)) | ConsumeByteArrayCanonical (ByteArray -> ST s (DecodeAction s a)) | ConsumeStringCanonical (Text -> ST s (DecodeAction s a)) | ConsumeUtf8ByteArrayCanonical (ByteArray -> ST s (DecodeAction s a)) | ConsumeSimpleCanonical (Word# -> ST s (DecodeAction s a)) | Fail String | Done a -- | The type of a token, which a decoder can ask for at -- an arbitrary time. -- -- @since 0.2.0.0 data TokenType = TypeUInt | TypeUInt64 | TypeNInt | TypeNInt64 | TypeInteger | TypeFloat16 | TypeFloat32 | TypeFloat64 | TypeBytes | TypeBytesIndef | TypeString | TypeStringIndef | TypeListLen | TypeListLen64 | TypeListLenIndef | TypeMapLen | TypeMapLen64 | TypeMapLenIndef | TypeTag | TypeTag64 | TypeBool | TypeNull | TypeSimple | TypeBreak | TypeInvalid deriving (Eq, Ord, Enum, Bounded, Show) -- | @since 0.2.0.0 instance Functor (Decoder s) where {-# INLINE fmap #-} fmap f = \d -> Decoder $ \k -> runDecoder d (k . f) -- | @since 0.2.0.0 instance Applicative (Decoder s) where {-# INLINE pure #-} pure = \x -> Decoder $ \k -> k x {-# INLINE (<*>) #-} (<*>) = \df dx -> Decoder $ \k -> runDecoder df (\f -> runDecoder dx (\x -> k (f x))) {-# INLINE (*>) #-} (*>) = \dm dn -> Decoder $ \k -> runDecoder dm (\_ -> runDecoder dn k) -- | @since 0.2.0.0 instance Monad (Decoder s) where return = pure {-# INLINE (>>=) #-} (>>=) = \dm f -> Decoder $ \k -> runDecoder dm (\m -> runDecoder (f m) k) {-# INLINE (>>) #-} (>>) = (*>) #if !MIN_VERSION_base(4,13,0) fail = Fail.fail #endif -- | @since 0.2.0.0 instance Fail.MonadFail (Decoder s) where fail msg = Decoder $ \_ -> return (Fail msg) -- | Lift an @ST@ action into a @Decoder@. Useful for, e.g., leveraging -- in-place mutation to efficiently build a deserialised value. -- -- @since 0.2.0.0 liftST :: ST s a -> Decoder s a liftST m = Decoder $ \k -> m >>= k -- | Given a 'Decoder', give us the 'DecodeAction' -- -- @since 0.2.0.0 getDecodeAction :: Decoder s a -> ST s (DecodeAction s a) getDecodeAction (Decoder k) = k (\x -> return (Done x)) -- Compatibility Shims toInt8 :: Int# -> Int8 toInt16 :: Int# -> Int16 toInt32 :: Int# -> Int32 toInt64 :: Int# -> Int64 toWord8 :: Word# -> Word8 toWord16 :: Word# -> Word16 toWord32 :: Word# -> Word32 toWord64 :: Word# -> Word64 #if MIN_VERSION_ghc_prim(0,8,0) toInt8 n = I8# (intToInt8# n) toInt16 n = I16# (intToInt16# n) toInt32 n = I32# (intToInt32# n) toWord8 n = W8# (wordToWord8# n) toWord16 n = W16# (wordToWord16# n) toWord32 n = W32# (wordToWord32# n) #if WORD_SIZE_IN_BITS == 64 #if MIN_VERSION_base(4,17,0) toInt64 n = I64# (intToInt64# n) toWord64 n = W64# (wordToWord64# n) #else toInt64 n = I64# n toWord64 n = W64# n #endif #else toInt64 n = I64# (intToInt64# n) toWord64 n = W64# (wordToWord64# n) #endif #else toInt8 n = I8# n toInt16 n = I16# n toInt32 n = I32# n toInt64 n = I64# n toWord8 n = W8# n toWord16 n = W16# n toWord32 n = W32# n toWord64 n = W64# n #endif -- $canonical -- -- -- -- In general in CBOR there can be multiple representations for the same value, -- for example the integer @0@ can be represented in 8, 16, 32 or 64 bits. This -- library always encoded values in the shortest representation but on -- decoding allows any valid encoding. For some applications it is useful or -- important to only decode the canonical encoding. The decoder primitives here -- are to allow applications to implement canonical decoding. -- -- It is important to note that achieving a canonical representation is /not/ -- simply about using these primitives. For example consider a typical CBOR -- encoding of a Haskell @Set@ data type. This will be encoded as a CBOR list -- of the set elements. A typical implementation might be: -- -- > encodeSet = encodeList . Set.toList -- > decodeSet = fmap Set.fromList . decodeList -- -- This /does not/ enforce a canonical encoding. The decoder above will allow -- set elements in any order. The use of @Set.fromList@ forgets the order. -- To enforce that the decoder only accepts the canonical encoding it will -- have to check that the elements in the list are /strictly/ increasing. -- Similar issues arise in many other data types, wherever there is redundancy -- in the external representation. -- -- The decoder primitives in this section are not much more expensive than -- their normal counterparts. If checking the canonical encoding property is -- critical then a technique that is more expensive but easier to implement and -- test is to decode normally, re-encode and check the serialised bytes are the -- same. --------------------------------------- -- Read input tokens of various types -- -- | Decode a 'Word'. -- -- @since 0.2.0.0 decodeWord :: Decoder s Word decodeWord = Decoder (\k -> return (ConsumeWord (\w# -> k (W# w#)))) {-# INLINE decodeWord #-} -- | Decode a 'Word8'. -- -- @since 0.2.0.0 decodeWord8 :: Decoder s Word8 decodeWord8 = Decoder (\k -> return (ConsumeWord8 (\w# -> k (toWord8 w#)))) {-# INLINE decodeWord8 #-} -- | Decode a 'Word16'. -- -- @since 0.2.0.0 decodeWord16 :: Decoder s Word16 decodeWord16 = Decoder (\k -> return (ConsumeWord16 (\w# -> k (toWord16 w#)))) {-# INLINE decodeWord16 #-} -- | Decode a 'Word32'. -- -- @since 0.2.0.0 decodeWord32 :: Decoder s Word32 decodeWord32 = Decoder (\k -> return (ConsumeWord32 (\w# -> k (toWord32 w#)))) {-# INLINE decodeWord32 #-} -- | Decode a 'Word64'. -- -- @since 0.2.0.0 decodeWord64 :: Decoder s Word64 {-# INLINE decodeWord64 #-} decodeWord64 = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeWord (\w# -> k (toWord64 w#)))) #else Decoder (\k -> return (ConsumeWord64 (\w64# -> k (toWord64 w64#)))) #endif -- | Decode a negative 'Word'. -- -- @since 0.2.0.0 decodeNegWord :: Decoder s Word decodeNegWord = Decoder (\k -> return (ConsumeNegWord (\w# -> k (W# w#)))) {-# INLINE decodeNegWord #-} -- | Decode a negative 'Word64'. -- -- @since 0.2.0.0 decodeNegWord64 :: Decoder s Word64 {-# INLINE decodeNegWord64 #-} decodeNegWord64 = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeNegWord (\w# -> k (toWord64 w#)))) #else Decoder (\k -> return (ConsumeNegWord64 (\w64# -> k (toWord64 w64#)))) #endif -- | Decode an 'Int'. -- -- @since 0.2.0.0 decodeInt :: Decoder s Int decodeInt = Decoder (\k -> return (ConsumeInt (\n# -> k (I# n#)))) {-# INLINE decodeInt #-} -- | Decode an 'Int8'. -- -- @since 0.2.0.0 decodeInt8 :: Decoder s Int8 decodeInt8 = Decoder (\k -> return (ConsumeInt8 (\w# -> k (toInt8 w#)))) {-# INLINE decodeInt8 #-} -- | Decode an 'Int16'. -- -- @since 0.2.0.0 decodeInt16 :: Decoder s Int16 decodeInt16 = Decoder (\k -> return (ConsumeInt16 (\w# -> k (toInt16 w#)))) {-# INLINE decodeInt16 #-} -- | Decode an 'Int32'. -- -- @since 0.2.0.0 decodeInt32 :: Decoder s Int32 decodeInt32 = Decoder (\k -> return (ConsumeInt32 (\w# -> k (toInt32 w#)))) {-# INLINE decodeInt32 #-} -- | Decode an 'Int64'. -- -- @since 0.2.0.0 decodeInt64 :: Decoder s Int64 {-# INLINE decodeInt64 #-} decodeInt64 = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeInt (\n# -> k (toInt64 n#)))) #else Decoder (\k -> return (ConsumeInt64 (\n64# -> k (toInt64 n64#)))) #endif -- | Decode canonical representation of a 'Word'. -- -- @since 0.2.0.0 decodeWordCanonical :: Decoder s Word decodeWordCanonical = Decoder (\k -> return (ConsumeWordCanonical (\w# -> k (W# w#)))) {-# INLINE decodeWordCanonical #-} -- | Decode canonical representation of a 'Word8'. -- -- @since 0.2.0.0 decodeWord8Canonical :: Decoder s Word8 decodeWord8Canonical = Decoder (\k -> return (ConsumeWord8Canonical (\w# -> k (toWord8 w#)))) {-# INLINE decodeWord8Canonical #-} -- | Decode canonical representation of a 'Word16'. -- -- @since 0.2.0.0 decodeWord16Canonical :: Decoder s Word16 decodeWord16Canonical = Decoder (\k -> return (ConsumeWord16Canonical (\w# -> k (toWord16 w#)))) {-# INLINE decodeWord16Canonical #-} -- | Decode canonical representation of a 'Word32'. -- -- @since 0.2.0.0 decodeWord32Canonical :: Decoder s Word32 decodeWord32Canonical = Decoder (\k -> return (ConsumeWord32Canonical (\w# -> k (toWord32 w#)))) {-# INLINE decodeWord32Canonical #-} -- | Decode canonical representation of a 'Word64'. -- -- @since 0.2.0.0 decodeWord64Canonical :: Decoder s Word64 {-# INLINE decodeWord64Canonical #-} decodeWord64Canonical = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeWordCanonical (\w# -> k (toWord64 w#)))) #else Decoder (\k -> return (ConsumeWord64Canonical (\w64# -> k (toWord64 w64#)))) #endif -- | Decode canonical representation of a negative 'Word'. -- -- @since 0.2.0.0 decodeNegWordCanonical :: Decoder s Word decodeNegWordCanonical = Decoder (\k -> return (ConsumeNegWordCanonical (\w# -> k (W# w#)))) {-# INLINE decodeNegWordCanonical #-} -- | Decode canonical representation of a negative 'Word64'. -- -- @since 0.2.0.0 decodeNegWord64Canonical :: Decoder s Word64 {-# INLINE decodeNegWord64Canonical #-} decodeNegWord64Canonical = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeNegWordCanonical (\w# -> k (toWord64 w#)))) #else Decoder (\k -> return (ConsumeNegWord64Canonical (\w64# -> k (toWord64 w64#)))) #endif -- | Decode canonical representation of an 'Int'. -- -- @since 0.2.0.0 decodeIntCanonical :: Decoder s Int decodeIntCanonical = Decoder (\k -> return (ConsumeIntCanonical (\n# -> k (I# n#)))) {-# INLINE decodeIntCanonical #-} -- | Decode canonical representation of an 'Int8'. -- -- @since 0.2.0.0 decodeInt8Canonical :: Decoder s Int8 decodeInt8Canonical = Decoder (\k -> return (ConsumeInt8Canonical (\w# -> k (toInt8 w#)))) {-# INLINE decodeInt8Canonical #-} -- | Decode canonical representation of an 'Int16'. -- -- @since 0.2.0.0 decodeInt16Canonical :: Decoder s Int16 decodeInt16Canonical = Decoder (\k -> return (ConsumeInt16Canonical (\w# -> k (toInt16 w#)))) {-# INLINE decodeInt16Canonical #-} -- | Decode canonical representation of an 'Int32'. -- -- @since 0.2.0.0 decodeInt32Canonical :: Decoder s Int32 decodeInt32Canonical = Decoder (\k -> return (ConsumeInt32Canonical (\w# -> k (toInt32 w#)))) {-# INLINE decodeInt32Canonical #-} -- | Decode canonical representation of an 'Int64'. -- -- @since 0.2.0.0 decodeInt64Canonical :: Decoder s Int64 {-# INLINE decodeInt64Canonical #-} decodeInt64Canonical = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeIntCanonical (\n# -> k (toInt64 n#)))) #else Decoder (\k -> return (ConsumeInt64Canonical (\n64# -> k (toInt64 n64#)))) #endif -- | Decode an 'Integer'. -- -- @since 0.2.0.0 decodeInteger :: Decoder s Integer decodeInteger = Decoder (\k -> return (ConsumeInteger (\n -> k n))) {-# INLINE decodeInteger #-} -- | Decode a 'Float'. -- -- @since 0.2.0.0 decodeFloat :: Decoder s Float decodeFloat = Decoder (\k -> return (ConsumeFloat (\f# -> k (F# f#)))) {-# INLINE decodeFloat #-} -- | Decode a 'Double'. -- -- @since 0.2.0.0 decodeDouble :: Decoder s Double decodeDouble = Decoder (\k -> return (ConsumeDouble (\f# -> k (D# f#)))) {-# INLINE decodeDouble #-} -- | Decode a string of bytes as a 'ByteString'. -- -- @since 0.2.0.0 decodeBytes :: Decoder s ByteString decodeBytes = Decoder (\k -> return (ConsumeBytes (\bs -> k bs))) {-# INLINE decodeBytes #-} -- | Decode canonical representation of a string of bytes as a 'ByteString'. -- -- @since 0.2.1.0 decodeBytesCanonical :: Decoder s ByteString decodeBytesCanonical = Decoder (\k -> return (ConsumeBytesCanonical (\bs -> k bs))) {-# INLINE decodeBytesCanonical #-} -- | Decode a token marking the beginning of an indefinite length -- set of bytes. -- -- @since 0.2.0.0 decodeBytesIndef :: Decoder s () decodeBytesIndef = Decoder (\k -> return (ConsumeBytesIndef (k ()))) {-# INLINE decodeBytesIndef #-} -- | Decode a string of bytes as a 'ByteArray'. -- -- Also note that this will eagerly copy the content out of the input -- to ensure that the input does not leak in the event that the 'ByteArray' is -- live but not forced. -- -- @since 0.2.0.0 decodeByteArray :: Decoder s ByteArray decodeByteArray = Decoder (\k -> return (ConsumeByteArray k)) {-# INLINE decodeByteArray #-} -- | Decode canonical representation of a string of bytes as a 'ByteArray'. -- -- Also note that this will eagerly copy the content out of the input -- to ensure that the input does not leak in the event that the 'ByteArray' is -- live but not forced. -- -- @since 0.2.1.0 decodeByteArrayCanonical :: Decoder s ByteArray decodeByteArrayCanonical = Decoder (\k -> return (ConsumeByteArrayCanonical k)) {-# INLINE decodeByteArrayCanonical #-} -- | Decode a textual string as a piece of 'Text'. -- -- @since 0.2.0.0 decodeString :: Decoder s Text decodeString = Decoder (\k -> return (ConsumeString (\str -> k str))) {-# INLINE decodeString #-} -- | Decode canonical representation of a textual string as a piece of 'Text'. -- -- @since 0.2.1.0 decodeStringCanonical :: Decoder s Text decodeStringCanonical = Decoder (\k -> return (ConsumeStringCanonical (\str -> k str))) {-# INLINE decodeStringCanonical #-} -- | Decode a token marking the beginning of an indefinite length -- string. -- -- @since 0.2.0.0 decodeStringIndef :: Decoder s () decodeStringIndef = Decoder (\k -> return (ConsumeStringIndef (k ()))) {-# INLINE decodeStringIndef #-} -- | Decode a textual string as UTF-8 encoded 'ByteArray'. Note that -- the result is not validated to be well-formed UTF-8. -- -- Also note that this will eagerly copy the content out of the input -- to ensure that the input does not leak in the event that the 'ByteArray' is -- live but not forced. -- -- @since 0.2.0.0 decodeUtf8ByteArray :: Decoder s ByteArray decodeUtf8ByteArray = Decoder (\k -> return (ConsumeUtf8ByteArray k)) {-# INLINE decodeUtf8ByteArray #-} -- | Decode canonical representation of a textual string as UTF-8 encoded -- 'ByteArray'. Note that the result is not validated to be well-formed UTF-8. -- -- Also note that this will eagerly copy the content out of the input -- to ensure that the input does not leak in the event that the 'ByteArray' is -- live but not forced. -- -- @since 0.2.1.0 decodeUtf8ByteArrayCanonical :: Decoder s ByteArray decodeUtf8ByteArrayCanonical = Decoder (\k -> return (ConsumeUtf8ByteArrayCanonical k)) {-# INLINE decodeUtf8ByteArrayCanonical #-} -- | Decode the length of a list. -- -- @since 0.2.0.0 decodeListLen :: Decoder s Int decodeListLen = Decoder (\k -> return (ConsumeListLen (\n# -> k (I# n#)))) {-# INLINE decodeListLen #-} -- | Decode canonical representation of the length of a list. -- -- @since 0.2.0.0 decodeListLenCanonical :: Decoder s Int decodeListLenCanonical = Decoder (\k -> return (ConsumeListLenCanonical (\n# -> k (I# n#)))) {-# INLINE decodeListLenCanonical #-} -- | Decode a token marking the beginning of a list of indefinite -- length. -- -- @since 0.2.0.0 decodeListLenIndef :: Decoder s () decodeListLenIndef = Decoder (\k -> return (ConsumeListLenIndef (k ()))) {-# INLINE decodeListLenIndef #-} -- | Decode the length of a map. -- -- @since 0.2.0.0 decodeMapLen :: Decoder s Int decodeMapLen = Decoder (\k -> return (ConsumeMapLen (\n# -> k (I# n#)))) {-# INLINE decodeMapLen #-} -- | Decode canonical representation of the length of a map. -- -- @since 0.2.0.0 decodeMapLenCanonical :: Decoder s Int decodeMapLenCanonical = Decoder (\k -> return (ConsumeMapLenCanonical (\n# -> k (I# n#)))) {-# INLINE decodeMapLenCanonical #-} -- | Decode a token marking the beginning of a map of indefinite -- length. -- -- @since 0.2.0.0 decodeMapLenIndef :: Decoder s () decodeMapLenIndef = Decoder (\k -> return (ConsumeMapLenIndef (k ()))) {-# INLINE decodeMapLenIndef #-} -- | Decode an arbitrary tag and return it as a 'Word'. -- -- @since 0.2.0.0 decodeTag :: Decoder s Word decodeTag = Decoder (\k -> return (ConsumeTag (\w# -> k (W# w#)))) {-# INLINE decodeTag #-} -- | Decode an arbitrary 64-bit tag and return it as a 'Word64'. -- -- @since 0.2.0.0 decodeTag64 :: Decoder s Word64 {-# INLINE decodeTag64 #-} decodeTag64 = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeTag (\w# -> k (W64# #if MIN_VERSION_base(4,17,0) (wordToWord64# w#) #else w# #endif )))) #else Decoder (\k -> return (ConsumeTag64 (\w64# -> k (W64# w64#)))) #endif -- | Decode canonical representation of an arbitrary tag and return it as a -- 'Word'. -- -- @since 0.2.0.0 decodeTagCanonical :: Decoder s Word decodeTagCanonical = Decoder (\k -> return (ConsumeTagCanonical (\w# -> k (W# w#)))) {-# INLINE decodeTagCanonical #-} -- | Decode canonical representation of an arbitrary 64-bit tag and return it as -- a 'Word64'. -- -- @since 0.2.0.0 decodeTag64Canonical :: Decoder s Word64 {-# INLINE decodeTag64Canonical #-} decodeTag64Canonical = #if defined(ARCH_64bit) Decoder (\k -> return (ConsumeTagCanonical (\w# -> k (W64# #if MIN_VERSION_base(4,17,0) (wordToWord64# w#) #else w# #endif )))) #else Decoder (\k -> return (ConsumeTag64Canonical (\w64# -> k (W64# w64#)))) #endif -- | Decode a bool. -- -- @since 0.2.0.0 decodeBool :: Decoder s Bool decodeBool = Decoder (\k -> return (ConsumeBool (\b -> k b))) {-# INLINE decodeBool #-} -- | Decode a nullary value, and return a unit value. -- -- @since 0.2.0.0 decodeNull :: Decoder s () decodeNull = Decoder (\k -> return (ConsumeNull (k ()))) {-# INLINE decodeNull #-} -- | Decode a 'simple' CBOR value and give back a 'Word8'. You -- probably don't ever need to use this. -- -- @since 0.2.0.0 decodeSimple :: Decoder s Word8 decodeSimple = Decoder (\k -> return (ConsumeSimple (\w# -> k (toWord8 w#)))) {-# INLINE decodeSimple #-} -- | Decode canonical representation of an 'Integer'. -- -- @since 0.2.0.0 decodeIntegerCanonical :: Decoder s Integer decodeIntegerCanonical = Decoder (\k -> return (ConsumeIntegerCanonical (\n -> k n))) {-# INLINE decodeIntegerCanonical #-} -- | Decode canonical representation of a half-precision 'Float'. -- -- @since 0.2.0.0 decodeFloat16Canonical :: Decoder s Float decodeFloat16Canonical = Decoder (\k -> return (ConsumeFloat16Canonical (\f# -> k (F# f#)))) {-# INLINE decodeFloat16Canonical #-} -- | Decode canonical representation of a 'Float'. -- -- @since 0.2.0.0 decodeFloatCanonical :: Decoder s Float decodeFloatCanonical = Decoder (\k -> return (ConsumeFloatCanonical (\f# -> k (F# f#)))) {-# INLINE decodeFloatCanonical #-} -- | Decode canonical representation of a 'Double'. -- -- @since 0.2.0.0 decodeDoubleCanonical :: Decoder s Double decodeDoubleCanonical = Decoder (\k -> return (ConsumeDoubleCanonical (\f# -> k (D# f#)))) {-# INLINE decodeDoubleCanonical #-} -- | Decode canonical representation of a 'simple' CBOR value and give back a -- 'Word8'. You probably don't ever need to use this. -- -- @since 0.2.0.0 decodeSimpleCanonical :: Decoder s Word8 decodeSimpleCanonical = Decoder (\k -> return (ConsumeSimpleCanonical (\w# -> k (toWord8 w#)))) {-# INLINE decodeSimpleCanonical #-} -------------------------------------------------------------- -- Specialised read operations: expect a token with a specific value -- -- | Attempt to decode a word with 'decodeWord', and ensure the word -- is exactly as expected, or fail. -- -- @since 0.2.0.0 decodeWordOf :: Word -- ^ Expected value of the decoded word -> Decoder s () decodeWordOf = decodeWordOfHelper decodeWord {-# INLINE decodeWordOf #-} -- | Attempt to decode a list length using 'decodeListLen', and -- ensure it is exactly the specified length, or fail. -- -- @since 0.2.0.0 decodeListLenOf :: Int -> Decoder s () decodeListLenOf = decodeListLenOfHelper decodeListLen {-# INLINE decodeListLenOf #-} -- | Attempt to decode canonical representation of a word with 'decodeWordCanonical', -- and ensure the word is exactly as expected, or fail. -- -- @since 0.2.0.0 decodeWordCanonicalOf :: Word -- ^ Expected value of the decoded word -> Decoder s () decodeWordCanonicalOf = decodeWordOfHelper decodeWordCanonical {-# INLINE decodeWordCanonicalOf #-} -- | Attempt to decode canonical representation of a list length using -- 'decodeListLenCanonical', and ensure it is exactly the specified length, or -- fail. -- -- @since 0.2.0.0 decodeListLenCanonicalOf :: Int -> Decoder s () decodeListLenCanonicalOf = decodeListLenOfHelper decodeListLenCanonical {-# INLINE decodeListLenCanonicalOf #-} decodeListLenOfHelper :: (Show a, Eq a, Fail.MonadFail m) => m a -> a -> m () decodeListLenOfHelper decodeFun = \len -> do len' <- decodeFun if len == len' then return () else fail $ "expected list of length " ++ show len {-# INLINE decodeListLenOfHelper #-} decodeWordOfHelper :: (Show a, Eq a, Fail.MonadFail m) => m a -> a -> m () decodeWordOfHelper decodeFun = \n -> do n' <- decodeFun if n == n' then return () else fail $ "expected word " ++ show n {-# INLINE decodeWordOfHelper #-} -------------------------------------------------------------- -- Branching operations -- | Attempt to decode a token for the length of a finite, known list, -- or an indefinite list. If 'Nothing' is returned, then an -- indefinite length list occurs afterwords. If @'Just' x@ is -- returned, then a list of length @x@ is encoded. -- -- @since 0.2.0.0 decodeListLenOrIndef :: Decoder s (Maybe Int) decodeListLenOrIndef = Decoder (\k -> return (ConsumeListLenOrIndef (\n# -> if I# n# >= 0 then k (Just (I# n#)) else k Nothing))) {-# INLINE decodeListLenOrIndef #-} -- | Attempt to decode a token for the length of a finite, known map, -- or an indefinite map. If 'Nothing' is returned, then an -- indefinite length map occurs afterwords. If @'Just' x@ is returned, -- then a map of length @x@ is encoded. -- -- @since 0.2.0.0 decodeMapLenOrIndef :: Decoder s (Maybe Int) decodeMapLenOrIndef = Decoder (\k -> return (ConsumeMapLenOrIndef (\n# -> if I# n# >= 0 then k (Just (I# n#)) else k Nothing))) {-# INLINE decodeMapLenOrIndef #-} -- | Attempt to decode a @Break@ token, and if that was -- successful, return 'True'. If the token was of any -- other type, return 'False'. -- -- @since 0.2.0.0 decodeBreakOr :: Decoder s Bool decodeBreakOr = Decoder (\k -> return (ConsumeBreakOr (\b -> k b))) {-# INLINE decodeBreakOr #-} -------------------------------------------------------------- -- Special operations -- | Peek at the current token we're about to decode, and return a -- 'TokenType' specifying what it is. -- -- @since 0.2.0.0 peekTokenType :: Decoder s TokenType peekTokenType = Decoder (\k -> return (PeekTokenType (\tk -> k tk))) {-# INLINE peekTokenType #-} -- | Peek and return the length of the current buffer that we're -- running our decoder on. -- -- @since 0.2.0.0 peekAvailable :: Decoder s Int peekAvailable = Decoder (\k -> return (PeekAvailable (\len# -> k (I# len#)))) {-# INLINE peekAvailable #-} -- | A 0-based offset within the overall byte sequence that makes up the -- input to the 'Decoder'. -- -- This is an 'Int64' since 'Decoder' is incremental and can decode more data -- than fits in memory at once. This is also compatible with the result type -- of 'Data.ByteString.Lazy.length'. -- type ByteOffset = Int64 -- | Get the current 'ByteOffset' in the input byte sequence of the 'Decoder'. -- -- The 'Decoder' does not provide any facility to get at the input data -- directly (since that is tricky with an incremental decoder). The next best -- is this primitive which can be used to keep track of the offset within the -- input bytes that makes up the encoded form of a term. -- -- By keeping track of the byte offsets before and after decoding a subterm -- (a pattern captured by 'decodeWithByteSpan') and if the overall input data -- is retained then this is enables later retrieving the span of bytes for the -- subterm. -- -- @since 0.2.2.0 peekByteOffset :: Decoder s ByteOffset peekByteOffset = Decoder (\k -> return (PeekByteOffset (\off# -> k (I64# #if MIN_VERSION_base(4,17,0) (intToInt64# off#) #else off# #endif )))) {-# INLINE peekByteOffset #-} -- | This captures the pattern of getting the byte offsets before and after -- decoding a subterm. -- -- > !before <- peekByteOffset -- > x <- decode -- > !after <- peekByteOffset -- decodeWithByteSpan :: Decoder s a -> Decoder s (a, ByteOffset, ByteOffset) decodeWithByteSpan da = do !before <- peekByteOffset x <- da !after <- peekByteOffset return (x, before, after) {- expectExactly :: Word -> Decoder (Word :#: s) s expectExactly n = expectExactly_ n done expectAtLeast :: Word -> Decoder (Word :#: s) (Word :#: s) expectAtLeast n = expectAtLeast_ n done ignoreTrailingTerms :: Decoder (a :*: Word :#: s) (a :*: s) ignoreTrailingTerms = IgnoreTerms done -} ------------------------------------------------------------------------------ -- Special combinations for sequences -- -- | Decode an indefinite sequence length. -- -- @since 0.2.0.0 decodeSequenceLenIndef :: (r -> a -> r) -> r -> (r -> r') -> Decoder s a -> Decoder s r' decodeSequenceLenIndef f z g get = go z where go !acc = do stop <- decodeBreakOr if stop then return $! g acc else do !x <- get; go (f acc x) {-# INLINE decodeSequenceLenIndef #-} -- | Decode a sequence length. -- -- @since 0.2.0.0 decodeSequenceLenN :: (r -> a -> r) -> r -> (r -> r') -> Int -> Decoder s a -> Decoder s r' decodeSequenceLenN f z g c get = go z c where go !acc 0 = return $! g acc go !acc n = do !x <- get; go (f acc x) (n-1) {-# INLINE decodeSequenceLenN #-} cborg-0.2.10.0/src/Codec/CBOR/Encoding.hs0000644000000000000000000002621607346545000015637 0ustar0000000000000000{-# LANGUAGE CPP #-} -- | -- Module : Codec.CBOR.Encoding -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- High level API for encoding values, for later serialization into -- CBOR binary format, using a 'Monoid' based interface. -- module Codec.CBOR.Encoding ( -- * Encoding implementation Encoding(..) -- :: * , Tokens(..) -- :: * -- * 'Encoding' API for serialisation , encodeWord -- :: Word -> Encoding , encodeWord8 -- :: Word8 -> Encoding , encodeWord16 -- :: Word16 -> Encoding , encodeWord32 -- :: Word32 -> Encoding , encodeWord64 -- :: Word64 -> Encoding , encodeInt -- :: Int -> Encoding , encodeInt8 -- :: Int8 -> Encoding , encodeInt16 -- :: Int16 -> Encoding , encodeInt32 -- :: Int32 -> Encoding , encodeInt64 -- :: Int64 -> Encoding , encodeInteger -- :: Integer -> Encoding , encodeBytes -- :: B.ByteString -> Encoding , encodeBytesIndef -- :: Encoding , encodeByteArray -- :: ByteArray -> Encoding , encodeString -- :: T.Text -> Encoding , encodeStringIndef -- :: Encoding , encodeUtf8ByteArray -- :: ByteArray -> Encoding , encodeListLen -- :: Word -> Encoding , encodeListLenIndef -- :: Encoding , encodeMapLen -- :: Word -> Encoding , encodeMapLenIndef -- :: Encoding , encodeBreak -- :: Encoding , encodeTag -- :: Word -> Encoding , encodeTag64 -- :: Word64 -> Encoding , encodeBool -- :: Bool -> Encoding , encodeUndef -- :: Encoding , encodeNull -- :: Encoding , encodeSimple -- :: Word8 -> Encoding , encodeFloat16 -- :: Float -> Encoding , encodeFloat -- :: Float -> Encoding , encodeDouble -- :: Double -> Encoding , encodePreEncoded -- :: B.ByteString -> Encoding ) where #include "cbor.h" import Data.Int import Data.Word import Data.Semigroup import qualified Data.ByteString as B import qualified Data.Text as T import Codec.CBOR.ByteArray.Sliced (SlicedByteArray) import Prelude hiding (encodeFloat) import {-# SOURCE #-} qualified Codec.CBOR.FlatTerm as FlatTerm -- | An intermediate form used during serialisation, specified as a -- 'Monoid'. It supports efficient concatenation, and is equivalent -- to a specialised 'Data.Monoid.Endo' 'Tokens' type. -- -- It is used for the stage in serialisation where we flatten out the -- Haskell data structure but it is independent of any specific -- external binary or text format. -- -- Traditionally, to build any arbitrary 'Encoding' value, you specify -- larger structures from smaller ones and append the small ones together -- using 'Data.Monoid.mconcat'. -- -- @since 0.2.0.0 newtype Encoding = Encoding (Tokens -> Tokens) instance Show Encoding where show = show . FlatTerm.toFlatTerm -- | A flattened representation of a term, which is independent -- of any underlying binary representation, but which we later -- serialise into CBOR format. -- -- @since 0.2.0.0 data Tokens = -- Positive and negative integers (type 0,1) TkWord {-# UNPACK #-} !Word Tokens | TkWord64 {-# UNPACK #-} !Word64 Tokens -- convenience for either positive or negative | TkInt {-# UNPACK #-} !Int Tokens | TkInt64 {-# UNPACK #-} !Int64 Tokens -- Bytes and string (type 2,3) | TkBytes {-# UNPACK #-} !B.ByteString Tokens | TkBytesBegin Tokens | TkByteArray {-# UNPACK #-} !SlicedByteArray Tokens | TkString {-# UNPACK #-} !T.Text Tokens | TkUtf8ByteArray {-# UNPACK #-} !SlicedByteArray Tokens | TkStringBegin Tokens -- Structures (type 4,5) | TkListLen {-# UNPACK #-} !Word Tokens | TkListBegin Tokens | TkMapLen {-# UNPACK #-} !Word Tokens | TkMapBegin Tokens -- Tagged values (type 6) | TkTag {-# UNPACK #-} !Word Tokens | TkTag64 {-# UNPACK #-} !Word64 Tokens | TkInteger !Integer Tokens -- Simple and floats (type 7) | TkNull Tokens | TkUndef Tokens | TkBool !Bool Tokens | TkSimple {-# UNPACK #-} !Word8 Tokens | TkFloat16 {-# UNPACK #-} !Float Tokens | TkFloat32 {-# UNPACK #-} !Float Tokens | TkFloat64 {-# UNPACK #-} !Double Tokens | TkBreak Tokens -- Special | TkEncoded {-# UNPACK #-} !B.ByteString Tokens | TkEnd deriving (Show,Eq) -- | @since 0.2.0.0 instance Semigroup Encoding where Encoding b1 <> Encoding b2 = Encoding (\ts -> b1 (b2 ts)) {-# INLINE (<>) #-} -- | @since 0.2.0.0 instance Monoid Encoding where mempty = Encoding (\ts -> ts) {-# INLINE mempty #-} mappend = (<>) {-# INLINE mappend #-} mconcat = foldr (<>) mempty {-# INLINE mconcat #-} -- | Encode a 'Word' in a flattened format. -- -- @since 0.2.0.0 encodeWord :: Word -> Encoding encodeWord = Encoding . TkWord -- | Encode a 'Word8' in a flattened format. -- -- @since 0.2.0.0 encodeWord8 :: Word8 -> Encoding encodeWord8 = Encoding . TkWord . fromIntegral -- | Encode a 'Word16' in a flattened format. -- -- @since 0.2.0.0 encodeWord16 :: Word16 -> Encoding encodeWord16 = Encoding . TkWord . fromIntegral -- | Encode a 'Word32' in a flattened format. -- -- @since 0.2.0.0 encodeWord32 :: Word32 -> Encoding encodeWord32 = Encoding . TkWord . fromIntegral -- | Encode a 'Word64' in a flattened format. -- -- @since 0.2.0.0 encodeWord64 :: Word64 -> Encoding encodeWord64 = Encoding . TkWord64 -- | Encode an 'Int' in a flattened format. -- -- @since 0.2.0.0 encodeInt :: Int -> Encoding encodeInt = Encoding . TkInt -- | Encode an 'Int8' in a flattened format. -- -- @since 0.2.0.0 encodeInt8 :: Int8 -> Encoding encodeInt8 = Encoding . TkInt . fromIntegral -- | Encode an 'Int16' in a flattened format. -- -- @since 0.2.0.0 encodeInt16 :: Int16 -> Encoding encodeInt16 = Encoding . TkInt . fromIntegral -- | Encode an 'Int32' in a flattened format. -- -- @since 0.2.0.0 encodeInt32 :: Int32 -> Encoding encodeInt32 = Encoding . TkInt . fromIntegral -- | Encode an @'Int64' in a flattened format. -- -- @since 0.2.0.0 encodeInt64 :: Int64 -> Encoding encodeInt64 = Encoding . TkInt64 -- | Encode an arbitrarily large @'Integer' in a -- flattened format. -- -- @since 0.2.0.0 encodeInteger :: Integer -> Encoding encodeInteger n = Encoding (TkInteger n) -- | Encode an arbitrary strict 'B.ByteString' in -- a flattened format. -- -- @since 0.2.0.0 encodeBytes :: B.ByteString -> Encoding encodeBytes = Encoding . TkBytes -- | Encode a bytestring in a flattened format. -- -- @since 0.2.0.0 encodeByteArray :: SlicedByteArray -> Encoding encodeByteArray = Encoding . TkByteArray -- | Encode a token specifying the beginning of a string of bytes of -- indefinite length. In reality, this specifies a stream of many -- occurrences of `encodeBytes`, each specifying a single chunk of the -- overall string. After all the bytes desired have been encoded, you -- should follow it with a break token (see 'encodeBreak'). -- -- @since 0.2.0.0 encodeBytesIndef :: Encoding encodeBytesIndef = Encoding TkBytesBegin -- | Encode a 'T.Text' in a flattened format. -- -- @since 0.2.0.0 encodeString :: T.Text -> Encoding encodeString = Encoding . TkString -- | Encode the beginning of an indefinite string. -- -- @since 0.2.0.0 encodeStringIndef :: Encoding encodeStringIndef = Encoding TkStringBegin -- | Encode a UTF-8 string in a flattened format. Note that the contents -- is not validated to be well-formed UTF-8. -- -- @since 0.2.0.0 encodeUtf8ByteArray :: SlicedByteArray -> Encoding encodeUtf8ByteArray = Encoding . TkUtf8ByteArray -- | Encode the length of a list, used to indicate that the following -- tokens represent the list values. -- -- @since 0.2.0.0 encodeListLen :: Word -> Encoding encodeListLen = Encoding . TkListLen -- | Encode a token specifying that this is the beginning of an -- indefinite list of unknown size. Tokens representing the list are -- expected afterwords, followed by a break token (see -- 'encodeBreak') when the list has ended. -- -- @since 0.2.0.0 encodeListLenIndef :: Encoding encodeListLenIndef = Encoding TkListBegin -- | Encode the length of a Map, used to indicate that -- the following tokens represent the map values. -- -- @since 0.2.0.0 encodeMapLen :: Word -> Encoding encodeMapLen = Encoding . TkMapLen -- | Encode a token specifying that this is the beginning of an -- indefinite map of unknown size. Tokens representing the map are -- expected afterwords, followed by a break token (see -- 'encodeBreak') when the map has ended. -- -- @since 0.2.0.0 encodeMapLenIndef :: Encoding encodeMapLenIndef = Encoding TkMapBegin -- | Encode a \'break\', used to specify the end of indefinite -- length objects like maps or lists. -- -- @since 0.2.0.0 encodeBreak :: Encoding encodeBreak = Encoding TkBreak -- | Encode an arbitrary 'Word' tag. -- -- @since 0.2.0.0 encodeTag :: Word -> Encoding encodeTag = Encoding . TkTag -- | Encode an arbitrary 64-bit 'Word64' tag. -- -- @since 0.2.0.0 encodeTag64 :: Word64 -> Encoding encodeTag64 = Encoding . TkTag64 -- | Encode a 'Bool'. -- -- @since 0.2.0.0 encodeBool :: Bool -> Encoding encodeBool b = Encoding (TkBool b) -- | Encode an @Undef@ value. -- -- @since 0.2.0.0 encodeUndef :: Encoding encodeUndef = Encoding TkUndef -- | Encode a @Null@ value. -- -- @since 0.2.0.0 encodeNull :: Encoding encodeNull = Encoding TkNull -- | Encode a \'simple\' CBOR token that can be represented with an -- 8-bit word. You probably don't ever need this. -- -- @since 0.2.0.0 encodeSimple :: Word8 -> Encoding encodeSimple = Encoding . TkSimple -- | Encode a small 16-bit 'Float' in a flattened format. -- -- @since 0.2.0.0 encodeFloat16 :: Float -> Encoding encodeFloat16 = Encoding . TkFloat16 -- | Encode a full precision 'Float' in a flattened format. -- -- @since 0.2.0.0 encodeFloat :: Float -> Encoding encodeFloat = Encoding . TkFloat32 -- | Encode a 'Double' in a flattened format. -- -- @since 0.2.0.0 encodeDouble :: Double -> Encoding encodeDouble = Encoding . TkFloat64 -- | Include pre-encoded valid CBOR data into the 'Encoding'. -- -- The data is included into the output as-is without any additional wrapper. -- -- This should be used with care. The data /must/ be a valid CBOR encoding, but -- this is /not/ checked. -- -- This is useful when you have CBOR data that you know is already valid, e.g. -- previously validated and stored on disk, and you wish to include it without -- having to decode and re-encode it. -- -- @since 0.2.2.0 encodePreEncoded :: B.ByteString -> Encoding encodePreEncoded = Encoding . TkEncoded cborg-0.2.10.0/src/Codec/CBOR/Encoding.hs-boot0000644000000000000000000000013507346545000016570 0ustar0000000000000000module Codec.CBOR.Encoding where newtype Encoding = Encoding (Tokens -> Tokens) data Tokenscborg-0.2.10.0/src/Codec/CBOR/FlatTerm.hs0000644000000000000000000010675707346545000015640 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE BangPatterns #-} -- | -- Module : Codec.CBOR.FlatTerm -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- A simpler form than CBOR for writing out 'Enc.Encoding' values that allows -- easier verification and testing. While this library primarily focuses -- on taking 'Enc.Encoding' values (independent of any underlying format) -- and serializing them into CBOR format, this module offers an alternative -- format called 'FlatTerm' for serializing 'Enc.Encoding' values. -- -- The 'FlatTerm' form is very simple and internally mirrors the original -- 'Encoding' type very carefully. The intention here is that once you -- have 'Enc.Encoding' and 'Dec.Decoding' values for your types, you can -- round-trip values through 'FlatTerm' to catch bugs more easily and with -- a smaller amount of code to look through. -- -- For that reason, this module is primarily useful for client libraries, -- and even then, only for their test suites to offer a simpler form for -- doing encoding tests and catching problems in an encoder and decoder. -- module Codec.CBOR.FlatTerm ( -- * Types FlatTerm -- :: * , TermToken(..) -- :: * -- * Functions , toFlatTerm -- :: Encoding -> FlatTerm , fromFlatTerm -- :: Decoder s a -> FlatTerm -> Either String a , validFlatTerm -- :: FlatTerm -> Bool , decodeTermToken -- Decoder s TermToken ) where #include "cbor.h" import Codec.CBOR.Encoding (Encoding(..)) import qualified Codec.CBOR.Encoding as Enc import Codec.CBOR.Decoding as Dec import qualified Codec.CBOR.Read as Read import qualified Codec.CBOR.ByteArray as BA import qualified Codec.CBOR.ByteArray.Sliced as BAS import Data.Int #if defined(ARCH_32bit) import GHC.Int (Int64(I64#)) import GHC.Word (Word64(W64#)) import GHC.Exts (Word64#, Int64#) #endif #if MIN_VERSION_ghc_prim(0,8,0) import GHC.Exts (word8ToWord#) #endif import GHC.Word (Word(W#), Word8(W8#)) import GHC.Exts (Int(I#), Int#, Word#, Float#, Double#) import GHC.Float (Float(F#), Double(D#), float2Double) import Data.Word import Data.Text (Text) import qualified Data.Text.Encoding as TE import Data.ByteString (ByteString) import qualified Data.ByteString as BS import Control.Monad.ST import qualified Control.Monad.ST.Lazy as ST.Lazy import Prelude hiding (encodeFloat, decodeFloat) -------------------------------------------------------------------------------- -- | A \"flat\" representation of an 'Enc.Encoding' value, -- useful for round-tripping and writing tests. -- -- @since 0.2.0.0 type FlatTerm = [TermToken] -- | A concrete encoding of 'Enc.Encoding' values, one -- which mirrors the original 'Enc.Encoding' type closely. -- -- @since 0.2.0.0 data TermToken = TkInt {-# UNPACK #-} !Int | TkInteger !Integer | TkBytes {-# UNPACK #-} !ByteString | TkBytesBegin | TkString {-# UNPACK #-} !Text | TkStringBegin | TkListLen {-# UNPACK #-} !Word | TkListBegin | TkMapLen {-# UNPACK #-} !Word | TkMapBegin | TkBreak | TkTag {-# UNPACK #-} !Word64 | TkBool !Bool | TkNull | TkSimple {-# UNPACK #-} !Word8 | TkFloat16 {-# UNPACK #-} !Float | TkFloat32 {-# UNPACK #-} !Float | TkFloat64 {-# UNPACK #-} !Double deriving (Eq, Ord, Show) -------------------------------------------------------------------------------- -- | Convert an arbitrary 'Enc.Encoding' into a 'FlatTerm'. -- -- @since 0.2.0.0 toFlatTerm :: Encoding -- ^ The input 'Enc.Encoding'. -> FlatTerm -- ^ The resulting 'FlatTerm'. toFlatTerm (Encoding tb) = convFlatTerm (tb Enc.TkEnd) convFlatTerm :: Enc.Tokens -> FlatTerm convFlatTerm (Enc.TkWord w ts) | w <= maxInt = TkInt (fromIntegral w) : convFlatTerm ts | otherwise = TkInteger (fromIntegral w) : convFlatTerm ts convFlatTerm (Enc.TkWord64 w ts) | w <= maxInt = TkInt (fromIntegral w) : convFlatTerm ts | otherwise = TkInteger (fromIntegral w) : convFlatTerm ts convFlatTerm (Enc.TkInt n ts) = TkInt n : convFlatTerm ts convFlatTerm (Enc.TkInt64 n ts) | n <= maxInt && n >= minInt = TkInt (fromIntegral n) : convFlatTerm ts | otherwise = TkInteger (fromIntegral n) : convFlatTerm ts convFlatTerm (Enc.TkInteger n ts) | n <= maxInt && n >= minInt = TkInt (fromIntegral n) : convFlatTerm ts | otherwise = TkInteger n : convFlatTerm ts convFlatTerm (Enc.TkBytes bs ts) = TkBytes bs : convFlatTerm ts convFlatTerm (Enc.TkBytesBegin ts) = TkBytesBegin : convFlatTerm ts convFlatTerm (Enc.TkByteArray a ts) = TkBytes (BAS.toByteString a) : convFlatTerm ts convFlatTerm (Enc.TkString st ts) = TkString st : convFlatTerm ts convFlatTerm (Enc.TkStringBegin ts) = TkStringBegin : convFlatTerm ts convFlatTerm (Enc.TkUtf8ByteArray a ts) = TkString (TE.decodeUtf8 $ BAS.toByteString a) : convFlatTerm ts convFlatTerm (Enc.TkListLen n ts) = TkListLen n : convFlatTerm ts convFlatTerm (Enc.TkListBegin ts) = TkListBegin : convFlatTerm ts convFlatTerm (Enc.TkMapLen n ts) = TkMapLen n : convFlatTerm ts convFlatTerm (Enc.TkMapBegin ts) = TkMapBegin : convFlatTerm ts convFlatTerm (Enc.TkTag n ts) = TkTag (fromIntegral n) : convFlatTerm ts convFlatTerm (Enc.TkTag64 n ts) = TkTag n : convFlatTerm ts convFlatTerm (Enc.TkBool b ts) = TkBool b : convFlatTerm ts convFlatTerm (Enc.TkNull ts) = TkNull : convFlatTerm ts convFlatTerm (Enc.TkUndef ts) = TkSimple 23 : convFlatTerm ts convFlatTerm (Enc.TkSimple n ts) = TkSimple n : convFlatTerm ts convFlatTerm (Enc.TkFloat16 f ts) = TkFloat16 f : convFlatTerm ts convFlatTerm (Enc.TkFloat32 f ts) = TkFloat32 f : convFlatTerm ts convFlatTerm (Enc.TkFloat64 f ts) = TkFloat64 f : convFlatTerm ts convFlatTerm (Enc.TkBreak ts) = TkBreak : convFlatTerm ts convFlatTerm (Enc.TkEncoded bs ts) = decodePreEncoded bs ++ convFlatTerm ts convFlatTerm Enc.TkEnd = [] -------------------------------------------------------------------------------- decodePreEncoded :: BS.ByteString -> FlatTerm decodePreEncoded bs0 = ST.Lazy.runST (provideInput bs0) where provideInput :: BS.ByteString -> ST.Lazy.ST s FlatTerm provideInput bs | BS.null bs = return [] | otherwise = do next <- ST.Lazy.strictToLazyST $ do -- This will always be a 'Partial' here because decodeTermToken -- always starts by requesting initial input. Only decoders that -- fail or return a value without looking at their input can give -- a different initial result. result <- Read.deserialiseIncremental decodeTermToken let k = case result of Read.Partial a -> a _ -> error "Failed to get a Partial" k (Just bs) collectOutput next collectOutput :: Read.IDecode s TermToken -> ST.Lazy.ST s FlatTerm collectOutput (Read.Fail _ _ err) = #if MIN_VERSION_base(4,17,0) error #else fail #endif $ "toFlatTerm: encodePreEncoded " ++ "used with invalid CBOR: " ++ show err collectOutput (Read.Partial k) = ST.Lazy.strictToLazyST (k Nothing) >>= collectOutput collectOutput (Read.Done bs' _ x) = do xs <- provideInput bs' return (x : xs) decodeTermToken :: Decoder s TermToken decodeTermToken = do tkty <- peekTokenType case tkty of TypeUInt -> do w <- decodeWord return $! fromWord w where fromWord :: Word -> TermToken fromWord w | w <= fromIntegral (maxBound :: Int) = TkInt (fromIntegral w) | otherwise = TkInteger (fromIntegral w) TypeUInt64 -> do w <- decodeWord64 return $! fromWord64 w where fromWord64 w | w <= fromIntegral (maxBound :: Int) = TkInt (fromIntegral w) | otherwise = TkInteger (fromIntegral w) TypeNInt -> do w <- decodeNegWord return $! fromNegWord w where fromNegWord w | w <= fromIntegral (maxBound :: Int) = TkInt (-1 - fromIntegral w) | otherwise = TkInteger (-1 - fromIntegral w) TypeNInt64 -> do w <- decodeNegWord64 return $! fromNegWord64 w where fromNegWord64 w | w <= fromIntegral (maxBound :: Int) = TkInt (-1 - fromIntegral w) | otherwise = TkInteger (-1 - fromIntegral w) TypeInteger -> do !x <- decodeInteger return (TkInteger x) TypeFloat16 -> do !x <- decodeFloat return (TkFloat16 x) TypeFloat32 -> do !x <- decodeFloat return (TkFloat32 x) TypeFloat64 -> do !x <- decodeDouble return (TkFloat64 x) TypeBytes -> do !x <- decodeBytes return (TkBytes x) TypeBytesIndef -> do decodeBytesIndef return TkBytesBegin TypeString -> do !x <- decodeString return (TkString x) TypeStringIndef -> do decodeStringIndef return TkStringBegin TypeListLen -> do !x <- decodeListLen return $! TkListLen (fromIntegral x) TypeListLen64 -> do !x <- decodeListLen return $! TkListLen (fromIntegral x) TypeListLenIndef -> do decodeListLenIndef return TkListBegin TypeMapLen -> do !x <- decodeMapLen return $! TkMapLen (fromIntegral x) TypeMapLen64 -> do !x <- decodeMapLen return $! TkMapLen (fromIntegral x) TypeMapLenIndef -> do decodeMapLenIndef return TkMapBegin TypeTag -> do !x <- decodeTag return $! TkTag (fromIntegral x) TypeTag64 -> do !x <- decodeTag64 return $! TkTag (fromIntegral x) TypeBool -> do !x <- decodeBool return (TkBool x) TypeNull -> do decodeNull return TkNull TypeSimple -> do !x <- decodeSimple return (TkSimple x) TypeBreak -> do _ <- decodeBreakOr return TkBreak TypeInvalid -> fail "invalid token encoding" -------------------------------------------------------------------------------- -- | Given a 'Dec.Decoder', decode a 'FlatTerm' back into -- an ordinary value, or return an error. -- -- @since 0.2.0.0 fromFlatTerm :: (forall s. Decoder s a) -- ^ A 'Dec.Decoder' for a serialised value. -> FlatTerm -- ^ The serialised 'FlatTerm'. -> Either String a -- ^ The deserialised value, or an error. fromFlatTerm decoder ft = runST (getDecodeAction decoder >>= go ft) where go :: FlatTerm -> DecodeAction s a -> ST s (Either String a) go (TkInt n : ts) (ConsumeWord k) | n >= 0 = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord k) | n >= 0 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeWord8 k) | n >= 0 && n <= maxWord8 = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord8 k) | n >= 0 && n <= maxWord8 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeWord16 k) | n >= 0 && n <= maxWord16 = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord16 k) | n >= 0 && n <= maxWord16 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeWord32 k) -- NOTE: we have to be very careful about this branch -- on 32 bit machines, because maxBound :: Int < maxBound :: Word32 | intIsValidWord32 n = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord32 k) | n >= 0 && n <= maxWord32 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeNegWord k) | n < 0 = k (unW# (fromIntegral (-1-n))) >>= go ts go (TkInteger n : ts) (ConsumeNegWord k) | n < 0 = k (unW# (fromIntegral (-1-n))) >>= go ts go (TkInt n : ts) (ConsumeInt k) = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeInt8 k) | n >= minInt8 && n <= maxInt8 = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt8 k) | n >= minInt8 && n <= maxInt8 = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeInt16 k) | n >= minInt16 && n <= maxInt16 = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt16 k) | n >= minInt16 && n <= maxInt16 = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeInt32 k) | n >= minInt32 && n <= maxInt32 = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt32 k) | n >= minInt32 && n <= maxInt32 = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeInteger k) = k (fromIntegral n) >>= go ts go (TkInteger n : ts) (ConsumeInteger k) = k n >>= go ts go (TkListLen n : ts) (ConsumeListLen k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkMapLen n : ts) (ConsumeMapLen k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkTag n : ts) (ConsumeTag k) | n <= maxWord = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeWordCanonical k) | n >= 0 = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWordCanonical k) | n >= 0 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeWord8Canonical k) | n >= 0 && n <= maxWord8 = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord8Canonical k) | n >= 0 && n <= maxWord8 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeWord16Canonical k) | n >= 0 && n <= maxWord16 = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord16Canonical k) | n >= 0 && n <= maxWord16 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeWord32Canonical k) -- NOTE: we have to be very careful about this branch -- on 32 bit machines, because maxBound :: Int < maxBound :: Word32 | intIsValidWord32 n = k (unW# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord32Canonical k) | n >= 0 && n <= maxWord32 = k (unW# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeNegWordCanonical k) | n < 0 = k (unW# (fromIntegral (-1-n))) >>= go ts go (TkInteger n : ts) (ConsumeNegWordCanonical k) | n < 0 = k (unW# (fromIntegral (-1-n))) >>= go ts go (TkInt n : ts) (ConsumeIntCanonical k) = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeInt8Canonical k) | n >= minInt8 && n <= maxInt8 = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt8Canonical k) | n >= minInt8 && n <= maxInt8 = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeInt16Canonical k) | n >= minInt16 && n <= maxInt16 = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt16Canonical k) | n >= minInt16 && n <= maxInt16 = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeInt32Canonical k) | n >= minInt32 && n <= maxInt32 = k (unI# n) >>= go ts go (TkInteger n : ts) (ConsumeInt32Canonical k) | n >= minInt32 && n <= maxInt32 = k (unI# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeIntegerCanonical k) = k (fromIntegral n) >>= go ts go (TkInteger n : ts) (ConsumeIntegerCanonical k) = k n >>= go ts go (TkListLen n : ts) (ConsumeListLenCanonical k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkMapLen n : ts) (ConsumeMapLenCanonical k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkTag n : ts) (ConsumeTagCanonical k) | n <= maxWord = k (unW# (fromIntegral n)) >>= go ts #if defined(ARCH_32bit) -- 64bit variants for 32bit machines go (TkInt n : ts) (ConsumeWord64 k) | n >= 0 = k (unW64# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord64 k) | n >= 0 = k (unW64# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeNegWord64 k) | n < 0 = k (unW64# (fromIntegral (-1-n))) >>= go ts go (TkInteger n : ts) (ConsumeNegWord64 k) | n < 0 = k (unW64# (fromIntegral (-1-n))) >>= go ts go (TkInt n : ts) (ConsumeInt64 k) = k (unI64# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeInt64 k) = k (unI64# (fromIntegral n)) >>= go ts go (TkTag n : ts) (ConsumeTag64 k) = k (unW64# n) >>= go ts go (TkInt n : ts) (ConsumeWord64Canonical k) | n >= 0 = k (unW64# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeWord64Canonical k) | n >= 0 = k (unW64# (fromIntegral n)) >>= go ts go (TkInt n : ts) (ConsumeNegWord64Canonical k) | n < 0 = k (unW64# (fromIntegral (-1-n))) >>= go ts go (TkInteger n : ts) (ConsumeNegWord64Canonical k) | n < 0 = k (unW64# (fromIntegral (-1-n))) >>= go ts go (TkInt n : ts) (ConsumeInt64Canonical k) = k (unI64# (fromIntegral n)) >>= go ts go (TkInteger n : ts) (ConsumeInt64Canonical k) = k (unI64# (fromIntegral n)) >>= go ts go (TkTag n : ts) (ConsumeTag64Canonical k) = k (unW64# n) >>= go ts -- TODO FIXME (aseipp/dcoutts): are these going to be utilized? -- see fallthrough case below if/when fixed. go ts (ConsumeListLen64 _) = unexpected "decodeListLen64" ts go ts (ConsumeMapLen64 _) = unexpected "decodeMapLen64" ts go ts (ConsumeListLen64Canonical _) = unexpected "decodeListLen64Canonical" ts go ts (ConsumeMapLen64Canonical _) = unexpected "decodeMapLen64Canonical" ts #endif go (TkFloat16 f : ts) (ConsumeFloat k) = k (unF# f) >>= go ts go (TkFloat32 f : ts) (ConsumeFloat k) = k (unF# f) >>= go ts go (TkFloat16 f : ts) (ConsumeDouble k) = k (unD# (float2Double f)) >>= go ts go (TkFloat32 f : ts) (ConsumeDouble k) = k (unD# (float2Double f)) >>= go ts go (TkFloat64 f : ts) (ConsumeDouble k) = k (unD# f) >>= go ts go (TkBytes bs : ts) (ConsumeBytes k) = k bs >>= go ts go (TkBytes bs : ts) (ConsumeByteArray k) = k (BA.fromByteString bs) >>= go ts go (TkString st : ts) (ConsumeString k) = k st >>= go ts go (TkString st : ts) (ConsumeUtf8ByteArray k) = k (BA.fromByteString $ TE.encodeUtf8 st) >>= go ts go (TkBool b : ts) (ConsumeBool k) = k b >>= go ts go (TkSimple n : ts) (ConsumeSimple k) = k (unW8# n) >>= go ts go (TkFloat16 f : ts) (ConsumeFloat16Canonical k) = k (unF# f) >>= go ts go (TkFloat32 f : ts) (ConsumeFloatCanonical k) = k (unF# f) >>= go ts go (TkFloat64 f : ts) (ConsumeDoubleCanonical k) = k (unD# f) >>= go ts go (TkBytes bs : ts) (ConsumeBytesCanonical k) = k bs >>= go ts go (TkBytes bs : ts) (ConsumeByteArrayCanonical k) = k (BA.fromByteString bs) >>= go ts go (TkString st : ts) (ConsumeStringCanonical k) = k st >>= go ts go (TkString st : ts) (ConsumeUtf8ByteArrayCanonical k) = k (BA.fromByteString $ TE.encodeUtf8 st) >>= go ts go (TkSimple n : ts) (ConsumeSimpleCanonical k) = k (unW8# n) >>= go ts go (TkBytesBegin : ts) (ConsumeBytesIndef da) = da >>= go ts go (TkStringBegin : ts) (ConsumeStringIndef da) = da >>= go ts go (TkListBegin : ts) (ConsumeListLenIndef da) = da >>= go ts go (TkMapBegin : ts) (ConsumeMapLenIndef da) = da >>= go ts go (TkNull : ts) (ConsumeNull da) = da >>= go ts go (TkListLen n : ts) (ConsumeListLenOrIndef k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkListBegin : ts) (ConsumeListLenOrIndef k) = k (-1#) >>= go ts go (TkMapLen n : ts) (ConsumeMapLenOrIndef k) | n <= maxInt = k (unI# (fromIntegral n)) >>= go ts go (TkMapBegin : ts) (ConsumeMapLenOrIndef k) = k (-1#) >>= go ts go (TkBreak : ts) (ConsumeBreakOr k) = k True >>= go ts go ts@(_ : _ ) (ConsumeBreakOr k) = k False >>= go ts go ts@(tk:_) (PeekTokenType k) = k (tokenTypeOf tk) >>= go ts go ts (PeekTokenType _) = unexpected "peekTokenType" ts -- We don't have real bytes so we have to give these two operations -- different interpretations: remaining tokens and just 0 for offsets. go ts (PeekAvailable k) = k (unI# (length ts)) >>= go ts #if defined(ARCH_32bit) go ts (PeekByteOffset k)= k (unI64# 0) >>= go ts #else go ts (PeekByteOffset k)= k 0# >>= go ts #endif go _ (Fail msg) = return $ Left msg go [] (Done x) = return $ Right x go ts (Done _) = return $ Left ("trailing tokens: " ++ show (take 5 ts)) ---------------------------------------------------------------------------- -- Fallthrough cases: unhandled token/DecodeAction combinations go ts (ConsumeWord _) = unexpected "decodeWord" ts go ts (ConsumeWord8 _) = unexpected "decodeWord8" ts go ts (ConsumeWord16 _) = unexpected "decodeWord16" ts go ts (ConsumeWord32 _) = unexpected "decodeWord32" ts go ts (ConsumeNegWord _) = unexpected "decodeNegWord" ts go ts (ConsumeInt _) = unexpected "decodeInt" ts go ts (ConsumeInt8 _) = unexpected "decodeInt8" ts go ts (ConsumeInt16 _) = unexpected "decodeInt16" ts go ts (ConsumeInt32 _) = unexpected "decodeInt32" ts go ts (ConsumeInteger _) = unexpected "decodeInteger" ts go ts (ConsumeListLen _) = unexpected "decodeListLen" ts go ts (ConsumeMapLen _) = unexpected "decodeMapLen" ts go ts (ConsumeTag _) = unexpected "decodeTag" ts go ts (ConsumeWordCanonical _) = unexpected "decodeWordCanonical" ts go ts (ConsumeWord8Canonical _) = unexpected "decodeWord8Canonical" ts go ts (ConsumeWord16Canonical _) = unexpected "decodeWord16Canonical" ts go ts (ConsumeWord32Canonical _) = unexpected "decodeWord32Canonical" ts go ts (ConsumeNegWordCanonical _) = unexpected "decodeNegWordCanonical" ts go ts (ConsumeIntCanonical _) = unexpected "decodeIntCanonical" ts go ts (ConsumeInt8Canonical _) = unexpected "decodeInt8Canonical" ts go ts (ConsumeInt16Canonical _) = unexpected "decodeInt16Canonical" ts go ts (ConsumeInt32Canonical _) = unexpected "decodeInt32Canonical" ts go ts (ConsumeIntegerCanonical _) = unexpected "decodeIntegerCanonical" ts go ts (ConsumeListLenCanonical _) = unexpected "decodeListLenCanonical" ts go ts (ConsumeMapLenCanonical _) = unexpected "decodeMapLenCanonical" ts go ts (ConsumeTagCanonical _) = unexpected "decodeTagCanonical" ts go ts (ConsumeFloat _) = unexpected "decodeFloat" ts go ts (ConsumeDouble _) = unexpected "decodeDouble" ts go ts (ConsumeBytes _) = unexpected "decodeBytes" ts go ts (ConsumeByteArray _) = unexpected "decodeByteArray" ts go ts (ConsumeString _) = unexpected "decodeString" ts go ts (ConsumeUtf8ByteArray _) = unexpected "decodeUtf8ByteArray" ts go ts (ConsumeBool _) = unexpected "decodeBool" ts go ts (ConsumeSimple _) = unexpected "decodeSimple" ts go ts (ConsumeFloat16Canonical _) = unexpected "decodeFloat16Canonical" ts go ts (ConsumeFloatCanonical _) = unexpected "decodeFloatCanonical" ts go ts (ConsumeDoubleCanonical _) = unexpected "decodeDoubleCanonical" ts go ts (ConsumeBytesCanonical _) = unexpected "decodeBytesCanonical" ts go ts (ConsumeByteArrayCanonical _) = unexpected "decodeByteArrayCanonical" ts go ts (ConsumeStringCanonical _) = unexpected "decodeStringCanonical" ts go ts (ConsumeUtf8ByteArrayCanonical _) = unexpected "decodeUtf8ByteArrayCanonical" ts go ts (ConsumeSimpleCanonical _) = unexpected "decodeSimpleCanonical" ts #if defined(ARCH_32bit) -- 64bit variants for 32bit machines go ts (ConsumeWord64 _) = unexpected "decodeWord64" ts go ts (ConsumeNegWord64 _) = unexpected "decodeNegWord64" ts go ts (ConsumeInt64 _) = unexpected "decodeInt64" ts go ts (ConsumeTag64 _) = unexpected "decodeTag64" ts --go ts (ConsumeListLen64 _) = unexpected "decodeListLen64" ts --go ts (ConsumeMapLen64 _) = unexpected "decodeMapLen64" ts go ts (ConsumeWord64Canonical _) = unexpected "decodeWord64Canonical" ts go ts (ConsumeNegWord64Canonical _) = unexpected "decodeNegWord64Canonical" ts go ts (ConsumeInt64Canonical _) = unexpected "decodeInt64Canonical" ts go ts (ConsumeTag64Canonical _) = unexpected "decodeTag64Canonical" ts --go ts (ConsumeListLen64Canonical _) = unexpected "decodeListLen64Canonical" ts --go ts (ConsumeMapLen64Canonical _) = unexpected "decodeMapLen64Canonical" ts #endif go ts (ConsumeBytesIndef _) = unexpected "decodeBytesIndef" ts go ts (ConsumeStringIndef _) = unexpected "decodeStringIndef" ts go ts (ConsumeListLenIndef _) = unexpected "decodeListLenIndef" ts go ts (ConsumeMapLenIndef _) = unexpected "decodeMapLenIndef" ts go ts (ConsumeNull _) = unexpected "decodeNull" ts go ts (ConsumeListLenOrIndef _) = unexpected "decodeListLenOrIndef" ts go ts (ConsumeMapLenOrIndef _) = unexpected "decodeMapLenOrIndef" ts go ts (ConsumeBreakOr _) = unexpected "decodeBreakOr" ts unexpected name [] = return $ Left $ name ++ ": unexpected end of input" unexpected name (tok:_) = return $ Left $ name ++ ": unexpected token " ++ show tok -- | Map a 'TermToken' to the underlying CBOR 'TokenType' tokenTypeOf :: TermToken -> TokenType tokenTypeOf (TkInt n) | n >= 0 = TypeUInt | otherwise = TypeNInt tokenTypeOf (TkInteger n) -- See https://github.com/well-typed/cborg/issues/324 | 0 <= n && n <= 0xffffffffffffffff = TypeUInt64 -- 0xffffffffffffffff == 2^64 - 1 | -0xffffffffffffffff <= n && n < 0 = TypeNInt64 | otherwise = TypeInteger tokenTypeOf TkBytes{} = TypeBytes tokenTypeOf TkBytesBegin{} = TypeBytesIndef tokenTypeOf TkString{} = TypeString tokenTypeOf TkStringBegin{} = TypeStringIndef tokenTypeOf TkListLen{} = TypeListLen tokenTypeOf TkListBegin{} = TypeListLenIndef tokenTypeOf TkMapLen{} = TypeMapLen tokenTypeOf TkMapBegin{} = TypeMapLenIndef tokenTypeOf TkTag{} = TypeTag tokenTypeOf TkBool{} = TypeBool tokenTypeOf TkNull = TypeNull tokenTypeOf TkBreak = TypeBreak tokenTypeOf TkSimple{} = TypeSimple tokenTypeOf TkFloat16{} = TypeFloat16 tokenTypeOf TkFloat32{} = TypeFloat32 tokenTypeOf TkFloat64{} = TypeFloat64 -------------------------------------------------------------------------------- -- | Ensure a 'FlatTerm' is internally consistent and was created in a valid -- manner. -- -- @since 0.2.0.0 validFlatTerm :: FlatTerm -- ^ The input 'FlatTerm' -> Bool -- ^ 'True' if valid, 'False' otherwise. validFlatTerm ts = either (const False) (const True) $ do ts' <- validateTerm TopLevelSingle ts case ts' of [] -> return () _ -> Left "trailing data" -- | A data type used for tracking the position we're at -- as we traverse a 'FlatTerm' and make sure it's valid. data Loc = TopLevelSingle | TopLevelSequence | InString Int Loc | InBytes Int Loc | InListN Int Int Loc | InList Int Loc | InMapNKey Int Int Loc | InMapNVal Int Int Loc | InMapKey Int Loc | InMapVal Int Loc | InTagged Word64 Loc deriving Show -- | Validate an arbitrary 'FlatTerm' at an arbitrary location. validateTerm :: Loc -> FlatTerm -> Either String FlatTerm validateTerm _loc (TkInt _ : ts) = return ts validateTerm _loc (TkInteger _ : ts) = return ts validateTerm _loc (TkBytes _ : ts) = return ts validateTerm loc (TkBytesBegin : ts) = validateBytes loc 0 ts validateTerm _loc (TkString _ : ts) = return ts validateTerm loc (TkStringBegin : ts) = validateString loc 0 ts validateTerm loc (TkListLen len : ts) | len <= maxInt = validateListN loc 0 (fromIntegral len) ts | otherwise = Left "list len too long (> max int)" validateTerm loc (TkListBegin : ts) = validateList loc 0 ts validateTerm loc (TkMapLen len : ts) | len <= maxInt = validateMapN loc 0 (fromIntegral len) ts | otherwise = Left "map len too long (> max int)" validateTerm loc (TkMapBegin : ts) = validateMap loc 0 ts validateTerm loc (TkTag w : ts) = validateTerm (InTagged w loc) ts validateTerm _loc (TkBool _ : ts) = return ts validateTerm _loc (TkNull : ts) = return ts validateTerm loc (TkBreak : _) = unexpectedToken TkBreak loc validateTerm _loc (TkSimple _ : ts) = return ts validateTerm _loc (TkFloat16 _ : ts) = return ts validateTerm _loc (TkFloat32 _ : ts) = return ts validateTerm _loc (TkFloat64 _ : ts) = return ts validateTerm loc [] = unexpectedEof loc unexpectedToken :: TermToken -> Loc -> Either String a unexpectedToken tok loc = Left $ "unexpected token " ++ show tok ++ ", in context " ++ show loc unexpectedEof :: Loc -> Either String a unexpectedEof loc = Left $ "unexpected end of input in context " ++ show loc validateBytes :: Loc -> Int -> [TermToken] -> Either String [TermToken] validateBytes _ _ (TkBreak : ts) = return ts validateBytes ploc i (TkBytes _ : ts) = validateBytes ploc (i+1) ts validateBytes ploc i (tok : _) = unexpectedToken tok (InBytes i ploc) validateBytes ploc i [] = unexpectedEof (InBytes i ploc) validateString :: Loc -> Int -> [TermToken] -> Either String [TermToken] validateString _ _ (TkBreak : ts) = return ts validateString ploc i (TkString _ : ts) = validateString ploc (i+1) ts validateString ploc i (tok : _) = unexpectedToken tok (InString i ploc) validateString ploc i [] = unexpectedEof (InString i ploc) validateListN :: Loc -> Int -> Int -> [TermToken] -> Either String [TermToken] validateListN _ i len ts | i == len = return ts validateListN ploc i len ts = do ts' <- validateTerm (InListN i len ploc) ts validateListN ploc (i+1) len ts' validateList :: Loc -> Int -> [TermToken] -> Either String [TermToken] validateList _ _ (TkBreak : ts) = return ts validateList ploc i ts = do ts' <- validateTerm (InList i ploc) ts validateList ploc (i+1) ts' validateMapN :: Loc -> Int -> Int -> [TermToken] -> Either String [TermToken] validateMapN _ i len ts | i == len = return ts validateMapN ploc i len ts = do ts' <- validateTerm (InMapNKey i len ploc) ts ts'' <- validateTerm (InMapNVal i len ploc) ts' validateMapN ploc (i+1) len ts'' validateMap :: Loc -> Int -> [TermToken] -> Either String [TermToken] validateMap _ _ (TkBreak : ts) = return ts validateMap ploc i ts = do ts' <- validateTerm (InMapKey i ploc) ts ts'' <- validateTerm (InMapVal i ploc) ts' validateMap ploc (i+1) ts'' -------------------------------------------------------------------------------- -- Utilities maxInt, minInt, maxWord :: Num n => n maxInt = fromIntegral (maxBound :: Int) minInt = fromIntegral (minBound :: Int) maxWord = fromIntegral (maxBound :: Word) maxInt8, minInt8, maxWord8 :: Num n => n maxInt8 = fromIntegral (maxBound :: Int8) minInt8 = fromIntegral (minBound :: Int8) maxWord8 = fromIntegral (maxBound :: Word8) maxInt16, minInt16, maxWord16 :: Num n => n maxInt16 = fromIntegral (maxBound :: Int16) minInt16 = fromIntegral (minBound :: Int16) maxWord16 = fromIntegral (maxBound :: Word16) maxInt32, minInt32, maxWord32 :: Num n => n maxInt32 = fromIntegral (maxBound :: Int32) minInt32 = fromIntegral (minBound :: Int32) maxWord32 = fromIntegral (maxBound :: Word32) -- | Do a careful check to ensure an 'Int' is in the -- range of a 'Word32'. intIsValidWord32 :: Int -> Bool intIsValidWord32 n = b1 && b2 where -- NOTE: this first comparison must use Int for -- the check, not Word32, in case a negative value -- is given. Otherwise this check would fail due to -- overflow. b1 = n >= 0 -- NOTE: we must convert n to Word32, otherwise, -- maxWord32 is inferred as Int, and because -- the maxBound of Word32 is greater than Int, -- it overflows and this check fails. b2 = (fromIntegral n :: Word32) <= maxWord32 unI# :: Int -> Int# unI# (I# i#) = i# unW# :: Word -> Word# unW# (W# w#) = w# unW8# :: Word8 -> Word# #if MIN_VERSION_ghc_prim(0,8,0) unW8# (W8# w#) = word8ToWord# w# #else unW8# (W8# w#) = w# #endif unF# :: Float -> Float# unF# (F# f#) = f# unD# :: Double -> Double# unD# (D# f#) = f# #if defined(ARCH_32bit) unW64# :: Word64 -> Word64# unW64# (W64# w#) = w# unI64# :: Int64 -> Int64# unI64# (I64# i#) = i# #endif cborg-0.2.10.0/src/Codec/CBOR/FlatTerm.hs-boot0000644000000000000000000000026407346545000016563 0ustar0000000000000000module Codec.CBOR.FlatTerm where import {-# SOURCE #-} Codec.CBOR.Encoding type FlatTerm = [TermToken] data TermToken instance Show TermToken toFlatTerm :: Encoding -> FlatTermcborg-0.2.10.0/src/Codec/CBOR/Magic.hs0000644000000000000000000005465507346545000015141 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE CPP #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE UnboxedTuples #-} {-# LANGUAGE ForeignFunctionInterface #-} {-# LANGUAGE ScopedTypeVariables #-} #include "cbor.h" -- | -- Module : Codec.CBOR.Magic -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- An internal module for doing magical, low-level, and unholy things -- in the name of efficiency. -- module Codec.CBOR.Magic ( -- * Word utilities grabWord8 -- :: Ptr () -> Word , grabWord16 -- :: Ptr () -> Word , grabWord32 -- :: Ptr () -> Word , grabWord64 -- :: Ptr () -> Word64 -- * 'ByteString' utilities , eatTailWord8 -- :: ByteString -> Word , eatTailWord16 -- :: ByteString -> Word , eatTailWord32 -- :: ByteString -> Word , eatTailWord64 -- :: ByteString -> Word64 -- * Half-floats , wordToFloat16 -- :: Word -> Float , floatToWord16 -- :: Float -> Word16 -- * Float\/Word conversion , wordToFloat32 -- :: Word -> Float , wordToFloat64 -- :: Word64 -> Double -- * Int and Word explicit conversions , word8ToWord -- :: Word8 -> Word , word16ToWord -- :: Word16 -> Word , word32ToWord -- :: Word32 -> Word , word64ToWord -- :: Word64 -> Word -- int*ToInt conversions are missing because they are not needed. , word8ToInt -- :: Word8 -> Int , word16ToInt -- :: Word16 -> Int , word32ToInt -- :: Word32 -> Int , word64ToInt -- :: Word64 -> Int , intToWord -- :: Int -> Word , intToInt64 -- :: Int -> Int64 , intToWord64 -- :: Int -> Word64 , int64ToWord64 -- :: Int64 -> Word64 #if defined(ARCH_32bit) , word8ToInt64 -- :: Word8 -> Int64 , word16ToInt64 -- :: Word16 -> Int64 , word32ToInt64 -- :: Word32 -> Int64 , word64ToInt64 -- :: Word64 -> Maybe Int64 , word8ToWord64 -- :: Word8 -> Word64 , word16ToWord64 -- :: Word16 -> Word64 , word32ToWord64 -- :: Word32 -> Word64 #endif -- * 'Integer' utilities , nintegerFromBytes -- :: ByteString -> Integer , uintegerFromBytes -- :: ByteString -> Integer -- * Simple mutable counters , Counter -- :: * -> * , newCounter -- :: Int -> ST s (Counter s) , readCounter -- :: Counter s -> ST s Int , writeCounter -- :: Counter s -> Int -> ST s () , incCounter -- :: Counter s -> ST s () , decCounter -- :: Counter s -> ST s () -- * Array support , copyByteStringToByteArray , copyByteArrayToByteString ) where import GHC.Exts import GHC.ST (ST(ST)) import GHC.IO (IO(IO), unsafeDupablePerformIO) import GHC.Word import GHC.Int #if MIN_VERSION_base(4,11,0) import GHC.Float (castWord32ToFloat, castWord64ToDouble) #endif import Foreign.Ptr #if defined(OPTIMIZE_GMP) #if defined(HAVE_GHC_BIGNUM) import qualified GHC.Num.Integer as BigNum #else import qualified GHC.Integer.GMP.Internals as Gmp #endif #endif import Data.ByteString (ByteString) import qualified Data.ByteString as BS import qualified Data.ByteString.Internal as BS import qualified Data.ByteString.Unsafe as BS import Data.Primitive.ByteArray as Prim hiding (copyByteArrayToPtr, copyPtrToMutableByteArray) import Foreign.ForeignPtr (withForeignPtr) import Foreign.C (CUShort) import qualified Numeric.Half as Half #if !defined(HAVE_BYTESWAP_PRIMOPS) || !defined(MEM_UNALIGNED_OPS) || !defined(OPTIMIZE_GMP) import Data.Bits ((.|.), unsafeShiftL) #endif #if defined(ARCH_32bit) import GHC.IntWord64 (wordToWord64#, word64ToWord#, intToInt64#, int64ToInt#, leWord64#, ltWord64#, word64ToInt64#) #endif -------------------------------------------------------------------------------- -- | Grab a 8-bit 'Word' given a 'Ptr' to some address. grabWord8 :: Ptr () -> Word8 {-# INLINE grabWord8 #-} -- | Grab a 16-bit 'Word' given a 'Ptr' to some address. grabWord16 :: Ptr () -> Word16 {-# INLINE grabWord16 #-} -- | Grab a 32-bit 'Word' given a 'Ptr' to some address. grabWord32 :: Ptr () -> Word32 {-# INLINE grabWord32 #-} -- | Grab a 64-bit 'Word64' given a 'Ptr' to some address. grabWord64 :: Ptr () -> Word64 {-# INLINE grabWord64 #-} -- -- Machine-dependent implementation -- -- 8-bit word case is always the same... grabWord8 (Ptr ip#) = W8# (indexWord8OffAddr# ip# 0#) -- ... but the remaining cases arent #if defined(HAVE_BYTESWAP_PRIMOPS) && \ defined(MEM_UNALIGNED_OPS) && \ !defined(WORDS_BIGENDIAN) -- On x86 machines with GHC 7.10, we have byteswap primitives -- available to make this conversion very fast. #if MIN_VERSION_ghc_prim(0,8,0) grabWord16 (Ptr ip#) = W16# (wordToWord16# (byteSwap16# (word16ToWord# (indexWord16OffAddr# ip# 0#)))) grabWord32 (Ptr ip#) = W32# (wordToWord32# (byteSwap32# (word32ToWord# (indexWord32OffAddr# ip# 0#)))) #else grabWord16 (Ptr ip#) = W16# (narrow16Word# (byteSwap16# (indexWord16OffAddr# ip# 0#))) grabWord32 (Ptr ip#) = W32# (narrow32Word# (byteSwap32# (indexWord32OffAddr# ip# 0#))) #endif #if defined(ARCH_64bit) #if MIN_VERSION_base(4,17,0) grabWord64 (Ptr ip#) = W64# (wordToWord64# (byteSwap# (word64ToWord# (indexWord64OffAddr# ip# 0#)))) #else grabWord64 (Ptr ip#) = W64# (byteSwap# (indexWord64OffAddr# ip# 0#)) #endif #else grabWord64 (Ptr ip#) = W64# (byteSwap64# (word64ToWord# (indexWord64OffAddr# ip# 0#))) #endif #elif defined(MEM_UNALIGNED_OPS) && \ defined(WORDS_BIGENDIAN) -- In some theoretical future-verse where there are unaligned memory -- accesses on the machine, but it is also big-endian, we need to be -- able to decode these numbers efficiently, still. grabWord16 (Ptr ip#) = W16# (indexWord16OffAddr# ip# 0#) grabWord32 (Ptr ip#) = W32# (indexWord32OffAddr# ip# 0#) grabWord64 (Ptr ip#) = W64# (indexWord64OffAddr# ip# 0#) #else -- Otherwise, we fall back to the much slower, inefficient case -- of writing out each of the 8 bits of the output word at -- a time. grabWord16 (Ptr ip#) = case indexWord8OffAddr# ip# 0# of w0# -> case indexWord8OffAddr# ip# 1# of w1# -> w16 w0# `unsafeShiftL` 8 .|. w16 w1# where #if MIN_VERSION_ghc_prim(0,8,0) w16 w# = W16# (wordToWord16# (word8ToWord# w#)) #else w16 w# = W16# w# #endif grabWord32 (Ptr ip#) = case indexWord8OffAddr# ip# 0# of w0# -> case indexWord8OffAddr# ip# 1# of w1# -> case indexWord8OffAddr# ip# 2# of w2# -> case indexWord8OffAddr# ip# 3# of w3# -> w32 w0# `unsafeShiftL` 24 .|. w32 w1# `unsafeShiftL` 16 .|. w32 w2# `unsafeShiftL` 8 .|. w32 w3# where #if MIN_VERSION_ghc_prim(0,8,0) w32 w# = W32# (wordToWord32# (word8ToWord# w#)) #else w32 w# = W32# w# #endif grabWord64 (Ptr ip#) = case indexWord8OffAddr# ip# 0# of w0# -> case indexWord8OffAddr# ip# 1# of w1# -> case indexWord8OffAddr# ip# 2# of w2# -> case indexWord8OffAddr# ip# 3# of w3# -> case indexWord8OffAddr# ip# 4# of w4# -> case indexWord8OffAddr# ip# 5# of w5# -> case indexWord8OffAddr# ip# 6# of w6# -> case indexWord8OffAddr# ip# 7# of w7# -> w64 w0# `unsafeShiftL` 56 .|. w64 w1# `unsafeShiftL` 48 .|. w64 w2# `unsafeShiftL` 40 .|. w64 w3# `unsafeShiftL` 32 .|. w64 w4# `unsafeShiftL` 24 .|. w64 w5# `unsafeShiftL` 16 .|. w64 w6# `unsafeShiftL` 8 .|. w64 w7# where #if MIN_VERSION_ghc_prim(0,8,0) toWord :: Word8# -> Word# toWord w# = word8ToWord# w# #else toWord :: Word# -> Word# toWord w# = w# #endif #if WORD_SIZE_IN_BITS == 64 #if MIN_VERSION_base(4,17,0) -- case taken from Codec.CBOR.Decoding w64 w# = W64# (wordToWord64# (toWord w#)) #else w64 w# = W64# (toWord w#) #endif #else w64 w# = W64# (wordToWord64# (toWord w#)) #endif #endif -------------------------------------------------------------------------------- -- ByteString shennanigans -- | Take the tail of a 'ByteString' (i.e. drop the first byte) and read the -- resulting byte(s) as an 8-bit word value. The input 'ByteString' MUST be at -- least 2 bytes long: one byte to drop from the front, and one to read as a -- 'Word' value. This is not checked, and failure to ensure this will result -- in undefined behavior. eatTailWord8 :: ByteString -> Word8 eatTailWord8 xs = withBsPtr grabWord8 (BS.unsafeTail xs) {-# INLINE eatTailWord8 #-} -- | Take the tail of a 'ByteString' (i.e. drop the first byte) and read the -- resulting byte(s) as a 16-bit word value. The input 'ByteString' MUST be at -- least 3 bytes long: one byte to drop from the front, and two to read as a -- 16-bit 'Word' value. This is not checked, and failure to ensure this will -- result in undefined behavior. eatTailWord16 :: ByteString -> Word16 eatTailWord16 xs = withBsPtr grabWord16 (BS.unsafeTail xs) {-# INLINE eatTailWord16 #-} -- | Take the tail of a 'ByteString' (i.e. drop the first byte) and read the -- resulting byte(s) as a 32-bit word value. The input 'ByteString' MUST be at -- least 5 bytes long: one byte to drop from the front, and four to read as a -- 32-bit 'Word' value. This is not checked, and failure to ensure this will -- result in undefined behavior. eatTailWord32 :: ByteString -> Word32 eatTailWord32 xs = withBsPtr grabWord32 (BS.unsafeTail xs) {-# INLINE eatTailWord32 #-} -- | Take the tail of a 'ByteString' (i.e. drop the first byte) and read the -- resulting byte(s) as a 64-bit word value. The input 'ByteString' MUST be at -- least 9 bytes long: one byte to drop from the front, and eight to read as a -- 64-bit 'Word64' value. This is not checked, and failure to ensure this will -- result in undefined behavior. eatTailWord64 :: ByteString -> Word64 eatTailWord64 xs = withBsPtr grabWord64 (BS.unsafeTail xs) {-# INLINE eatTailWord64 #-} -- | Unsafely take a 'Ptr' to a 'ByteString' and do unholy things -- with it. withBsPtr :: (Ptr b -> a) -> ByteString -> a withBsPtr f (BS.PS x off _) = unsafeDupablePerformIO $ withForeignPtr x $ \(Ptr addr#) -> return $! (f (Ptr addr# `plusPtr` off)) {-# INLINE withBsPtr #-} -------------------------------------------------------------------------------- -- Half floats -- | Convert a 'Word16' to a half-sized 'Float'. wordToFloat16 :: Word16 -> Float wordToFloat16 = \x -> Half.fromHalf (Half.Half (cast x)) where cast :: Word16 -> CUShort cast = fromIntegral {-# INLINE wordToFloat16 #-} -- | Convert a half-sized 'Float' to a 'Word'. floatToWord16 :: Float -> Word16 floatToWord16 = \x -> cast (Half.getHalf (Half.toHalf x)) where cast :: CUShort -> Word16 cast = fromIntegral {-# INLINE floatToWord16 #-} -------------------------------------------------------------------------------- -- Casting words to floats -- We have to go via a word rather than reading directly from memory because of -- endian issues. A little endian machine cannot read a big-endian float direct -- from memory, so we read a word, bswap it and then convert to float. -- -- Prior to base 4.11, there are no primops for casting word <-> float, see -- https://ghc.haskell.org/trac/ghc/ticket/4092 -- -- In our fallback implementation, we're avoiding doing the extra indirection -- (and closure allocation) of the runSTRep stuff, but we have to be very -- careful here, we cannot allow the "constant" newByteArray# 8# realWorld# to -- be floated out and shared and aliased across multiple concurrent calls. -- So we do manual worker/wrapper with the worker not being inlined. -- | Cast a 'Word32' to a 'Float'. wordToFloat32 :: Word32 -> Float #if MIN_VERSION_base(4,11,0) wordToFloat32 = GHC.Float.castWord32ToFloat #else wordToFloat32 (W32# w#) = F# (wordToFloat32# w#) {-# INLINE wordToFloat32 #-} -- | Cast an unboxed word to an unboxed float. wordToFloat32# :: Word# -> Float# wordToFloat32# w# = case newByteArray# 4# realWorld# of (# s', mba# #) -> case writeWord32Array# mba# 0# w# s' of s'' -> case readFloatArray# mba# 0# s'' of (# _, f# #) -> f# {-# NOINLINE wordToFloat32# #-} #endif -- | Cast a 'Word64' to a 'Float'. wordToFloat64 :: Word64 -> Double #if MIN_VERSION_base(4,11,0) wordToFloat64 = GHC.Float.castWord64ToDouble #else wordToFloat64 (W64# w#) = D# (wordToFloat64# w#) {-# INLINE wordToFloat64 #-} -- | Cast an unboxed word to an unboxed double. #if defined(ARCH_64bit) wordToFloat64# :: Word# -> Double# #else wordToFloat64# :: Word64# -> Double# #endif wordToFloat64# w# = case newByteArray# 8# realWorld# of (# s', mba# #) -> case writeWord64Array# mba# 0# w# s' of s'' -> case readDoubleArray# mba# 0# s'' of (# _, f# #) -> f# {-# NOINLINE wordToFloat64# #-} #endif -------------------------------------------------------------------------------- -- Casting words and ints word8ToWord :: Word8 -> Word word16ToWord :: Word16 -> Word word32ToWord :: Word32 -> Word #if defined(ARCH_64bit) word64ToWord :: Word64 -> Word #else word64ToWord :: Word64 -> Maybe Word #endif word8ToInt :: Word8 -> Int word16ToInt :: Word16 -> Int #if defined(ARCH_64bit) word32ToInt :: Word32 -> Int #else word32ToInt :: Word32 -> Maybe Int #endif word64ToInt :: Word64 -> Maybe Int #if defined(ARCH_32bit) word8ToInt64 :: Word8 -> Int64 word16ToInt64 :: Word16 -> Int64 word32ToInt64 :: Word32 -> Int64 word64ToInt64 :: Word64 -> Maybe Int64 word8ToWord64 :: Word8 -> Word64 word16ToWord64 :: Word16 -> Word64 word32ToWord64 :: Word32 -> Word64 #endif intToInt64 :: Int -> Int64 intToInt64 = fromIntegral {-# INLINE intToInt64 #-} intToWord :: Int -> Word intToWord = fromIntegral {-# INLINE intToWord #-} intToWord64 :: Int -> Word64 intToWord64 = fromIntegral {-# INLINE intToWord64 #-} int64ToWord64 :: Int64 -> Word64 int64ToWord64 = fromIntegral {-# INLINE int64ToWord64 #-} #if MIN_VERSION_ghc_prim(0,8,0) word8ToWord (W8# w#) = W# (word8ToWord# w#) word16ToWord (W16# w#) = W# (word16ToWord# w#) word32ToWord (W32# w#) = W# (word32ToWord# w#) #if defined(ARCH_64bit) #if MIN_VERSION_base(4,17,0) word64ToWord (W64# w#) = W# (word64ToWord# w#) #else word64ToWord (W64# w#) = W# w# #endif #else word64ToWord (W64# w64#) = case isTrue# (w64# `leWord64#` wordToWord64# 0xffffffff##) of True -> Just (W# (word64ToWord# w64#)) False -> Nothing #endif #else word8ToWord (W8# w#) = W# w# word16ToWord (W16# w#) = W# w# word32ToWord (W32# w#) = W# w# #if defined(ARCH_64bit) word64ToWord (W64# w#) = W# w# #else word64ToWord (W64# w64#) = case isTrue# (w64# `leWord64#` wordToWord64# 0xffffffff##) of True -> Just (W# (word64ToWord# w64#)) False -> Nothing #endif #endif {-# INLINE word8ToWord #-} {-# INLINE word16ToWord #-} {-# INLINE word32ToWord #-} {-# INLINE word64ToWord #-} #if MIN_VERSION_ghc_prim(0,8,0) word8ToInt (W8# w#) = I# (word2Int# (word8ToWord# w#)) word16ToInt (W16# w#) = I# (word2Int# (word16ToWord# w#)) #if defined(ARCH_64bit) word32ToInt (W32# w#) = I# (word2Int# (word32ToWord# w#)) #else word32ToInt (W32# w#) = case isTrue# (w# `ltWord#` 0x80000000##) of True -> Just (I# (word2Int# (word32ToWord# w#))) False -> Nothing #endif #else word8ToInt (W8# w#) = I# (word2Int# w#) word16ToInt (W16# w#) = I# (word2Int# w#) #if defined(ARCH_64bit) word32ToInt (W32# w#) = I# (word2Int# w#) #else word32ToInt (W32# w#) = case isTrue# (w# `ltWord#` 0x80000000##) of True -> Just (I# (word2Int# w#)) False -> Nothing #endif #endif #if defined(ARCH_64bit) word64ToInt (W64# w#) = #if MIN_VERSION_base(4,17,0) case isTrue# (word64ToWord# w# `ltWord#` 0x8000000000000000##) of #else case isTrue# (w# `ltWord#` 0x8000000000000000##) of #endif True -> #if MIN_VERSION_base(4,17,0) Just (I# (word2Int# (word64ToWord# w#))) #else Just (I# (word2Int# w#)) #endif False -> Nothing #else word64ToInt (W64# w#) = case isTrue# (w# `ltWord64#` wordToWord64# 0x80000000##) of True -> Just (I# (int64ToInt# (word64ToInt64# w#))) False -> Nothing #endif {-# INLINE word8ToInt #-} {-# INLINE word16ToInt #-} {-# INLINE word32ToInt #-} {-# INLINE word64ToInt #-} #if defined(ARCH_32bit) word8ToInt64 (W8# w#) = I64# (intToInt64# (word2Int# w#)) word16ToInt64 (W16# w#) = I64# (intToInt64# (word2Int# w#)) word32ToInt64 (W32# w#) = I64# (word64ToInt64# (wordToWord64# w#)) word64ToInt64 (W64# w#) = case isTrue# (w# `ltWord64#` uncheckedShiftL64# (wordToWord64# 1##) 63#) of True -> Just (I64# (word64ToInt64# w#)) False -> Nothing word8ToWord64 (W8# w#) = W64# (wordToWord64# w#) word16ToWord64 (W16# w#) = W64# (wordToWord64# w#) word32ToWord64 (W32# w#) = W64# (wordToWord64# w#) {-# INLINE word8ToInt64 #-} {-# INLINE word16ToInt64 #-} {-# INLINE word32ToInt64 #-} {-# INLINE word64ToInt64 #-} {-# INLINE word8ToWord64 #-} {-# INLINE word16ToWord64 #-} {-# INLINE word32ToWord64 #-} #endif -------------------------------------------------------------------------------- -- Integer utilities -- | Create a negative 'Integer' out of a raw 'BS.ByteString'. nintegerFromBytes :: BS.ByteString -> Integer nintegerFromBytes bs = -1 - uintegerFromBytes bs -- | Create an 'Integer' out of a raw 'BS.ByteString'. uintegerFromBytes :: BS.ByteString -> Integer #if defined(OPTIMIZE_GMP) uintegerFromBytes (BS.PS fp (I# off#) (I# len#)) = -- This should be safe since we're simply reading from ByteString (which is -- immutable) and GMP allocates a new memory for the Integer, i.e., there is -- no mutation involved. unsafeDupablePerformIO $ withForeignPtr fp $ \(Ptr addr#) -> let addrOff# = addr# `plusAddr#` off# -- The last parmaeter (`1#`) tells the import function to use big -- endian encoding. in #if defined(HAVE_GHC_BIGNUM) BigNum.integerFromAddr (int2Word# len#) addrOff# 1# #else Gmp.importIntegerFromAddr addrOff# (int2Word# len#) 1# #endif #else uintegerFromBytes bs = case BS.uncons bs of Nothing -> 0 Just (w0, ws0) -> go (fromIntegral w0) ws0 where go !acc ws = case BS.uncons ws of Nothing -> acc Just (w, ws') -> go (acc `unsafeShiftL` 8 + fromIntegral w) ws' #endif -------------------------------------------------------------------------------- -- Mutable counters -- | An efficient, mutable counter. Designed to be used inside -- 'ST' or other primitive monads, hence it carries an abstract -- rank-2 @s@ type parameter. data Counter s = Counter (MutableByteArray# s) -- | Create a new counter with a starting 'Int' value. newCounter :: Int -> ST s (Counter s) newCounter (I# n#) = ST (\s -> case newByteArray# 8# s of (# s', mba# #) -> case writeIntArray# mba# 0# n# s' of s'' -> (# s'', Counter mba# #)) {-# INLINE newCounter #-} -- | Read the current value of a 'Counter'. readCounter :: Counter s -> ST s Int readCounter (Counter mba#) = ST (\s -> case readIntArray# mba# 0# s of (# s', n# #) -> (# s', I# n# #)) {-# INLINE readCounter #-} -- | Write a new value into the 'Counter'. writeCounter :: Counter s -> Int -> ST s () writeCounter (Counter mba#) (I# n#) = ST (\s -> case writeIntArray# mba# 0# n# s of s' -> (# s', () #)) {-# INLINE writeCounter #-} -- | Increment a 'Counter' by one. incCounter :: Counter s -> ST s () incCounter c = do x <- readCounter c writeCounter c (x+1) {-# INLINE incCounter #-} -- | Decrement a 'Counter' by one. decCounter :: Counter s -> ST s () decCounter c = do x <- readCounter c writeCounter c (x-1) {-# INLINE decCounter #-} -------------------------------------------------------------------------------- -- Array support -- | Copy a 'BS.ByteString' and create a primitive 'Prim.ByteArray' from it. copyByteStringToByteArray :: BS.ByteString -> Prim.ByteArray copyByteStringToByteArray (BS.PS fp off len) = unsafeDupablePerformIO $ withForeignPtr fp $ \ptr -> do mba <- Prim.newByteArray len copyPtrToMutableByteArray (ptr `plusPtr` off) mba 0 len Prim.unsafeFreezeByteArray mba -- TODO FIXME: can do better here: can do non-copying for larger pinned arrays -- or copy directly into the builder buffer -- | Copy a 'Prim.ByteArray' at a certain offset and length into a -- 'BS.ByteString'. copyByteArrayToByteString :: Prim.ByteArray -- ^ 'Prim.ByteArray' to copy from. -> Int -- ^ Offset into the 'Prim.ByteArray' to start with. -> Int -- ^ Length of the data to copy. -> BS.ByteString copyByteArrayToByteString ba off len = unsafeDupablePerformIO $ do fp <- BS.mallocByteString len withForeignPtr fp $ \ptr -> do copyByteArrayToPtr ba off ptr len return (BS.PS fp 0 len) -- | Copy the data pointed to by a 'Ptr' into a @'MutableByteArray'. copyPtrToMutableByteArray :: Ptr a -- ^ 'Ptr' to buffer to copy from. -> MutableByteArray RealWorld -- ^ 'MutableByteArray' to copy into. -> Int -- ^ Offset to start copying from. -> Int -- ^ Length of the data to copy. -> IO () copyPtrToMutableByteArray (Ptr addr#) (MutableByteArray mba#) (I# off#) (I# len#) = IO (\s -> case copyAddrToByteArray# addr# mba# off# len# s of s' -> (# s', () #)) -- | Copy a 'ByteArray' into a 'Ptr' with a given offset and length. copyByteArrayToPtr :: ByteArray -- ^ 'ByteArray' to copy. -> Int -- ^ Offset into the 'ByteArray' of where to start copying. -> Ptr a -- ^ Pointer to destination buffer. -> Int -- ^ Length of the data to copy into the destination buffer. -> IO () copyByteArrayToPtr (ByteArray ba#) (I# off#) (Ptr addr#) (I# len#) = IO (\s -> case copyByteArrayToAddr# ba# off# addr# len# s of s' -> (# s', () #)) cborg-0.2.10.0/src/Codec/CBOR/Pretty.hs0000644000000000000000000002762107346545000015401 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE DeriveFunctor #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE StandaloneDeriving #-} {-# LANGUAGE UnboxedTuples #-} -- | -- Module : Codec.CBOR.Pretty -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- Pretty printing tools for debugging and analysis. -- module Codec.CBOR.Pretty ( prettyHexEnc -- :: Encoding -> String ) where #include "cbor.h" import Data.Word import qualified Data.ByteString as S import qualified Data.Text as T import Codec.CBOR.ByteArray.Sliced import Codec.CBOR.Encoding import Codec.CBOR.Write import qualified Control.Monad.Fail as Fail import Control.Monad (replicateM_) import GHC.Int (Int64) import Numeric #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif -------------------------------------------------------------------------------- newtype PP a = PP (Tokens -> Int -> ShowS -> Either String (Tokens,Int,ShowS,a)) -- | Pretty prints an 'Encoding' in an annotated, hexadecimal format -- that maps CBOR values to their types. The output format is similar -- to the format used on http://cbor.me/. -- -- For example, with the term: -- -- @ -- 'Prelude.putStrLn' . 'prettyHexEnc' . 'Codec.CBOR.encode' $ -- ( True -- , [1,2,3::Int] -- , ('Data.Map.fromList' [(\"Hello\",True),(\"World\",False)], "This is a long string which wraps") -- ) -- @ -- -- You get: -- -- @ -- 83 # list(3) -- f5 # bool(true) -- 9f # list(*) -- 01 # int(1) -- 02 # int(2) -- 03 # int(3) -- ff # break -- 82 # list(2) -- a2 # map(2) -- 65 48 65 6c 6c 6f # text(\"Hello\") -- f5 # bool(true) -- 65 57 6f 72 6c 64 # text(\"World\") -- f4 # bool(false) -- 78 21 54 68 69 73 20 69 73 20 61 20 6c 6f 6e 67 -- 20 73 74 72 69 6e 67 20 77 68 69 63 68 20 77 72 -- 61 70 73 # text("This is a long string which wraps") -- @ -- -- @since 0.2.0.0 prettyHexEnc :: Encoding -> String prettyHexEnc e = case runPP pprint e of Left s -> s Right (TkEnd,_,ss,_) -> ss "" Right (toks,_,ss,_) -> ss $ "\nprettyEnc: Not all input was consumed (this is probably a problem with the pretty printing code). Tokens left: " ++ show toks runPP :: PP a -> Encoding -> Either String (Tokens, Int, ShowS, a) runPP (PP f) (Encoding enc) = f (enc TkEnd) 0 id deriving instance Functor PP instance Applicative PP where pure a = PP (\toks ind ss -> Right (toks, ind, ss, a)) (PP f) <*> (PP x) = PP $ \toks ind ss -> case f toks ind ss of Left s -> Left s Right (toks', ind',ss',f') -> case x toks' ind' ss' of Left s -> Left s Right (toks'', ind'', ss'', x') -> Right (toks'', ind'', ss'', f' x') instance Monad PP where (PP f) >>= g = PP $ \toks ind ss -> case f toks ind ss of Left s -> Left s Right (toks', ind', ss', x) -> let PP g' = g x in g' toks' ind' ss' return = pure #if !MIN_VERSION_base(4,13,0) fail = Fail.fail #endif instance Fail.MonadFail PP where fail s = PP $ \_ _ _ -> Left s indent :: PP () indent = PP (\toks ind ss -> Right (toks,ind,ss . (replicate ind ' ' ++),())) nl :: PP () nl = PP (\toks ind ss -> Right (toks,ind,ss . ('\n':), ())) inc :: Int -> PP () inc i = PP (\toks ind ss -> Right (toks,ind+i,ss,())) dec :: Int -> PP () dec i = inc (-i) getTerm :: PP Tokens getTerm = PP $ \toks ind ss -> case unconsToken toks of Just (tk,rest) -> Right (rest,ind,ss,tk) Nothing -> Left "getTok: Unexpected end of input" peekTerm :: PP Tokens peekTerm = PP $ \toks ind ss -> case unconsToken toks of Just (tk,_) -> Right (toks,ind,ss,tk) Nothing -> Left "peekTerm: Unexpected end of input" appShowS :: ShowS -> PP () appShowS s = PP $ \toks ind ss -> Right (toks,ind,ss . s,()) str :: String -> PP () str = appShowS . showString shown :: Show a => a -> PP () shown = appShowS . shows parens :: PP a -> PP a parens pp = str "(" *> pp <* str ")" indef :: PP () -> PP () indef pp = do tk <- peekTerm case tk of TkBreak TkEnd -> dec 3 >> pprint _ -> pp >> indef pp pprint :: PP () pprint = do nl term <- getTerm hexRep term str " " case term of TkInt i TkEnd -> ppTkInt i TkInt _ _ -> termFailure term TkInt64 i TkEnd -> ppTkInt64 i TkInt64 _ _ -> termFailure term TkInteger i TkEnd -> ppTkInteger i TkInteger _ _ -> termFailure term TkWord64 w TkEnd -> ppTkWord64 w TkWord64 _ _ -> termFailure term TkWord w TkEnd -> ppTkWord w TkWord _ _ -> termFailure term TkBytes bs TkEnd -> ppTkBytes bs TkBytes _ _ -> termFailure term TkBytesBegin TkEnd -> ppTkBytesBegin TkBytesBegin _ -> termFailure term TkByteArray ba TkEnd -> ppTkByteArray ba TkByteArray _ _ -> termFailure term TkUtf8ByteArray ba TkEnd -> ppTkUtf8ByteArray ba TkUtf8ByteArray _ _ -> termFailure term TkString t TkEnd -> ppTkString t TkString _ _ -> termFailure term TkStringBegin TkEnd -> ppTkStringBegin TkStringBegin _ -> termFailure term TkListLen w TkEnd -> ppTkListLen w TkListLen _ _ -> termFailure term TkListBegin TkEnd -> ppTkListBegin TkListBegin _ -> termFailure term TkMapLen w TkEnd -> ppTkMapLen w TkMapLen _ _ -> termFailure term TkMapBegin TkEnd -> ppTkMapBegin TkMapBegin _ -> termFailure term TkBreak TkEnd -> ppTkBreak TkBreak _ -> termFailure term TkTag w TkEnd -> ppTkTag w TkTag _ _ -> termFailure term TkTag64 w TkEnd -> ppTkTag64 w TkTag64 _ _ -> termFailure term TkBool b TkEnd -> ppTkBool b TkBool _ _ -> termFailure term TkNull TkEnd -> ppTkNull TkNull _ -> termFailure term TkUndef TkEnd -> ppTkUndef TkUndef _ -> termFailure term TkSimple w TkEnd -> ppTkSimple w TkSimple _ _ -> termFailure term TkFloat16 f TkEnd -> ppTkFloat16 f TkFloat16 _ _ -> termFailure term TkFloat32 f TkEnd -> ppTkFloat32 f TkFloat32 _ _ -> termFailure term TkFloat64 f TkEnd -> ppTkFloat64 f TkFloat64 _ _ -> termFailure term TkEncoded _ TkEnd -> ppTkEncoded TkEncoded _ _ -> termFailure term TkEnd -> str "# End of input" where termFailure t = fail $ unwords ["pprint: Unexpected token:", show t] ppTkInt :: Int -> PP () ppTkInt i = str "# int" >> parens (shown i) ppTkInt64 :: Int64 -> PP () ppTkInt64 i = str "# int" >> parens (shown i) ppTkInteger :: Integer -> PP () ppTkInteger i = str "# integer" >> parens (shown i) ppTkWord64 :: Word64 -> PP () ppTkWord64 w = str "# word" >> parens (shown w) ppTkWord :: Word -> PP () ppTkWord w = str "# word" >> parens (shown w) ppTkByteArray :: SlicedByteArray -> PP () ppTkByteArray bs = str "# bytes" >> parens (shown $ sizeofSlicedByteArray bs) ppTkUtf8ByteArray :: SlicedByteArray -> PP () ppTkUtf8ByteArray bs = str "# text" >> parens (shown $ sizeofSlicedByteArray bs) ppTkBytes :: S.ByteString -> PP () ppTkBytes bs = str "# bytes" >> parens (shown (S.length bs)) ppTkBytesBegin :: PP () ppTkBytesBegin = str "# bytes(*)" >> inc 3 >> indef pprint ppTkString :: T.Text -> PP () ppTkString t = str "# text" >> parens (shown t) ppTkStringBegin:: PP () ppTkStringBegin = str "# text(*)" >> inc 3 >> indef pprint ppTkEncoded :: PP () ppTkEncoded = str "# pre-encoded CBOR term" ppTkListLen :: Word -> PP () ppTkListLen n = do str "# list" parens (shown n) inc 3 replicateM_ (fromIntegral n) pprint dec 3 ppTkListBegin :: PP () ppTkListBegin = str "# list(*)" >> inc 3 >> indef pprint ppMapPairs :: PP () ppMapPairs = do nl inc 3 indent str " # key" pprint dec 3 -- str " [end map key]" nl inc 3 indent str " # value" pprint dec 3 -- str " [end map value]" ppTkMapLen :: Word -> PP () ppTkMapLen w = do str "# map" parens (shown w) -- inc 3 replicateM_ (fromIntegral w) ppMapPairs -- dec 3 ppTkMapBegin :: PP () ppTkMapBegin = str "# map(*)" >> inc 3 >> indef ppMapPairs ppTkBreak :: PP () ppTkBreak = str "# break" ppTkTag :: Word -> PP () ppTkTag w = do str "# tag" parens (shown w) inc 3 pprint dec 3 ppTkTag64 :: Word64 -> PP () ppTkTag64 w = do str "# tag" parens (shown w) inc 3 pprint dec 3 ppTkBool :: Bool -> PP () ppTkBool True = str "# bool" >> parens (str "true") ppTkBool False = str "# bool" >> parens (str "false") ppTkNull :: PP () ppTkNull = str "# null" ppTkUndef :: PP () ppTkUndef = str "# undefined" ppTkSimple :: Word8 -> PP () ppTkSimple w = str "# simple" >> parens (shown w) ppTkFloat16 :: Float -> PP () ppTkFloat16 f = str "# float16" >> parens (shown f) ppTkFloat32 :: Float -> PP () ppTkFloat32 f = str "# float32" >> parens (shown f) ppTkFloat64 :: Double -> PP () ppTkFloat64 f = str "# float64" >> parens (shown f) unconsToken :: Tokens -> Maybe (Tokens, Tokens) unconsToken TkEnd = Nothing unconsToken (TkWord w tks) = Just (TkWord w TkEnd,tks) unconsToken (TkWord64 w tks) = Just (TkWord64 w TkEnd,tks) unconsToken (TkInt i tks) = Just (TkInt i TkEnd,tks) unconsToken (TkInt64 i tks) = Just (TkInt64 i TkEnd,tks) unconsToken (TkBytes bs tks) = Just (TkBytes bs TkEnd,tks) unconsToken (TkBytesBegin tks) = Just (TkBytesBegin TkEnd,tks) unconsToken (TkByteArray a tks) = Just (TkByteArray a TkEnd,tks) unconsToken (TkString t tks) = Just (TkString t TkEnd,tks) unconsToken (TkStringBegin tks) = Just (TkStringBegin TkEnd,tks) unconsToken (TkUtf8ByteArray a tks) = Just (TkUtf8ByteArray a TkEnd,tks) unconsToken (TkListLen len tks) = Just (TkListLen len TkEnd,tks) unconsToken (TkListBegin tks) = Just (TkListBegin TkEnd,tks) unconsToken (TkMapLen len tks) = Just (TkMapLen len TkEnd,tks) unconsToken (TkMapBegin tks) = Just (TkMapBegin TkEnd,tks) unconsToken (TkTag w tks) = Just (TkTag w TkEnd,tks) unconsToken (TkTag64 w64 tks) = Just (TkTag64 w64 TkEnd,tks) unconsToken (TkInteger i tks) = Just (TkInteger i TkEnd,tks) unconsToken (TkNull tks) = Just (TkNull TkEnd,tks) unconsToken (TkUndef tks) = Just (TkUndef TkEnd,tks) unconsToken (TkBool b tks) = Just (TkBool b TkEnd,tks) unconsToken (TkSimple w8 tks) = Just (TkSimple w8 TkEnd,tks) unconsToken (TkFloat16 f16 tks) = Just (TkFloat16 f16 TkEnd,tks) unconsToken (TkFloat32 f32 tks) = Just (TkFloat32 f32 TkEnd,tks) unconsToken (TkFloat64 f64 tks) = Just (TkFloat64 f64 TkEnd,tks) unconsToken (TkEncoded bs tks) = Just (TkEncoded bs TkEnd,tks) unconsToken (TkBreak tks) = Just (TkBreak TkEnd,tks) hexRep :: Tokens -> PP () hexRep tk = go . toStrictByteString . Encoding $ const tk where go bs | S.length bs > 16 = case S.splitAt 16 bs of (h,t) -> indent >> appShowS (hexBS h) >> nl >> go t | otherwise = indent >> appShowS (hexBS bs) hexBS :: S.ByteString -> ShowS hexBS = foldr (.) id . map (\n -> ((if n < 16 then ('0':) else id) . showHex n . (' ':))) . S.unpack cborg-0.2.10.0/src/Codec/CBOR/Read.hs0000644000000000000000000031776507346545000015000 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE CPP #-} {-# LANGUAGE DeriveDataTypeable #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RankNTypes #-} #if __GLASGOW_HASKELL__ < 900 -- Bump up from the default 1.5, otherwise our decoder fast path is no good. -- We went over the threshold when we switched to using ST. -- -- However, this flag is not supported on GHC 9.0 and later and eye-balling the -- Core suggests that the new inlining heuristics don't require it. {-# OPTIONS_GHC -funfolding-keeness-factor=2.0 #-} #endif -- | -- Module : Codec.CBOR.Read -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- Tools for reading values in a CBOR-encoded format -- back into ordinary values. -- module Codec.CBOR.Read ( deserialiseFromBytes -- :: Decoder a -> ByteString -> Either String (ByteString, a) , deserialiseFromBytesWithSize -- :: Decoder a -> ByteString -> Either String (ByteString, ByteOffset, a) , deserialiseIncremental -- :: Decoder a -> ST s (IDecode s a) , DeserialiseFailure(..) , IDecode(..) , ByteOffset ) where #include "cbor.h" #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif import GHC.Int import Control.DeepSeq import Control.Monad (ap) import Control.Monad.ST import Data.Array.IArray import Data.Array.Unboxed import qualified Data.Array.Base as A import Data.Monoid import Data.Bits import Data.ByteString (ByteString) import qualified Data.ByteString as BS import qualified Data.ByteString.Unsafe as BS import qualified Data.ByteString.Lazy as LBS import qualified Data.ByteString.Lazy.Internal as LBS import qualified Data.Text as T import qualified Data.Text.Encoding as T import Data.Word import GHC.Word #if defined(ARCH_32bit) import GHC.IntWord64 #endif import GHC.Exts import GHC.Float (float2Double) import Data.Typeable import Control.Exception -- We do all numeric conversions explicitly to be careful about overflows. import Prelude hiding (fromIntegral) import qualified Codec.CBOR.ByteArray as BA import Codec.CBOR.Decoding hiding (DecodeAction(Done, Fail)) import Codec.CBOR.Decoding (DecodeAction) import qualified Codec.CBOR.Decoding as D import Codec.CBOR.Magic -------------------------------------------------------------------------------- -- | An exception type that may be returned (by pure functions) or -- thrown (by IO actions) that fail to deserialise a given input. -- -- @since 0.2.0.0 data DeserialiseFailure = DeserialiseFailure ByteOffset String deriving (Eq, Show, Typeable) instance NFData DeserialiseFailure where rnf (DeserialiseFailure offset msg) = rnf offset `seq` rnf msg `seq` () instance Exception DeserialiseFailure where #if MIN_VERSION_base(4,8,0) displayException (DeserialiseFailure off msg) = "Codec.CBOR: deserialising failed at offset " ++ show off ++ " : " ++ msg #endif -- | An Incremental decoder, used to represent the result of -- attempting to run a decoder over a given input, and return a value -- of type @a@. data IDecode s a = -- | The decoder has consumed the available input and needs more -- to continue. Provide 'Just' if more input is available and -- 'Nothing' otherwise, and you will get a new 'IDecode'. Partial (Maybe BS.ByteString -> ST s (IDecode s a)) -- | The decoder has successfully finished. Except for the output -- value you also get any unused input as well as the number of -- bytes consumed. | Done !BS.ByteString {-# UNPACK #-} !ByteOffset a -- | The decoder ran into an error. The decoder either used -- 'fail' or was not provided enough input. Contains any -- unconsumed input, the number of bytes consumed, and a -- 'DeserialiseFailure' exception describing the reason why the -- failure occurred. | Fail !BS.ByteString {-# UNPACK #-} !ByteOffset DeserialiseFailure -- | Given a 'Decoder' and some 'LBS.ByteString' representing -- an encoded CBOR value, return 'Either' the decoded CBOR value -- or an error. In addition to the decoded value return any remaining input -- content. -- -- @since 0.2.0.0 deserialiseFromBytes :: (forall s. Decoder s a) -> LBS.ByteString -> Either DeserialiseFailure (LBS.ByteString, a) deserialiseFromBytes d lbs = fmap f $ runIDecode (deserialiseIncremental d) lbs where f (rest, _, x) = (rest, x) -- | Given a 'Decoder' and some 'LBS.ByteString' representing -- an encoded CBOR value, return 'Either' the decoded CBOR value -- or an error. In addition to the decoded value return any remaining input -- content and the number of bytes consumed. -- -- @since 0.2.0.0 deserialiseFromBytesWithSize :: (forall s. Decoder s a) -> LBS.ByteString -> Either DeserialiseFailure (LBS.ByteString, ByteOffset, a) deserialiseFromBytesWithSize d lbs = runIDecode (deserialiseIncremental d) lbs runIDecode :: (forall s. ST s (IDecode s a)) -> LBS.ByteString -> Either DeserialiseFailure (LBS.ByteString, ByteOffset, a) runIDecode d lbs = runST (go lbs =<< d) where go :: LBS.ByteString -> IDecode s a -> ST s (Either DeserialiseFailure (LBS.ByteString, ByteOffset, a)) go _ (Fail _ _ err) = return (Left err) go lbs' (Done bs off x) = let rest | BS.null bs = lbs' | otherwise = LBS.Chunk bs lbs' in return (Right (rest, off, x)) go LBS.Empty (Partial k) = k Nothing >>= go LBS.Empty go (LBS.Chunk bs lbs') (Partial k) = k (Just bs) >>= go lbs' -- | Run a 'Decoder' incrementally, returning a continuation -- representing the result of the incremental decode. -- -- @since 0.2.0.0 deserialiseIncremental :: Decoder s a -> ST s (IDecode s a) deserialiseIncremental decoder = do da <- getDecodeAction decoder runIncrementalDecoder (runDecodeAction da) ---------------------------------------------- -- A monad for building incremental decoders -- newtype IncrementalDecoder s a = IncrementalDecoder { unIncrementalDecoder :: forall r. (a -> ST s (IDecode s r)) -> ST s (IDecode s r) } instance Functor (IncrementalDecoder s) where fmap f a = a >>= return . f instance Applicative (IncrementalDecoder s) where pure x = IncrementalDecoder $ \k -> k x (<*>) = ap instance Monad (IncrementalDecoder s) where return = pure {-# INLINE (>>=) #-} m >>= f = IncrementalDecoder $ \k -> unIncrementalDecoder m $ \x -> unIncrementalDecoder (f x) k runIncrementalDecoder :: IncrementalDecoder s (ByteString, ByteOffset, a) -> ST s (IDecode s a) runIncrementalDecoder (IncrementalDecoder f) = f (\(trailing, off, x) -> return $ Done trailing off x) decodeFail :: ByteString -> ByteOffset -> String -> IncrementalDecoder s a decodeFail trailing off msg = IncrementalDecoder $ \_ -> return $ Fail trailing off exn where exn = DeserialiseFailure off msg needChunk :: IncrementalDecoder s (Maybe ByteString) needChunk = IncrementalDecoder $ \k -> return $ Partial $ \mbs -> k mbs lift :: ST s a -> IncrementalDecoder s a lift action = IncrementalDecoder (\k -> action >>= k) -------------------------------------------- -- The main decoder -- -- The top level entry point runDecodeAction :: DecodeAction s a -> IncrementalDecoder s (ByteString, ByteOffset, a) runDecodeAction (D.Fail msg) = decodeFail BS.empty 0 msg runDecodeAction (D.Done x) = return (BS.empty, 0, x) runDecodeAction (D.PeekAvailable k) = lift (k 0#) >>= runDecodeAction runDecodeAction da = do mbs <- needChunk case mbs of Nothing -> decodeFail BS.empty 0 "end of input" Just bs -> go_slow da bs 0 -- The decoder is split into a fast path and a slow path. The fast path is -- used for a single input chunk. It decodes as far as it can, reading only -- whole tokens that fit within the input chunk. When it cannot read any -- further it returns control to the slow path. The slow path fixes up all the -- complicated corner cases with tokens that span chunk boundaries, gets more -- input and then goes back into the fast path. -- -- The idea is that chunks are usually large, and we can use simpler and -- faster code if we don't make it deal with the general case of tokens that -- span chunk boundaries. -- These are all the ways in which the fast path can finish, and return -- control to the slow path. In particular there are three different cases -- of tokens spanning a chunk boundary. -- data SlowPath s a = FastDone {-# UNPACK #-} !ByteString a | SlowConsumeTokenBytes {-# UNPACK #-} !ByteString (ByteString -> ST s (DecodeAction s a)) {-# UNPACK #-} !Int | SlowConsumeTokenByteArray {-# UNPACK #-} !ByteString (BA.ByteArray -> ST s (DecodeAction s a)) {-# UNPACK #-} !Int | SlowConsumeTokenString {-# UNPACK #-} !ByteString (T.Text -> ST s (DecodeAction s a)) {-# UNPACK #-} !Int | SlowConsumeTokenUtf8ByteArray {-# UNPACK #-} !ByteString (BA.ByteArray -> ST s (DecodeAction s a)) {-# UNPACK #-} !Int #if defined(ARCH_32bit) | SlowPeekByteOffset {-# UNPACK #-} !ByteString (Int64# -> ST s (DecodeAction s a)) #else | SlowPeekByteOffset {-# UNPACK #-} !ByteString (Int# -> ST s (DecodeAction s a)) #endif | SlowDecodeAction {-# UNPACK #-} !ByteString (DecodeAction s a) | SlowFail {-# UNPACK #-} !ByteString String -- The main fast path. The fast path itself is actually split into two parts -- the main version 'go_fast' and a version used when we are near the end of -- the chunk, 'go_fast_end'. -- -- This version can then do fewer tests when we're not near the end of the -- chunk, in particular we just check if there's enough input buffer space -- left for the largest possible fixed-size token (8+1 bytes). -- go_fast :: ByteString -> DecodeAction s a -> ST s (SlowPath s a) go_fast !bs da | BS.length bs < 9 = go_fast_end bs da go_fast !bs da@(ConsumeWord k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) -> k w# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeWord8 k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) -> case gtWord# w# 0xff## of 0# -> k w# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeWord16 k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) -> case gtWord# w# 0xffff## of 0# -> k w# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeWord32 k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) -> #if defined(ARCH_32bit) k w# >>= go_fast (BS.unsafeDrop sz bs) #else case gtWord# w# 0xffffffff## of 0# -> k w# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da #endif go_fast !bs da@(ConsumeNegWord k) = case tryConsumeNegWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) -> k w# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeInt k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> k n# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeInt8 k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> case (n# ># 0x7f#) `orI#` (n# <# -0x80#) of 0# -> k n# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeInt16 k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> case (n# ># 0x7fff#) `orI#` (n# <# -0x8000#) of 0# -> k n# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeInt32 k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> #if defined(ARCH_32bit) k n# >>= go_fast (BS.unsafeDrop sz bs) #else case (n# ># 0x7fffffff#) `orI#` (n# <# -0x80000000#) of 0# -> k n# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da #endif go_fast !bs da@(ConsumeListLen k) = case tryConsumeListLen (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> k n# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeMapLen k) = case tryConsumeMapLen (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> k n# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeTag k) = case tryConsumeTag (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) -> k w# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeWordCanonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W# w#) | isWordCanonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeWord8Canonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W# w#) -> case gtWord# w# 0xff## of 0# | isWordCanonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeWord16Canonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W# w#) -> case gtWord# w# 0xffff## of 0# | isWordCanonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeWord32Canonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W# w#) -> case w_out_of_range w# of 0# | isWordCanonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da where w_out_of_range :: Word# -> Int# w_out_of_range _w# = #if defined(ARCH_32bit) 0# #else gtWord# _w# 0xffffffff## #endif go_fast !bs da@(ConsumeNegWordCanonical k) = case tryConsumeNegWord (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W# w#) | isWordCanonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeIntCanonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz n@(I# n#) | isIntCanonical sz n -> k n# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeInt8Canonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz n@(I# n#) -> case (n# ># 0x7f#) `orI#` (n# <# -0x80#) of 0# | isIntCanonical sz n -> k n# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeInt16Canonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz n@(I# n#) -> case (n# ># 0x7fff#) `orI#` (n# <# -0x8000#) of 0# | isIntCanonical sz n -> k n# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeInt32Canonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz n@(I# n#) -> case n_out_of_range n# of 0# | isIntCanonical sz n -> k n# >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da where n_out_of_range :: Int# -> Int# n_out_of_range _n# = #if defined(ARCH_32bit) 0# #else (_n# ># 0x7fffffff#) `orI#` (_n# <# -0x80000000#) #endif go_fast !bs da@(ConsumeListLenCanonical k) = case tryConsumeListLen (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz n@(I# n#) -- List length can't be negative, cast it to Word. | isWordCanonical sz (intToWord n) -> k n# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeMapLenCanonical k) = case tryConsumeMapLen (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz n@(I# n#) -- Map length can't be negative, cast it to Word. | isWordCanonical sz (intToWord n) -> k n# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeTagCanonical k) = case tryConsumeTag (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W# w#) | isWordCanonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da #if defined(ARCH_32bit) go_fast !bs da@(ConsumeWord64 k) = case tryConsumeWord64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W64# w#) -> k w# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeNegWord64 k) = case tryConsumeNegWord64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W64# w#) -> k w# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeInt64 k) = case tryConsumeInt64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I64# i#) -> k i# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeListLen64 k) = case tryConsumeListLen64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I64# i#) -> k i# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeMapLen64 k) = case tryConsumeMapLen64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I64# i#) -> k i# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeTag64 k) = case tryConsumeTag64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W64# w#) -> k w# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeWord64Canonical k) = case tryConsumeWord64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W64# w#) | isWord64Canonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeNegWord64Canonical k) = case tryConsumeNegWord64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W64# w#) | isWord64Canonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeInt64Canonical k) = case tryConsumeInt64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz i@(I64# i#) | isInt64Canonical sz i -> k i# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeListLen64Canonical k) = case tryConsumeListLen64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz i@(I64# i#) -- List length can't be negative, cast it to Word64#. | isWord64Canonical sz (int64ToWord64 i) -> k i# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeMapLen64Canonical k) = case tryConsumeMapLen64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz i@(I64# i#) -- Map length can't be negative, cast it to Word64#. | isWord64Canonical sz (int64ToWord64 i) -> k i# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeTag64Canonical k) = case tryConsumeTag64 (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz w@(W64# w#) | isWord64Canonical sz w -> k w# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da #endif go_fast !bs da@(ConsumeInteger k) = case tryConsumeInteger (BS.unsafeHead bs) bs of DecodedToken sz (BigIntToken _ n) -> k n >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeFloat k) = case tryConsumeFloat (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (F# f#) -> k f# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeDouble k) = case tryConsumeDouble (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (D# f#) -> k f# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeBytes k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (Fits _ bstr) -> k bstr >>= go_fast (BS.unsafeDrop sz bs) DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) k len go_fast !bs da@(ConsumeByteArray k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (Fits _ str) -> k (BA.fromByteString str) >>= go_fast (BS.unsafeDrop sz bs) DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenByteArray (BS.unsafeDrop sz bs) k len go_fast !bs da@(ConsumeString k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (Fits _ str) -> case T.decodeUtf8' str of Right t -> k t >>= go_fast (BS.unsafeDrop sz bs) Left _e -> return $! SlowFail bs "invalid UTF8" DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenString (BS.unsafeDrop sz bs) k len go_fast !bs da@(ConsumeUtf8ByteArray k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (Fits _ str) -> k (BA.fromByteString str) >>= go_fast (BS.unsafeDrop sz bs) DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenUtf8ByteArray (BS.unsafeDrop sz bs) k len go_fast !bs da@(ConsumeBool k) = case tryConsumeBool (BS.unsafeHead bs) of DecodeFailure -> go_fast_end bs da DecodedToken sz b -> k b >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeSimple k) = case tryConsumeSimple (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) -> k w# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeIntegerCanonical k) = case tryConsumeInteger (BS.unsafeHead bs) bs of DecodedToken sz (BigIntToken True n) -> k n >>= go_fast (BS.unsafeDrop sz bs) _ -> go_fast_end bs da go_fast !bs da@(ConsumeFloat16Canonical k) = case tryConsumeFloat (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz f@(F# f#) | isFloat16Canonical sz bs f -> k f# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeFloatCanonical k) = case tryConsumeFloat (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz f@(F# f#) | isFloatCanonical sz bs f -> k f# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeDoubleCanonical k) = case tryConsumeDouble (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz f@(D# f#) | isDoubleCanonical sz bs f -> k f# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeBytesCanonical k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodedToken sz (Fits True bstr) -> k bstr >>= go_fast (BS.unsafeDrop sz bs) DecodedToken sz (TooLong True len) -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) k len _ -> go_fast_end bs da go_fast !bs da@(ConsumeByteArrayCanonical k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodedToken sz (Fits True str) -> k (BA.fromByteString str) >>= go_fast (BS.unsafeDrop sz bs) DecodedToken sz (TooLong True len) -> return $! SlowConsumeTokenByteArray (BS.unsafeDrop sz bs) k len _ -> go_fast_end bs da go_fast !bs da@(ConsumeStringCanonical k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodedToken sz (Fits True str) -> case T.decodeUtf8' str of Right t -> k t >>= go_fast (BS.unsafeDrop sz bs) Left _e -> return $! SlowFail bs "invalid UTF8" DecodedToken sz (TooLong True len) -> return $! SlowConsumeTokenString (BS.unsafeDrop sz bs) k len _ -> go_fast_end bs da go_fast !bs da@(ConsumeUtf8ByteArrayCanonical k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodedToken sz (Fits True str) -> k (BA.fromByteString str) >>= go_fast (BS.unsafeDrop sz bs) DecodedToken sz (TooLong True len) -> return $! SlowConsumeTokenUtf8ByteArray (BS.unsafeDrop sz bs) k len _ -> go_fast_end bs da go_fast !bs da@(ConsumeSimpleCanonical k) = case tryConsumeSimple (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (W# w#) | isSimpleCanonical sz w# -> k w# >>= go_fast (BS.unsafeDrop sz bs) | otherwise -> go_fast_end bs da go_fast !bs da@(ConsumeBytesIndef k) = case tryConsumeBytesIndef (BS.unsafeHead bs) of DecodeFailure -> go_fast_end bs da DecodedToken sz _ -> k >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeStringIndef k) = case tryConsumeStringIndef (BS.unsafeHead bs) of DecodeFailure -> go_fast_end bs da DecodedToken sz _ -> k >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeListLenIndef k) = case tryConsumeListLenIndef (BS.unsafeHead bs) of DecodeFailure -> go_fast_end bs da DecodedToken sz _ -> k >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeMapLenIndef k) = case tryConsumeMapLenIndef (BS.unsafeHead bs) of DecodeFailure -> go_fast_end bs da DecodedToken sz _ -> k >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeNull k) = case tryConsumeNull (BS.unsafeHead bs) of DecodeFailure -> go_fast_end bs da DecodedToken sz _ -> k >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeListLenOrIndef k) = case tryConsumeListLenOrIndef (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> k n# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs da@(ConsumeMapLenOrIndef k) = case tryConsumeMapLenOrIndef (BS.unsafeHead bs) bs of DecodeFailure -> go_fast_end bs da DecodedToken sz (I# n#) -> k n# >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs (ConsumeBreakOr k) = case tryConsumeBreakOr (BS.unsafeHead bs) of DecodeFailure -> k False >>= go_fast bs DecodedToken sz _ -> k True >>= go_fast (BS.unsafeDrop sz bs) go_fast !bs (PeekTokenType k) = let !hdr = BS.unsafeHead bs !tkty = decodeTokenTypeTable `A.unsafeAt` word8ToInt hdr in k tkty >>= go_fast bs go_fast !bs (PeekAvailable k) = k (case BS.length bs of I# len# -> len#) >>= go_fast bs go_fast !bs da@PeekByteOffset{} = go_fast_end bs da go_fast !bs da@D.Fail{} = go_fast_end bs da go_fast !bs da@D.Done{} = go_fast_end bs da -- This variant of the fast path has to do a few more checks because we're -- near the end of the chunk. The guarantee we provide here is that we will -- decode any tokens where the whole token fits within the input buffer. So -- if we return with input buffer space still unconsumed (and we're not done -- or failed) then there's one remaining token that spans the end of the -- input chunk (the slow path fixup code relies on this guarantee). -- go_fast_end :: ByteString -> DecodeAction s a -> ST s (SlowPath s a) -- these three cases don't need any input go_fast_end !bs (D.Fail msg) = return $! SlowFail bs msg go_fast_end !bs (D.Done x) = return $! FastDone bs x go_fast_end !bs (PeekAvailable k) = k (case BS.length bs of I# len# -> len#) >>= go_fast_end bs go_fast_end !bs (PeekByteOffset k) = return $! SlowPeekByteOffset bs k -- the next two cases only need the 1 byte token header go_fast_end !bs da | BS.null bs = return $! SlowDecodeAction bs da go_fast_end !bs (ConsumeBreakOr k) = case tryConsumeBreakOr (BS.unsafeHead bs) of DecodeFailure -> k False >>= go_fast_end bs DecodedToken sz _ -> k True >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (PeekTokenType k) = let !hdr = BS.unsafeHead bs !tkty = decodeTokenTypeTable `A.unsafeAt` word8ToInt hdr in k tkty >>= go_fast_end bs -- all the remaining cases have to decode the current token go_fast_end !bs da | let !hdr = BS.unsafeHead bs , BS.length bs < tokenSize hdr = return $! SlowDecodeAction bs da go_fast_end !bs (ConsumeWord k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word" DecodedToken sz (W# w#) -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeWord8 k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word8" DecodedToken sz (W# w#) -> case gtWord# w# 0xff## of 0# -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) _ -> return $! SlowFail bs "expected word8" go_fast_end !bs (ConsumeWord16 k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word16" DecodedToken sz (W# w#) -> case gtWord# w# 0xffff## of 0# -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) _ -> return $! SlowFail bs "expected word16" go_fast_end !bs (ConsumeWord32 k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word32" DecodedToken sz (W# w#) -> #if defined(ARCH_32bit) k w# >>= go_fast_end (BS.unsafeDrop sz bs) #else case gtWord# w# 0xffffffff## of 0# -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) _ -> return $! SlowFail bs "expected word32" #endif go_fast_end !bs (ConsumeNegWord k) = case tryConsumeNegWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected negative int" DecodedToken sz (W# w#) -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeInt k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int" DecodedToken sz (I# n#) -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeInt8 k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int8" DecodedToken sz (I# n#) -> case (n# ># 0x7f#) `orI#` (n# <# -0x80#) of 0# -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) _ -> return $! SlowFail bs "expected int8" go_fast_end !bs (ConsumeInt16 k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int16" DecodedToken sz (I# n#) -> case (n# ># 0x7fff#) `orI#` (n# <# -0x8000#) of 0# -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) _ -> return $! SlowFail bs "expected int16" go_fast_end !bs (ConsumeInt32 k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int32" DecodedToken sz (I# n#) -> #if defined(ARCH_32bit) k n# >>= go_fast_end (BS.unsafeDrop sz bs) #else case (n# ># 0x7fffffff#) `orI#` (n# <# -0x80000000#) of 0# -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) _ -> return $! SlowFail bs "expected int32" #endif go_fast_end !bs (ConsumeListLen k) = case tryConsumeListLen (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected list len" DecodedToken sz (I# n#) -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeMapLen k) = case tryConsumeMapLen (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected map len" DecodedToken sz (I# n#) -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeTag k) = case tryConsumeTag (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected tag" DecodedToken sz (W# w#) -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeWordCanonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word" DecodedToken sz w@(W# w#) | isWordCanonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical word" go_fast_end !bs (ConsumeWord8Canonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word8" DecodedToken sz w@(W# w#) -> case gtWord# w# 0xff## of 0# | isWordCanonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical word8" _ -> return $! SlowFail bs "expected word8" go_fast_end !bs (ConsumeWord16Canonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word16" DecodedToken sz w@(W# w#) -> case gtWord# w# 0xffff## of 0# | isWordCanonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical word16" _ -> return $! SlowFail bs "expected word16" go_fast_end !bs (ConsumeWord32Canonical k) = case tryConsumeWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word32" DecodedToken sz w@(W# w#) -> case w_out_of_range w# of 0# | isWordCanonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical word32" _ -> return $! SlowFail bs "expected word32" where w_out_of_range :: Word# -> Int# w_out_of_range _w# = #if defined(ARCH_32bit) 0# #else gtWord# _w# 0xffffffff## #endif go_fast_end !bs (ConsumeNegWordCanonical k) = case tryConsumeNegWord (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected negative int" DecodedToken sz w@(W# w#) | isWordCanonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical negative int" go_fast_end !bs (ConsumeIntCanonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int" DecodedToken sz n@(I# n#) | isIntCanonical sz n -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical int" go_fast_end !bs (ConsumeInt8Canonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int8" DecodedToken sz n@(I# n#) -> case (n# ># 0x7f#) `orI#` (n# <# -0x80#) of 0# | isIntCanonical sz n -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical int8" _ -> return $! SlowFail bs "expected int8" go_fast_end !bs (ConsumeInt16Canonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int16" DecodedToken sz n@(I# n#) -> case (n# ># 0x7fff#) `orI#` (n# <# -0x8000#) of 0# | isIntCanonical sz n -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical int16" _ -> return $! SlowFail bs "expected int16" go_fast_end !bs (ConsumeInt32Canonical k) = case tryConsumeInt (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int32" DecodedToken sz n@(I# n#) -> case n_out_of_range n# of 0# | isIntCanonical sz n -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical int32" _ -> return $! SlowFail bs "expected int32" where n_out_of_range :: Int# -> Int# n_out_of_range _n# = #if defined(ARCH_32bit) 0# #else (_n# ># 0x7fffffff#) `orI#` (_n# <# -0x80000000#) #endif go_fast_end !bs (ConsumeListLenCanonical k) = case tryConsumeListLen (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected list len" DecodedToken sz n@(I# n#) -- List length can't be negative, cast it to Word#. | isWordCanonical sz (intToWord n) -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical list len" go_fast_end !bs (ConsumeMapLenCanonical k) = case tryConsumeMapLen (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected map len" DecodedToken sz n@(I# n#) -- Map length can't be negative, cast it to Word#. | isWordCanonical sz (intToWord n) -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical map len" go_fast_end !bs (ConsumeTagCanonical k) = case tryConsumeTag (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected tag" DecodedToken sz w@(W# w#) | isWordCanonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical tag" #if defined(ARCH_32bit) go_fast_end !bs (ConsumeWord64 k) = case tryConsumeWord64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word64" DecodedToken sz (W64# w#) -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeNegWord64 k) = case tryConsumeNegWord64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected negative int" DecodedToken sz (W64# w#) -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeInt64 k) = case tryConsumeInt64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int64" DecodedToken sz (I64# i#) -> k i# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeListLen64 k) = case tryConsumeListLen64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected list len 64" DecodedToken sz (I64# i#) -> k i# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeMapLen64 k) = case tryConsumeMapLen64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected map len 64" DecodedToken sz (I64# i#) -> k i# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeTag64 k) = case tryConsumeTag64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected tag64" DecodedToken sz (W64# w#) -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeWord64Canonical k) = case tryConsumeWord64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected word64" DecodedToken sz w@(W64# w#) | isWord64Canonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical word64" go_fast_end !bs (ConsumeNegWord64Canonical k) = case tryConsumeNegWord64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected negative int" DecodedToken sz w@(W64# w#) | isWord64Canonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical negative int" go_fast_end !bs (ConsumeInt64Canonical k) = case tryConsumeInt64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected int64" DecodedToken sz i@(I64# i#) | isInt64Canonical sz i -> k i# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical int64" go_fast_end !bs (ConsumeListLen64Canonical k) = case tryConsumeListLen64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected list len 64" DecodedToken sz i@(I64# i#) -- List length can't be negative, cast it to Word64#. | isWord64Canonical sz (int64ToWord64 i) -> k i# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical list len 64" go_fast_end !bs (ConsumeMapLen64Canonical k) = case tryConsumeMapLen64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected map len 64" DecodedToken sz i@(I64# i#) -- Map length can't be negative, cast it to Word64#. | isWord64Canonical sz (int64ToWord64 i) -> k i# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical map len 64" go_fast_end !bs (ConsumeTag64Canonical k) = case tryConsumeTag64 (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected tag64" DecodedToken sz w@(W64# w#) | isWord64Canonical sz w -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical tag64" #endif go_fast_end !bs (ConsumeInteger k) = case tryConsumeInteger (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected integer" DecodedToken sz (BigIntToken _ n) -> k n >>= go_fast_end (BS.unsafeDrop sz bs) DecodedToken sz (BigUIntNeedBody _ len) -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) (adjustContBigUIntNeedBody k) len DecodedToken sz (BigNIntNeedBody _ len) -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) (adjustContBigNIntNeedBody k) len DecodedToken sz BigUIntNeedHeader -> return $! SlowDecodeAction (BS.unsafeDrop sz bs) (adjustContBigUIntNeedHeader k) DecodedToken sz BigNIntNeedHeader -> return $! SlowDecodeAction (BS.unsafeDrop sz bs) (adjustContBigNIntNeedHeader k) go_fast_end !bs (ConsumeFloat k) = case tryConsumeFloat (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected float" DecodedToken sz (F# f#) -> k f# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeDouble k) = case tryConsumeDouble (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected double" DecodedToken sz (D# f#) -> k f# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeBytes k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected bytes" DecodedToken sz (Fits _ bstr) -> k bstr >>= go_fast_end (BS.unsafeDrop sz bs) DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) k len go_fast_end !bs (ConsumeByteArray k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected string" DecodedToken sz (Fits _ str) -> (k $! BA.fromByteString str) >>= go_fast_end (BS.unsafeDrop sz bs) DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenByteArray (BS.unsafeDrop sz bs) k len go_fast_end !bs (ConsumeString k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected string" DecodedToken sz (Fits _ str) -> case T.decodeUtf8' str of Right t -> k t >>= go_fast_end (BS.unsafeDrop sz bs) Left _e -> return $! SlowFail bs "invalid UTF8" DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenString (BS.unsafeDrop sz bs) k len go_fast_end !bs (ConsumeUtf8ByteArray k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected string" DecodedToken sz (Fits _ str) -> (k $! BA.fromByteString str) >>= go_fast_end (BS.unsafeDrop sz bs) DecodedToken sz (TooLong _ len) -> return $! SlowConsumeTokenUtf8ByteArray (BS.unsafeDrop sz bs) k len go_fast_end !bs (ConsumeBool k) = case tryConsumeBool (BS.unsafeHead bs) of DecodeFailure -> return $! SlowFail bs "expected bool" DecodedToken sz b -> k b >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeSimple k) = case tryConsumeSimple (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected simple" DecodedToken sz (W# w#) -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeIntegerCanonical k) = case tryConsumeInteger (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected integer" DecodedToken sz (BigIntToken True n) -> k n >>= go_fast_end (BS.unsafeDrop sz bs) DecodedToken sz (BigUIntNeedBody True len) -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) (adjustContCanonicalBigUIntNeedBody k) len DecodedToken sz (BigNIntNeedBody True len) -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) (adjustContCanonicalBigNIntNeedBody k) len DecodedToken sz BigUIntNeedHeader -> return $! SlowDecodeAction (BS.unsafeDrop sz bs) (adjustContCanonicalBigUIntNeedHeader k) DecodedToken sz BigNIntNeedHeader -> return $! SlowDecodeAction (BS.unsafeDrop sz bs) (adjustContCanonicalBigNIntNeedHeader k) _ -> return $! SlowFail bs "non-canonical integer" go_fast_end !bs (ConsumeFloat16Canonical k) = case tryConsumeFloat (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected float" DecodedToken sz f@(F# f#) | isFloat16Canonical sz bs f -> k f# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical float16" go_fast_end !bs (ConsumeFloatCanonical k) = case tryConsumeFloat (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected float" DecodedToken sz f@(F# f#) | isFloatCanonical sz bs f -> k f# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical float" go_fast_end !bs (ConsumeDoubleCanonical k) = case tryConsumeDouble (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected double" DecodedToken sz f@(D# f#) | isDoubleCanonical sz bs f -> k f# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical double" go_fast_end !bs (ConsumeBytesCanonical k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected bytes" DecodedToken sz token -> case token of Fits True bstr -> k bstr >>= go_fast_end (BS.unsafeDrop sz bs) TooLong True len -> return $! SlowConsumeTokenBytes (BS.unsafeDrop sz bs) k len _ -> return $! SlowFail bs "non-canonical length prefix" go_fast_end !bs (ConsumeByteArrayCanonical k) = case tryConsumeBytes (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected string" DecodedToken sz token -> case token of Fits True str -> (k $! BA.fromByteString str) >>= go_fast_end (BS.unsafeDrop sz bs) TooLong True len -> return $! SlowConsumeTokenByteArray (BS.unsafeDrop sz bs) k len _ -> return $! SlowFail bs "non-canonical length prefix" go_fast_end !bs (ConsumeStringCanonical k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected string" DecodedToken sz token -> case token of Fits True str -> case T.decodeUtf8' str of Right t -> k t >>= go_fast_end (BS.unsafeDrop sz bs) Left _e -> return $! SlowFail bs "invalid UTF8" TooLong True len -> return $! SlowConsumeTokenString (BS.unsafeDrop sz bs) k len _ -> return $! SlowFail bs "non-canonical length prefix" go_fast_end !bs (ConsumeUtf8ByteArrayCanonical k) = case tryConsumeString (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected string" DecodedToken sz token -> case token of Fits True str -> (k $! BA.fromByteString str) >>= go_fast_end (BS.unsafeDrop sz bs) TooLong True len -> return $! SlowConsumeTokenUtf8ByteArray (BS.unsafeDrop sz bs) k len _ -> return $! SlowFail bs "non-canonical length prefix" go_fast_end !bs (ConsumeSimpleCanonical k) = case tryConsumeSimple (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected simple" DecodedToken sz (W# w#) | isSimpleCanonical sz w# -> k w# >>= go_fast_end (BS.unsafeDrop sz bs) | otherwise -> return $! SlowFail bs "non-canonical simple" go_fast_end !bs (ConsumeBytesIndef k) = case tryConsumeBytesIndef (BS.unsafeHead bs) of DecodeFailure -> return $! SlowFail bs "expected bytes start" DecodedToken sz _ -> k >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeStringIndef k) = case tryConsumeStringIndef (BS.unsafeHead bs) of DecodeFailure -> return $! SlowFail bs "expected string start" DecodedToken sz _ -> k >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeListLenIndef k) = case tryConsumeListLenIndef (BS.unsafeHead bs) of DecodeFailure -> return $! SlowFail bs "expected list start" DecodedToken sz _ -> k >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeMapLenIndef k) = case tryConsumeMapLenIndef (BS.unsafeHead bs) of DecodeFailure -> return $! SlowFail bs "expected map start" DecodedToken sz _ -> k >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeNull k) = case tryConsumeNull (BS.unsafeHead bs) of DecodeFailure -> return $! SlowFail bs "expected null" DecodedToken sz _ -> k >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeListLenOrIndef k) = case tryConsumeListLenOrIndef (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected list len or indef" DecodedToken sz (I# n#) -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) go_fast_end !bs (ConsumeMapLenOrIndef k) = case tryConsumeMapLenOrIndef (BS.unsafeHead bs) bs of DecodeFailure -> return $! SlowFail bs "expected map len or indef" DecodedToken sz (I# n#) -> k n# >>= go_fast_end (BS.unsafeDrop sz bs) -- The slow path starts off by running the fast path on the current chunk -- then looking at where it finished, fixing up the chunk boundary issues, -- getting more input and going around again. -- -- The offset here is the offset after of all data consumed so far, -- so not including the current chunk. -- go_slow :: DecodeAction s a -> ByteString -> ByteOffset -> IncrementalDecoder s (ByteString, ByteOffset, a) go_slow da bs !offset = do slowpath <- lift $ go_fast bs da case slowpath of FastDone bs' x -> return (bs', offset', x) where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') SlowConsumeTokenBytes bs' k len -> do (bstr, bs'') <- getTokenVarLen len bs' offset' lift (k bstr) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') SlowConsumeTokenByteArray bs' k len -> do (bstr, bs'') <- getTokenVarLen len bs' offset' let !str = BA.fromByteString bstr lift (k str) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') SlowConsumeTokenString bs' k len -> do (bstr, bs'') <- getTokenVarLen len bs' offset' case T.decodeUtf8' bstr of Right str -> lift (k str) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) Left _e -> decodeFail bs' offset' "invalid UTF8" where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') SlowConsumeTokenUtf8ByteArray bs' k len -> do (bstr, bs'') <- getTokenVarLen len bs' offset' let !str = BA.fromByteString bstr lift (k str) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') -- we didn't have enough input in the buffer SlowDecodeAction bs' da' | BS.null bs' -> do -- in this case we're exactly out of input -- so we can get more input and carry on mbs <- needChunk case mbs of Nothing -> decodeFail bs' offset' "end of input" Just bs'' -> go_slow da' bs'' offset' where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') SlowDecodeAction bs' da' -> -- of course we should only end up here when we really are out of -- input, otherwise go_fast_end could have continued assert (BS.length bs' < tokenSize (BS.head bs')) $ go_slow_fixup da' bs' offset' where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') SlowPeekByteOffset bs' k -> lift #if MIN_VERSION_base(4,17,0) (k (int64ToInt# off#)) #else (k off#) #endif >>= \daz -> go_slow daz bs' offset' where !offset'@(I64# off#) = offset + intToInt64 (BS.length bs - BS.length bs') SlowFail bs' msg -> decodeFail bs' offset' msg where !offset' = offset + intToInt64 (BS.length bs - BS.length bs') -- The complicated case is when a token spans a chunk boundary. -- -- Our goal is to get enough input so that go_fast_end can consume exactly one -- token without need for further fixups. -- go_slow_fixup :: DecodeAction s a -> ByteString -> ByteOffset -> IncrementalDecoder s (ByteString, ByteOffset, a) go_slow_fixup da !bs !offset = do let !hdr = BS.head bs !sz = tokenSize hdr mbs <- needChunk case mbs of Nothing -> decodeFail bs offset "end of input" Just bs' -- We have enough input now, try reading one final token | BS.length bs + BS.length bs' >= sz -> go_slow_overlapped da sz bs bs' offset -- We still don't have enough input, get more | otherwise -> go_slow_fixup da (bs <> bs') offset -- We've now got more input, but we have one token that spanned the old and -- new input buffers, so we have to decode that one before carrying on go_slow_overlapped :: DecodeAction s a -> Int -> ByteString -> ByteString -> ByteOffset -> IncrementalDecoder s (ByteString, ByteOffset, a) go_slow_overlapped da sz bs_cur bs_next !offset = -- we have: -- sz the size of the pending input token -- bs_cur the tail end of the previous input buffer -- bs_next the next input chunk -- we know the old buffer is too small, but the combo is enough assert (BS.length bs_cur < sz) $ assert (BS.length bs_cur + BS.length bs_next >= sz) $ -- we make: -- bs_tok a buffer containing only the pending input token -- bs' the tail of the next input chunk, -- which will become the next input buffer let bs_tok = bs_cur <> BS.unsafeTake (sz - BS.length bs_cur) bs_next bs' = BS.unsafeDrop (sz - BS.length bs_cur) bs_next offset' = offset + intToInt64 sz in -- so the token chunk should be exactly the right size assert (BS.length bs_tok == sz) $ -- and overall we shouldn't loose any input assert (BS.length bs_cur + BS.length bs_next == sz + BS.length bs') $ do -- so now we can run the fast path to consume just this one token slowpath <- lift $ go_fast_end bs_tok da case slowpath of -- typically we'll fall out of the fast path having -- consumed exactly one token, now with no trailing data SlowDecodeAction bs_empty da' -> assert (BS.null bs_empty) $ go_slow da' bs' offset' -- but the other possibilities can happen too FastDone bs_empty x -> assert (BS.null bs_empty) $ return (bs', offset', x) SlowConsumeTokenBytes bs_empty k len -> assert (BS.null bs_empty) $ do (bstr, bs'') <- getTokenShortOrVarLen bs' offset' len lift (k bstr) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) SlowConsumeTokenByteArray bs_empty k len -> assert (BS.null bs_empty) $ do (bstr, bs'') <- getTokenShortOrVarLen bs' offset' len let !ba = BA.fromByteString bstr lift (k ba) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) SlowConsumeTokenString bs_empty k len -> assert (BS.null bs_empty) $ do (bstr, bs'') <- getTokenShortOrVarLen bs' offset' len case T.decodeUtf8' bstr of Right str -> lift (k str) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) Left _e -> decodeFail bs' offset' "invalid UTF8" SlowConsumeTokenUtf8ByteArray bs_empty k len -> assert (BS.null bs_empty) $ do (bstr, bs'') <- getTokenShortOrVarLen bs' offset' len let !ba = BA.fromByteString bstr lift (k ba) >>= \daz -> go_slow daz bs'' (offset' + intToInt64 len) SlowPeekByteOffset bs_empty k -> assert (BS.null bs_empty) $ do lift #if MIN_VERSION_base(4,17,0) (k (int64ToInt# off#)) #else (k off#) #endif >>= \daz -> go_slow daz bs' offset' where !(I64# off#) = offset' SlowFail bs_unconsumed msg -> decodeFail (bs_unconsumed <> bs') offset'' msg where !offset'' = offset + intToInt64 (sz - BS.length bs_unconsumed) where {-# INLINE getTokenShortOrVarLen #-} getTokenShortOrVarLen :: BS.ByteString -> ByteOffset -> Int -> IncrementalDecoder s (ByteString, ByteString) getTokenShortOrVarLen bs' offset' len | BS.length bs' < len = getTokenVarLen len bs' offset' | otherwise = let !bstr = BS.take len bs' !bs'' = BS.drop len bs' in return (bstr, bs'') -- TODO FIXME: we can do slightly better here. If we're returning a -- lazy string (String, lazy Text, lazy ByteString) then we don't have -- to strictify here and if we're returning a strict string perhaps we -- can still stream the utf8 validation/converstion -- TODO FIXME: also consider sharing or not sharing here, and possibly -- rechunking. getTokenVarLen :: Int -> ByteString -> ByteOffset -> IncrementalDecoder s (ByteString, ByteString) getTokenVarLen len bs offset = assert (len > BS.length bs) $ do mbs <- needChunk case mbs of Nothing -> decodeFail BS.empty offset "end of input" Just bs' | let n = len - BS.length bs , BS.length bs' >= n -> let !tok = bs <> BS.unsafeTake n bs' in return (tok, BS.drop n bs') | otherwise -> getTokenVarLenSlow [bs',bs] (len - (BS.length bs + BS.length bs')) offset getTokenVarLenSlow :: [ByteString] -> Int -> ByteOffset -> IncrementalDecoder s (ByteString, ByteString) getTokenVarLenSlow bss n offset = do mbs <- needChunk case mbs of Nothing -> decodeFail BS.empty offset "end of input" Just bs | BS.length bs >= n -> let !tok = BS.concat (reverse (BS.unsafeTake n bs : bss)) in return (tok, BS.drop n bs) | otherwise -> getTokenVarLenSlow (bs:bss) (n - BS.length bs) offset tokenSize :: Word8 -> Int tokenSize hdr = word8ToInt $ decodeTableSz `A.unsafeAt` (word8ToInt hdr .&. 0x1f) decodeTableSz :: UArray Word8 Word8 decodeTableSz = array (0, 0x1f) $ [ (encodeHeader 0 n, 1) | n <- [0..0x1f] ] ++ [ (encodeHeader 0 n, s) | (n, s) <- zip [24..27] [2,3,5,9] ] decodeTokenTypeTable :: Array Word8 TokenType decodeTokenTypeTable = array (minBound, maxBound) $ [ (encodeHeader 0 n, TypeUInt) | n <- [0..26] ] ++ [ (encodeHeader 0 27, TypeUInt64) , (encodeHeader 0 31, TypeInvalid) ] ++ [ (encodeHeader 1 n, TypeNInt) | n <- [0..26] ] ++ [ (encodeHeader 1 27, TypeNInt64) , (encodeHeader 1 31, TypeInvalid) ] ++ [ (encodeHeader 2 n, TypeBytes) | n <- [0..27] ] ++ [ (encodeHeader 2 31, TypeBytesIndef) ] ++ [ (encodeHeader 3 n, TypeString) | n <- [0..27] ] ++ [ (encodeHeader 3 31, TypeStringIndef) ] ++ [ (encodeHeader 4 n, TypeListLen) | n <- [0..26] ] ++ [ (encodeHeader 4 27, TypeListLen64) , (encodeHeader 4 31, TypeListLenIndef) ] ++ [ (encodeHeader 5 n, TypeMapLen) | n <- [0..26] ] ++ [ (encodeHeader 5 27, TypeMapLen64) , (encodeHeader 5 31, TypeMapLenIndef) ] ++ [ (encodeHeader 6 n, TypeTag) | n <- 0:1:[4..26] ] ++ [ (encodeHeader 6 2, TypeInteger) , (encodeHeader 6 3, TypeInteger) , (encodeHeader 6 27, TypeTag64) , (encodeHeader 6 31, TypeInvalid) ] ++ [ (encodeHeader 7 n, TypeSimple) | n <- [0..19] ] ++ [ (encodeHeader 7 20, TypeBool) , (encodeHeader 7 21, TypeBool) , (encodeHeader 7 22, TypeNull) , (encodeHeader 7 23, TypeSimple) , (encodeHeader 7 24, TypeSimple) , (encodeHeader 7 25, TypeFloat16) , (encodeHeader 7 26, TypeFloat32) , (encodeHeader 7 27, TypeFloat64) , (encodeHeader 7 31, TypeBreak) ] ++ [ (encodeHeader mt n, TypeInvalid) | mt <- [0..7], n <- [28..30] ] encodeHeader :: Word8 -> Word8 -> Word8 encodeHeader mt ai = mt `shiftL` 5 .|. ai data DecodedToken a = DecodedToken !Int !a | DecodeFailure deriving Show -- TODO add classification for DecodeFailure -- | Note that canonicity information is calculated lazily. This way we don't -- need to concern ourselves with two distinct paths, while according to -- benchmarks it doesn't affect performance in the non-canonical case. data LongToken a = Fits Bool {- canonical? -} !a | TooLong Bool {- canonical? -} !Int deriving Show -- Canoncal NaN floats: -- -- In these float/double canonical tests we check NaNs are canonical too. -- There are lots of bit values representing NaN, for each of the flat types. -- The rule from CBOR RFC 7049, section 3.9 is that the canonical NaN is the -- CBOR term f97e00 which is the canonical half-float representation. We do -- this by testing for the size being 3 (since tryConsumeFloat/Double only -- return 3 when the header byte is 0xf9) and the 16 bytes being 0x7e00. {-# INLINE isFloat16Canonical #-} isFloat16Canonical :: Int -> BS.ByteString -> Float -> Bool isFloat16Canonical sz bs f | sz /= 3 = False | isNaN f = eatTailWord16 bs == 0x7e00 | otherwise = True {-# INLINE isFloatCanonical #-} isFloatCanonical :: Int -> BS.ByteString -> Float -> Bool isFloatCanonical sz bs f | isNaN f = sz == 3 && eatTailWord16 bs == 0x7e00 | otherwise = sz == 5 {-# INLINE isDoubleCanonical #-} isDoubleCanonical :: Int -> BS.ByteString -> Double -> Bool isDoubleCanonical sz bs f | isNaN f = sz == 3 && eatTailWord16 bs == 0x7e00 | otherwise = sz == 9 {-# INLINE isWordCanonical #-} isWordCanonical :: Int -> Word -> Bool isWordCanonical sz !w | sz == 2 = w > 0x17 | sz == 3 = w > 0xff | sz == 5 = w > 0xffff | sz == 9 = w > 0xffffffff | otherwise = True {-# INLINE isIntCanonical #-} isIntCanonical :: Int -> Int -> Bool isIntCanonical sz i | i < 0 = isWordCanonical sz (complement w) | otherwise = isWordCanonical sz w where w = intToWord i #if defined(ARCH_32bit) {-# INLINE isWord64Canonical #-} isWord64Canonical :: Int -> Word64 -> Bool isWord64Canonical sz w | sz == 2 = w > 0x17) | sz == 3 = w > 0xff) | sz == 5 = w > 0xffff) | sz == 9 = w > 0xffffffff) | otherwise = True {-# INLINE isInt64Canonical #-} isInt64Canonical :: Int -> Int64# -> Bool isInt64Canonical sz i# | isTrue# (i# `ltInt64#` intToInt64# 0#) = isWord64Canonical sz (not64# w#) | otherwise = isWord64Canonical sz w# where w# = int64ToWord64# i# #endif {-# INLINE isSimpleCanonical #-} isSimpleCanonical :: Int -> Word# -> Bool isSimpleCanonical 2 w# = isTrue# (w# `gtWord#` 0x17##) isSimpleCanonical _ _ = True -- only size 1 and 2 are possible here -- TODO FIXME: check with 7.10 and file ticket: -- a case analysis against 0x00 .. 0xff :: Word8 turns into a huge chain -- of >= tests. It could use a jump table, or at least it could use a binary -- division. Whereas for Int or Word it does the right thing. {-# INLINE tryConsumeWord #-} tryConsumeWord :: Word8 -> ByteString -> DecodedToken Word tryConsumeWord hdr !bs = case word8ToWord hdr of -- Positive integers (type 0) 0x00 -> DecodedToken 1 0 0x01 -> DecodedToken 1 1 0x02 -> DecodedToken 1 2 0x03 -> DecodedToken 1 3 0x04 -> DecodedToken 1 4 0x05 -> DecodedToken 1 5 0x06 -> DecodedToken 1 6 0x07 -> DecodedToken 1 7 0x08 -> DecodedToken 1 8 0x09 -> DecodedToken 1 9 0x0a -> DecodedToken 1 10 0x0b -> DecodedToken 1 11 0x0c -> DecodedToken 1 12 0x0d -> DecodedToken 1 13 0x0e -> DecodedToken 1 14 0x0f -> DecodedToken 1 15 0x10 -> DecodedToken 1 16 0x11 -> DecodedToken 1 17 0x12 -> DecodedToken 1 18 0x13 -> DecodedToken 1 19 0x14 -> DecodedToken 1 20 0x15 -> DecodedToken 1 21 0x16 -> DecodedToken 1 22 0x17 -> DecodedToken 1 23 0x18 -> DecodedToken 2 $! word8ToWord (eatTailWord8 bs) 0x19 -> DecodedToken 3 $! word16ToWord (eatTailWord16 bs) 0x1a -> DecodedToken 5 $! word32ToWord (eatTailWord32 bs) #if defined(ARCH_64bit) 0x1b -> DecodedToken 9 $! word64ToWord (eatTailWord64 bs) #else 0x1b -> case word64ToWord (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure #endif _ -> DecodeFailure {-# INLINE tryConsumeNegWord #-} tryConsumeNegWord :: Word8 -> ByteString -> DecodedToken Word tryConsumeNegWord hdr !bs = case word8ToWord hdr of -- Positive integers (type 0) 0x20 -> DecodedToken 1 0 0x21 -> DecodedToken 1 1 0x22 -> DecodedToken 1 2 0x23 -> DecodedToken 1 3 0x24 -> DecodedToken 1 4 0x25 -> DecodedToken 1 5 0x26 -> DecodedToken 1 6 0x27 -> DecodedToken 1 7 0x28 -> DecodedToken 1 8 0x29 -> DecodedToken 1 9 0x2a -> DecodedToken 1 10 0x2b -> DecodedToken 1 11 0x2c -> DecodedToken 1 12 0x2d -> DecodedToken 1 13 0x2e -> DecodedToken 1 14 0x2f -> DecodedToken 1 15 0x30 -> DecodedToken 1 16 0x31 -> DecodedToken 1 17 0x32 -> DecodedToken 1 18 0x33 -> DecodedToken 1 19 0x34 -> DecodedToken 1 20 0x35 -> DecodedToken 1 21 0x36 -> DecodedToken 1 22 0x37 -> DecodedToken 1 23 0x38 -> DecodedToken 2 $! (word8ToWord (eatTailWord8 bs)) 0x39 -> DecodedToken 3 $! (word16ToWord (eatTailWord16 bs)) 0x3a -> DecodedToken 5 $! (word32ToWord (eatTailWord32 bs)) #if defined(ARCH_64bit) 0x3b -> DecodedToken 9 $! (word64ToWord (eatTailWord64 bs)) #else 0x3b -> case word64ToWord (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure #endif _ -> DecodeFailure {-# INLINE tryConsumeInt #-} tryConsumeInt :: Word8 -> ByteString -> DecodedToken Int tryConsumeInt hdr !bs = case word8ToWord hdr of -- Positive integers (type 0) 0x00 -> DecodedToken 1 0 0x01 -> DecodedToken 1 1 0x02 -> DecodedToken 1 2 0x03 -> DecodedToken 1 3 0x04 -> DecodedToken 1 4 0x05 -> DecodedToken 1 5 0x06 -> DecodedToken 1 6 0x07 -> DecodedToken 1 7 0x08 -> DecodedToken 1 8 0x09 -> DecodedToken 1 9 0x0a -> DecodedToken 1 10 0x0b -> DecodedToken 1 11 0x0c -> DecodedToken 1 12 0x0d -> DecodedToken 1 13 0x0e -> DecodedToken 1 14 0x0f -> DecodedToken 1 15 0x10 -> DecodedToken 1 16 0x11 -> DecodedToken 1 17 0x12 -> DecodedToken 1 18 0x13 -> DecodedToken 1 19 0x14 -> DecodedToken 1 20 0x15 -> DecodedToken 1 21 0x16 -> DecodedToken 1 22 0x17 -> DecodedToken 1 23 0x18 -> DecodedToken 2 $! (word8ToInt (eatTailWord8 bs)) 0x19 -> DecodedToken 3 $! (word16ToInt (eatTailWord16 bs)) #if defined(ARCH_64bit) 0x1a -> DecodedToken 5 $! (word32ToInt (eatTailWord32 bs)) #else 0x1a -> case word32ToInt (eatTailWord32 bs) of Just n -> DecodedToken 5 n Nothing -> DecodeFailure #endif 0x1b -> case word64ToInt (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure -- Negative integers (type 1) 0x20 -> DecodedToken 1 (-1) 0x21 -> DecodedToken 1 (-2) 0x22 -> DecodedToken 1 (-3) 0x23 -> DecodedToken 1 (-4) 0x24 -> DecodedToken 1 (-5) 0x25 -> DecodedToken 1 (-6) 0x26 -> DecodedToken 1 (-7) 0x27 -> DecodedToken 1 (-8) 0x28 -> DecodedToken 1 (-9) 0x29 -> DecodedToken 1 (-10) 0x2a -> DecodedToken 1 (-11) 0x2b -> DecodedToken 1 (-12) 0x2c -> DecodedToken 1 (-13) 0x2d -> DecodedToken 1 (-14) 0x2e -> DecodedToken 1 (-15) 0x2f -> DecodedToken 1 (-16) 0x30 -> DecodedToken 1 (-17) 0x31 -> DecodedToken 1 (-18) 0x32 -> DecodedToken 1 (-19) 0x33 -> DecodedToken 1 (-20) 0x34 -> DecodedToken 1 (-21) 0x35 -> DecodedToken 1 (-22) 0x36 -> DecodedToken 1 (-23) 0x37 -> DecodedToken 1 (-24) 0x38 -> DecodedToken 2 $! (-1 - word8ToInt (eatTailWord8 bs)) 0x39 -> DecodedToken 3 $! (-1 - word16ToInt (eatTailWord16 bs)) #if defined(ARCH_64bit) 0x3a -> DecodedToken 5 $! (-1 - word32ToInt (eatTailWord32 bs)) #else 0x3a -> case word32ToInt (eatTailWord32 bs) of Just n -> DecodedToken 5 (-1 - n) Nothing -> DecodeFailure #endif 0x3b -> case word64ToInt (eatTailWord64 bs) of Just n -> DecodedToken 9 (-1 - n) Nothing -> DecodeFailure _ -> DecodeFailure {-# INLINE tryConsumeInteger #-} tryConsumeInteger :: Word8 -> ByteString -> DecodedToken (BigIntToken Integer) tryConsumeInteger hdr !bs = case word8ToWord hdr of -- Positive integers (type 0) 0x00 -> DecodedToken 1 (BigIntToken True 0) 0x01 -> DecodedToken 1 (BigIntToken True 1) 0x02 -> DecodedToken 1 (BigIntToken True 2) 0x03 -> DecodedToken 1 (BigIntToken True 3) 0x04 -> DecodedToken 1 (BigIntToken True 4) 0x05 -> DecodedToken 1 (BigIntToken True 5) 0x06 -> DecodedToken 1 (BigIntToken True 6) 0x07 -> DecodedToken 1 (BigIntToken True 7) 0x08 -> DecodedToken 1 (BigIntToken True 8) 0x09 -> DecodedToken 1 (BigIntToken True 9) 0x0a -> DecodedToken 1 (BigIntToken True 10) 0x0b -> DecodedToken 1 (BigIntToken True 11) 0x0c -> DecodedToken 1 (BigIntToken True 12) 0x0d -> DecodedToken 1 (BigIntToken True 13) 0x0e -> DecodedToken 1 (BigIntToken True 14) 0x0f -> DecodedToken 1 (BigIntToken True 15) 0x10 -> DecodedToken 1 (BigIntToken True 16) 0x11 -> DecodedToken 1 (BigIntToken True 17) 0x12 -> DecodedToken 1 (BigIntToken True 18) 0x13 -> DecodedToken 1 (BigIntToken True 19) 0x14 -> DecodedToken 1 (BigIntToken True 20) 0x15 -> DecodedToken 1 (BigIntToken True 21) 0x16 -> DecodedToken 1 (BigIntToken True 22) 0x17 -> DecodedToken 1 (BigIntToken True 23) 0x18 -> let !w = eatTailWord8 bs sz = 2 in DecodedToken sz (BigIntToken (isWordCanonical sz (word8ToWord w)) $! toInteger w) 0x19 -> let !w = eatTailWord16 bs sz = 3 in DecodedToken sz (BigIntToken (isWordCanonical sz (word16ToWord w)) $! toInteger w) 0x1a -> let !w = eatTailWord32 bs sz = 5 in DecodedToken sz (BigIntToken (isWordCanonical sz (word32ToWord w)) $! toInteger w) 0x1b -> let !w = eatTailWord64 bs sz = 9 #if defined(ARCH_32bit) in DecodedToken sz (BigIntToken (isWord64Canonical sz (word64ToWord w)) $! toInteger w) #else in DecodedToken sz (BigIntToken (isWordCanonical sz (word64ToWord w)) $! toInteger w) #endif -- Negative integers (type 1) 0x20 -> DecodedToken 1 (BigIntToken True (-1)) 0x21 -> DecodedToken 1 (BigIntToken True (-2)) 0x22 -> DecodedToken 1 (BigIntToken True (-3)) 0x23 -> DecodedToken 1 (BigIntToken True (-4)) 0x24 -> DecodedToken 1 (BigIntToken True (-5)) 0x25 -> DecodedToken 1 (BigIntToken True (-6)) 0x26 -> DecodedToken 1 (BigIntToken True (-7)) 0x27 -> DecodedToken 1 (BigIntToken True (-8)) 0x28 -> DecodedToken 1 (BigIntToken True (-9)) 0x29 -> DecodedToken 1 (BigIntToken True (-10)) 0x2a -> DecodedToken 1 (BigIntToken True (-11)) 0x2b -> DecodedToken 1 (BigIntToken True (-12)) 0x2c -> DecodedToken 1 (BigIntToken True (-13)) 0x2d -> DecodedToken 1 (BigIntToken True (-14)) 0x2e -> DecodedToken 1 (BigIntToken True (-15)) 0x2f -> DecodedToken 1 (BigIntToken True (-16)) 0x30 -> DecodedToken 1 (BigIntToken True (-17)) 0x31 -> DecodedToken 1 (BigIntToken True (-18)) 0x32 -> DecodedToken 1 (BigIntToken True (-19)) 0x33 -> DecodedToken 1 (BigIntToken True (-20)) 0x34 -> DecodedToken 1 (BigIntToken True (-21)) 0x35 -> DecodedToken 1 (BigIntToken True (-22)) 0x36 -> DecodedToken 1 (BigIntToken True (-23)) 0x37 -> DecodedToken 1 (BigIntToken True (-24)) 0x38 -> let !w = eatTailWord8 bs sz = 2 in DecodedToken sz (BigIntToken (isWordCanonical sz (word8ToWord w)) $! (-1 - toInteger w)) 0x39 -> let !w = eatTailWord16 bs sz = 3 in DecodedToken sz (BigIntToken (isWordCanonical sz (word16ToWord w)) $! (-1 - toInteger w)) 0x3a -> let !w = eatTailWord32 bs sz = 5 in DecodedToken sz (BigIntToken (isWordCanonical sz (word32ToWord w)) $! (-1 - toInteger w)) 0x3b -> let !w = eatTailWord64 bs sz = 9 #if defined(ARCH_32bit) in DecodedToken sz (BigIntToken (isWord64Canonical sz (word64ToWord w)) $! (-1 - toInteger w)) #else in DecodedToken sz (BigIntToken (isWordCanonical sz (word64ToWord w)) $! (-1 - toInteger w)) #endif 0xc2 -> readBigUInt bs 0xc3 -> readBigNInt bs _ -> DecodeFailure {-# INLINE tryConsumeBytes #-} tryConsumeBytes :: Word8 -> ByteString -> DecodedToken (LongToken ByteString) tryConsumeBytes hdr !bs = case word8ToWord hdr of -- Bytes (type 2) 0x40 -> readBytesSmall 0 bs 0x41 -> readBytesSmall 1 bs 0x42 -> readBytesSmall 2 bs 0x43 -> readBytesSmall 3 bs 0x44 -> readBytesSmall 4 bs 0x45 -> readBytesSmall 5 bs 0x46 -> readBytesSmall 6 bs 0x47 -> readBytesSmall 7 bs 0x48 -> readBytesSmall 8 bs 0x49 -> readBytesSmall 9 bs 0x4a -> readBytesSmall 10 bs 0x4b -> readBytesSmall 11 bs 0x4c -> readBytesSmall 12 bs 0x4d -> readBytesSmall 13 bs 0x4e -> readBytesSmall 14 bs 0x4f -> readBytesSmall 15 bs 0x50 -> readBytesSmall 16 bs 0x51 -> readBytesSmall 17 bs 0x52 -> readBytesSmall 18 bs 0x53 -> readBytesSmall 19 bs 0x54 -> readBytesSmall 20 bs 0x55 -> readBytesSmall 21 bs 0x56 -> readBytesSmall 22 bs 0x57 -> readBytesSmall 23 bs 0x58 -> readBytes8 bs 0x59 -> readBytes16 bs 0x5a -> readBytes32 bs 0x5b -> readBytes64 bs _ -> DecodeFailure {-# INLINE tryConsumeString #-} tryConsumeString :: Word8 -> ByteString -> DecodedToken (LongToken ByteString) tryConsumeString hdr !bs = case word8ToWord hdr of -- Strings (type 3) 0x60 -> readBytesSmall 0 bs 0x61 -> readBytesSmall 1 bs 0x62 -> readBytesSmall 2 bs 0x63 -> readBytesSmall 3 bs 0x64 -> readBytesSmall 4 bs 0x65 -> readBytesSmall 5 bs 0x66 -> readBytesSmall 6 bs 0x67 -> readBytesSmall 7 bs 0x68 -> readBytesSmall 8 bs 0x69 -> readBytesSmall 9 bs 0x6a -> readBytesSmall 10 bs 0x6b -> readBytesSmall 11 bs 0x6c -> readBytesSmall 12 bs 0x6d -> readBytesSmall 13 bs 0x6e -> readBytesSmall 14 bs 0x6f -> readBytesSmall 15 bs 0x70 -> readBytesSmall 16 bs 0x71 -> readBytesSmall 17 bs 0x72 -> readBytesSmall 18 bs 0x73 -> readBytesSmall 19 bs 0x74 -> readBytesSmall 20 bs 0x75 -> readBytesSmall 21 bs 0x76 -> readBytesSmall 22 bs 0x77 -> readBytesSmall 23 bs 0x78 -> readBytes8 bs 0x79 -> readBytes16 bs 0x7a -> readBytes32 bs 0x7b -> readBytes64 bs _ -> DecodeFailure {-# INLINE tryConsumeListLen #-} tryConsumeListLen :: Word8 -> ByteString -> DecodedToken Int tryConsumeListLen hdr !bs = case word8ToWord hdr of -- List structures (type 4) 0x80 -> DecodedToken 1 0 0x81 -> DecodedToken 1 1 0x82 -> DecodedToken 1 2 0x83 -> DecodedToken 1 3 0x84 -> DecodedToken 1 4 0x85 -> DecodedToken 1 5 0x86 -> DecodedToken 1 6 0x87 -> DecodedToken 1 7 0x88 -> DecodedToken 1 8 0x89 -> DecodedToken 1 9 0x8a -> DecodedToken 1 10 0x8b -> DecodedToken 1 11 0x8c -> DecodedToken 1 12 0x8d -> DecodedToken 1 13 0x8e -> DecodedToken 1 14 0x8f -> DecodedToken 1 15 0x90 -> DecodedToken 1 16 0x91 -> DecodedToken 1 17 0x92 -> DecodedToken 1 18 0x93 -> DecodedToken 1 19 0x94 -> DecodedToken 1 20 0x95 -> DecodedToken 1 21 0x96 -> DecodedToken 1 22 0x97 -> DecodedToken 1 23 0x98 -> DecodedToken 2 (word8ToInt (eatTailWord8 bs)) 0x99 -> DecodedToken 3 (word16ToInt (eatTailWord16 bs)) #if defined(ARCH_64bit) 0x9a -> DecodedToken 5 (word32ToInt (eatTailWord32 bs)) #else 0x9a -> case word32ToInt (eatTailWord32 bs) of Just n -> DecodedToken 5 n Nothing -> DecodeFailure #endif 0x9b -> case word64ToInt (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure _ -> DecodeFailure {-# INLINE tryConsumeMapLen #-} tryConsumeMapLen :: Word8 -> ByteString -> DecodedToken Int tryConsumeMapLen hdr !bs = case word8ToWord hdr of -- Map structures (type 5) 0xa0 -> DecodedToken 1 0 0xa1 -> DecodedToken 1 1 0xa2 -> DecodedToken 1 2 0xa3 -> DecodedToken 1 3 0xa4 -> DecodedToken 1 4 0xa5 -> DecodedToken 1 5 0xa6 -> DecodedToken 1 6 0xa7 -> DecodedToken 1 7 0xa8 -> DecodedToken 1 8 0xa9 -> DecodedToken 1 9 0xaa -> DecodedToken 1 10 0xab -> DecodedToken 1 11 0xac -> DecodedToken 1 12 0xad -> DecodedToken 1 13 0xae -> DecodedToken 1 14 0xaf -> DecodedToken 1 15 0xb0 -> DecodedToken 1 16 0xb1 -> DecodedToken 1 17 0xb2 -> DecodedToken 1 18 0xb3 -> DecodedToken 1 19 0xb4 -> DecodedToken 1 20 0xb5 -> DecodedToken 1 21 0xb6 -> DecodedToken 1 22 0xb7 -> DecodedToken 1 23 0xb8 -> DecodedToken 2 $! (word8ToInt (eatTailWord8 bs)) 0xb9 -> DecodedToken 3 $! (word16ToInt (eatTailWord16 bs)) #if defined(ARCH_64bit) 0xba -> DecodedToken 5 $! (word32ToInt (eatTailWord32 bs)) #else 0xba -> case word32ToInt (eatTailWord32 bs) of Just n -> DecodedToken 5 n Nothing -> DecodeFailure #endif 0xbb -> case word64ToInt (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure _ -> DecodeFailure {-# INLINE tryConsumeListLenIndef #-} tryConsumeListLenIndef :: Word8 -> DecodedToken () tryConsumeListLenIndef hdr = case word8ToWord hdr of 0x9f -> DecodedToken 1 () _ -> DecodeFailure {-# INLINE tryConsumeMapLenIndef #-} tryConsumeMapLenIndef :: Word8 -> DecodedToken () tryConsumeMapLenIndef hdr = case word8ToWord hdr of 0xbf -> DecodedToken 1 () _ -> DecodeFailure {-# INLINE tryConsumeListLenOrIndef #-} tryConsumeListLenOrIndef :: Word8 -> ByteString -> DecodedToken Int tryConsumeListLenOrIndef hdr !bs = case word8ToWord hdr of -- List structures (type 4) 0x80 -> DecodedToken 1 0 0x81 -> DecodedToken 1 1 0x82 -> DecodedToken 1 2 0x83 -> DecodedToken 1 3 0x84 -> DecodedToken 1 4 0x85 -> DecodedToken 1 5 0x86 -> DecodedToken 1 6 0x87 -> DecodedToken 1 7 0x88 -> DecodedToken 1 8 0x89 -> DecodedToken 1 9 0x8a -> DecodedToken 1 10 0x8b -> DecodedToken 1 11 0x8c -> DecodedToken 1 12 0x8d -> DecodedToken 1 13 0x8e -> DecodedToken 1 14 0x8f -> DecodedToken 1 15 0x90 -> DecodedToken 1 16 0x91 -> DecodedToken 1 17 0x92 -> DecodedToken 1 18 0x93 -> DecodedToken 1 19 0x94 -> DecodedToken 1 20 0x95 -> DecodedToken 1 21 0x96 -> DecodedToken 1 22 0x97 -> DecodedToken 1 23 0x98 -> DecodedToken 2 $! (word8ToInt (eatTailWord8 bs)) 0x99 -> DecodedToken 3 $! (word16ToInt (eatTailWord16 bs)) #if defined(ARCH_64bit) 0x9a -> DecodedToken 5 $! (word32ToInt (eatTailWord32 bs)) #else 0x9a -> case word32ToInt (eatTailWord32 bs) of Just n -> DecodedToken 5 n Nothing -> DecodeFailure #endif 0x9b -> case word64ToInt (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure 0x9f -> DecodedToken 1 (-1) -- indefinite length _ -> DecodeFailure {-# INLINE tryConsumeMapLenOrIndef #-} tryConsumeMapLenOrIndef :: Word8 -> ByteString -> DecodedToken Int tryConsumeMapLenOrIndef hdr !bs = case word8ToWord hdr of -- Map structures (type 5) 0xa0 -> DecodedToken 1 0 0xa1 -> DecodedToken 1 1 0xa2 -> DecodedToken 1 2 0xa3 -> DecodedToken 1 3 0xa4 -> DecodedToken 1 4 0xa5 -> DecodedToken 1 5 0xa6 -> DecodedToken 1 6 0xa7 -> DecodedToken 1 7 0xa8 -> DecodedToken 1 8 0xa9 -> DecodedToken 1 9 0xaa -> DecodedToken 1 10 0xab -> DecodedToken 1 11 0xac -> DecodedToken 1 12 0xad -> DecodedToken 1 13 0xae -> DecodedToken 1 14 0xaf -> DecodedToken 1 15 0xb0 -> DecodedToken 1 16 0xb1 -> DecodedToken 1 17 0xb2 -> DecodedToken 1 18 0xb3 -> DecodedToken 1 19 0xb4 -> DecodedToken 1 20 0xb5 -> DecodedToken 1 21 0xb6 -> DecodedToken 1 22 0xb7 -> DecodedToken 1 23 0xb8 -> DecodedToken 2 $! (word8ToInt (eatTailWord8 bs)) 0xb9 -> DecodedToken 3 $! (word16ToInt (eatTailWord16 bs)) #if defined(ARCH_64bit) 0xba -> DecodedToken 5 $! (word32ToInt (eatTailWord32 bs)) #else 0xba -> case word32ToInt (eatTailWord32 bs) of Just n -> DecodedToken 5 n Nothing -> DecodeFailure #endif 0xbb -> case word64ToInt (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure 0xbf -> DecodedToken 1 (-1) -- indefinite length _ -> DecodeFailure {-# INLINE tryConsumeTag #-} tryConsumeTag :: Word8 -> ByteString -> DecodedToken Word tryConsumeTag hdr !bs = case word8ToWord hdr of -- Tagged values (type 6) 0xc0 -> DecodedToken 1 0 0xc1 -> DecodedToken 1 1 0xc2 -> DecodedToken 1 2 0xc3 -> DecodedToken 1 3 0xc4 -> DecodedToken 1 4 0xc5 -> DecodedToken 1 5 0xc6 -> DecodedToken 1 6 0xc7 -> DecodedToken 1 7 0xc8 -> DecodedToken 1 8 0xc9 -> DecodedToken 1 9 0xca -> DecodedToken 1 10 0xcb -> DecodedToken 1 11 0xcc -> DecodedToken 1 12 0xcd -> DecodedToken 1 13 0xce -> DecodedToken 1 14 0xcf -> DecodedToken 1 15 0xd0 -> DecodedToken 1 16 0xd1 -> DecodedToken 1 17 0xd2 -> DecodedToken 1 18 0xd3 -> DecodedToken 1 19 0xd4 -> DecodedToken 1 20 0xd5 -> DecodedToken 1 21 0xd6 -> DecodedToken 1 22 0xd7 -> DecodedToken 1 23 0xd8 -> DecodedToken 2 $! (word8ToWord (eatTailWord8 bs)) 0xd9 -> DecodedToken 3 $! (word16ToWord (eatTailWord16 bs)) 0xda -> DecodedToken 5 $! (word32ToWord (eatTailWord32 bs)) #if defined(ARCH_64bit) 0xdb -> DecodedToken 9 $! (word64ToWord (eatTailWord64 bs)) #else 0xdb -> case word64ToWord (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure #endif _ -> DecodeFailure -- -- 64-on-32 bit code paths -- #if defined(ARCH_32bit) tryConsumeWord64 :: Word8 -> ByteString -> DecodedToken Word64 tryConsumeWord64 hdr !bs = case word8ToWord hdr of -- Positive integers (type 0) 0x00 -> DecodedToken 1 0 0x01 -> DecodedToken 1 1 0x02 -> DecodedToken 1 2 0x03 -> DecodedToken 1 3 0x04 -> DecodedToken 1 4 0x05 -> DecodedToken 1 5 0x06 -> DecodedToken 1 6 0x07 -> DecodedToken 1 7 0x08 -> DecodedToken 1 8 0x09 -> DecodedToken 1 9 0x0a -> DecodedToken 1 10 0x0b -> DecodedToken 1 11 0x0c -> DecodedToken 1 12 0x0d -> DecodedToken 1 13 0x0e -> DecodedToken 1 14 0x0f -> DecodedToken 1 15 0x10 -> DecodedToken 1 16 0x11 -> DecodedToken 1 17 0x12 -> DecodedToken 1 18 0x13 -> DecodedToken 1 19 0x14 -> DecodedToken 1 20 0x15 -> DecodedToken 1 21 0x16 -> DecodedToken 1 22 0x17 -> DecodedToken 1 23 0x18 -> DecodedToken 2 $! (word8ToWord64 (eatTailWord8 bs)) 0x19 -> DecodedToken 3 $! (word16ToWord64 (eatTailWord16 bs)) 0x1a -> DecodedToken 5 $! (word32ToWord64 (eatTailWord32 bs)) 0x1b -> DecodedToken 9 $! (eatTailWord64 bs) _ -> DecodeFailure {-# INLINE tryConsumeWord64 #-} tryConsumeNegWord64 :: Word8 -> ByteString -> DecodedToken Word64 tryConsumeNegWord64 hdr !bs = case word8ToWord hdr of -- Positive integers (type 0) 0x20 -> DecodedToken 1 0 0x21 -> DecodedToken 1 1 0x22 -> DecodedToken 1 2 0x23 -> DecodedToken 1 3 0x24 -> DecodedToken 1 4 0x25 -> DecodedToken 1 5 0x26 -> DecodedToken 1 6 0x27 -> DecodedToken 1 7 0x28 -> DecodedToken 1 8 0x29 -> DecodedToken 1 9 0x2a -> DecodedToken 1 10 0x2b -> DecodedToken 1 11 0x2c -> DecodedToken 1 12 0x2d -> DecodedToken 1 13 0x2e -> DecodedToken 1 14 0x2f -> DecodedToken 1 15 0x30 -> DecodedToken 1 16 0x31 -> DecodedToken 1 17 0x32 -> DecodedToken 1 18 0x33 -> DecodedToken 1 19 0x34 -> DecodedToken 1 20 0x35 -> DecodedToken 1 21 0x36 -> DecodedToken 1 22 0x37 -> DecodedToken 1 23 0x38 -> DecodedToken 2 $! (word8ToWord64 (eatTailWord8 bs)) 0x39 -> DecodedToken 3 $! (word16ToWord64 (eatTailWord16 bs)) 0x3a -> DecodedToken 5 $! (word32ToWord64 (eatTailWord32 bs)) 0x3b -> DecodedToken 9 $! (eatTailWord64 bs) _ -> DecodeFailure {-# INLINE tryConsumeNegWord64 #-} tryConsumeInt64 :: Word8 -> ByteString -> DecodedToken Int64 tryConsumeInt64 hdr !bs = case word8ToWord hdr of -- Positive integers (type 0) 0x00 -> DecodedToken 1 0 0x01 -> DecodedToken 1 1 0x02 -> DecodedToken 1 2 0x03 -> DecodedToken 1 3 0x04 -> DecodedToken 1 4 0x05 -> DecodedToken 1 5 0x06 -> DecodedToken 1 6 0x07 -> DecodedToken 1 7 0x08 -> DecodedToken 1 8 0x09 -> DecodedToken 1 9 0x0a -> DecodedToken 1 10 0x0b -> DecodedToken 1 11 0x0c -> DecodedToken 1 12 0x0d -> DecodedToken 1 13 0x0e -> DecodedToken 1 14 0x0f -> DecodedToken 1 15 0x10 -> DecodedToken 1 16 0x11 -> DecodedToken 1 17 0x12 -> DecodedToken 1 18 0x13 -> DecodedToken 1 19 0x14 -> DecodedToken 1 20 0x15 -> DecodedToken 1 21 0x16 -> DecodedToken 1 22 0x17 -> DecodedToken 1 23 0x18 -> DecodedToken 2 $! (word8ToInt64 (eatTailWord8 bs)) 0x19 -> DecodedToken 3 $! (word16ToInt64 (eatTailWord16 bs)) 0x1a -> DecodedToken 5 $! (word32ToInt64 (eatTailWord32 bs)) 0x1b -> case word64ToInt64 (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure -- Negative integers (type 1) 0x20 -> DecodedToken 1 (-1) 0x21 -> DecodedToken 1 (-2) 0x22 -> DecodedToken 1 (-3) 0x23 -> DecodedToken 1 (-4) 0x24 -> DecodedToken 1 (-5) 0x25 -> DecodedToken 1 (-6) 0x26 -> DecodedToken 1 (-7) 0x27 -> DecodedToken 1 (-8) 0x28 -> DecodedToken 1 (-9) 0x29 -> DecodedToken 1 (-10) 0x2a -> DecodedToken 1 (-11) 0x2b -> DecodedToken 1 (-12) 0x2c -> DecodedToken 1 (-13) 0x2d -> DecodedToken 1 (-14) 0x2e -> DecodedToken 1 (-15) 0x2f -> DecodedToken 1 (-16) 0x30 -> DecodedToken 1 (-17) 0x31 -> DecodedToken 1 (-18) 0x32 -> DecodedToken 1 (-19) 0x33 -> DecodedToken 1 (-20) 0x34 -> DecodedToken 1 (-21) 0x35 -> DecodedToken 1 (-22) 0x36 -> DecodedToken 1 (-23) 0x37 -> DecodedToken 1 (-24) 0x38 -> DecodedToken 2 $! (-1 - word8ToInt64 (eatTailWord8 bs)) 0x39 -> DecodedToken 3 $! (-1 - word16ToInt64 (eatTailWord16 bs)) 0x3a -> DecodedToken 5 $! (-1 - word32ToInt64 (eatTailWord32 bs)) 0x3b -> case word64ToInt64 (eatTailWord64 bs) of Just n -> DecodedToken 9 (-1 - n) Nothing -> DecodeFailure _ -> DecodeFailure {-# INLINE tryConsumeInt64 #-} tryConsumeListLen64 :: Word8 -> ByteString -> DecodedToken Int64 tryConsumeListLen64 hdr !bs = case word8ToWord hdr of -- List structures (type 4) 0x80 -> DecodedToken 1 0 0x81 -> DecodedToken 1 1 0x82 -> DecodedToken 1 2 0x83 -> DecodedToken 1 3 0x84 -> DecodedToken 1 4 0x85 -> DecodedToken 1 5 0x86 -> DecodedToken 1 6 0x87 -> DecodedToken 1 7 0x88 -> DecodedToken 1 8 0x89 -> DecodedToken 1 9 0x8a -> DecodedToken 1 10 0x8b -> DecodedToken 1 11 0x8c -> DecodedToken 1 12 0x8d -> DecodedToken 1 13 0x8e -> DecodedToken 1 14 0x8f -> DecodedToken 1 15 0x90 -> DecodedToken 1 16 0x91 -> DecodedToken 1 17 0x92 -> DecodedToken 1 18 0x93 -> DecodedToken 1 19 0x94 -> DecodedToken 1 20 0x95 -> DecodedToken 1 21 0x96 -> DecodedToken 1 22 0x97 -> DecodedToken 1 23 0x98 -> DecodedToken 2 $! (word8ToInt64 (eatTailWord8 bs)) 0x99 -> DecodedToken 3 $! (word16ToInt64 (eatTailWord16 bs)) 0x9a -> DecodedToken 5 $! (word32ToInt64 (eatTailWord32 bs)) 0x9b -> case word64ToInt64 (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure _ -> DecodeFailure {-# INLINE tryConsumeListLen64 #-} tryConsumeMapLen64 :: Word8 -> ByteString -> DecodedToken Int64 tryConsumeMapLen64 hdr !bs = case word8ToWord hdr of -- Map structures (type 5) 0xa0 -> DecodedToken 1 0 0xa1 -> DecodedToken 1 1 0xa2 -> DecodedToken 1 2 0xa3 -> DecodedToken 1 3 0xa4 -> DecodedToken 1 4 0xa5 -> DecodedToken 1 5 0xa6 -> DecodedToken 1 6 0xa7 -> DecodedToken 1 7 0xa8 -> DecodedToken 1 8 0xa9 -> DecodedToken 1 9 0xaa -> DecodedToken 1 10 0xab -> DecodedToken 1 11 0xac -> DecodedToken 1 12 0xad -> DecodedToken 1 13 0xae -> DecodedToken 1 14 0xaf -> DecodedToken 1 15 0xb0 -> DecodedToken 1 16 0xb1 -> DecodedToken 1 17 0xb2 -> DecodedToken 1 18 0xb3 -> DecodedToken 1 19 0xb4 -> DecodedToken 1 20 0xb5 -> DecodedToken 1 21 0xb6 -> DecodedToken 1 22 0xb7 -> DecodedToken 1 23 0xb8 -> DecodedToken 2 $! (word8ToInt64 (eatTailWord8 bs)) 0xb9 -> DecodedToken 3 $! (word16ToInt64 (eatTailWord16 bs)) 0xba -> DecodedToken 5 $! (word32ToInt64 (eatTailWord32 bs)) 0xbb -> case word64ToInt64 (eatTailWord64 bs) of Just n -> DecodedToken 9 n Nothing -> DecodeFailure _ -> DecodeFailure {-# INLINE tryConsumeMapLen64 #-} tryConsumeTag64 :: Word8 -> ByteString -> DecodedToken Word64 tryConsumeTag64 hdr !bs = case word8ToWord hdr of -- Tagged values (type 6) 0xc0 -> DecodedToken 1 0 0xc1 -> DecodedToken 1 1 0xc2 -> DecodedToken 1 2 0xc3 -> DecodedToken 1 3 0xc4 -> DecodedToken 1 4 0xc5 -> DecodedToken 1 5 0xc6 -> DecodedToken 1 6 0xc7 -> DecodedToken 1 7 0xc8 -> DecodedToken 1 8 0xc9 -> DecodedToken 1 9 0xca -> DecodedToken 1 10 0xcb -> DecodedToken 1 11 0xcc -> DecodedToken 1 12 0xcd -> DecodedToken 1 13 0xce -> DecodedToken 1 14 0xcf -> DecodedToken 1 15 0xd0 -> DecodedToken 1 16 0xd1 -> DecodedToken 1 17 0xd2 -> DecodedToken 1 18 0xd3 -> DecodedToken 1 19 0xd4 -> DecodedToken 1 20 0xd5 -> DecodedToken 1 21 0xd6 -> DecodedToken 1 22 0xd7 -> DecodedToken 1 23 0xd8 -> DecodedToken 2 $! (word8ToWord64 (eatTailWord8 bs)) 0xd9 -> DecodedToken 3 $! (word16ToWord64 (eatTailWord16 bs)) 0xda -> DecodedToken 5 $! (word32ToWord64 (eatTailWord32 bs)) 0xdb -> DecodedToken 9 $! (eatTailWord64 bs) _ -> DecodeFailure {-# INLINE tryConsumeTag64 #-} #endif {-# INLINE tryConsumeFloat #-} tryConsumeFloat :: Word8 -> ByteString -> DecodedToken Float tryConsumeFloat hdr !bs = case word8ToWord hdr of 0xf9 -> DecodedToken 3 $! (wordToFloat16 (eatTailWord16 bs)) 0xfa -> DecodedToken 5 $! (wordToFloat32 (eatTailWord32 bs)) _ -> DecodeFailure {-# INLINE tryConsumeDouble #-} tryConsumeDouble :: Word8 -> ByteString -> DecodedToken Double tryConsumeDouble hdr !bs = case word8ToWord hdr of 0xf9 -> DecodedToken 3 $! (float2Double $ wordToFloat16 (eatTailWord16 bs)) 0xfa -> DecodedToken 5 $! (float2Double $ wordToFloat32 (eatTailWord32 bs)) 0xfb -> DecodedToken 9 $! (wordToFloat64 (eatTailWord64 bs)) _ -> DecodeFailure {-# INLINE tryConsumeBool #-} tryConsumeBool :: Word8 -> DecodedToken Bool tryConsumeBool hdr = case word8ToWord hdr of 0xf4 -> DecodedToken 1 False 0xf5 -> DecodedToken 1 True _ -> DecodeFailure {-# INLINE tryConsumeSimple #-} tryConsumeSimple :: Word8 -> ByteString -> DecodedToken Word tryConsumeSimple hdr !bs = case word8ToWord hdr of -- Simple and floats (type 7) 0xe0 -> DecodedToken 1 0 0xe1 -> DecodedToken 1 1 0xe2 -> DecodedToken 1 2 0xe3 -> DecodedToken 1 3 0xe4 -> DecodedToken 1 4 0xe5 -> DecodedToken 1 5 0xe6 -> DecodedToken 1 6 0xe7 -> DecodedToken 1 7 0xe8 -> DecodedToken 1 8 0xe9 -> DecodedToken 1 9 0xea -> DecodedToken 1 10 0xeb -> DecodedToken 1 11 0xec -> DecodedToken 1 12 0xed -> DecodedToken 1 13 0xee -> DecodedToken 1 14 0xef -> DecodedToken 1 15 0xf0 -> DecodedToken 1 16 0xf1 -> DecodedToken 1 17 0xf2 -> DecodedToken 1 18 0xf3 -> DecodedToken 1 19 0xf4 -> DecodedToken 1 20 0xf5 -> DecodedToken 1 21 0xf6 -> DecodedToken 1 22 0xf7 -> DecodedToken 1 23 0xf8 -> DecodedToken 2 $! (word8ToWord (eatTailWord8 bs)) _ -> DecodeFailure {-# INLINE tryConsumeBytesIndef #-} tryConsumeBytesIndef :: Word8 -> DecodedToken () tryConsumeBytesIndef hdr = case word8ToWord hdr of 0x5f -> DecodedToken 1 () _ -> DecodeFailure {-# INLINE tryConsumeStringIndef #-} tryConsumeStringIndef :: Word8 -> DecodedToken () tryConsumeStringIndef hdr = case word8ToWord hdr of 0x7f -> DecodedToken 1 () _ -> DecodeFailure {-# INLINE tryConsumeNull #-} tryConsumeNull :: Word8 -> DecodedToken () tryConsumeNull hdr = case word8ToWord hdr of 0xf6 -> DecodedToken 1 () _ -> DecodeFailure {-# INLINE tryConsumeBreakOr #-} tryConsumeBreakOr :: Word8 -> DecodedToken () tryConsumeBreakOr hdr = case word8ToWord hdr of 0xff -> DecodedToken 1 () _ -> DecodeFailure {-# INLINE readBytesSmall #-} readBytesSmall :: Int -> ByteString -> DecodedToken (LongToken ByteString) readBytesSmall n bs -- if n <= bound then ok return it all | n + hdrsz <= BS.length bs = DecodedToken (n+hdrsz) $ Fits True $ BS.unsafeTake n (BS.unsafeDrop hdrsz bs) -- if n > bound then slow path, multi-chunk | otherwise = DecodedToken hdrsz $ TooLong True n where hdrsz = 1 {-# INLINE readBytes8 #-} {-# INLINE readBytes16 #-} {-# INLINE readBytes32 #-} {-# INLINE readBytes64 #-} readBytes8, readBytes16, readBytes32, readBytes64 :: ByteString -> DecodedToken (LongToken ByteString) readBytes8 bs | n <= BS.length bs - hdrsz = DecodedToken (n+hdrsz) $ Fits lengthCanonical $ BS.unsafeTake n (BS.unsafeDrop hdrsz bs) -- if n > bound then slow path, multi-chunk | otherwise = DecodedToken hdrsz $ TooLong lengthCanonical n where hdrsz = 2 !n = word8ToInt (eatTailWord8 bs) lengthCanonical = isIntCanonical hdrsz n readBytes16 bs | n <= BS.length bs - hdrsz = DecodedToken (n+hdrsz) $ Fits lengthCanonical $ BS.unsafeTake n (BS.unsafeDrop hdrsz bs) -- if n > bound then slow path, multi-chunk | otherwise = DecodedToken hdrsz $ TooLong lengthCanonical n where hdrsz = 3 !n = word16ToInt (eatTailWord16 bs) lengthCanonical = isIntCanonical hdrsz n readBytes32 bs = case word32ToInt (eatTailWord32 bs) of #if defined(ARCH_32bit) Just n #else n #endif | n <= BS.length bs - hdrsz -> DecodedToken (n+hdrsz) $ Fits (isIntCanonical hdrsz n) $ BS.unsafeTake n (BS.unsafeDrop hdrsz bs) -- if n > bound then slow path, multi-chunk | otherwise -> DecodedToken hdrsz $ TooLong (isIntCanonical hdrsz n) n #if defined(ARCH_32bit) Nothing -> DecodeFailure #endif where hdrsz = 5 readBytes64 bs = case word64ToInt (eatTailWord64 bs) of Just n | n <= BS.length bs - hdrsz -> DecodedToken (n+hdrsz) $ Fits (isIntCanonical hdrsz n) $ BS.unsafeTake n (BS.unsafeDrop hdrsz bs) -- if n > bound then slow path, multi-chunk | otherwise -> DecodedToken hdrsz $ TooLong (isIntCanonical hdrsz n) n Nothing -> DecodeFailure where hdrsz = 9 ------------------------------------------------------------------------------ -- Reading big integers -- -- Big ints consist of two CBOR tokens: a tag token (2 for positive, 3 for -- negative) followed by a bytes token. Our usual invariant (for go_fast and -- go_fast_end) only guarantees that we've got enough space to decode the -- first token. So given that there's two tokens and the second is variable -- length then there are several points where we can discover we're out of -- input buffer space. -- -- In those cases we need to break out of the fast path but we must arrange -- things so that we can continue later once we've got more input buffer. -- -- In particular, we might run out of space when: -- 1. trying to decode the header of the second token (bytes); or -- 2. trying to read the bytes body -- --- The existing mechanisms we've got to drop out of the fast path are: -- * SlowDecodeAction to re-read a whole token -- * SlowConsumeTokenBytes to read the body of a bytes token -- -- Of course when we resume we need to convert the bytes into an integer. -- Rather than making new fast path return mechanisms we can reuse the -- existing ones, so long as we're prepared to allocate new continuation -- closures. This seems a reasonable price to pay to reduce complexity since -- decoding a big int across an input buffer boundary ought to be rare, and -- allocating a new continuation closure isn't that expensive. -- -- Note that canonicity information is calculated lazily. This way we don't need -- to concern ourselves with two distinct paths, while according to benchmarks -- it doesn't affect performance in the non-canonical case. data BigIntToken a = BigIntToken Bool {- canonical? -} Integer | BigUIntNeedBody Bool {- canonical? -} Int | BigNIntNeedBody Bool {- canonical? -} Int | BigUIntNeedHeader | BigNIntNeedHeader -- So when we have to break out because we can't read the whole bytes body -- in one go then we need to use SlowConsumeTokenBytes but we can adjust the -- continuation so that when we get the ByteString back we convert it to an -- Integer before calling the original continuation. adjustContBigUIntNeedBody, adjustContBigNIntNeedBody :: (Integer -> ST s (DecodeAction s a)) -> (ByteString -> ST s (DecodeAction s a)) adjustContBigUIntNeedBody k = \bs -> k $! uintegerFromBytes bs adjustContBigNIntNeedBody k = \bs -> k $! nintegerFromBytes bs adjustContCanonicalBigUIntNeedBody, adjustContCanonicalBigNIntNeedBody :: (Integer -> ST s (DecodeAction s a)) -> (ByteString -> ST s (DecodeAction s a)) adjustContCanonicalBigUIntNeedBody k = \bs -> if isBigIntRepCanonical bs then k $! uintegerFromBytes bs else pure $! D.Fail ("non-canonical integer") adjustContCanonicalBigNIntNeedBody k = \bs -> if isBigIntRepCanonical bs then k $! nintegerFromBytes bs else pure $! D.Fail ("non-canonical integer") -- And when we have to break out because we can't read the bytes token header -- in one go then we need to use SlowDecodeAction but we have to make two -- adjustments. When we resume we need to read a bytes token, not a big int. -- That is we don't want to re-read the tag token. Indeed we cannot even if we -- wanted to because the slow path code only guarantees to arrange for one -- complete token header in the input buffer. So we must pretend that we did -- in fact want to read a bytes token using ConsumeBytes, and then we can -- adjust the continuation for that in the same way as above. adjustContBigUIntNeedHeader, adjustContBigNIntNeedHeader :: (Integer -> ST s (DecodeAction s a)) -> DecodeAction s a adjustContBigUIntNeedHeader k = ConsumeBytes (\bs -> k $! uintegerFromBytes bs) adjustContBigNIntNeedHeader k = ConsumeBytes (\bs -> k $! nintegerFromBytes bs) adjustContCanonicalBigUIntNeedHeader, adjustContCanonicalBigNIntNeedHeader :: (Integer -> ST s (DecodeAction s a)) -> DecodeAction s a adjustContCanonicalBigUIntNeedHeader k = ConsumeBytesCanonical $ \bs -> if isBigIntRepCanonical bs then k $! uintegerFromBytes bs else pure $! D.Fail ("non-canonical integer") adjustContCanonicalBigNIntNeedHeader k = ConsumeBytesCanonical $ \bs -> if isBigIntRepCanonical bs then k $! nintegerFromBytes bs else pure $! D.Fail ("non-canonical integer") -- So finally when reading the input buffer we check if we have enough space -- to read the header of the bytes token and then try to read the bytes body, -- using the appropriate break-out codes as above. {-# INLINE readBigUInt #-} readBigUInt :: ByteString -> DecodedToken (BigIntToken a) readBigUInt bs | let bs' = BS.unsafeTail bs , not (BS.null bs') , let !hdr = BS.unsafeHead bs' , BS.length bs' >= tokenSize hdr = case tryConsumeBytes hdr bs' of DecodeFailure -> DecodeFailure DecodedToken sz (Fits canonical bstr) -> DecodedToken (1+sz) (BigIntToken (canonical && isBigIntRepCanonical bstr) (uintegerFromBytes bstr)) DecodedToken sz (TooLong canonical len) -> DecodedToken (1+sz) (BigUIntNeedBody canonical len) | otherwise = DecodedToken 1 BigUIntNeedHeader {-# INLINE readBigNInt #-} readBigNInt :: ByteString -> DecodedToken (BigIntToken a) readBigNInt bs | let bs' = BS.unsafeTail bs , not (BS.null bs') , let !hdr = BS.unsafeHead bs' , BS.length bs' >= tokenSize hdr = case tryConsumeBytes hdr bs' of DecodeFailure -> DecodeFailure DecodedToken sz (Fits canonical bstr) -> DecodedToken (1+sz) (BigIntToken (canonical && isBigIntRepCanonical bstr) (nintegerFromBytes bstr)) DecodedToken sz (TooLong canonical len) -> DecodedToken (1+sz) (BigNIntNeedBody canonical len) | otherwise = DecodedToken 1 BigNIntNeedHeader -- Binary representation of a big integer is canonical if it's at least 9 bytes -- long (as for smaller values the canonical representation is the same one as -- for Int) and the leading byte is not zero (meaning that it's the smallest -- representation for the number in question). isBigIntRepCanonical :: ByteString -> Bool isBigIntRepCanonical bstr = BS.length bstr > 8 && BS.unsafeHead bstr /= 0x00 cborg-0.2.10.0/src/Codec/CBOR/Term.hs0000644000000000000000000002265007346545000015016 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE BangPatterns #-} -- | -- Module : Codec.CBOR.Term -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- This module provides an interface for decoding and encoding arbitrary -- CBOR values (ones that, for example, may not have been generated by this -- library). -- -- Using 'decodeTerm', you can decode an arbitrary CBOR value given to you -- into a 'Term', which represents a CBOR value as an AST. -- -- Similarly, if you wanted to encode some value into a CBOR value directly, -- you can wrap it in a 'Term' constructor and use 'encodeTerm'. This -- would be useful, as an example, if you needed to serialise some value into -- a CBOR term that is not compatible with that types 'Serialise' instance. -- -- Because this interface gives you the ability to decode or encode any -- arbitrary CBOR term, it can also be seen as an alternative interface to the -- 'Codec.CBOR.Encoding' and -- 'Codec.CBOR.Decoding' modules. -- module Codec.CBOR.Term ( Term(..) -- :: * , encodeTerm -- :: Term -> Encoding , decodeTerm -- :: Decoder Term ) where #include "cbor.h" import Codec.CBOR.Encoding hiding (Tokens(..)) import Codec.CBOR.Decoding import Data.Word import qualified Data.Text as T import qualified Data.Text.Lazy as LT import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS import Data.Monoid import Control.Applicative import Prelude hiding (encodeFloat, decodeFloat) -------------------------------------------------------------------------------- -- Types -- | A general CBOR term, which can be used to serialise or deserialise -- arbitrary CBOR terms for interoperability or debugging. This type is -- essentially a direct reflection of the CBOR abstract syntax tree as a -- Haskell data type. -- -- The 'Term' type also comes with a 'Serialise' instance, so you can -- easily use @'decode' :: 'Decoder' 'Term'@ to directly decode any arbitrary -- CBOR value into Haskell with ease, and likewise with 'encode'. -- -- @since 0.2.0.0 data Term = TInt {-# UNPACK #-} !Int | TInteger !Integer | TBytes !BS.ByteString | TBytesI !LBS.ByteString | TString !T.Text | TStringI !LT.Text | TList ![Term] | TListI ![Term] | TMap ![(Term, Term)] | TMapI ![(Term, Term)] | TTagged {-# UNPACK #-} !Word64 !Term | TBool !Bool | TNull | TSimple {-# UNPACK #-} !Word8 | THalf {-# UNPACK #-} !Float | TFloat {-# UNPACK #-} !Float | TDouble {-# UNPACK #-} !Double deriving (Eq, Ord, Show, Read) -------------------------------------------------------------------------------- -- Main API -- | Encode an arbitrary 'Term' into an 'Encoding' for later serialization. -- -- @since 0.2.0.0 encodeTerm :: Term -> Encoding encodeTerm (TInt n) = encodeInt n encodeTerm (TInteger n) = encodeInteger n encodeTerm (TBytes bs) = encodeBytes bs encodeTerm (TString st) = encodeString st encodeTerm (TBytesI bss) = encodeBytesIndef <> mconcat [ encodeBytes bs | bs <- LBS.toChunks bss ] <> encodeBreak encodeTerm (TStringI sts) = encodeStringIndef <> mconcat [ encodeString str | str <- LT.toChunks sts ] <> encodeBreak encodeTerm (TList ts) = encodeListLen (fromIntegral $ length ts) <> mconcat [ encodeTerm t | t <- ts ] encodeTerm (TListI ts) = encodeListLenIndef <> mconcat [ encodeTerm t | t <- ts ] <> encodeBreak encodeTerm (TMap ts) = encodeMapLen (fromIntegral $ length ts) <> mconcat [ encodeTerm t <> encodeTerm t' | (t, t') <- ts ] encodeTerm (TMapI ts) = encodeMapLenIndef <> mconcat [ encodeTerm t <> encodeTerm t' | (t, t') <- ts ] <> encodeBreak encodeTerm (TTagged w t) = encodeTag64 w <> encodeTerm t encodeTerm (TBool b) = encodeBool b encodeTerm TNull = encodeNull encodeTerm (TSimple w) = encodeSimple w encodeTerm (THalf f) = encodeFloat16 f encodeTerm (TFloat f) = encodeFloat f encodeTerm (TDouble f) = encodeDouble f -- | Decode some arbitrary CBOR value into a 'Term'. -- -- @since 0.2.0.0 decodeTerm :: Decoder s Term decodeTerm = do tkty <- peekTokenType case tkty of TypeUInt -> do w <- decodeWord return $! fromWord w where fromWord :: Word -> Term fromWord w | w <= fromIntegral (maxBound :: Int) = TInt (fromIntegral w) | otherwise = TInteger (fromIntegral w) TypeUInt64 -> do w <- decodeWord64 return $! fromWord64 w where fromWord64 w | w <= fromIntegral (maxBound :: Int) = TInt (fromIntegral w) | otherwise = TInteger (fromIntegral w) TypeNInt -> do w <- decodeNegWord return $! fromNegWord w where fromNegWord w | w <= fromIntegral (maxBound :: Int) = TInt (-1 - fromIntegral w) | otherwise = TInteger (-1 - fromIntegral w) TypeNInt64 -> do w <- decodeNegWord64 return $! fromNegWord64 w where fromNegWord64 w | w <= fromIntegral (maxBound :: Int) = TInt (-1 - fromIntegral w) | otherwise = TInteger (-1 - fromIntegral w) TypeInteger -> do !x <- decodeInteger return (TInteger x) TypeFloat16 -> do !x <- decodeFloat return (THalf x) TypeFloat32 -> do !x <- decodeFloat return (TFloat x) TypeFloat64 -> do !x <- decodeDouble return (TDouble x) TypeBytes -> do !x <- decodeBytes return (TBytes x) TypeBytesIndef -> decodeBytesIndef >> decodeBytesIndefLen [] TypeString -> do !x <- decodeString return (TString x) TypeStringIndef -> decodeStringIndef >> decodeStringIndefLen [] TypeListLen -> decodeListLen >>= flip decodeListN [] TypeListLen64 -> decodeListLen >>= flip decodeListN [] TypeListLenIndef -> decodeListLenIndef >> decodeListIndefLen [] TypeMapLen -> decodeMapLen >>= flip decodeMapN [] TypeMapLen64 -> decodeMapLen >>= flip decodeMapN [] TypeMapLenIndef -> decodeMapLenIndef >> decodeMapIndefLen [] TypeTag -> do !x <- decodeTag64 !y <- decodeTerm return (TTagged x y) TypeTag64 -> do !x <- decodeTag64 !y <- decodeTerm return (TTagged x y) TypeBool -> do !x <- decodeBool return (TBool x) TypeNull -> TNull <$ decodeNull TypeSimple -> do !x <- decodeSimple return (TSimple x) TypeBreak -> fail "unexpected break" TypeInvalid -> fail "invalid token encoding" -------------------------------------------------------------------------------- -- Internal utilities decodeBytesIndefLen :: [BS.ByteString] -> Decoder s Term decodeBytesIndefLen acc = do stop <- decodeBreakOr if stop then return $! TBytesI (LBS.fromChunks (reverse acc)) else do !bs <- decodeBytes decodeBytesIndefLen (bs : acc) decodeStringIndefLen :: [T.Text] -> Decoder s Term decodeStringIndefLen acc = do stop <- decodeBreakOr if stop then return $! TStringI (LT.fromChunks (reverse acc)) else do !str <- decodeString decodeStringIndefLen (str : acc) decodeListN :: Int -> [Term] -> Decoder s Term decodeListN !n acc = case n of 0 -> return $! TList (reverse acc) _ -> do !t <- decodeTerm decodeListN (n-1) (t : acc) decodeListIndefLen :: [Term] -> Decoder s Term decodeListIndefLen acc = do stop <- decodeBreakOr if stop then return $! TListI (reverse acc) else do !tm <- decodeTerm decodeListIndefLen (tm : acc) decodeMapN :: Int -> [(Term, Term)] -> Decoder s Term decodeMapN !n acc = case n of 0 -> return $! TMap (reverse acc) _ -> do !tm <- decodeTerm !tm' <- decodeTerm decodeMapN (n-1) ((tm, tm') : acc) decodeMapIndefLen :: [(Term, Term)] -> Decoder s Term decodeMapIndefLen acc = do stop <- decodeBreakOr if stop then return $! TMapI (reverse acc) else do !tm <- decodeTerm !tm' <- decodeTerm decodeMapIndefLen ((tm, tm') : acc) cborg-0.2.10.0/src/Codec/CBOR/Write.hs0000644000000000000000000007006707346545000015206 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE CPP #-} {-# LANGUAGE DeriveFunctor #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE PatternSynonyms #-} #include "cbor.h" #if defined(OPTIMIZE_GMP) && defined(HAVE_GHC_BIGNUM) {-# LANGUAGE UnboxedSums #-} #endif -- | -- Module : Codec.CBOR.Write -- Copyright : (c) Duncan Coutts 2015-2017 -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : duncan@community.haskell.org -- Stability : experimental -- Portability : non-portable (GHC extensions) -- -- Functions for writing out CBOR 'Encoding' values in a variety of forms. -- module Codec.CBOR.Write ( toBuilder -- :: Encoding -> B.Builder , toLazyByteString -- :: Encoding -> L.ByteString , toStrictByteString -- :: Encoding -> S.ByteString ) where import Data.Bits import Data.Int #if ! MIN_VERSION_base(4,11,0) import Data.Monoid #endif import Data.Word import Foreign.Ptr import qualified Data.ByteString as S import qualified Data.ByteString.Builder as B import qualified Data.ByteString.Builder.Internal as BI import Data.ByteString.Builder.Prim (condB, (>$<), (>*<)) import qualified Data.ByteString.Builder.Prim as P import qualified Data.ByteString.Builder.Prim.Internal as PI import qualified Data.ByteString.Lazy as L import qualified Data.Text as T import qualified Data.Text.Encoding as T #if MIN_VERSION_text(2,0,0) import qualified Data.Text.Foreign as T #endif import Control.Exception.Base (assert) import GHC.Exts #if defined(OPTIMIZE_GMP) #if defined(HAVE_GHC_BIGNUM) import GHC.IO (IO(IO)) import qualified GHC.Num.Integer import qualified GHC.Num.BigNat as Gmp import qualified GHC.Num.BigNat import GHC.Num.BigNat (BigNat) #else import qualified GHC.Integer.GMP.Internals as Gmp import GHC.Integer.GMP.Internals (BigNat) #endif #endif #if __GLASGOW_HASKELL__ < 710 import GHC.Word #endif import qualified Codec.CBOR.ByteArray.Sliced as BAS import Codec.CBOR.Encoding import Codec.CBOR.Magic -------------------------------------------------------------------------------- -- | Turn an 'Encoding' into a lazy 'L.ByteString' in CBOR binary -- format. -- -- @since 0.2.0.0 toLazyByteString :: Encoding -- ^ The 'Encoding' of a CBOR value. -> L.ByteString -- ^ The encoded CBOR value. toLazyByteString = B.toLazyByteString . toBuilder -- | Turn an 'Encoding' into a strict 'S.ByteString' in CBOR binary -- format. -- -- @since 0.2.0.0 toStrictByteString :: Encoding -- ^ The 'Encoding' of a CBOR value. -> S.ByteString -- ^ The encoded value. toStrictByteString = L.toStrict . B.toLazyByteString . toBuilder -- | Turn an 'Encoding' into a 'L.ByteString' 'B.Builder' in CBOR -- binary format. -- -- @since 0.2.0.0 toBuilder :: Encoding -- ^ The 'Encoding' of a CBOR value. -> B.Builder -- ^ The encoded value as a 'B.Builder'. toBuilder = \(Encoding vs0) -> BI.builder (buildStep (vs0 TkEnd)) buildStep :: Tokens -> (BI.BufferRange -> IO (BI.BuildSignal a)) -> BI.BufferRange -> IO (BI.BuildSignal a) buildStep vs1 k (BI.BufferRange op0 ope0) = go vs1 op0 where go vs !op | op `plusPtr` bound <= ope0 = case vs of TkWord x vs' -> PI.runB wordMP x op >>= go vs' TkWord64 x vs' -> PI.runB word64MP x op >>= go vs' TkInt x vs' -> PI.runB intMP x op >>= go vs' TkInt64 x vs' -> PI.runB int64MP x op >>= go vs' TkBytes x vs' -> BI.runBuilderWith (bytesMP x) (buildStep vs' k) (BI.BufferRange op ope0) TkByteArray x vs' -> BI.runBuilderWith (byteArrayMP x) (buildStep vs' k) (BI.BufferRange op ope0) TkUtf8ByteArray x vs' -> BI.runBuilderWith (utf8ByteArrayMP x) (buildStep vs' k) (BI.BufferRange op ope0) TkString x vs' -> BI.runBuilderWith (stringMP x) (buildStep vs' k) (BI.BufferRange op ope0) TkBytesBegin vs' -> PI.runB bytesBeginMP () op >>= go vs' TkStringBegin vs'-> PI.runB stringBeginMP () op >>= go vs' TkListLen x vs' -> PI.runB arrayLenMP x op >>= go vs' TkListBegin vs' -> PI.runB arrayBeginMP () op >>= go vs' TkMapLen x vs' -> PI.runB mapLenMP x op >>= go vs' TkMapBegin vs' -> PI.runB mapBeginMP () op >>= go vs' TkTag x vs' -> PI.runB tagMP x op >>= go vs' TkTag64 x vs' -> PI.runB tag64MP x op >>= go vs' #if defined(OPTIMIZE_GMP) -- This code is specialized for GMP implementation of Integer. By -- looking directly at the constructors we can avoid some checks. -- S# hold an Int, so we can just use intMP. TkInteger (SmallInt i) vs' -> PI.runB intMP (I# i) op >>= go vs' -- PosBigInt is guaranteed to be > 0. TkInteger integer@(PosBigInt bigNat) vs' | integer <= fromIntegral (maxBound :: Word64) -> PI.runB word64MP (fromIntegral integer) op >>= go vs' | otherwise -> let buffer = BI.BufferRange op ope0 in BI.runBuilderWith (bigNatMP bigNat) (buildStep vs' k) buffer -- Jn# is guaranteed to be < 0. TkInteger integer@(NegBigInt bigNat) vs' | integer >= -1 - fromIntegral (maxBound :: Word64) -> PI.runB negInt64MP (fromIntegral (-1 - integer)) op >>= go vs' | otherwise -> let buffer = BI.BufferRange op ope0 in BI.runBuilderWith (negBigNatMP bigNat) (buildStep vs' k) buffer #else TkInteger x vs' | x >= 0 , x <= fromIntegral (maxBound :: Word64) -> PI.runB word64MP (fromIntegral x) op >>= go vs' | x < 0 , x >= -1 - fromIntegral (maxBound :: Word64) -> PI.runB negInt64MP (fromIntegral (-1 - x)) op >>= go vs' | otherwise -> BI.runBuilderWith (integerMP x) (buildStep vs' k) (BI.BufferRange op ope0) #endif TkBool False vs' -> PI.runB falseMP () op >>= go vs' TkBool True vs' -> PI.runB trueMP () op >>= go vs' TkNull vs' -> PI.runB nullMP () op >>= go vs' TkUndef vs' -> PI.runB undefMP () op >>= go vs' TkSimple w vs' -> PI.runB simpleMP w op >>= go vs' TkFloat16 f vs' -> PI.runB halfMP f op >>= go vs' TkFloat32 f vs' -> PI.runB floatMP f op >>= go vs' TkFloat64 f vs' -> PI.runB doubleMP f op >>= go vs' TkBreak vs' -> PI.runB breakMP () op >>= go vs' TkEncoded x vs' -> BI.runBuilderWith (B.byteString x) (buildStep vs' k) (BI.BufferRange op ope0) TkEnd -> k (BI.BufferRange op ope0) | otherwise = return $ BI.bufferFull bound op (buildStep vs k) -- The maximum size in bytes of the fixed-size encodings bound :: Int bound = 9 header :: P.BoundedPrim Word8 header = P.liftFixedToBounded P.word8 constHeader :: Word8 -> P.BoundedPrim () constHeader h = P.liftFixedToBounded (const h >$< P.word8) withHeader :: P.FixedPrim a -> P.BoundedPrim (Word8, a) withHeader p = P.liftFixedToBounded (P.word8 >*< p) withConstHeader :: Word8 -> P.FixedPrim a -> P.BoundedPrim a withConstHeader h p = P.liftFixedToBounded ((,) h >$< (P.word8 >*< p)) {- From RFC 7049: Major type 0: an unsigned integer. The 5-bit additional information is either the integer itself (for additional information values 0 through 23) or the length of additional data. Additional information 24 means the value is represented in an additional uint8_t, 25 means a uint16_t, 26 means a uint32_t, and 27 means a uint64_t. For example, the integer 10 is denoted as the one byte 0b000_01010 (major type 0, additional information 10). The integer 500 would be 0b000_11001 (major type 0, additional information 25) followed by the two bytes 0x01f4, which is 500 in decimal. -} {-# INLINE wordMP #-} wordMP :: P.BoundedPrim Word wordMP = condB (<= 0x17) (fromIntegral >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 24 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 25 P.word16BE) $ #if defined(ARCH_64bit) condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 26 P.word32BE) $ (fromIntegral >$< withConstHeader 27 P.word64BE) #else (fromIntegral >$< withConstHeader 26 P.word32BE) #endif {-# INLINE word64MP #-} word64MP :: P.BoundedPrim Word64 word64MP = condB (<= 0x17) (fromIntegral >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 24 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 25 P.word16BE) $ condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 26 P.word32BE) $ (fromIntegral >$< withConstHeader 27 P.word64BE) {- From RFC 7049: Major type 1: a negative integer. The encoding follows the rules for unsigned integers (major type 0), except that the value is then -1 minus the encoded unsigned integer. For example, the integer -500 would be 0b001_11001 (major type 1, additional information 25) followed by the two bytes 0x01f3, which is 499 in decimal. -} negInt64MP :: P.BoundedPrim Word64 negInt64MP = condB (<= 0x17) (fromIntegral . (0x20 +) >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 0x38 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 0x39 P.word16BE) $ condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 0x3a P.word32BE) $ (fromIntegral >$< withConstHeader 0x3b P.word64BE) {- Major types 0 and 1 are designed in such a way that they can be encoded in C from a signed integer without actually doing an if-then- else for positive/negative (Figure 2). This uses the fact that (-1-n), the transformation for major type 1, is the same as ~n (bitwise complement) in C unsigned arithmetic; ~n can then be expressed as (-1)^n for the negative case, while 0^n leaves n unchanged for non-negative. The sign of a number can be converted to -1 for negative and 0 for non-negative (0 or positive) by arithmetic- shifting the number by one bit less than the bit length of the number (for example, by 63 for 64-bit numbers). void encode_sint(int64_t n) { uint64t ui = n >> 63; // extend sign to whole length mt = ui & 0x20; // extract major type ui ^= n; // complement negatives if (ui < 24) *p++ = mt + ui; else if (ui < 256) { *p++ = mt + 24; *p++ = ui; } else ... Figure 2: Pseudocode for Encoding a Signed Integer -} {-# INLINE intMP #-} intMP :: P.BoundedPrim Int intMP = prep >$< ( condB ((<= 0x17) . snd) (encIntSmall >$< header) $ condB ((<= 0xff) . snd) (encInt8 >$< withHeader P.word8) $ condB ((<= 0xffff) . snd) (encInt16 >$< withHeader P.word16BE) $ #if defined(ARCH_64bit) condB ((<= 0xffffffff) . snd) (encInt32 >$< withHeader P.word32BE) (encInt64 >$< withHeader P.word64BE) #else (encInt32 >$< withHeader P.word32BE) #endif ) where prep :: Int -> (Word8, Word) prep n = (mt, ui) where sign :: Word -- extend sign to whole length sign = fromIntegral (n `unsafeShiftR` intBits) #if MIN_VERSION_base(4,7,0) intBits = finiteBitSize (undefined :: Int) - 1 #else intBits = bitSize (undefined :: Int) - 1 #endif mt :: Word8 -- select major type mt = fromIntegral (sign .&. 0x20) ui :: Word -- complement negatives ui = fromIntegral n `xor` sign encIntSmall :: (Word8, Word) -> Word8 encIntSmall (mt, ui) = mt + fromIntegral ui encInt8 (mt, ui) = (mt + 24, fromIntegral ui) encInt16 (mt, ui) = (mt + 25, fromIntegral ui) encInt32 (mt, ui) = (mt + 26, fromIntegral ui) #if defined(ARCH_64bit) encInt64 (mt, ui) = (mt + 27, fromIntegral ui) #endif {-# INLINE int64MP #-} int64MP :: P.BoundedPrim Int64 int64MP = prep >$< ( condB ((<= 0x17) . snd) (encIntSmall >$< header) $ condB ((<= 0xff) . snd) (encInt8 >$< withHeader P.word8) $ condB ((<= 0xffff) . snd) (encInt16 >$< withHeader P.word16BE) $ condB ((<= 0xffffffff) . snd) (encInt32 >$< withHeader P.word32BE) (encInt64 >$< withHeader P.word64BE) ) where prep :: Int64 -> (Word8, Word64) prep n = (mt, ui) where sign :: Word64 -- extend sign to whole length sign = fromIntegral (n `unsafeShiftR` intBits) #if MIN_VERSION_base(4,7,0) intBits = finiteBitSize (undefined :: Int64) - 1 #else intBits = bitSize (undefined :: Int64) - 1 #endif mt :: Word8 -- select major type mt = fromIntegral (sign .&. 0x20) ui :: Word64 -- complement negatives ui = fromIntegral n `xor` sign encIntSmall (mt, ui) = mt + fromIntegral ui encInt8 (mt, ui) = (mt + 24, fromIntegral ui) encInt16 (mt, ui) = (mt + 25, fromIntegral ui) encInt32 (mt, ui) = (mt + 26, fromIntegral ui) encInt64 (mt, ui) = (mt + 27, fromIntegral ui) {- Major type 2: a byte string. The string's length in bytes is represented following the rules for positive integers (major type 0). For example, a byte string whose length is 5 would have an initial byte of 0b010_00101 (major type 2, additional information 5 for the length), followed by 5 bytes of binary content. A byte string whose length is 500 would have 3 initial bytes of 0b010_11001 (major type 2, additional information 25 to indicate a two-byte length) followed by the two bytes 0x01f4 for a length of 500, followed by 500 bytes of binary content. -} bytesMP :: S.ByteString -> B.Builder bytesMP bs = P.primBounded bytesLenMP (fromIntegral $ S.length bs) <> B.byteString bs bytesLenMP :: P.BoundedPrim Word bytesLenMP = condB (<= 0x17) (fromIntegral . (0x40 +) >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 0x58 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 0x59 P.word16BE) $ condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 0x5a P.word32BE) $ (fromIntegral >$< withConstHeader 0x5b P.word64BE) byteArrayMP :: BAS.SlicedByteArray -> B.Builder byteArrayMP ba = P.primBounded bytesLenMP n <> BAS.toBuilder ba where n = fromIntegral $ BAS.sizeofSlicedByteArray ba bytesBeginMP :: P.BoundedPrim () bytesBeginMP = constHeader 0x5f {- Major type 3: a text string, specifically a string of Unicode characters that is encoded as UTF-8 [RFC3629]. The format of this type is identical to that of byte strings (major type 2), that is, as with major type 2, the length gives the number of bytes. This type is provided for systems that need to interpret or display human-readable text, and allows the differentiation between unstructured bytes and text that has a specified repertoire and encoding. In contrast to formats such as JSON, the Unicode characters in this type are never escaped. Thus, a newline character (U+000A) is always represented in a string as the byte 0x0a, and never as the bytes 0x5c6e (the characters "\" and "n") or as 0x5c7530303061 (the characters "\", "u", "0", "0", "0", and "a"). -} stringMP :: T.Text -> B.Builder stringMP t = #if MIN_VERSION_text(2,0,0) P.primBounded stringLenMP (fromIntegral $ T.lengthWord8 t) <> T.encodeUtf8Builder t #else P.primBounded stringLenMP (fromIntegral $ S.length bs) <> B.byteString bs where bs = T.encodeUtf8 t #endif stringLenMP :: P.BoundedPrim Word stringLenMP = condB (<= 0x17) (fromIntegral . (0x60 +) >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 0x78 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 0x79 P.word16BE) $ condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 0x7a P.word32BE) $ (fromIntegral >$< withConstHeader 0x7b P.word64BE) stringBeginMP :: P.BoundedPrim () stringBeginMP = constHeader 0x7f utf8ByteArrayMP :: BAS.SlicedByteArray -> B.Builder utf8ByteArrayMP t = P.primBounded stringLenMP n <> BAS.toBuilder t where n = fromIntegral $ BAS.sizeofSlicedByteArray t {- Major type 4: an array of data items. Arrays are also called lists, sequences, or tuples. The array's length follows the rules for byte strings (major type 2), except that the length denotes the number of data items, not the length in bytes that the array takes up. Items in an array do not need to all be of the same type. For example, an array that contains 10 items of any type would have an initial byte of 0b100_01010 (major type of 4, additional information of 10 for the length) followed by the 10 remaining items. -} arrayLenMP :: P.BoundedPrim Word arrayLenMP = condB (<= 0x17) (fromIntegral . (0x80 +) >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 0x98 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 0x99 P.word16BE) $ condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 0x9a P.word32BE) $ (fromIntegral >$< withConstHeader 0x9b P.word64BE) arrayBeginMP :: P.BoundedPrim () arrayBeginMP = constHeader 0x9f {- Major type 5: a map of pairs of data items. Maps are also called tables, dictionaries, hashes, or objects (in JSON). A map is comprised of pairs of data items, each pair consisting of a key that is immediately followed by a value. The map's length follows the rules for byte strings (major type 2), except that the length denotes the number of pairs, not the length in bytes that the map takes up. For example, a map that contains 9 pairs would have an initial byte of 0b101_01001 (major type of 5, additional information of 9 for the number of pairs) followed by the 18 remaining items. The first item is the first key, the second item is the first value, the third item is the second key, and so on. A map that has duplicate keys may be well-formed, but it is not valid, and thus it causes indeterminate decoding; see also Section 3.7. -} mapLenMP :: P.BoundedPrim Word mapLenMP = condB (<= 0x17) (fromIntegral . (0xa0 +) >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 0xb8 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 0xb9 P.word16BE) $ condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 0xba P.word32BE) $ (fromIntegral >$< withConstHeader 0xbb P.word64BE) mapBeginMP :: P.BoundedPrim () mapBeginMP = constHeader 0xbf {- Major type 6: optional semantic tagging of other major types. In CBOR, a data item can optionally be preceded by a tag to give it additional semantics while retaining its structure. The tag is major type 6, and represents an integer number as indicated by the tag's integer value; the (sole) data item is carried as content data. The initial bytes of the tag follow the rules for positive integers (major type 0). -} tagMP :: P.BoundedPrim Word tagMP = condB (<= 0x17) (fromIntegral . (0xc0 +) >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 0xd8 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 0xd9 P.word16BE) $ #if defined(ARCH_64bit) condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 0xda P.word32BE) $ (fromIntegral >$< withConstHeader 0xdb P.word64BE) #else (fromIntegral >$< withConstHeader 0xda P.word32BE) #endif tag64MP :: P.BoundedPrim Word64 tag64MP = condB (<= 0x17) (fromIntegral . (0xc0 +) >$< header) $ condB (<= 0xff) (fromIntegral >$< withConstHeader 0xd8 P.word8) $ condB (<= 0xffff) (fromIntegral >$< withConstHeader 0xd9 P.word16BE) $ condB (<= 0xffffffff) (fromIntegral >$< withConstHeader 0xda P.word32BE) $ (fromIntegral >$< withConstHeader 0xdb P.word64BE) {- Major type 7: floating-point numbers and simple data types that need no content, as well as the "break" stop code. Major type 7 is for two types of data: floating-point numbers and "simple values" that do not need any content. Each value of the 5-bit additional information in the initial byte has its own separate meaning, as defined in Table 1. Like the major types for integers, items of this major type do not carry content data; all the information is in the initial bytes. +-------------+--------------------------------------------------+ | 5-Bit Value | Semantics | +-------------+--------------------------------------------------+ | 0..23 | Simple value (value 0..23) | | | | | 24 | Simple value (value 32..255 in following byte) | | | | | 25 | IEEE 754 Half-Precision Float (16 bits follow) | | | | | 26 | IEEE 754 Single-Precision Float (32 bits follow) | | | | | 27 | IEEE 754 Double-Precision Float (64 bits follow) | | | | | 28-30 | (Unassigned) | | | | | 31 | "break" stop code for indefinite-length items | +-------------+--------------------------------------------------+ -} simpleMP :: P.BoundedPrim Word8 simpleMP = condB (<= 0x17) ((0xe0 +) >$< header) $ (withConstHeader 0xf8 P.word8) falseMP :: P.BoundedPrim () falseMP = constHeader 0xf4 trueMP :: P.BoundedPrim () trueMP = constHeader 0xf5 nullMP :: P.BoundedPrim () nullMP = constHeader 0xf6 undefMP :: P.BoundedPrim () undefMP = constHeader 0xf7 -- Canonical encoding of a NaN as per RFC 7049, section 3.9. canonicalNaN :: PI.BoundedPrim a canonicalNaN = P.liftFixedToBounded $ const (0xf9, (0x7e, 0x00)) >$< P.word8 >*< P.word8 >*< P.word8 halfMP :: P.BoundedPrim Float halfMP = condB isNaN canonicalNaN (floatToWord16 >$< withConstHeader 0xf9 P.word16BE) floatMP :: P.BoundedPrim Float floatMP = condB isNaN canonicalNaN (withConstHeader 0xfa P.floatBE) doubleMP :: P.BoundedPrim Double doubleMP = condB isNaN canonicalNaN (withConstHeader 0xfb P.doubleBE) breakMP :: P.BoundedPrim () breakMP = constHeader 0xff #if defined(OPTIMIZE_GMP) -- ---------------------------------------- -- -- Implementation optimized for integer-gmp -- -- ---------------------------------------- -- -- Below is where we try to abstract over the differences between the legacy -- integer-gmp interface and ghc-bignum, shipped in GHC >= 9.0. -- | Write the limbs of a 'BigNat' to the given address in big-endian byte -- ordering. exportBigNatToAddr :: BigNat -> Addr# -> IO Word #if defined(HAVE_GHC_BIGNUM) {-# COMPLETE SmallInt, PosBigInt, NegBigInt #-} pattern SmallInt :: Int# -> Integer pattern SmallInt n = GHC.Num.Integer.IS n pattern PosBigInt, NegBigInt :: GHC.Num.BigNat.BigNat# -> Integer pattern PosBigInt n = GHC.Num.Integer.IP n pattern NegBigInt n = GHC.Num.Integer.IN n bigNatSizeInBytes :: GHC.Num.BigNat.BigNat -> Word bigNatSizeInBytes bigNat = Gmp.bigNatSizeInBase 256 (GHC.Num.BigNat.unBigNat bigNat) bigNatMP :: GHC.Num.BigNat.BigNat# -> B.Builder bigNatMP n = P.primBounded header 0xc2 <> bigNatToBuilder (GHC.Num.BigNat.BN# n) negBigNatMP :: GHC.Num.BigNat.BigNat# -> B.Builder negBigNatMP n = -- If value `n` is stored in CBOR, it is interpreted as -1 - n. Since BigNat -- already represents n (note: it's unsigned), we simply decrement it to get -- the correct encoding. P.primBounded header 0xc3 <> bigNatToBuilder (subtractOneBigNat (GHC.Num.BigNat.BN# n)) where subtractOneBigNat (GHC.Num.BigNat.BN# nat) = case GHC.Num.BigNat.bigNatSubWord# nat 1## of (# | r #) -> GHC.Num.BigNat.BN# r (# (# #) | #) -> error "subtractOneBigNat: impossible" exportBigNatToAddr (GHC.Num.BigNat.BN# b) addr = IO $ \s -> -- The last parameter (`1#`) makes the export function use big endian encoding. case GHC.Num.BigNat.bigNatToAddr# b addr 1# s of (# s', w #) -> (# s', W# w #) #else /* HAVE_GHC_BIGNUM */ {-# COMPLETE SmallInt, PosBigInt, NegBigInt #-} pattern SmallInt :: Int# -> Integer pattern SmallInt n = Gmp.S# n pattern PosBigInt :: BigNat -> Integer pattern NegBigInt :: BigNat -> Integer pattern PosBigInt n = Gmp.Jp# n pattern NegBigInt n = Gmp.Jn# n bigNatSizeInBytes :: BigNat -> Word bigNatSizeInBytes bigNat = W# (Gmp.sizeInBaseBigNat bigNat 256#) bigNatMP :: BigNat -> B.Builder bigNatMP n = P.primBounded header 0xc2 <> bigNatToBuilder n negBigNatMP :: BigNat -> B.Builder negBigNatMP n = -- If value `n` is stored in CBOR, it is interpreted as -1 - n. Since BigNat -- already represents n (note: it's unsigned), we simply decrement it to get -- the correct encoding. P.primBounded header 0xc3 <> bigNatToBuilder (subtractOneBigNat n) where subtractOneBigNat m = Gmp.minusBigNatWord m (int2Word# 1#) exportBigNatToAddr bigNat addr# = -- The last parameter (`1#`) makes the export function use big endian encoding. Gmp.exportBigNatToAddr bigNat addr# 1# #endif /* HAVE_GHC_BIGNUM */ bigNatToBuilder :: BigNat -> B.Builder bigNatToBuilder = bigNatBuilder where bigNatBuilder :: BigNat -> B.Builder bigNatBuilder bigNat = let sizeW = bigNatSizeInBytes bigNat #if MIN_VERSION_bytestring(0,10,12) bounded = PI.boundedPrim (fromIntegral sizeW) (dumpBigNat sizeW) #else bounded = PI.boudedPrim (fromIntegral sizeW) (dumpBigNat sizeW) #endif in P.primBounded bytesLenMP sizeW <> P.primBounded bounded bigNat dumpBigNat :: Word -> BigNat -> Ptr a -> IO (Ptr a) dumpBigNat (W# sizeW#) bigNat ptr@(Ptr addr#) = do (W# written#) <- exportBigNatToAddr bigNat addr# let !newPtr = ptr `plusPtr` (I# (word2Int# written#)) sanity = isTrue# (sizeW# `eqWord#` written#) return $ assert sanity newPtr #else /* OPTIMIZE_GMP */ -- ---------------------- -- -- Generic implementation -- -- ---------------------- -- integerMP :: Integer -> B.Builder integerMP n | n >= 0 = P.primBounded header 0xc2 <> integerToBuilder n | otherwise = P.primBounded header 0xc3 <> integerToBuilder (-1 - n) integerToBuilder :: Integer -> B.Builder integerToBuilder n = bytesMP (integerToBytes n) integerToBytes :: Integer -> S.ByteString integerToBytes n0 | n0 == 0 = S.pack [0] | otherwise = S.pack (reverse (go n0)) where go n | n == 0 = [] | otherwise = narrow n : go (n `shiftR` 8) narrow :: Integer -> Word8 narrow = fromIntegral #endif /* OPTIMIZE_GMP */ cborg-0.2.10.0/src/cbits/0000755000000000000000000000000007346545000013110 5ustar0000000000000000cborg-0.2.10.0/src/cbits/cbor.h0000644000000000000000000000134207346545000014206 0ustar0000000000000000/* Needed GHC definitions */ #include "MachDeps.h" /* ** GHC 7.10 and above include efficient byte-swapping primitives, ** which are useful for efficient byte-mangling routines. */ #if __GLASGOW_HASKELL__ >= 710 #define HAVE_BYTESWAP_PRIMOPS #endif /* ** On Intel 32/64 bit machines, memory access to unaligned addresses ** is permitted (and generally efficient, too). With this in mind, ** some operations can be implemented more efficiently. */ #if i386_HOST_ARCH || x86_64_HOST_ARCH #define MEM_UNALIGNED_OPS #endif /* ** Establish the word-size of the machine, or fail. */ #if WORD_SIZE_IN_BITS == 64 #define ARCH_64bit #elif WORD_SIZE_IN_BITS == 32 #define ARCH_32bit #else #error expected WORD_SIZE_IN_BITS to be 32 or 64 #endif cborg-0.2.10.0/tests/0000755000000000000000000000000007346545000012357 5ustar0000000000000000cborg-0.2.10.0/tests/Main.hs0000644000000000000000000000146407346545000013604 0ustar0000000000000000module Main (main) where import Test.Tasty (TestTree, defaultMain, testGroup) import qualified Tests.Reference as Reference import qualified Tests.UnitTests as UnitTests import qualified Tests.Properties as Properties import qualified Tests.Boundary as Boundary import qualified Tests.ByteOffset as ByteOffset import qualified Tests.Canonical as Canonical import qualified Tests.Regress as Regress import qualified Tests.UTF8 as UTF8 import qualified Tests.PreEncoded as PreEncoded main :: IO () main = defaultMain tests tests :: TestTree tests = testGroup "CBOR" [ Reference.testTree , UnitTests.testTree , Properties.testTree , ByteOffset.testTree , Boundary.testTree , Canonical.testTree , Regress.testTree , UTF8.testTree , PreEncoded.testTree ] cborg-0.2.10.0/tests/Tests/0000755000000000000000000000000007346545000013461 5ustar0000000000000000cborg-0.2.10.0/tests/Tests/Boundary.hs0000644000000000000000000001471507346545000015610 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiWayIf #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE UndecidableInstances #-} module Tests.Boundary ( testTree -- :: TestTree ) where import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as BSL import Data.Either import Data.Int import Data.Word import qualified Data.Text as T import Codec.CBOR.Decoding import Codec.CBOR.Encoding import Codec.CBOR.Read import Codec.CBOR.Write import Tests.Util import Test.Tasty import Test.Tasty.QuickCheck #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif -- | CBOR can represent 64 bit negative and positive integers, hence we need -- wrapper for Integer to represent the whole range. newtype Int65 = Int65 Integer deriving (Eq, Ord, Enum, Num, Integral, Real, Show) instance Bounded Int65 where maxBound = Int65 (2^(64 :: Int) - 1) minBound = Int65 (-2^(64 :: Int)) instance Arbitrary Int65 where arbitrary = arbitraryBoundedIntegral encodeInt65 :: Int65 -> Encoding encodeInt65 (Int65 n) = encodeInteger n -- | Wrapper for bounded, integral type 'a' that potentially contains values -- outside of range of 'a'. newtype B a = B { unB :: BRep a } type family BRep a where BRep Word = Word64 BRep Word8 = Word64 BRep Word16 = Word64 BRep Word32 = Word64 BRep Word64 = Word64 BRep Int = Int65 BRep Int8 = Int64 BRep Int16 = Int64 BRep Int32 = Int64 BRep Int64 = Int65 instance Show (BRep a) => Show (B a) where showsPrec p = showsPrec p . unB instance (Arbitrary (BRep a), Num (BRep a), Bounded a, Integral a ) => Arbitrary (B a) where arbitrary = B <$> arbitraryWithBounds (undefined::a) -- | Check if deserialisation of values of type 'a' deals properly with the ones -- out of range, i.e. fails to decode them. boundaryTest :: forall a rep. (Bounded a, Integral a, Show a, rep ~ BRep a, Ord rep, Num rep) => (rep -> Encoding) -- ^ encode -> (forall s. Decoder s a) -- ^ decode -> B a -> Property boundaryTest enc dec a = if outsideRange then collect "outside" $ isLeft a' else collect "inside" $ isRight a' where a' = deserialiseFromBytes dec . toLazyByteString . enc $ unB a -- Note that this is always true for a ~ rep. outsideRange = unB a < fromIntegral (minBound :: a) || unB a > fromIntegral (maxBound :: a) mkBoundaryTest :: forall a rep. (Bounded a, Integral a, Show a, rep ~ BRep a, Arbitrary rep, Ord rep, Show rep, Num rep) => String -> (rep -> Encoding) -> (forall s. Decoder s a) -> (forall s. Decoder s a) -> [TestTree] mkBoundaryTest aName enc dec decCan = [ testProperty aName $ boundaryTest enc dec , testProperty (aName ++ " (canonical)") $ boundaryTest enc decCan ] ---------------------------------------- -- | Check if deserialisation of map/list length deals properly with the ones -- out of range, i.e. fails to decode them. lenBoundaryTest :: (Word -> Encoding) -> (forall s. Decoder s Int) -> Length -> Property lenBoundaryTest enc dec a = if outsideRange then collect "outside" $ isLeft a' else collect "inside" $ isRight a' where a' = deserialiseFromBytes dec . toLazyByteString . enc $ unLength a outsideRange = fromIntegral (unLength a) < (0::Int) mkLenBoundaryTest :: String -> (Word -> Encoding) -> (forall s. Decoder s Int) -> (forall s. Decoder s Int) -> [TestTree] mkLenBoundaryTest aName enc dec decCan = [ testProperty aName $ lenBoundaryTest enc dec , testProperty (aName ++ " (canonical)") $ lenBoundaryTest enc decCan ] ---------------------------------------- data StringLengthPrefix = StringLP Word BSL.ByteString deriving Show instance Arbitrary StringLengthPrefix where arbitrary = (\l -> StringLP (unLength l) (mkLengthPrefix True l)) <$> arbitrary data BytesLengthPrefix = BytesLP Word BSL.ByteString deriving Show instance Arbitrary BytesLengthPrefix where arbitrary = (\l -> BytesLP (unLength l) (mkLengthPrefix False l)) <$> arbitrary -- | Test that positive length prefixes of string/bytes are parsed successfully, -- whereas negative are not. stringBytesBoundaryTest :: [TestTree] stringBytesBoundaryTest = [ testProperty "String" $ \(StringLP w bs) -> case deserialiseFromBytes decodeString bs of Right (_rest, string) -> w == 0 && T.length string == 0 Left (DeserialiseFailure _ msg) -> if fromIntegral w < (0::Int) then msg == "expected string" else msg == "end of input" , testProperty "Bytes" $ \(BytesLP w bs) -> case deserialiseFromBytes decodeBytes bs of Right (_rest, bytes) -> w == 0 && BS.length bytes == 0 Left (DeserialiseFailure _ msg) -> if fromIntegral w < (0::Int) then msg == "expected bytes" else msg == "end of input" ] ---------------------------------------- testTree :: TestTree testTree = localOption (QuickCheckTests 1000) . testGroup "Boundary checks" $ concat [ mkBoundaryTest "Word" encodeWord64 decodeWord decodeWordCanonical , mkBoundaryTest "Word8" encodeWord64 decodeWord8 decodeWord8Canonical , mkBoundaryTest "Word16" encodeWord64 decodeWord16 decodeWord16Canonical , mkBoundaryTest "Word32" encodeWord64 decodeWord32 decodeWord32Canonical , mkBoundaryTest "Word64" encodeWord64 decodeWord64 decodeWord64Canonical , mkBoundaryTest "Int" encodeInt65 decodeInt decodeIntCanonical , mkBoundaryTest "Int8" encodeInt64 decodeInt8 decodeInt8Canonical , mkBoundaryTest "Int16" encodeInt64 decodeInt16 decodeInt16Canonical , mkBoundaryTest "Int32" encodeInt64 decodeInt32 decodeInt32Canonical , mkBoundaryTest "Int64" encodeInt65 decodeInt64 decodeInt64Canonical , mkLenBoundaryTest "ListLen" encodeListLen decodeListLen decodeListLenCanonical , mkLenBoundaryTest "MapLen" encodeMapLen decodeMapLen decodeMapLenCanonical , stringBytesBoundaryTest ] cborg-0.2.10.0/tests/Tests/ByteOffset.hs0000644000000000000000000004271107346545000016074 0ustar0000000000000000{-# LANGUAGE DeriveFunctor, BangPatterns #-} module Tests.ByteOffset (testTree) where import Data.Word import qualified Data.Text as T import qualified Data.Text.Lazy as LT import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS import Control.Exception (throw) import Control.Applicative import Codec.CBOR.Decoding import Codec.CBOR.Read (deserialiseFromBytes) import Codec.CBOR.Write (toLazyByteString) import Codec.CBOR.Term (Term, encodeTerm, decodeTerm) import qualified Codec.CBOR.Term as Term (Term(..)) import Test.Tasty (TestTree, testGroup, localOption) import Test.Tasty.QuickCheck (testProperty, QuickCheckMaxSize(..)) import Test.QuickCheck hiding (subterms) import qualified Tests.Reference.Implementation as RefImpl import Tests.Reference.Generators (floatToWord, doubleToWord) import Tests.Term (eqTerm, canonicaliseTerm) import Tests.Util import Prelude hiding (encodeFloat, decodeFloat) -- | Like a 'Term', but with an annotation on top level terms and every -- subterm. This is used for tests of 'peekByteOffset' where we annotate a -- decoded term with the byte range it covers. -- data ATerm annotation = ATerm (TermF (ATerm annotation)) annotation deriving (Show, Eq, Functor) -- | Term one-level functor. -- data TermF t = TInt Int | TInteger Integer | TBytes BS.ByteString | TBytesI LBS.ByteString | TString T.Text | TStringI LT.Text | TList [t] | TListI [t] | TMap [(t, t)] | TMapI [(t, t)] | TTagged Word64 t | TBool Bool | TNull | TSimple Word8 | THalf Float | TFloat Float | TDouble Double deriving (Show, Eq, Functor) testTree :: TestTree testTree = testGroup "peekByteOffset" [ testGroup "ATerm framework" [ testProperty "isomorphic 1" prop_ATerm_isomorphic , testProperty "isomorphic 2" prop_ATerm_isomorphic2 , testProperty "isomorphic 3" prop_ATerm_isomorphic3 ] , testProperty "bytes deserialise" prop_peekByteOffset_deserialise , testProperty "bytes reserialise" prop_peekByteOffset_reserialise , testProperty "non-canonical encoding" prop_peekByteOffset_noncanonical , localOption (QuickCheckMaxSize 30) $ testProperty "same offsets with all 2-splits" prop_peekByteOffset_splits2 , localOption (QuickCheckMaxSize 20) $ testProperty "same offsets with all 3-splits" prop_peekByteOffset_splits3 ] -------------------------------------------------------------------------------- -- Properties of the framework -- -- | Basic property to check that 'ATerm' is isomorphic to the 'Term'. -- prop_ATerm_isomorphic :: Term -> Bool prop_ATerm_isomorphic t = t `eqTerm` (convertATermToTerm . convertTermToATerm) t -- | Variation on 'prop_ATerm_isomorphic', checking that serialising as a -- 'Term', deserialising as an 'ATerm' and converting back gives an equivalent -- term. -- prop_ATerm_isomorphic2 :: Term -> Bool prop_ATerm_isomorphic2 t = canonicaliseTerm t `eqTerm` (convertATermToTerm . deserialiseATerm . serialiseTerm) t -- | Variation on 'prop_ATerm_isomorphic2', but where we check the terms are -- equivalent as 'ATerm's. -- prop_ATerm_isomorphic3 :: Term -> Bool prop_ATerm_isomorphic3 t = (convertTermToATerm . canonicaliseTerm) t `eqATerm` (fmap (const ()) . deserialiseATerm . serialiseTerm) t -------------------------------------------------------------------------------- -- Properties of peekByteOffset -- -- | A key consistency property for terms annotated with their bytes: -- taking those bytes and deserialising them gives the corresponding term -- prop_ATerm_deserialise :: ATerm ByteSpan -> Bool prop_ATerm_deserialise t@(ATerm _ bs) = deserialiseTerm bs `eqTerm` convertATermToTerm t -- | For the case of canonical encodings it is also true for terms annotated -- with their bytes: taking the term and serialising it gives the bytes. -- -- Note this is /only/ expected to hold for canonical encodings. See -- 'prop_peekByteOffset_noncanonical' for a demonstration of this not holding -- for non-canonical encodings. -- prop_ATerm_reserialise :: ATerm ByteSpan -> Bool prop_ATerm_reserialise t@(ATerm _ bs) = serialiseTerm (convertATermToTerm t) == bs -- | For an 'ATerm' annotated with its bytes (obtained by decoding a term), -- 'prop_ATerm_deserialise' should be true for the whole term and all subterms. -- prop_peekByteOffset_deserialise :: Term -> Bool prop_peekByteOffset_deserialise t = all prop_ATerm_deserialise (subterms t') where t' = deserialiseATerm (serialiseTerm t) -- | For an 'ATerm' annotated with its bytes (obtained by decoding a canonical -- term), 'prop_ATerm_serialise' should be true for the whole term and all -- subterms. -- prop_peekByteOffset_reserialise :: Term -> Bool prop_peekByteOffset_reserialise t = all prop_ATerm_reserialise (subterms t') where t' = deserialiseATerm (serialiseTerm t) -- | For an 'ATerm' annotated with its bytes obtained by decoding a -- /non-canonical/ term, 'prop_ATerm_serialise' should not always hold. -- -- This is in some sense the essence of why we want 'peekByteOffset' in the -- first place: to get the bytes corresponding to a term we have to get the -- original input bytes since we cannot rely on re-serialising to recover the -- bytes (at least not without relying on and checking for canonical encodings). -- prop_peekByteOffset_noncanonical :: RefImpl.Term -> Property prop_peekByteOffset_noncanonical t = not (RefImpl.isCanonicalTerm t) ==> not (prop_ATerm_reserialise t') where t' = deserialiseATerm (RefImpl.serialise t) -- | The offsets we get when decoding a term should be the same irrespective of -- block boundaries in the input data stream. This checks the property for all -- possible 2-chunk splits of the input data. -- prop_peekByteOffset_splits2 :: Term -> Bool prop_peekByteOffset_splits2 t = and [ deserialiseATermOffsets lbs' `eqATerm` t' | lbs' <- splits2 lbs ] where lbs = serialiseTerm t t' = deserialiseATermOffsets lbs -- | The offsets we get when decoding a term should be the same irrespective of -- block boundaries in the input data stream. This checks the property for all -- possible 3-chunk splits of the input data. -- prop_peekByteOffset_splits3 :: Term -> Bool prop_peekByteOffset_splits3 t = and [ deserialiseATermOffsets lbs' `eqATerm` t' | lbs' <- splits3 lbs ] where lbs = serialiseTerm t t' = deserialiseATermOffsets lbs ------------------------------------------------------------------------------ subterms :: ATerm a -> [ATerm a] subterms at@(ATerm t0 _) = at : subtermsF t0 where subtermsF :: TermF (ATerm a) -> [ATerm a] subtermsF (TList ts) = concatMap subterms ts subtermsF (TListI ts) = concatMap subterms ts subtermsF (TMap ts) = [ t' | (x, y) <- ts , t' <- subterms x ++ subterms y ] subtermsF (TMapI ts) = [ t' | (x, y) <- ts , t' <- subterms x ++ subterms y ] subtermsF (TTagged _ t') = subterms t' subtermsF TInt {} = [] subtermsF TInteger{} = [] subtermsF TBytes {} = [] subtermsF TBytesI {} = [] subtermsF TString {} = [] subtermsF TStringI{} = [] subtermsF TBool {} = [] subtermsF TNull {} = [] subtermsF TSimple {} = [] subtermsF THalf {} = [] subtermsF TFloat {} = [] subtermsF TDouble {} = [] ------------------------------------------------------------------------------ serialiseTerm :: Term -> LBS.ByteString serialiseTerm = toLazyByteString . encodeTerm deserialiseTerm :: LBS.ByteString -> Term deserialiseTerm = either throw snd . deserialiseFromBytes decodeTerm -------------------------------------------------------------------------------- -- Decoding a term, annotated with its underlying bytes -- type Offsets = (ByteOffset, ByteOffset) type ByteSpan = LBS.ByteString deserialiseATermOffsets :: LBS.ByteString -> ATerm Offsets deserialiseATermOffsets = either throw snd . deserialiseFromBytes decodeATerm deserialiseATerm :: LBS.ByteString -> ATerm ByteSpan deserialiseATerm lbs = atermOffsetsToBytes lbs (deserialiseATermOffsets lbs) atermOffsetsToBytes :: LBS.ByteString -> ATerm Offsets -> ATerm ByteSpan atermOffsetsToBytes original = fmap (`slice` original) where slice :: (ByteOffset, ByteOffset) -> LBS.ByteString -> LBS.ByteString slice (n,m) = LBS.take (m-n) . LBS.drop n decodeATerm :: Decoder s (ATerm Offsets) decodeATerm = do start <- peekByteOffset t <- decodeTermFATerm end <- peekByteOffset return (ATerm t (start, end)) decodeTermFATerm :: Decoder s (TermF (ATerm Offsets)) decodeTermFATerm = do tkty <- peekTokenType case tkty of TypeUInt -> do w <- decodeWord return $! fromWord w where fromWord :: Word -> TermF (ATerm Offsets) fromWord w | w <= fromIntegral (maxBound :: Int) = TInt (fromIntegral w) | otherwise = TInteger (fromIntegral w) TypeUInt64 -> do w <- decodeWord64 return $! fromWord64 w where fromWord64 w | w <= fromIntegral (maxBound :: Int) = TInt (fromIntegral w) | otherwise = TInteger (fromIntegral w) TypeNInt -> do w <- decodeNegWord return $! fromNegWord w where fromNegWord w | w <= fromIntegral (maxBound :: Int) = TInt (-1 - fromIntegral w) | otherwise = TInteger (-1 - fromIntegral w) TypeNInt64 -> do w <- decodeNegWord64 return $! fromNegWord64 w where fromNegWord64 w | w <= fromIntegral (maxBound :: Int) = TInt (-1 - fromIntegral w) | otherwise = TInteger (-1 - fromIntegral w) TypeInteger -> do !x <- decodeInteger return (TInteger x) TypeFloat16 -> do !x <- decodeFloat return (THalf x) TypeFloat32 -> do !x <- decodeFloat return (TFloat x) TypeFloat64 -> do !x <- decodeDouble return (TDouble x) TypeBytes -> do !x <- decodeBytes return (TBytes x) TypeBytesIndef -> decodeBytesIndef >> decodeBytesIndefLen [] TypeString -> do !x <- decodeString return (TString x) TypeStringIndef -> decodeStringIndef >> decodeStringIndefLen [] TypeListLen -> decodeListLen >>= flip decodeListN [] TypeListLen64 -> decodeListLen >>= flip decodeListN [] TypeListLenIndef -> decodeListLenIndef >> decodeListIndefLen [] TypeMapLen -> decodeMapLen >>= flip decodeMapN [] TypeMapLen64 -> decodeMapLen >>= flip decodeMapN [] TypeMapLenIndef -> decodeMapLenIndef >> decodeMapIndefLen [] TypeTag -> do !x <- decodeTag64 !y <- decodeATerm return (TTagged x y) TypeTag64 -> do !x <- decodeTag64 !y <- decodeATerm return (TTagged x y) TypeBool -> do !x <- decodeBool return (TBool x) TypeNull -> TNull <$ decodeNull TypeSimple -> do !x <- decodeSimple return (TSimple x) TypeBreak -> fail "unexpected break" TypeInvalid -> fail "invalid token encoding" decodeBytesIndefLen :: [BS.ByteString] -> Decoder s (TermF (ATerm Offsets)) decodeBytesIndefLen acc = do stop <- decodeBreakOr if stop then return $! TBytesI (LBS.fromChunks (reverse acc)) else do !bs <- decodeBytes decodeBytesIndefLen (bs : acc) decodeStringIndefLen :: [T.Text] -> Decoder s (TermF (ATerm Offsets)) decodeStringIndefLen acc = do stop <- decodeBreakOr if stop then return $! TStringI (LT.fromChunks (reverse acc)) else do !str <- decodeString decodeStringIndefLen (str : acc) decodeListN :: Int -> [ATerm Offsets] -> Decoder s (TermF (ATerm Offsets)) decodeListN !n acc = case n of 0 -> return $! TList (reverse acc) _ -> do !t <- decodeATerm decodeListN (n-1) (t : acc) decodeListIndefLen :: [ATerm Offsets] -> Decoder s (TermF (ATerm Offsets)) decodeListIndefLen acc = do stop <- decodeBreakOr if stop then return $! TListI (reverse acc) else do !tm <- decodeATerm decodeListIndefLen (tm : acc) decodeMapN :: Int -> [(ATerm Offsets, ATerm Offsets)] -> Decoder s (TermF (ATerm Offsets)) decodeMapN !n acc = case n of 0 -> return $! TMap (reverse acc) _ -> do !tm <- decodeATerm !tm' <- decodeATerm decodeMapN (n-1) ((tm, tm') : acc) decodeMapIndefLen :: [(ATerm Offsets, ATerm Offsets)] -> Decoder s (TermF (ATerm Offsets)) decodeMapIndefLen acc = do stop <- decodeBreakOr if stop then return $! TMapI (reverse acc) else do !tm <- decodeATerm !tm' <- decodeATerm decodeMapIndefLen ((tm, tm') : acc) -------------------------------------------------------------------------------- -- Converting between terms and annotated terms convertTermToATerm :: Term -> ATerm () convertTermToATerm t = ATerm (convertTermToTermF t) () convertTermToTermF :: Term -> TermF (ATerm ()) convertTermToTermF (Term.TList ts) = TList (map convertTermToATerm ts) convertTermToTermF (Term.TListI ts) = TListI (map convertTermToATerm ts) convertTermToTermF (Term.TMap ts) = TMap [ ( convertTermToATerm x , convertTermToATerm y ) | (x, y) <- ts ] convertTermToTermF (Term.TMapI ts) = TMapI [ ( convertTermToATerm x , convertTermToATerm y ) | (x, y) <- ts ] convertTermToTermF (Term.TTagged x t) = TTagged x (convertTermToATerm t) convertTermToTermF (Term.TInt x) = TInt x convertTermToTermF (Term.TInteger x) = TInteger x convertTermToTermF (Term.TBytes x) = TBytes x convertTermToTermF (Term.TBytesI x) = TBytesI x convertTermToTermF (Term.TString x) = TString x convertTermToTermF (Term.TStringI x) = TStringI x convertTermToTermF (Term.TBool x) = TBool x convertTermToTermF Term.TNull = TNull convertTermToTermF (Term.TSimple x) = TSimple x convertTermToTermF (Term.THalf x) = THalf x convertTermToTermF (Term.TFloat x) = TFloat x convertTermToTermF (Term.TDouble x) = TDouble x convertATermToTerm :: ATerm a -> Term convertATermToTerm (ATerm t _ann) = convertTermFToTerm t convertTermFToTerm :: TermF (ATerm a) -> Term convertTermFToTerm (TList ts) = Term.TList (map convertATermToTerm ts) convertTermFToTerm (TListI ts) = Term.TListI (map convertATermToTerm ts) convertTermFToTerm (TMap ts) = Term.TMap [ ( convertATermToTerm x , convertATermToTerm y ) | (x, y) <- ts ] convertTermFToTerm (TMapI ts) = Term.TMapI [ ( convertATermToTerm x , convertATermToTerm y ) | (x, y) <- ts ] convertTermFToTerm (TTagged x t) = Term.TTagged x (convertATermToTerm t) convertTermFToTerm (TInt x) = Term.TInt x convertTermFToTerm (TInteger x) = Term.TInteger x convertTermFToTerm (TBytes x) = Term.TBytes x convertTermFToTerm (TBytesI x) = Term.TBytesI x convertTermFToTerm (TString x) = Term.TString x convertTermFToTerm (TStringI x) = Term.TStringI x convertTermFToTerm (TBool x) = Term.TBool x convertTermFToTerm TNull = Term.TNull convertTermFToTerm (TSimple x) = Term.TSimple x convertTermFToTerm (THalf x) = Term.THalf x convertTermFToTerm (TFloat x) = Term.TFloat x convertTermFToTerm (TDouble x) = Term.TDouble x -- NaNs are so annoying... eqATerm :: Eq a => ATerm a -> ATerm a -> Bool eqATerm (ATerm t1 ann1) (ATerm t2 ann2) = ann1 == ann2 && eqATermF t1 t2 eqATermF :: Eq a => TermF (ATerm a) -> TermF (ATerm a) -> Bool eqATermF (TList ts) (TList ts') = and (zipWith eqATerm ts ts') eqATermF (TListI ts) (TListI ts') = and (zipWith eqATerm ts ts') eqATermF (TMap ts) (TMap ts') = and (zipWith eqATermPair ts ts') eqATermF (TMapI ts) (TMapI ts') = and (zipWith eqATermPair ts ts') eqATermF (TTagged w t) (TTagged w' t') = w == w' && eqATerm t t' eqATermF (THalf f) (THalf f') = floatToWord f == floatToWord f' eqATermF (TFloat f) (TFloat f') = floatToWord f == floatToWord f' eqATermF (TDouble f) (TDouble f') = doubleToWord f == doubleToWord f' eqATermF a b = a == b eqATermPair :: (Eq a, Eq b) => (ATerm a, ATerm b) -> (ATerm a, ATerm b) -> Bool eqATermPair (a,b) (a',b') = eqATerm a a' && eqATerm b b' cborg-0.2.10.0/tests/Tests/Canonical.hs0000644000000000000000000001735307346545000015715 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE RankNTypes #-} module Tests.Canonical (testTree) where import Prelude hiding (decodeFloat, encodeFloat) #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif import qualified Data.ByteString.Lazy as LBS import Data.Proxy import Codec.CBOR.Read (deserialiseFromBytes) import Codec.CBOR.Decoding import Test.Tasty (TestTree, testGroup) import Test.Tasty.QuickCheck (testProperty) import Test.QuickCheck import Tests.Properties hiding (testTree) -- | This is a version of 'prop_decodeRefdecodeImp' but where we restrict the -- encoded input to non-canonical forms. These forms are covered by the -- original property. This property just ensures that we have good coverage of -- this case. -- prop_decode_nonCanonical :: forall t. Token t => t -> Property prop_decode_nonCanonical x = let enc = serialiseRef x y = deserialiseRef t enc y' = deserialiseImp t enc enc' = serialiseImp t y' isCanonical = enc == enc' in not isCanonical ==> -- This property holds without this pre-condition, as demonstrated by -- prop_decodeRefdecodeImp, but using it ensures we get good coverage -- of the non-canonical cases y' `eq` fromRef y where eq = eqImp t t = Proxy :: Proxy t -- | Check that the special checked canonical form decoder primitives work. -- -- We decode with the normal and canonical decoder, and check that they agree -- in the canonical cases, and that the canonical decoder rejects the -- non-canonical cases. -- -- We have a QC coverage check to make sure we are covering enough of both -- canonical and non-canonical cases. -- prop_decodeCanonical :: forall t. Token t => (forall s. Decoder s (Imp t)) -> t -> Property prop_decodeCanonical decodeCanonical x = classify isCanonical "canonical" $ case deserialiseFromBytes decodeCanonical enc of Left _failure -> not isCanonical Right (trailing, y') -> isCanonical && eqImp t y y' && LBS.null trailing where enc = serialiseRef x y = deserialiseImp t enc -- It is canonical if it re-encodes to the same bytes we decoded isCanonical = serialiseImp t y == enc t = Proxy :: Proxy t prop_decodeCanonical_Word :: TokWord -> Property prop_decodeCanonical_Word = prop_decodeCanonical decodeWordCanonical prop_decodeCanonical_Word8 :: TokWord8 -> Property prop_decodeCanonical_Word8 = prop_decodeCanonical decodeWord8Canonical prop_decodeCanonical_Word16 :: TokWord16 -> Property prop_decodeCanonical_Word16 = prop_decodeCanonical decodeWord16Canonical prop_decodeCanonical_Word32 :: TokWord32 -> Property prop_decodeCanonical_Word32 = prop_decodeCanonical decodeWord32Canonical prop_decodeCanonical_Word64 :: TokWord64 -> Property prop_decodeCanonical_Word64 = prop_decodeCanonical decodeWord64Canonical --prop_decodeCanonical_NegWord :: TokNegWord -> Property --prop_decodeCanonical_NegWord = prop_decodeCanonical decodeNegWordCanonical prop_decodeCanonical_Int :: TokInt -> Property prop_decodeCanonical_Int = prop_decodeCanonical decodeIntCanonical prop_decodeCanonical_Int8 :: TokInt8 -> Property prop_decodeCanonical_Int8 = prop_decodeCanonical decodeInt8Canonical prop_decodeCanonical_Int16 :: TokInt16 -> Property prop_decodeCanonical_Int16 = prop_decodeCanonical decodeInt16Canonical prop_decodeCanonical_Int32 :: TokInt32 -> Property prop_decodeCanonical_Int32 = prop_decodeCanonical decodeInt32Canonical prop_decodeCanonical_Int64 :: TokInt64 -> Property prop_decodeCanonical_Int64 = prop_decodeCanonical decodeInt64Canonical prop_decodeCanonical_Integer :: TokInteger -> Property prop_decodeCanonical_Integer = prop_decodeCanonical decodeIntegerCanonical prop_decodeCanonical_Half :: TokHalf -> Property prop_decodeCanonical_Half = prop_decodeCanonical decodeFloat16Canonical prop_decodeCanonical_Float :: TokFloat -> Property prop_decodeCanonical_Float = prop_decodeCanonical decodeFloatCanonical prop_decodeCanonical_Double :: TokDouble -> Property prop_decodeCanonical_Double = prop_decodeCanonical decodeDoubleCanonical prop_decodeCanonical_Tag :: TokTag -> Property prop_decodeCanonical_Tag = prop_decodeCanonical decodeTagCanonical prop_decodeCanonical_Tag64 :: TokTag64 -> Property prop_decodeCanonical_Tag64 = prop_decodeCanonical decodeTag64Canonical prop_decodeCanonical_Simple :: Simple -> Property prop_decodeCanonical_Simple = prop_decodeCanonical decodeSimpleCanonical {- , decodeNegWordCanonical -- :: Decoder s Word , decodeNegWord64Canonical -- :: Decoder s Word64 , decodeBytesCanonical -- :: Decoder s ByteString , decodeByteArrayCanonical -- :: Decoder s ByteArray , decodeStringCanonical -- :: Decoder s Text , decodeUtf8ByteArrayCanonical -- :: Decoder s ByteArray , decodeListLenCanonical -- :: Decoder s Int , decodeMapLenCanonical -- :: Decoder s Int -} -------------------------------------------------------------------------------- -- TestTree API testTree :: TestTree testTree = testGroup "properties" [ testGroup "decode non-canonical encoding" [ testProperty "Word8" (prop_decode_nonCanonical :: TokWord8 -> Property) , testProperty "Word16" (prop_decode_nonCanonical :: TokWord16 -> Property) , testProperty "Word32" (prop_decode_nonCanonical :: TokWord32 -> Property) , testProperty "Word64" (prop_decode_nonCanonical :: TokWord64 -> Property) , testProperty "Word" (prop_decode_nonCanonical :: TokWord -> Property) -- , testProperty "NegWord" (prop_decode_nonCanonical :: TokNegWord -> Property) , testProperty "Int8" (prop_decode_nonCanonical :: TokInt8 -> Property) , testProperty "Int16" (prop_decode_nonCanonical :: TokInt16 -> Property) , testProperty "Int32" (prop_decode_nonCanonical :: TokInt32 -> Property) , testProperty "Int64" (prop_decode_nonCanonical :: TokInt64 -> Property) , testProperty "Int" (prop_decode_nonCanonical :: TokInt -> Property) , testProperty "Integer" (prop_decode_nonCanonical :: TokInteger -> Property) , testProperty "Half" (prop_decode_nonCanonical :: TokHalf -> Property) , testProperty "Float" (prop_decode_nonCanonical :: TokFloat -> Property) , testProperty "Double" (prop_decode_nonCanonical :: TokDouble -> Property) , testProperty "Tag" (prop_decode_nonCanonical :: TokTag -> Property) , testProperty "Tag64" (prop_decode_nonCanonical :: TokTag64 -> Property) , testProperty "Simple" (prop_decode_nonCanonical :: Simple -> Property) , testProperty "Term" (prop_decode_nonCanonical :: Term -> Property) ] , testGroup "canonical decoding" [ testProperty "Word" prop_decodeCanonical_Word , testProperty "Word8" prop_decodeCanonical_Word8 , testProperty "Word16" prop_decodeCanonical_Word16 , testProperty "Word32" prop_decodeCanonical_Word32 , testProperty "Word64" prop_decodeCanonical_Word64 , testProperty "Int" prop_decodeCanonical_Int , testProperty "Int8" prop_decodeCanonical_Int8 , testProperty "Int16" prop_decodeCanonical_Int16 , testProperty "Int32" prop_decodeCanonical_Int32 , testProperty "Int64" prop_decodeCanonical_Int64 , testProperty "Integer" prop_decodeCanonical_Integer , testProperty "Half" prop_decodeCanonical_Half , testProperty "Float" prop_decodeCanonical_Float , testProperty "Double" prop_decodeCanonical_Double , testProperty "Tag" prop_decodeCanonical_Tag , testProperty "Tag64" prop_decodeCanonical_Tag64 , testProperty "Simple" prop_decodeCanonical_Simple ] ] cborg-0.2.10.0/tests/Tests/PreEncoded.hs0000644000000000000000000000621107346545000016025 0ustar0000000000000000module Tests.PreEncoded ( testTree ) where import Data.Monoid (Monoid(mconcat)) import Codec.CBOR.Term (Term, encodeTerm) import Codec.CBOR.FlatTerm (FlatTerm, toFlatTerm, TermToken(..)) import Codec.CBOR.Write (toStrictByteString, toLazyByteString) import Codec.CBOR.Encoding (Encoding, encodePreEncoded) import Test.Tasty (TestTree, testGroup) import Test.Tasty.QuickCheck (testProperty) import Tests.Term () -- instance Arbitrary Term import Tests.Reference.Generators (canonicalNaN, floatToWord, doubleToWord) -- | Use 'encodePreEncoded' but with a serialised term as the bytes. -- encodePreEncoded' :: Term -> Encoding encodePreEncoded' = encodePreEncoded . toStrictByteString . encodeTerm prop_preEncodedTerm_sameBytes :: Term -> Bool prop_preEncodedTerm_sameBytes t = sameBytes (encodeTerm t) (encodePreEncoded' t) prop_preEncodedTerm_sameTokens :: Term -> Bool prop_preEncodedTerm_sameTokens t = sameTokens (encodeTerm t) (encodePreEncoded' t) prop_preEncodedTerms_sameBytes :: [(Term, Bool)] -> Bool prop_preEncodedTerms_sameBytes ts = sameBytes (mconcat [ encodeTerm t | (t, _) <- ts ]) (mconcat [ if pre then encodePreEncoded' t else encodeTerm t | (t, pre) <- ts ]) prop_preEncodedTerms_sameTokens :: [(Term, Bool)] -> Bool prop_preEncodedTerms_sameTokens ts = sameTokens (mconcat [ encodeTerm t | (t, _) <- ts ]) (mconcat [ if pre then encodePreEncoded' t else encodeTerm t | (t, pre) <- ts ]) sameBytes :: Encoding -> Encoding -> Bool sameBytes e1 e2 = toLazyByteString e1 == toLazyByteString e2 sameTokens :: Encoding -> Encoding -> Bool sameTokens e1 e2 = canonicaliseFlatTerm (toFlatTerm e1) `eqFlatTerm` canonicaliseFlatTerm (toFlatTerm e2) canonicaliseFlatTerm :: FlatTerm -> FlatTerm canonicaliseFlatTerm = map canonicaliseTermToken canonicaliseTermToken :: TermToken -> TermToken canonicaliseTermToken (TkFloat16 f) | isNaN f = TkFloat16 canonicalNaN canonicaliseTermToken (TkFloat32 f) | isNaN f = TkFloat16 canonicalNaN canonicaliseTermToken (TkFloat64 f) | isNaN f = TkFloat16 canonicalNaN canonicaliseTermToken x = x eqFlatTerm :: FlatTerm -> FlatTerm -> Bool eqFlatTerm x y = and (zipWith eqTermToken x y) -- NaNs strike again! eqTermToken :: TermToken -> TermToken -> Bool eqTermToken (TkFloat16 x) (TkFloat16 y) = floatToWord x == floatToWord y eqTermToken (TkFloat32 x) (TkFloat32 y) = floatToWord x == floatToWord y eqTermToken (TkFloat64 x) (TkFloat64 y) = doubleToWord x == doubleToWord y eqTermToken x y = x == y -------------------------------------------------------------------------------- -- TestTree API testTree :: TestTree testTree = testGroup "pre-encoded" [ testProperty "single term, same bytes" prop_preEncodedTerm_sameBytes , testProperty "single term, same tokens" prop_preEncodedTerm_sameTokens , testProperty "list terms, same bytes" prop_preEncodedTerms_sameBytes , testProperty "list terms, same tokens" prop_preEncodedTerms_sameTokens ] cborg-0.2.10.0/tests/Tests/Properties.hs0000644000000000000000000013676507346545000016173 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE DefaultSignatures #-} {-# OPTIONS_GHC -fno-warn-orphans #-} module Tests.Properties ( testTree -- * Token type class and derived functions , Token(..) , serialiseRef , serialiseImp , deserialiseRef , deserialiseImp -- * Various test token types , TokInt , TokInt8 , TokInt16 , TokInt32 , TokInt64 , TokInteger , TokWord , TokWord8 , TokWord16 , TokWord32 , TokWord64 , TokHalf , TokFloat , TokDouble , TokTag , TokTag64 , Ref.Simple , Ref.Term ) where import Prelude hiding (decodeFloat, encodeFloat) import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS import Data.Word import Data.Int import Data.Bits (complement) import qualified Numeric.Half as Half import Data.Function (on) import Data.Proxy import Data.Kind (Type) import GHC.Exts import Codec.CBOR.ByteArray import qualified Codec.CBOR.ByteArray.Sliced as Sliced import Codec.CBOR.Term import Codec.CBOR.Read import Codec.CBOR.Write import Codec.CBOR.Decoding import Codec.CBOR.Encoding import Test.Tasty (TestTree, testGroup, localOption) import Test.Tasty.QuickCheck (testProperty, QuickCheckMaxSize(..)) import Test.QuickCheck import Test.QuickCheck.Gen (Gen (MkGen)) import qualified Tests.Reference.Implementation as Ref import Tests.Reference.Implementation (UInt(..), lengthUInt) import Tests.Reference.Generators import Tests.Term ( fromRefTerm, toRefTerm, eqTerm, canonicaliseTerm ) import Tests.Util #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif #if MIN_VERSION_bytestring(0,11,1) import qualified Data.ByteString.Short as SBS #else import qualified Data.ByteString.Short.Internal as SBS #endif import qualified Data.Primitive.ByteArray as Prim (ByteArray (..)) import System.Random.Stateful hiding (genByteString, genShortByteString) -- | The CBOR implementation and its reference implementation satisfy all the -- properties implied in the following commuting diagram. -- -- The properties in this module exercise various paths throguh this diagram, -- and do so for various different types. -- -- > canon id -- > Ref──────────▶Ref───────────▶Ref -- > │ ▲ ╲ (ref) ╱ │ -- > │ │ ╲enc dec╱ │ -- > │ │ ╲ ╱ │ -- > │from to│ ▶Enc▶ │from -- > │ │ ╱ ╲ │ -- > │ │ ╱enc dec╲ │ -- > ▼ │ ╱ (imp) ╲ ▼ -- > Imp──────────▶Imp───────────▶Imp -- > id canon -- -- Key -- -- * Imp: Implementation token type -- * Ref: Reference token type -- * Enc: Encoding (ie bytes) -- * canon: canonicaliseRef or canonicaliseImp -- * enc: encodeRef or encodeImp -- * dec: decodeRef or decodeImp -- -- We capture these types and arrows with a type class and an associated type. -- class (Eq t, Show t) => Token t where type Imp t :: Type encodeImp :: Proxy t -> Imp t -> Encoding encodeRef :: Ref.Encoder t decodeImp :: forall s. Proxy t -> Decoder s (Imp t) decodeRef :: Proxy t -> Ref.Decoder t canonicaliseImp :: Proxy t -> Imp t -> Imp t canonicaliseRef :: t -> t eqImp :: Proxy t -> Imp t -> Imp t -> Bool toRef :: Proxy t -> Imp t -> t fromRef :: t -> Imp t -- defaults canonicaliseImp _ = id canonicaliseRef = toRef t . fromRef where t = Proxy :: Proxy t default eqImp :: Eq (Imp t) => Proxy t -> Imp t -> Imp t -> Bool eqImp _ = (==) -- A few derived utils serialiseRef :: forall t. Token t => t -> LBS.ByteString serialiseRef = LBS.pack . encodeRef serialiseImp :: forall t. Token t => Proxy t -> Imp t -> LBS.ByteString serialiseImp _ = toLazyByteString . encodeImp t where t = Proxy :: Proxy t deserialiseRef :: forall t. Token t => Proxy t -> LBS.ByteString -> t deserialiseRef _ bytes = case Ref.runDecoder (decodeRef t) (LBS.unpack bytes) of Just (x, trailing) | null trailing -> x | otherwise -> error "deserialiseRef: trailing bytes" Nothing -> error "deserialiseRef: decode failure" where t = Proxy :: Proxy t deserialiseImp :: forall t. Token t => Proxy t -> LBS.ByteString -> Imp t deserialiseImp _ bytes = case deserialiseFromBytes (decodeImp t) bytes of Right (trailing, x) | LBS.null trailing -> x | otherwise -> error "deserialiseImp: trailing data" Left _failure -> error "deserialiseImp: decode failure" where t = Proxy :: Proxy t -------------------------------------------------------------------------------- -- Properties -- -- | The property corresponding to the following part of the commuting diagram. -- -- > canon -- > Ref──────────▶Ref . . . . . ▷. -- > │ ▲ . . . -- > │ │ . . . -- > │ │ . . . -- > │from to│ ▷ ▷ . -- > │ │ . . . -- > │ │ . . . -- > ▼ │ . . ▽ -- > Imp──────────▶Imp . . . . . ▷. -- > id -- -- > to . id . from = canon_ref -- prop_fromRefToRef :: Token t => Proxy t -> t -> Bool prop_fromRefToRef _ x = (toRef t . fromRef) x == canonicaliseRef x where t = Proxy :: Proxy t -- | The property corresponding to the following part of the commuting diagram. -- -- > id -- > . . . . . .▷Ref───────────▶Ref -- > . ▲ . . │ -- > . │ . . │ -- > . │ . . │ -- > . to│ ▷ ▷ │from -- > . │ . . │ -- > . │ . . │ -- > ▽ │ . . ▼ -- > . . . . . .▶Imp───────────▶Imp -- > canon -- -- > from . id . to = canon_imp -- prop_toRefFromRef :: forall t. Token t => Proxy t -> Imp t -> Bool prop_toRefFromRef _ x = (fromRef . toRef t) x `eq` canonicaliseImp t x where eq = eqImp t t = Proxy :: Proxy t -- | The property corresponding to the following part of the commuting diagram. -- -- This is a round trip property, with the reference implementation of the -- encoder and decoder. -- -- > id -- > . . . . . .▷Ref───────────▶Ref -- > . △ ╲ ╱ . -- > . . ╲enc dec╱ . -- > . . ╲ ╱ . -- > . . ▶Enc▶ . -- > . . . . . -- > . . . . . -- > ▽ . . . ▽ -- > . . . . . . ▷.. . . . . . ▷. -- -- > dec_ref . enc_ref = id -- prop_encodeRefdecodeRef :: forall t. Token t => Proxy t -> t -> Bool prop_encodeRefdecodeRef _ x = (deserialiseRef t . serialiseRef) x == x where t = Proxy :: Proxy t -- | The property corresponding to the following part of the commuting diagram. -- -- This is a round trip property, with the production implementation of the -- encoder and decoder. -- -- > . . . . . . ▷. . . . . . .▷. -- > . △ . . . -- > . . . . . -- > . . . . . -- > . . ▶Enc▶ . -- > . . ╱ ╲ . -- > . . ╱enc dec╲ . -- > ▽ . ╱ ╲ ▽ -- > . . . . . .▷Imp───────────▶Imp -- > canon -- -- > dec_imp . enc_imp = canon_imp -- prop_encodeImpdecodeImp :: forall t. Token t => Proxy t -> Imp t -> Bool prop_encodeImpdecodeImp _ x = (deserialiseImp t . serialiseImp t) x `eq` canonicaliseImp t x where eq = eqImp t t = Proxy :: Proxy t -- | This is the same property as 'prop_encodeImpdecodeImp' but the encoded -- data is split into two chunks provided as input into the decoder. All -- possible 2-chunk splits are tried. This checks that the decoder gives the -- same result irrespective of the chunk boundaries. -- prop_encodeImpdecodeImp_splits2 :: forall t. Token t => Proxy t -> Imp t -> Bool prop_encodeImpdecodeImp_splits2 _ x = and [ deserialiseImp t enc' `eq` x' | let enc = serialiseImp t x x' = canonicaliseImp t x , enc' <- splits2 enc ] where eq = eqImp t t = Proxy :: Proxy t -- | This is the same idea as 'prop_encodeImpdecodeImp_splits2' but with all -- possible 3-chunk splits of the input data. This test is of course more -- expensive and so the size of the input must be limited. -- prop_encodeImpdecodeImp_splits3 :: forall t. Token t => Proxy t -> Imp t -> Bool prop_encodeImpdecodeImp_splits3 _ x = and [ deserialiseImp t enc' `eq` x' | let enc = serialiseImp t x x' = canonicaliseImp t x , enc' <- splits3 enc ] where eq = eqImp t t = Proxy :: Proxy t -- | The property corresponding to the following part of the commuting diagram. -- -- This checks that the reference and real implementation produce the same -- encoded bytes. It starts from a value in the reference implementation. -- -- > canon -- > Ref──────────▶Ref . . . . . ▷. -- > │ △ ╲ . . -- > │ . ╲enc . . -- > │ . ╲ . . -- > │from . ▶Enc▷ . -- > │ . ╱ . . -- > │ . ╱enc . . -- > ▼ . ╱ . ▽ -- > Imp──────────▶Imp . . . . . ▷. -- > id -- -- > enc_imp . id . from = enc_ref . canon_ref -- prop_encodeRefencodeImp1 :: forall t. Token t => Proxy t -> t -> Bool prop_encodeRefencodeImp1 _ x = (serialiseImp t . fromRef) x == (serialiseRef . canonicaliseRef) x where t = Proxy :: Proxy t -- | The property corresponding to the following part of the commuting diagram. -- -- This checks that the reference and real implementation produce the same -- encoded bytes. It starts from a value in the real implementation. -- -- > . . . . . .▷Ref . . . . . ▷. -- > . ▲ ╲ . . -- > . │ ╲enc . . -- > . │ ╲ . . -- > . to│ ▶Enc▷ . -- > . │ ╱ . . -- > . │ ╱enc . . -- > ▽ │ ╱ . ▽ -- > . . . . . .▷Imp . . . . . ▷. -- -- > enc_ref . id . to = enc_imp -- prop_encodeRefencodeImp2 :: forall t. Token t => Proxy t -> Imp t -> Bool prop_encodeRefencodeImp2 _ x = (serialiseRef . toRef t) x == serialiseImp t x where t = Proxy :: Proxy t -- | The property corresponding to the following part of the commuting diagram. -- -- This checks that starting from the same encoding, the reference and real -- implementation deserialise to equivalent values. -- -- > . . . . . .▷Ref . . . . . ▶Ref -- > . △ ╲ ╱ │ -- > . . ╲enc dec╱ │ -- > . . ╲ ╱ │ -- > . . ▶Enc▶ │from -- > . . . ╲ │ -- > . . . dec╲ │ -- > ▽ . . ╲ ▼ -- > . . . . . . ▷.. . . . . . ▶Imp -- -- > dec_imp . enc_ref = from . dec_ref . enc_ref -- prop_decodeRefdecodeImp :: forall t. Token t => Proxy t -> t -> Bool prop_decodeRefdecodeImp _ x = deserialiseImp t enc `eq` (fromRef . deserialiseRef t) enc where enc = serialiseRef x eq = eqImp t t = Proxy :: Proxy t -------------------------------------------------------------------------------- -- Token class instances for unsigned types -- newtype TokWord8 = TokWord8 { unTokWord8 :: UInt } deriving (Eq, Show) instance Token TokWord8 where type Imp TokWord8 = Word8 fromRef = fromIntegral . Ref.fromUInt . unTokWord8 toRef _ = TokWord8 . Ref.toUInt . fromIntegral encodeImp _ = encodeWord8 decodeImp _ = decodeWord8 encodeRef (TokWord8 n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) decodeRef _ = do Ref.MT0_UnsignedInt n <- Ref.decodeToken return (TokWord8 n) instance Arbitrary TokWord8 where arbitrary = TokWord8 <$> oneof arbitraryUInt_Word8 arbitraryUInt_Word8 :: [Gen UInt] arbitraryUInt_Word8 = [ UIntSmall <$> arbitrarySmall , UInt8 <$> arbitrarySmall , UInt8 <$> arbitraryUInt8 ] newtype TokWord16 = TokWord16 { unTokWord16 :: UInt } deriving (Eq, Show) instance Token TokWord16 where type Imp TokWord16 = Word16 fromRef = fromIntegral . Ref.fromUInt . unTokWord16 toRef _ = TokWord16 . Ref.toUInt . fromIntegral encodeImp _ = encodeWord16 decodeImp _ = decodeWord16 encodeRef (TokWord16 n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) decodeRef _ = do Ref.MT0_UnsignedInt n <- Ref.decodeToken return (TokWord16 n) instance Arbitrary TokWord16 where arbitrary = TokWord16 <$> oneof arbitraryUInt_Word16 arbitraryUInt_Word16 :: [Gen UInt] arbitraryUInt_Word16 = arbitraryUInt_Word8 ++ [ UInt16 <$> arbitrarySmall , UInt16 <$> arbitraryUInt8 , UInt16 <$> arbitraryUInt16 ] newtype TokWord32 = TokWord32 { unTokWord32 :: UInt } deriving (Eq, Show) instance Token TokWord32 where type Imp TokWord32 = Word32 fromRef = fromIntegral . Ref.fromUInt . unTokWord32 toRef _ = TokWord32 . Ref.toUInt . fromIntegral encodeImp _ = encodeWord32 decodeImp _ = decodeWord32 encodeRef (TokWord32 n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) decodeRef _ = do Ref.MT0_UnsignedInt n <- Ref.decodeToken return (TokWord32 n) instance Arbitrary TokWord32 where arbitrary = TokWord32 <$> oneof arbitraryUInt_Word32 arbitraryUInt_Word32 :: [Gen UInt] arbitraryUInt_Word32 = arbitraryUInt_Word16 ++ [ UInt32 <$> arbitrarySmall , UInt32 <$> arbitraryUInt8 , UInt32 <$> arbitraryUInt16 , UInt32 <$> arbitraryUInt32 ] newtype TokWord64 = TokWord64 { unTokWord64 :: UInt } deriving (Eq, Show) instance Token TokWord64 where type Imp TokWord64 = Word64 fromRef = fromIntegral . Ref.fromUInt . unTokWord64 toRef _ = TokWord64 . Ref.toUInt . fromIntegral encodeImp _ = encodeWord64 decodeImp _ = decodeWord64 encodeRef (TokWord64 n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) decodeRef _ = do Ref.MT0_UnsignedInt n <- Ref.decodeToken return (TokWord64 n) instance Arbitrary TokWord64 where arbitrary = TokWord64 <$> oneof arbitraryUInt_Word64 arbitraryUInt_Word64 :: [Gen UInt] arbitraryUInt_Word64 = arbitraryUInt_Word32 ++ [ UInt64 <$> arbitrarySmall , UInt64 <$> arbitraryUInt8 , UInt64 <$> arbitraryUInt16 , UInt64 <$> arbitraryUInt32 , UInt64 <$> arbitraryUInt64 ] newtype TokWord = TokWord { unTokWord :: UInt } deriving (Eq, Show) instance Arbitrary TokWord where arbitrary = TokWord <$> oneof arbitraryUInt_Word arbitraryUInt_Word :: [Gen UInt] arbitraryUInt_Word = arbitraryUInt_Word32 ++ [ UInt64 <$> arbitrarySmall , UInt64 <$> arbitraryUInt8 , UInt64 <$> arbitraryUInt16 , UInt64 <$> arbitraryUInt32 #if defined(ARCH_64bit) , UInt64 <$> arbitraryUInt64 #endif ] instance Token TokWord where type Imp TokWord = Word fromRef = fromIntegral . Ref.fromUInt . unTokWord toRef _ = TokWord . Ref.toUInt . fromIntegral encodeImp _ = encodeWord decodeImp _ = decodeWord encodeRef (TokWord n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) decodeRef _ = do Ref.MT0_UnsignedInt n <- Ref.decodeToken return (TokWord n) -------------------------------------------------------------------------------- -- Token class instances for signed types -- data TokInt8 = TokInt8 Bool UInt deriving (Eq, Show) instance Token TokInt8 where type Imp TokInt8 = Int8 fromRef (TokInt8 True n) = (fromIntegral . Ref.fromUInt) n fromRef (TokInt8 False n) = (complement . fromIntegral . Ref.fromUInt) n toRef _ n | n >= 0 = TokInt8 True ((Ref.toUInt . fromIntegral) n) | otherwise = TokInt8 False ((Ref.toUInt . fromIntegral . complement) n) encodeImp _ = encodeInt8 decodeImp _ = decodeInt8 encodeRef (TokInt8 True n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) encodeRef (TokInt8 False n) = Ref.encodeToken (Ref.MT1_NegativeInt n) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT0_UnsignedInt n -> return (TokInt8 True n) Ref.MT1_NegativeInt n -> return (TokInt8 False n) _ -> fail "decodeRef (TokInt)" instance Arbitrary TokInt8 where arbitrary = TokInt8 <$> arbitrary <*> oneof arbitraryUInt_Int8 shrink (TokInt8 sign n) = [ TokInt8 sign' n' | (sign', n') <- shrink (sign, n) ] arbitraryUInt_Int8 :: [Gen UInt] arbitraryUInt_Int8 = [ UIntSmall <$> arbitrarySmall , UInt8 <$> arbitrarySmall , UInt8 <$> arbitraryUInt7 ] data TokInt16 = TokInt16 Bool UInt deriving (Eq, Show) instance Token TokInt16 where type Imp TokInt16 = Int16 fromRef (TokInt16 True n) = (fromIntegral . Ref.fromUInt) n fromRef (TokInt16 False n) = (complement . fromIntegral . Ref.fromUInt) n toRef _ n | n >= 0 = TokInt16 True ((Ref.toUInt . fromIntegral) n) | otherwise = TokInt16 False ((Ref.toUInt . fromIntegral . complement) n) encodeImp _ = encodeInt16 decodeImp _ = decodeInt16 encodeRef (TokInt16 True n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) encodeRef (TokInt16 False n) = Ref.encodeToken (Ref.MT1_NegativeInt n) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT0_UnsignedInt n -> return (TokInt16 True n) Ref.MT1_NegativeInt n -> return (TokInt16 False n) _ -> fail "decodeRef (TokInt16)" instance Arbitrary TokInt16 where arbitrary = TokInt16 <$> arbitrary <*> oneof arbitraryUInt_Int16 arbitraryUInt_Int16 :: [Gen UInt] arbitraryUInt_Int16 = arbitraryUInt_Int8 ++ [ UInt16 <$> arbitrarySmall , UInt16 <$> arbitraryUInt7 , UInt16 <$> arbitraryUInt15 ] data TokInt32 = TokInt32 Bool UInt deriving (Eq, Show) instance Token TokInt32 where type Imp TokInt32 = Int32 fromRef (TokInt32 True n) = (fromIntegral . Ref.fromUInt) n fromRef (TokInt32 False n) = (complement . fromIntegral . Ref.fromUInt) n toRef _ n | n >= 0 = TokInt32 True ((Ref.toUInt . fromIntegral) n) | otherwise = TokInt32 False ((Ref.toUInt . fromIntegral . complement) n) encodeImp _ = encodeInt32 decodeImp _ = decodeInt32 encodeRef (TokInt32 True n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) encodeRef (TokInt32 False n) = Ref.encodeToken (Ref.MT1_NegativeInt n) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT0_UnsignedInt n -> return (TokInt32 True n) Ref.MT1_NegativeInt n -> return (TokInt32 False n) _ -> fail "decodeRef (TokInt32)" instance Arbitrary TokInt32 where arbitrary = TokInt32 <$> arbitrary <*> oneof arbitraryUInt_Int32 arbitraryUInt_Int32 :: [Gen UInt] arbitraryUInt_Int32 = arbitraryUInt_Int16 ++ [ UInt32 <$> arbitrarySmall , UInt32 <$> arbitraryUInt7 , UInt32 <$> arbitraryUInt15 , UInt32 <$> arbitraryUInt31 ] data TokInt64 = TokInt64 Bool UInt deriving (Eq, Show) instance Token TokInt64 where type Imp TokInt64 = Int64 fromRef (TokInt64 True n) = (fromIntegral . Ref.fromUInt) n fromRef (TokInt64 False n) = (complement . fromIntegral . Ref.fromUInt) n toRef _ n | n >= 0 = TokInt64 True ((Ref.toUInt . fromIntegral) n) | otherwise = TokInt64 False ((Ref.toUInt . fromIntegral . complement) n) encodeImp _ = encodeInt64 decodeImp _ = decodeInt64 encodeRef (TokInt64 True n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) encodeRef (TokInt64 False n) = Ref.encodeToken (Ref.MT1_NegativeInt n) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT0_UnsignedInt n -> return (TokInt64 True n) Ref.MT1_NegativeInt n -> return (TokInt64 False n) _ -> fail "decodeRef (TokInt64)" instance Arbitrary TokInt64 where arbitrary = TokInt64 <$> arbitrary <*> oneof arbitraryUInt_Int64 arbitraryUInt_Int64 :: [Gen UInt] arbitraryUInt_Int64 = arbitraryUInt_Int32 ++ [ UInt64 <$> arbitrarySmall , UInt64 <$> arbitraryUInt7 , UInt64 <$> arbitraryUInt15 , UInt64 <$> arbitraryUInt31 , UInt64 <$> arbitraryUInt63 ] data TokInt = TokInt Bool UInt deriving (Eq, Show) instance Token TokInt where type Imp TokInt = Int fromRef (TokInt True n) = (fromIntegral . Ref.fromUInt) n fromRef (TokInt False n) = (complement . fromIntegral . Ref.fromUInt) n toRef _ n | n >= 0 = TokInt True ((Ref.toUInt . fromIntegral) n) | otherwise = TokInt False ((Ref.toUInt . fromIntegral . complement) n) encodeImp _ = encodeInt decodeImp _ = decodeInt encodeRef (TokInt True n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) encodeRef (TokInt False n) = Ref.encodeToken (Ref.MT1_NegativeInt n) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT0_UnsignedInt n -> return (TokInt True n) Ref.MT1_NegativeInt n -> return (TokInt False n) _ -> fail "decodeRef (TokInt)" instance Arbitrary TokInt where arbitrary = TokInt <$> arbitrary <*> oneof arbitraryUInt_Int arbitraryUInt_Int :: [Gen UInt] arbitraryUInt_Int = arbitraryUInt_Int32 ++ [ UInt64 <$> arbitrarySmall , UInt64 <$> arbitraryUInt7 , UInt64 <$> arbitraryUInt15 , UInt64 <$> arbitraryUInt31 #if defined(ARCH_64bit) , UInt64 <$> arbitraryUInt63 #endif ] data TokInteger = TokIntegerUInt UInt | TokIntegerNInt UInt | TokIntegerBig LargeInteger deriving (Eq, Show) instance Arbitrary TokInteger where arbitrary = oneof [ TokIntegerUInt <$> arbitrary , TokIntegerNInt <$> arbitrary , TokIntegerBig <$> arbitrary ] instance Token TokInteger where type Imp TokInteger = Integer fromRef (TokIntegerUInt n) = (fromIntegral . Ref.fromUInt) n fromRef (TokIntegerNInt n) = (complement . fromIntegral . Ref.fromUInt) n fromRef (TokIntegerBig n) = getLargeInteger n toRef _ n | n >= 0 && n <= fromIntegral (maxBound :: Word64) = TokIntegerUInt ((Ref.toUInt . fromIntegral) n) | n < 0 && complement n <= fromIntegral (maxBound :: Word64) = TokIntegerNInt ((Ref.toUInt . fromIntegral . complement) n) | otherwise = TokIntegerBig (LargeInteger n) encodeImp _ = encodeInteger decodeImp _ = decodeInteger encodeRef (TokIntegerUInt n) = Ref.encodeToken (Ref.MT0_UnsignedInt n) encodeRef (TokIntegerNInt n) = Ref.encodeToken (Ref.MT1_NegativeInt n) encodeRef (TokIntegerBig n) = Ref.encodeTerm (Ref.TBigInt (getLargeInteger n)) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT0_UnsignedInt n -> return (TokIntegerUInt n) Ref.MT1_NegativeInt n -> return (TokIntegerNInt n) Ref.MT6_Tag tag -> do Ref.TBigInt n <- Ref.decodeTagged tag return (TokIntegerBig (LargeInteger n)) _ -> fail "decodeRef (TokInteger)" -------------------------------------------------------------------------------- -- Arbitrary helpers for integer types -- arbitrarySmall, arbitraryUInt8, arbitraryUInt16, arbitraryUInt32, arbitraryUInt64, arbitraryUInt7, arbitraryUInt15, arbitraryUInt31, arbitraryUInt63 :: (Num n, Random n) => Gen n arbitrarySmall = chooseZeroToBound (23 :: Word) arbitraryUInt8 = chooseZeroToBound (maxBound :: Word8) arbitraryUInt16 = chooseZeroToBound (maxBound :: Word16) arbitraryUInt32 = chooseZeroToBound (maxBound :: Word32) arbitraryUInt64 = chooseZeroToBound (maxBound :: Word64) arbitraryUInt7 = chooseZeroToBound (maxBound :: Int8) arbitraryUInt15 = chooseZeroToBound (maxBound :: Int16) arbitraryUInt31 = chooseZeroToBound (maxBound :: Int32) arbitraryUInt63 = chooseZeroToBound (maxBound :: Int64) chooseZeroToBound :: (Num a, Random a, Integral a1) => a1 -> Gen a chooseZeroToBound bound = frequency [ (9, choose (0, bound')) , (1, pure bound') ] where bound' = fromIntegral bound -------------------------------------------------------------------------------- -- Token class instances for floating point types -- data TokHalf = TokHalf HalfSpecials deriving (Eq, Show) instance Arbitrary TokHalf where arbitrary = TokHalf <$> arbitrary instance Token TokHalf where type Imp TokHalf = Float eqImp _ = (==) `on` floatToWord fromRef (TokHalf (HalfSpecials n)) = Half.fromHalf n toRef _ = TokHalf . canonicaliseNaN . HalfSpecials . Half.toHalf canonicaliseImp _ = Half.fromHalf . canonicaliseNaN . Half.toHalf canonicaliseRef (TokHalf n) = TokHalf (canonicaliseNaN n) encodeImp _ = encodeFloat16 decodeImp _ = decodeFloat encodeRef (TokHalf n) = Ref.encodeToken (Ref.MT7_Float16 n) decodeRef _ = do Ref.MT7_Float16 n <- Ref.decodeToken return (TokHalf n) data TokFloat = TokFloat FloatSpecials | TokFloatNan deriving (Eq, Show) instance Arbitrary TokFloat where arbitrary = frequency [(19, TokFloat <$> arbitrary), (1, pure TokFloatNan)] instance Token TokFloat where type Imp TokFloat = Float eqImp _ = (==) `on` floatToWord fromRef (TokFloat n) = getFloatSpecials n fromRef TokFloatNan = canonicalNaN toRef _ n | isNaN n = TokFloatNan | otherwise = TokFloat (FloatSpecials n) canonicaliseImp _ = canonicaliseNaN canonicaliseRef TokFloatNan = TokFloatNan canonicaliseRef (TokFloat (FloatSpecials n)) | isNaN n = TokFloatNan | otherwise = TokFloat (FloatSpecials n) encodeImp _ = encodeFloat decodeImp _ = decodeFloat encodeRef (TokFloat n) = Ref.encodeToken (Ref.MT7_Float32 n) encodeRef TokFloatNan = Ref.encodeToken (Ref.MT7_Float16 canonicalNaN) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT7_Float16 n | isNaN n -> return TokFloatNan Ref.MT7_Float32 n -> return (TokFloat n) _ -> fail "decodeRef (TokFloat)" data TokDouble = TokDouble DoubleSpecials | TokDoubleNan deriving (Eq, Show) instance Arbitrary TokDouble where arbitrary = frequency [(19, TokDouble <$> arbitrary), (1, pure TokDoubleNan)] instance Token TokDouble where type Imp TokDouble = Double eqImp _ = (==) `on` doubleToWord fromRef (TokDouble n) = getDoubleSpecials n fromRef TokDoubleNan = canonicalNaN toRef _ n | isNaN n = TokDoubleNan | otherwise = TokDouble (DoubleSpecials n) canonicaliseImp _ = canonicaliseNaN canonicaliseRef TokDoubleNan = TokDoubleNan canonicaliseRef (TokDouble (DoubleSpecials n)) | isNaN n = TokDoubleNan | otherwise = TokDouble (DoubleSpecials n) encodeImp _ = encodeDouble decodeImp _ = decodeDouble encodeRef (TokDouble n) = Ref.encodeToken (Ref.MT7_Float64 n) encodeRef TokDoubleNan = Ref.encodeToken (Ref.MT7_Float16 canonicalNaN) decodeRef _ = do tok <- Ref.decodeToken case tok of Ref.MT7_Float16 n | isNaN n -> return TokDoubleNan Ref.MT7_Float64 n -> return (TokDouble n) _ -> fail "decodeRef (TokDouble)" -------------------------------------------------------------------------------- -- Miscelaneous token class instances -- data TokTag = TokTag { unTokTag :: UInt } deriving (Eq, Show) instance Arbitrary TokTag where arbitrary = TokTag <$> oneof arbitraryUInt_Word instance Token TokTag where type Imp TokTag = Word fromRef = fromIntegral . Ref.fromUInt . unTokTag toRef _ = TokTag . Ref.toUInt . fromIntegral encodeImp _ = encodeTag decodeImp _ = decodeTag encodeRef (TokTag n) = Ref.encodeToken (Ref.MT6_Tag n) decodeRef _ = do Ref.MT6_Tag n <- Ref.decodeToken return (TokTag n) data TokTag64 = TokTag64 { unTokTag64 :: UInt } deriving (Eq, Show) instance Arbitrary TokTag64 where arbitrary = TokTag64 <$> oneof arbitraryUInt_Word64 instance Token TokTag64 where type Imp TokTag64 = Word64 fromRef = fromIntegral . Ref.fromUInt . unTokTag64 toRef _ = TokTag64 . Ref.toUInt . fromIntegral encodeImp _ = encodeTag64 decodeImp _ = decodeTag64 encodeRef (TokTag64 n) = Ref.encodeToken (Ref.MT6_Tag n) decodeRef _ = do Ref.MT6_Tag n <- Ref.decodeToken return (TokTag64 n) instance Token Ref.Simple where type Imp Ref.Simple = Word8 fromRef = Ref.fromSimple toRef _ = Ref.toSimple encodeImp _ = encodeSimple decodeImp _ = decodeSimple encodeRef n = Ref.encodeToken (Ref.MT7_Simple n) decodeRef _ = do Ref.MT7_Simple n <- Ref.decodeToken return n -------------------------------------------------------------------------------- -- Token class instances for Term type -- instance Token Ref.Term where type Imp Ref.Term = Term eqImp _ = eqTerm fromRef = fromRefTerm toRef _ = toRefTerm canonicaliseImp _ = canonicaliseTerm canonicaliseRef = Ref.canonicaliseTerm encodeImp _ = encodeTerm decodeImp _ = decodeTerm encodeRef = Ref.encodeTerm decodeRef _ = Ref.decodeTerm -------------------------------------------------------------------------------- -- Token class instances for ByteArray tokens. -- newtype TokByteArray = TokByteArray { unTokByteArray :: [Word8] } deriving (Eq, Show) instance Arbitrary TokByteArray where arbitrary = TokByteArray <$> arbitrary instance Token TokByteArray where type Imp TokByteArray = Sliced.SlicedByteArray eqImp _ = (==) fromRef = Sliced.fromByteString . BS.pack . unTokByteArray toRef _ = TokByteArray . toList canonicaliseImp _ = id canonicaliseRef = id encodeImp _ = encodeByteArray decodeImp _ = toSliced <$> decodeByteArray encodeRef (TokByteArray bs) = Ref.encodeToken (Ref.MT2_ByteString (lengthUInt bs) bs) decodeRef _ = do Ref.MT2_ByteString _n bs <- Ref.decodeToken -- TODO? check _n == bs return (TokByteArray bs) instance Arbitrary Sliced.SlicedByteArray where -- Taken from cardano-ledger-binary testlib. arbitrary = do NonNegative off <- arbitrary Positive count <- arbitrary NonNegative slack <- arbitrary let len = off + count + slack ba <- genByteArray len pure $ Sliced.SBA ba off count where genShortByteString :: Int -> Gen SBS.ShortByteString genShortByteString n = MkGen (\r _n -> runStateGen_ r (uniformShortByteString n)) genByteArray :: Int -> Gen Prim.ByteArray genByteArray n = do bss <- genShortByteString n case bss of SBS.SBS ba -> pure $ Prim.ByteArray ba -------------------------------------------------------------------------------- -- TestTree API testTree :: TestTree testTree = testGroup "properties" [ testGroup "to . id . from = canon_ref" [ testProperty "Word8" (prop_fromRefToRef (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_fromRefToRef (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_fromRefToRef (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_fromRefToRef (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_fromRefToRef (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_fromRefToRef (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_fromRefToRef (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_fromRefToRef (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_fromRefToRef (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_fromRefToRef (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_fromRefToRef (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_fromRefToRef (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_fromRefToRef (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_fromRefToRef (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_fromRefToRef (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_fromRefToRef (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_fromRefToRef (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_fromRefToRef (Proxy :: Proxy Ref.Simple)) , testProperty "Term" (prop_fromRefToRef (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_fromRefToRef (Proxy :: Proxy TokByteArray)) ] , testGroup "from . id . to = canon_imp" [ testProperty "Word8" (prop_toRefFromRef (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_toRefFromRef (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_toRefFromRef (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_toRefFromRef (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_toRefFromRef (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_toRefFromRef (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_toRefFromRef (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_toRefFromRef (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_toRefFromRef (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_toRefFromRef (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_toRefFromRef (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_toRefFromRef (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_toRefFromRef (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_toRefFromRef (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_toRefFromRef (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_toRefFromRef (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_toRefFromRef (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_toRefFromRef (Proxy :: Proxy Ref.Simple)) , testProperty "Term" (prop_toRefFromRef (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_toRefFromRef (Proxy :: Proxy TokByteArray)) ] , testGroup "dec_ref . enc_ref = id" [ testProperty "Word8" (prop_encodeRefdecodeRef (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_encodeRefdecodeRef (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_encodeRefdecodeRef (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_encodeRefdecodeRef (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_encodeRefdecodeRef (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_encodeRefdecodeRef (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_encodeRefdecodeRef (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_encodeRefdecodeRef (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_encodeRefdecodeRef (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_encodeRefdecodeRef (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_encodeRefdecodeRef (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_encodeRefdecodeRef (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_encodeRefdecodeRef (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_encodeRefdecodeRef (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_encodeRefdecodeRef (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_encodeRefdecodeRef (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_encodeRefdecodeRef (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_encodeRefdecodeRef (Proxy :: Proxy Ref.Simple)) , testProperty "Term" (prop_encodeRefdecodeRef (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_encodeRefdecodeRef (Proxy :: Proxy TokByteArray)) ] , testGroup "dec_imp . enc_imp = canon_imp" [ testProperty "Word8" (prop_encodeImpdecodeImp (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_encodeImpdecodeImp (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_encodeImpdecodeImp (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_encodeImpdecodeImp (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_encodeImpdecodeImp (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_encodeImpdecodeImp (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_encodeImpdecodeImp (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_encodeImpdecodeImp (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_encodeImpdecodeImp (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_encodeImpdecodeImp (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_encodeImpdecodeImp (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_encodeImpdecodeImp (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_encodeImpdecodeImp (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_encodeImpdecodeImp (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_encodeImpdecodeImp (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_encodeImpdecodeImp (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_encodeImpdecodeImp (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_encodeImpdecodeImp (Proxy :: Proxy Ref.Simple)) , testProperty "Term" (prop_encodeImpdecodeImp (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_encodeImpdecodeImp (Proxy :: Proxy TokByteArray)) ] , testGroup "dec_imp . enc_imp = canon_imp (all 2-splits)" [ testProperty "Word8" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy Ref.Simple)) , localOption (QuickCheckMaxSize 100) $ testProperty "Term" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_encodeImpdecodeImp_splits2 (Proxy :: Proxy TokByteArray)) ] , testGroup "dec_imp . enc_imp = canon_imp (all 3-splits)" [ testProperty "Word8" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy Ref.Simple)) , localOption (QuickCheckMaxSize 25) $ testProperty "Term" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_encodeImpdecodeImp_splits3 (Proxy :: Proxy TokByteArray)) ] , testGroup "enc_imp . from = enc_ref . canon_ref" [ testProperty "Word8" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_encodeRefencodeImp1 (Proxy :: Proxy Ref.Simple)) , testProperty "Term" (prop_encodeRefencodeImp1 (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_encodeRefencodeImp1 (Proxy :: Proxy TokByteArray)) ] , testGroup "enc_ref . to = enc_imp" [ testProperty "Word8" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_encodeRefencodeImp2 (Proxy :: Proxy Ref.Simple)) , testProperty "Term" (prop_encodeRefencodeImp2 (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_encodeRefencodeImp2 (Proxy :: Proxy TokByteArray)) ] , testGroup "dec_imp . enc_ref = from . dec_ref . enc_ref" [ testProperty "Word8" (prop_decodeRefdecodeImp (Proxy :: Proxy TokWord8)) , testProperty "Word16" (prop_decodeRefdecodeImp (Proxy :: Proxy TokWord16)) , testProperty "Word32" (prop_decodeRefdecodeImp (Proxy :: Proxy TokWord32)) , testProperty "Word64" (prop_decodeRefdecodeImp (Proxy :: Proxy TokWord64)) , testProperty "Word" (prop_decodeRefdecodeImp (Proxy :: Proxy TokWord)) -- , testProperty "NegWord" (prop_decodeRefdecodeImp (Proxy :: Proxy TokNegWord)) , testProperty "Int8" (prop_decodeRefdecodeImp (Proxy :: Proxy TokInt8)) , testProperty "Int16" (prop_decodeRefdecodeImp (Proxy :: Proxy TokInt16)) , testProperty "Int32" (prop_decodeRefdecodeImp (Proxy :: Proxy TokInt32)) , testProperty "Int64" (prop_decodeRefdecodeImp (Proxy :: Proxy TokInt64)) , testProperty "Int" (prop_decodeRefdecodeImp (Proxy :: Proxy TokInt)) , testProperty "Integer" (prop_decodeRefdecodeImp (Proxy :: Proxy TokInteger)) , testProperty "Half" (prop_decodeRefdecodeImp (Proxy :: Proxy TokHalf)) , testProperty "Float" (prop_decodeRefdecodeImp (Proxy :: Proxy TokFloat)) , testProperty "Double" (prop_decodeRefdecodeImp (Proxy :: Proxy TokDouble)) , testProperty "Tag" (prop_decodeRefdecodeImp (Proxy :: Proxy TokTag)) , testProperty "Tag64" (prop_decodeRefdecodeImp (Proxy :: Proxy TokTag64)) , testProperty "Simple" (prop_decodeRefdecodeImp (Proxy :: Proxy Ref.Simple)) , testProperty "Term" (prop_decodeRefdecodeImp (Proxy :: Proxy Ref.Term)) , testProperty "ByteArray" (prop_decodeRefdecodeImp (Proxy :: Proxy TokByteArray)) ] ] cborg-0.2.10.0/tests/Tests/Reference.hs0000644000000000000000000001560607346545000015723 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE NamedFieldPuns #-} module Tests.Reference ( testTree , termToJson , equalJson ) where import Test.Tasty as Tasty import Test.Tasty.HUnit import Test.Tasty.QuickCheck import qualified Data.ByteString as BS import qualified Data.ByteString.Base64.URL as Base64url import qualified Data.Text as T import qualified Data.Text.Encoding as T import qualified Data.Vector as V import Data.Scientific (fromFloatDigits, toRealFloat) import Data.Aeson as Aeson #if MIN_VERSION_aeson(2,0,0) import Data.Aeson.Key as Aeson.Key #endif import Data.Word import qualified Numeric.Half as Half import Tests.Reference.Implementation as CBOR import Tests.Reference.Generators ( HalfSpecials(..), FloatSpecials(..), DoubleSpecials(..) ) import Tests.Reference.TestVectors ------------------------------------------------------------------------------- -- Unit tests for test vector from https://github.com/cbor/test-vectors/ -- unit_externalTestVector :: [ExternalTestCase] -> Assertion unit_externalTestVector = mapM_ unit_externalTestCase unit_externalTestCase :: ExternalTestCase -> Assertion unit_externalTestCase ExternalTestCase { encoded, decoded = Left expectedJson } = do let term = deserialise encoded actualJson = termToJson term reencoded = serialise term expectedJson `equalJson` actualJson encoded @=? reencoded unit_externalTestCase ExternalTestCase { encoded, decoded = Right expectedDiagnostic } = do let term = deserialise encoded actualDiagnostic = diagnosticNotation term reencoded = serialise term expectedDiagnostic @=? actualDiagnostic encoded @=? reencoded equalJson :: Aeson.Value -> Aeson.Value -> Assertion equalJson (Aeson.Number expected) (Aeson.Number actual) | toRealFloat expected == promoteDouble (toRealFloat actual) = return () where -- This is because the expected JSON output is always using double precision -- where as Aeson's Scientific type preserves the precision of the input. -- So for tests using Float, we're more precise than the reference values. promoteDouble :: Float -> Double promoteDouble = realToFrac equalJson expected actual = expected @=? actual #if MIN_VERSION_aeson(2,0,0) stringToJsonKey :: String -> Aeson.Key.Key stringToJsonKey = Aeson.Key.fromString #else stringToJsonKey :: String -> T.Text stringToJsonKey = T.pack #endif termToJson :: CBOR.Term -> Aeson.Value termToJson (TUInt n) = Aeson.Number (fromIntegral (fromUInt n)) termToJson (TNInt n) = Aeson.Number (-1 - fromIntegral (fromUInt n)) termToJson (TBigInt n) = Aeson.Number (fromIntegral n) termToJson (TBytes ws) = Aeson.String (bytesToBase64Text ws) termToJson (TBytess wss) = Aeson.String (bytesToBase64Text (concat wss)) termToJson (TString cs) = Aeson.String (T.pack cs) termToJson (TStrings css) = Aeson.String (T.pack (concat css)) termToJson (TArray ts) = Aeson.Array (V.fromList (map termToJson ts)) termToJson (TArrayI ts) = Aeson.Array (V.fromList (map termToJson ts)) termToJson (TMap kvs) = Aeson.object [ (stringToJsonKey k, termToJson v) | (TString k,v) <- kvs ] termToJson (TMapI kvs) = Aeson.object [ (stringToJsonKey k, termToJson v) | (TString k,v) <- kvs ] termToJson (TTagged _ t) = termToJson t termToJson TTrue = Aeson.Bool True termToJson TFalse = Aeson.Bool False termToJson TNull = Aeson.Null termToJson TUndef = Aeson.Null -- replacement value termToJson (TSimple _) = Aeson.Null -- replacement value termToJson (TFloat16 f) = Aeson.Number (fromFloatDigits (Half.fromHalf (getHalfSpecials f))) termToJson (TFloat32 f) = Aeson.Number (fromFloatDigits (getFloatSpecials f)) termToJson (TFloat64 f) = Aeson.Number (fromFloatDigits (getDoubleSpecials f)) bytesToBase64Text :: [Word8] -> T.Text bytesToBase64Text = T.decodeLatin1 . Base64url.encode . BS.pack ------------------------------------------------------------------------------- -- Unit tests for test vector from CBOR spec RFC7049 Appendix A -- unit_expectedDiagnosticNotation :: RFC7049TestCase -> Assertion unit_expectedDiagnosticNotation RFC7049TestCase { expectedDiagnostic, encodedBytes } = do let Just (term, []) = runDecoder decodeTerm encodedBytes actualDiagnostic = diagnosticNotation term expectedDiagnostic @=? actualDiagnostic -- | The reference implementation satisfies the roundtrip property for most -- examples (all the ones from Appendix A). It does not satisfy the roundtrip -- property in general however, non-canonical over-long int encodings for -- example. -- unit_encodedRoundtrip :: RFC7049TestCase -> Assertion unit_encodedRoundtrip RFC7049TestCase { expectedDiagnostic, encodedBytes } = do let Just (term, []) = runDecoder decodeTerm encodedBytes reencodedBytes = encodeTerm term assertEqual ("for CBOR: " ++ expectedDiagnostic) encodedBytes reencodedBytes -------------------------------------------------------------------------------- -- TestTree API testTree :: TestTree testTree = testGroup "Reference implementation" [ testGroup "internal properties" [ testProperty "Integer to/from bytes" prop_integerToFromBytes , testProperty "Word16 to/from network byte order" prop_word16ToFromNet , testProperty "Word32 to/from network byte order" prop_word32ToFromNet , testProperty "Word64 to/from network byte order" prop_word64ToFromNet , testProperty "Numeric.Half to/from Float" prop_halfToFromFloat ] , testGroup "properties" [ testProperty "encoding/decoding initial byte" prop_InitialByte , testProperty "encoding/decoding additional info" prop_AdditionalInfo , testProperty "encoding/decoding token header" prop_TokenHeader , testProperty "encoding/decoding token header 2" prop_TokenHeader2 , testProperty "encoding/decoding tokens" prop_Token , --localOption (QuickCheckTests 1000) $ localOption (QuickCheckMaxSize 150) $ testProperty "encoding/decoding terms" prop_Term ] , testCase "RFC7049 test vector: decode" $ mapM_ unit_expectedDiagnosticNotation rfc7049TestVector , testCase "RFC7049 test vector: roundtrip" $ mapM_ unit_encodedRoundtrip rfc7049TestVector , withExternalTestVector $ \getTestVector -> testCase "external test vector" $ getTestVector >>= unit_externalTestVector ] cborg-0.2.10.0/tests/Tests/Reference/0000755000000000000000000000000007346545000015357 5ustar0000000000000000cborg-0.2.10.0/tests/Tests/Reference/Generators.hs0000644000000000000000000001767407346545000020043 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE DeriveFunctor #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} module Tests.Reference.Generators ( -- * Integer with a large range LargeInteger(..) -- * Floats with NaNs , FloatNaN(..) , canonicaliseNaN -- * Floats with special values , HalfSpecials(..) , FloatSpecials(..) , DoubleSpecials(..) -- * Floating types to bit representation conversion , halfToWord , floatToWord , doubleToWord , wordToHalf , wordToFloat , wordToDouble ) where import Data.Word import Numeric (showHex) import Numeric.Half as Half import GHC.Float (float2Double) import Data.Proxy import Foreign import System.IO.Unsafe import System.Random (Random) import Test.QuickCheck #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif -- | QuickCheck generator for large integers -- newtype LargeInteger = LargeInteger { getLargeInteger :: Integer } deriving (Show, Eq) instance Arbitrary LargeInteger where arbitrary = sized $ \n -> oneof $ take (1 + n `div` 10) [ LargeInteger . fromIntegral <$> (arbitrary :: Gen Int8) , LargeInteger . fromIntegral <$> choose (minBound, maxBound :: Int64) , LargeInteger . bigger . fromIntegral <$> choose (minBound, maxBound :: Int64) ] where bigger n = n * abs n ---------------------------------------- -- Float <-> Integral conversions -- wordToHalf :: Word16 -> Half wordToHalf = Half.Half . fromIntegral wordToFloat :: Word32 -> Float wordToFloat = toFloat wordToDouble :: Word64 -> Double wordToDouble = toFloat toFloat :: (Storable word, Storable float) => word -> float toFloat w = unsafeDupablePerformIO $ alloca $ \buf -> do poke (castPtr buf) w peek buf halfToWord :: Half -> Word16 halfToWord (Half.Half w) = fromIntegral w floatToWord :: Float -> Word32 floatToWord = fromFloat doubleToWord :: Double -> Word64 doubleToWord = fromFloat fromFloat :: (Storable word, Storable float) => float -> word fromFloat float = unsafeDupablePerformIO $ alloca $ \buf -> do poke (castPtr buf) float peek buf --------------------------------------------------- -- Floats with NaNs -- class RealFloat n => FloatNaN n where canonicalNaN :: n canonicaliseNaN :: FloatNaN n => n -> n canonicaliseNaN n | isNaN n = canonicalNaN | otherwise = n instance FloatNaN Half where canonicalNaN = Half 0x7e00 instance FloatNaN Float where canonicalNaN = Half.fromHalf canonicalNaN instance FloatNaN Double where canonicalNaN = float2Double canonicalNaN --------------------------------------------------- -- Generators for float types with special values -- instance Arbitrary Half where arbitrary = getHalfSpecials <$> arbitrary shrink = shrinkRealFrac newtype HalfSpecials = HalfSpecials { getHalfSpecials :: Half } deriving (Ord, Num, Fractional, RealFrac, Real, Floating, RealFloat, FloatNaN) newtype FloatSpecials = FloatSpecials { getFloatSpecials :: Float } deriving (Ord, Num, Fractional, RealFrac, Real, Floating, RealFloat, FloatNaN) newtype DoubleSpecials = DoubleSpecials { getDoubleSpecials :: Double } deriving (Ord, Num, Fractional, RealFrac, Real, Floating, RealFloat, FloatNaN) instance Eq HalfSpecials where HalfSpecials a == HalfSpecials b = halfToWord a == halfToWord b instance Eq FloatSpecials where FloatSpecials a == FloatSpecials b = floatToWord a == floatToWord b instance Eq DoubleSpecials where DoubleSpecials a == DoubleSpecials b = doubleToWord a == doubleToWord b instance Show HalfSpecials where showsPrec p (HalfSpecials n) | isNaN n = showString "NaN{-0x" . showHex (halfToWord n) . showString "-}" | otherwise = showsPrec p n instance Show FloatSpecials where showsPrec p (FloatSpecials n) | isNaN n = showString "NaN{-0x" . showHex (floatToWord n) . showString "-}" | otherwise = showsPrec p n instance Show DoubleSpecials where showsPrec p (DoubleSpecials n) | isNaN n = showString "NaN{-0x" . showHex (doubleToWord n) . showString "-}" | otherwise = showsPrec p n instance Arbitrary HalfSpecials where arbitrary = HalfSpecials <$> frequency [ (2, arbitraryFloating) , (1, arbitraryFloatSpecials) ] shrink (HalfSpecials n) = [ HalfSpecials n' | n' <- shrinkRealFrac n ] instance Arbitrary FloatSpecials where arbitrary = FloatSpecials <$> frequency [ (2, arbitraryFloating) , (1, arbitraryFloatSpecials) ] shrink (FloatSpecials n) = [ FloatSpecials n' | n' <- shrinkRealFrac n ] instance Arbitrary DoubleSpecials where arbitrary = DoubleSpecials <$> frequency [ (2, arbitraryFloating) , (1, arbitraryFloatSpecials) ] shrink (DoubleSpecials n) = [ DoubleSpecials n' | n' <- shrinkRealFrac n ] -- | Generate a float from a uniformly random bit pattern -- arbitraryFloating :: forall n. RealFloatIEEE n => Gen n arbitraryFloating = wordToFloating <$> arbitraryBoundedIntegral -- | Generate float special values, see 'IeeeSpecials', -- -- In particular we generate more than a single NaN bit pattern so that we can -- test non-canonical representations. The other special values have a single -- bit pattern. -- arbitraryFloatSpecials :: forall n. (RealFloatIEEE n, Random (FloatWord n)) => Gen n arbitraryFloatSpecials = frequency [ (1, pure (wordToFloating positiveInfinity)) , (1, pure (wordToFloating negativeInfinity)) , (1, pure (wordToFloating negativeZero)) , (3, wordToFloating <$> choose nanRange) ] where IeeeSpecials {..} = floatIeeeSpecials (Proxy :: Proxy n) -- | Special values for IEEE float types, including negative 0, -- positive and negative infinity and a range of NaN values. -- data IeeeSpecials n = IeeeSpecials { positiveInfinity :: n, negativeInfinity :: n, negativeZero :: n, nanRange :: (n, n) } deriving (Eq, Functor, Show) -- | The 'IeeeSpecials' values for 'RealFloatIEEE' types (i.e. 'Half', 'Float' -- and 'Double'). -- -- To make sense of the bit-twiddling here, see -- -- -- -- -- floatIeeeSpecials :: RealFloatIEEE n => Proxy n -> IeeeSpecials (FloatWord n) floatIeeeSpecials p = IeeeSpecials {..} where positiveInfinity = (setBit 0 (exponentBits p) - 1) `shiftL` significandBits p negativeInfinity = (setBit 0 (exponentBits p +1) - 1) `shiftL` significandBits p negativeZero = setBit 0 (exponentBits p + significandBits p) nanRange = (positiveInfinity, negativeZero - 1) class (RealFloat n, Integral (FloatWord n), Show (FloatWord n), Bounded (FloatWord n), Bits (FloatWord n)) => RealFloatIEEE n where exponentBits :: Proxy n -> Int significandBits :: Proxy n -> Int type FloatWord n :: * wordToFloating :: FloatWord n -> n --floatingToWord :: n -> FloatWord n instance RealFloatIEEE Half where exponentBits _ = 5 significandBits _ = 10 type FloatWord Half = Word16 wordToFloating = wordToHalf --floatingToWord = halfToWord instance RealFloatIEEE Float where exponentBits _ = 8 significandBits _ = 23 type FloatWord Float = Word32 wordToFloating = wordToFloat --floatingToWord = floatToWord instance RealFloatIEEE Double where exponentBits _ = 11 significandBits _ = 52 type FloatWord Double = Word64 wordToFloating = wordToDouble --floatingToWord = doubleToWord cborg-0.2.10.0/tests/Tests/Reference/Implementation.hs0000644000000000000000000011045407346545000020705 0ustar0000000000000000{-# LANGUAGE CPP, BangPatterns #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.CBOR -- Copyright : 2013 Simon Meier , -- 2013-2014 Duncan Coutts, -- License : BSD3-style (see LICENSE.txt) -- -- Maintainer : Duncan Coutts -- Stability : -- Portability : portable -- -- CBOR format support. -- ----------------------------------------------------------------------------- module Tests.Reference.Implementation ( serialise, deserialise, Term(..), Token(..), canonicaliseTerm, isCanonicalTerm, UInt(..), fromUInt, toUInt, canonicaliseUInt, lengthUInt, Simple(..), fromSimple, toSimple, reservedSimple, unassignedSimple, reservedTag, Decoder, runDecoder, testDecode, decodeTerm, decodeTokens, decodeToken, decodeTagged, diagnosticNotation, Encoder, encodeTerm, encodeToken, prop_InitialByte, prop_AdditionalInfo, prop_TokenHeader, prop_TokenHeader2, prop_Token, prop_Term, -- properties of internal helpers prop_integerToFromBytes, prop_word16ToFromNet, prop_word32ToFromNet, prop_word64ToFromNet, prop_halfToFromFloat, ) where import qualified Control.Monad.Fail as Fail import Data.Bits import Data.Word import qualified Numeric.Half as Half import Data.List import Numeric import GHC.Float (float2Double) import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS import qualified Data.Text as T import qualified Data.Text.Encoding as T import Data.Monoid ((<>)) import Control.Monad (ap) import Test.QuickCheck.Arbitrary import Test.QuickCheck.Gen #if !MIN_VERSION_base(4,8,0) import Data.Monoid (Monoid(..)) import Control.Applicative #endif import Tests.Reference.Generators serialise :: Term -> LBS.ByteString serialise = LBS.pack . encodeTerm deserialise :: LBS.ByteString -> Term deserialise bytes = case runDecoder decodeTerm (LBS.unpack bytes) of Just (term, []) -> term Just _ -> error "ReferenceImpl.deserialise: trailing data" Nothing -> error "ReferenceImpl.deserialise: decoding failed" ------------------------------------------------------------------------ newtype Decoder a = Decoder { runDecoder :: [Word8] -> Maybe (a, [Word8]) } instance Functor Decoder where fmap f a = a >>= return . f instance Applicative Decoder where pure = return (<*>) = ap instance Monad Decoder where return x = Decoder (\ws -> Just (x, ws)) d >>= f = Decoder (\ws -> case runDecoder d ws of Nothing -> Nothing Just (x, ws') -> runDecoder (f x) ws') #if !MIN_VERSION_base(4,13,0) fail = Fail.fail #endif instance Fail.MonadFail Decoder where fail _ = Decoder (\_ -> Nothing) getByte :: Decoder Word8 getByte = Decoder $ \ws -> case ws of w:ws' -> Just (w, ws') _ -> Nothing getBytes :: Integral n => n -> Decoder [Word8] getBytes n = Decoder $ \ws -> case genericSplitAt n ws of (ws', []) | genericLength ws' == n -> Just (ws', []) | otherwise -> Nothing (ws', ws'') -> Just (ws', ws'') eof :: Decoder Bool eof = Decoder $ \ws -> Just (null ws, ws) type Encoder a = a -> [Word8] -- The initial byte of each data item contains both information about -- the major type (the high-order 3 bits, described in Section 2.1) and -- additional information (the low-order 5 bits). data MajorType = MajorType0 | MajorType1 | MajorType2 | MajorType3 | MajorType4 | MajorType5 | MajorType6 | MajorType7 deriving (Show, Eq, Ord, Enum) instance Arbitrary MajorType where arbitrary = elements [MajorType0 .. MajorType7] encodeInitialByte :: MajorType -> Word -> Word8 encodeInitialByte mt ai | ai < 2^(5 :: Int) = fromIntegral (fromIntegral (fromEnum mt) `shiftL` 5 .|. ai) | otherwise = error "encodeInitialByte: invalid additional info value" decodeInitialByte :: Word8 -> (MajorType, Word) decodeInitialByte ib = ( toEnum $ fromIntegral $ ib `shiftR` 5 , fromIntegral $ ib .&. 0x1f) prop_InitialByte :: Bool prop_InitialByte = and [ (uncurry encodeInitialByte . decodeInitialByte) w8 == w8 | w8 <- [minBound..maxBound] ] -- When the value of the -- additional information is less than 24, it is directly used as a -- small unsigned integer. When it is 24 to 27, the additional bytes -- for a variable-length integer immediately follow; the values 24 to 27 -- of the additional information specify that its length is a 1-, 2-, -- 4-, or 8-byte unsigned integer, respectively. Additional information -- value 31 is used for indefinite-length items, described in -- Section 2.2. Additional information values 28 to 30 are reserved for -- future expansion. -- -- In all additional information values, the resulting integer is -- interpreted depending on the major type. It may represent the actual -- data: for example, in integer types, the resulting integer is used -- for the value itself. It may instead supply length information: for -- example, in byte strings it gives the length of the byte string data -- that follows. data UInt = UIntSmall Word | UInt8 Word8 | UInt16 Word16 | UInt32 Word32 | UInt64 Word64 deriving (Eq, Show) data AdditionalInformation = AiValue UInt | AiIndefLen | AiReserved Word deriving (Eq, Show) instance Arbitrary UInt where arbitrary = sized $ \n -> oneof $ take (1 + n `div` 2) [ UIntSmall <$> choose (0, 23) , UInt8 <$> arbitraryBoundedIntegral , UInt16 <$> arbitraryBoundedIntegral , UInt32 <$> arbitraryBoundedIntegral , UInt64 <$> arbitraryBoundedIntegral ] shrink (UIntSmall n) = [ UIntSmall n' | n' <- shrink n ] shrink (UInt8 n) = [ UInt8 n' | n' <- shrink n ] ++ [ UIntSmall (fromIntegral n) | n <= 23 ] shrink (UInt16 n) = [ UInt16 n' | n' <- shrink n ] ++ [ UInt8 (fromIntegral n) | n <= fromIntegral (maxBound :: Word8) ] shrink (UInt32 n) = [ UInt32 n' | n' <- shrink n ] ++ [ UInt16 (fromIntegral n) | n <= fromIntegral (maxBound :: Word16) ] shrink (UInt64 n) = [ UInt64 n' | n' <- shrink n ] ++ [ UInt32 (fromIntegral n) | n <= fromIntegral (maxBound :: Word32) ] instance Arbitrary AdditionalInformation where arbitrary = frequency [ (7, AiValue <$> arbitrary) , (2, pure AiIndefLen) , (1, AiReserved <$> choose (28, 30)) ] decodeAdditionalInfo :: Word -> Decoder AdditionalInformation decodeAdditionalInfo = dec where dec n | n < 24 = return (AiValue (UIntSmall n)) dec 24 = do w <- getByte return (AiValue (UInt8 w)) dec 25 = do [w1,w0] <- getBytes (2 :: Int) let w = word16FromNet w1 w0 return (AiValue (UInt16 w)) dec 26 = do [w3,w2,w1,w0] <- getBytes (4 :: Int) let w = word32FromNet w3 w2 w1 w0 return (AiValue (UInt32 w)) dec 27 = do [w7,w6,w5,w4,w3,w2,w1,w0] <- getBytes (8 :: Int) let w = word64FromNet w7 w6 w5 w4 w3 w2 w1 w0 return (AiValue (UInt64 w)) dec 31 = return AiIndefLen dec n | n < 31 = return (AiReserved n) dec _ = fail "" encodeAdditionalInfo :: AdditionalInformation -> (Word, [Word8]) encodeAdditionalInfo = enc where enc (AiValue (UIntSmall n)) | n < 24 = (n, []) | otherwise = error "invalid UIntSmall value" enc (AiValue (UInt8 w)) = (24, [w]) enc (AiValue (UInt16 w)) = (25, [w1, w0]) where (w1, w0) = word16ToNet w enc (AiValue (UInt32 w)) = (26, [w3, w2, w1, w0]) where (w3, w2, w1, w0) = word32ToNet w enc (AiValue (UInt64 w)) = (27, [w7, w6, w5, w4, w3, w2, w1, w0]) where (w7, w6, w5, w4, w3, w2, w1, w0) = word64ToNet w enc AiIndefLen = (31, []) enc (AiReserved n) | n >= 28 && n < 31 = (n, []) | otherwise = error "invalid AiReserved value" prop_AdditionalInfo :: AdditionalInformation -> Bool prop_AdditionalInfo ai = let (w, ws) = encodeAdditionalInfo ai Just (ai', _) = runDecoder (decodeAdditionalInfo w) ws in ai == ai' data TokenHeader = TokenHeader MajorType AdditionalInformation deriving (Show, Eq) instance Arbitrary TokenHeader where arbitrary = TokenHeader <$> arbitrary <*> arbitrary decodeTokenHeader :: Decoder TokenHeader decodeTokenHeader = do b <- getByte let (mt, ai) = decodeInitialByte b ai' <- decodeAdditionalInfo ai return (TokenHeader mt ai') encodeTokenHeader :: Encoder TokenHeader encodeTokenHeader (TokenHeader mt ai) = let (w, ws) = encodeAdditionalInfo ai in encodeInitialByte mt w : ws prop_TokenHeader :: TokenHeader -> Bool prop_TokenHeader header = let ws = encodeTokenHeader header Just (header', _) = runDecoder decodeTokenHeader ws in header == header' prop_TokenHeader2 :: Bool prop_TokenHeader2 = and [ w8 : extraused == encoded | w8 <- [minBound..maxBound] , let extra = [1..8] Just (header, unused) = runDecoder decodeTokenHeader (w8 : extra) encoded = encodeTokenHeader header extraused = take (8 - length unused) extra ] data Simple = SimpleSmall Word -- 0 .. 23 | SimpleLarge Word8 -- 0 .. 255, but 0..23 are non-canonical -- and 24..31 are reserved deriving (Eq, Show) fromSimple :: Simple -> Word8 fromSimple (SimpleSmall w) = fromIntegral w fromSimple (SimpleLarge w) = w toSimple :: Word8 -> Simple toSimple w | w <= 23 = SimpleSmall (fromIntegral w) | otherwise = SimpleLarge w reservedSimple :: Word8 -> Bool reservedSimple w = w >= 24 && w <= 31 unassignedSimple :: Word8 -> Bool unassignedSimple w = w < 20 || w > 31 instance Arbitrary Simple where arbitrary = oneof [ SimpleSmall <$> choose (0, 23) , SimpleLarge <$> choose (0, 31) , SimpleLarge <$> choose (32, 255) ] shrink (SimpleSmall n) = [ SimpleSmall n' | n' <- shrink n ] shrink (SimpleLarge n) = [ SimpleSmall (fromIntegral n') | n' <- shrink n, n' <= 23 ] ++ [ SimpleLarge n' | n' <- shrink n ] data Token = MT0_UnsignedInt UInt | MT1_NegativeInt UInt | MT2_ByteString UInt [Word8] | MT2_ByteStringIndef | MT3_String UInt [Word8] | MT3_StringIndef | MT4_ArrayLen UInt | MT4_ArrayLenIndef | MT5_MapLen UInt | MT5_MapLenIndef | MT6_Tag UInt | MT7_Simple Simple | MT7_Float16 HalfSpecials | MT7_Float32 FloatSpecials | MT7_Float64 DoubleSpecials | MT7_Break deriving (Show, Eq) instance Arbitrary Token where arbitrary = oneof [ MT0_UnsignedInt <$> arbitrary , MT1_NegativeInt <$> arbitrary , do ws <- arbitrary MT2_ByteString <$> arbitraryLengthUInt ws <*> pure ws , pure MT2_ByteStringIndef , do cs <- arbitrary let ws = encodeUTF8 cs MT3_String <$> arbitraryLengthUInt ws <*> pure ws , pure MT3_StringIndef , MT4_ArrayLen <$> arbitrary , pure MT4_ArrayLenIndef , MT5_MapLen <$> arbitrary , pure MT5_MapLenIndef , MT6_Tag <$> arbitrary , MT7_Simple <$> arbitrary , MT7_Float16 <$> arbitrary , MT7_Float32 <$> arbitrary , MT7_Float64 <$> arbitrary , pure MT7_Break ] where arbitraryLengthUInt xs = let n = length xs in elements $ [ UIntSmall (fromIntegral n) | n < 24 ] ++ [ UInt8 (fromIntegral n) | n < 255 ] ++ [ UInt16 (fromIntegral n) | n < 65536 ] ++ [ UInt32 (fromIntegral n) , UInt64 (fromIntegral n) ] testDecode :: [Word8] -> Term testDecode ws = case runDecoder decodeTerm ws of Just (x, []) -> x _ -> error "testDecode: parse error" decodeTokens :: Decoder [Token] decodeTokens = do done <- eof if done then return [] else do tok <- decodeToken toks <- decodeTokens return (tok:toks) decodeToken :: Decoder Token decodeToken = do header <- decodeTokenHeader extra <- getBytes (tokenExtraLen header) either fail return (packToken header extra) tokenExtraLen :: TokenHeader -> Word64 tokenExtraLen (TokenHeader MajorType2 (AiValue n)) = fromUInt n -- bytestrings tokenExtraLen (TokenHeader MajorType3 (AiValue n)) = fromUInt n -- unicode strings tokenExtraLen _ = 0 packToken :: TokenHeader -> [Word8] -> Either String Token packToken (TokenHeader mt ai) extra = case (mt, ai) of -- Major type 0: an unsigned integer. The 5-bit additional information -- is either the integer itself (for additional information values 0 -- through 23) or the length of additional data. (MajorType0, AiValue n) -> return (MT0_UnsignedInt n) -- Major type 1: a negative integer. The encoding follows the rules -- for unsigned integers (major type 0), except that the value is -- then -1 minus the encoded unsigned integer. (MajorType1, AiValue n) -> return (MT1_NegativeInt n) -- Major type 2: a byte string. The string's length in bytes is -- represented following the rules for positive integers (major type 0). (MajorType2, AiValue n) -> return (MT2_ByteString n extra) (MajorType2, AiIndefLen) -> return MT2_ByteStringIndef -- Major type 3: a text string, specifically a string of Unicode -- characters that is encoded as UTF-8 [RFC3629]. The format of this -- type is identical to that of byte strings (major type 2), that is, -- as with major type 2, the length gives the number of bytes. (MajorType3, AiValue n) -> return (MT3_String n extra) (MajorType3, AiIndefLen) -> return MT3_StringIndef -- Major type 4: an array of data items. The array's length follows the -- rules for byte strings (major type 2), except that the length -- denotes the number of data items, not the length in bytes that the -- array takes up. (MajorType4, AiValue n) -> return (MT4_ArrayLen n) (MajorType4, AiIndefLen) -> return MT4_ArrayLenIndef -- Major type 5: a map of pairs of data items. A map is comprised of -- pairs of data items, each pair consisting of a key that is -- immediately followed by a value. The map's length follows the -- rules for byte strings (major type 2), except that the length -- denotes the number of pairs, not the length in bytes that the map -- takes up. (MajorType5, AiValue n) -> return (MT5_MapLen n) (MajorType5, AiIndefLen) -> return MT5_MapLenIndef -- Major type 6: optional semantic tagging of other major types. -- The initial bytes of the tag follow the rules for positive integers -- (major type 0). (MajorType6, AiValue n) -> return (MT6_Tag n) -- Major type 7 is for two types of data: floating-point numbers and -- "simple values" that do not need any content. Each value of the -- 5-bit additional information in the initial byte has its own separate -- meaning, as defined in Table 1. -- | 0..23 | Simple value (value 0..23) | -- | 24 | Simple value (value 32..255 in following byte) | -- | 25 | IEEE 754 Half-Precision Float (16 bits follow) | -- | 26 | IEEE 754 Single-Precision Float (32 bits follow) | -- | 27 | IEEE 754 Double-Precision Float (64 bits follow) | -- | 28-30 | (Unassigned) | -- | 31 | "break" stop code for indefinite-length items | (MajorType7, AiValue (UIntSmall w)) -> return (MT7_Simple (SimpleSmall w)) (MajorType7, AiValue (UInt8 w)) -> return (MT7_Simple (SimpleLarge w)) (MajorType7, AiValue (UInt16 w)) -> return (MT7_Float16 (HalfSpecials (wordToHalf w))) (MajorType7, AiValue (UInt32 w)) -> return (MT7_Float32 (FloatSpecials (wordToFloat w))) (MajorType7, AiValue (UInt64 w)) -> return (MT7_Float64 (DoubleSpecials (wordToDouble w))) (MajorType7, AiIndefLen) -> return (MT7_Break) _ -> Left "invalid token header" encodeToken :: Encoder Token encodeToken tok = let (header, extra) = unpackToken tok in encodeTokenHeader header ++ extra unpackToken :: Token -> (TokenHeader, [Word8]) unpackToken tok = (\(mt, ai, ws) -> (TokenHeader mt ai, ws)) $ case tok of (MT0_UnsignedInt n) -> (MajorType0, AiValue n, []) (MT1_NegativeInt n) -> (MajorType1, AiValue n, []) (MT2_ByteString n ws) -> (MajorType2, AiValue n, ws) MT2_ByteStringIndef -> (MajorType2, AiIndefLen, []) (MT3_String n ws) -> (MajorType3, AiValue n, ws) MT3_StringIndef -> (MajorType3, AiIndefLen, []) (MT4_ArrayLen n) -> (MajorType4, AiValue n, []) MT4_ArrayLenIndef -> (MajorType4, AiIndefLen, []) (MT5_MapLen n) -> (MajorType5, AiValue n, []) MT5_MapLenIndef -> (MajorType5, AiIndefLen, []) (MT6_Tag n) -> (MajorType6, AiValue n, []) (MT7_Simple (SimpleSmall n)) -> (MajorType7, AiValue (UIntSmall (fromIntegral n)), []) (MT7_Simple (SimpleLarge n)) -> (MajorType7, AiValue (UInt8 n), []) (MT7_Float16 (HalfSpecials f)) -> (MajorType7, AiValue (UInt16 (halfToWord f)), []) (MT7_Float32 (FloatSpecials f)) -> (MajorType7, AiValue (UInt32 (floatToWord f)), []) (MT7_Float64 (DoubleSpecials f))-> (MajorType7, AiValue (UInt64 (doubleToWord f)), []) MT7_Break -> (MajorType7, AiIndefLen, []) fromUInt :: UInt -> Word64 fromUInt (UIntSmall w) = fromIntegral w fromUInt (UInt8 w) = fromIntegral w fromUInt (UInt16 w) = fromIntegral w fromUInt (UInt32 w) = fromIntegral w fromUInt (UInt64 w) = fromIntegral w toUInt :: Word64 -> UInt toUInt n | n < 24 = UIntSmall (fromIntegral n) | n <= fromIntegral (maxBound :: Word8) = UInt8 (fromIntegral n) | n <= fromIntegral (maxBound :: Word16) = UInt16 (fromIntegral n) | n <= fromIntegral (maxBound :: Word32) = UInt32 (fromIntegral n) | otherwise = UInt64 n lengthUInt :: [a] -> UInt lengthUInt = toUInt . fromIntegral . length decodeUTF8 :: [Word8] -> Either String [Char] decodeUTF8 = either (Left . show) (return . T.unpack) . T.decodeUtf8' . BS.pack encodeUTF8 :: [Char] -> [Word8] encodeUTF8 = BS.unpack . T.encodeUtf8 . T.pack reservedTag :: Word64 -> Bool reservedTag w = w <= 5 prop_Token :: Token -> Bool prop_Token token = let ws = encodeToken token Just (token', []) = runDecoder decodeToken ws in token == token' data Term = TUInt UInt | TNInt UInt | TBigInt Integer | TBytes [Word8] | TBytess [[Word8]] | TString [Char] | TStrings [[Char]] | TArray [Term] | TArrayI [Term] | TMap [(Term, Term)] | TMapI [(Term, Term)] | TTagged UInt Term | TTrue | TFalse | TNull | TUndef | TSimple Simple | TFloat16 HalfSpecials | TFloat32 FloatSpecials | TFloat64 DoubleSpecials deriving (Show, Eq) instance Arbitrary Term where arbitrary = frequency [ (1, TUInt <$> arbitrary) , (1, TNInt <$> arbitrary) , (1, TBigInt . getLargeInteger <$> arbitrary) , (1, TBytes <$> arbitrary) , (1, TBytess <$> arbitrary) , (1, TString <$> arbitrary) , (1, TStrings <$> arbitrary) , (2, TArray <$> listOfSmaller arbitrary) , (2, TArrayI <$> listOfSmaller arbitrary) , (2, TMap <$> listOfSmaller ((,) <$> arbitrary <*> arbitrary)) , (2, TMapI <$> listOfSmaller ((,) <$> arbitrary <*> arbitrary)) , (1, TTagged <$> arbitraryTag <*> sized (\sz -> resize (max 0 (sz-1)) arbitrary)) , (1, pure TFalse) , (1, pure TTrue) , (1, pure TNull) , (1, pure TUndef) , (1, TSimple <$> arbitrary `suchThat` (unassignedSimple . fromSimple)) , (1, TFloat16 <$> arbitrary) , (1, TFloat32 <$> arbitrary) , (1, TFloat64 <$> arbitrary) ] where listOfSmaller :: Gen a -> Gen [a] listOfSmaller gen = sized $ \n -> do k <- choose (0,n) vectorOf k (resize (n `div` (k+1)) gen) arbitraryTag = arbitrary `suchThat` (not . reservedTag . fromUInt) shrink (TUInt n) = [ TUInt n' | n' <- shrink n ] shrink (TNInt n) = [ TNInt n' | n' <- shrink n ] shrink (TBigInt n) = [ TBigInt n' | n' <- shrink n ] shrink (TBytes ws) = [ TBytes ws' | ws' <- shrink ws ] shrink (TBytess wss) = [ TBytess wss' | wss' <- shrink wss ] shrink (TString ws) = [ TString ws' | ws' <- shrink ws ] shrink (TStrings wss) = [ TStrings wss' | wss' <- shrink wss ] shrink (TArray xs@[x]) = x : [ TArray xs' | xs' <- shrink xs ] shrink (TArray xs) = [ TArray xs' | xs' <- shrink xs ] shrink (TArrayI xs@[x]) = x : [ TArrayI xs' | xs' <- shrink xs ] shrink (TArrayI xs) = [ TArrayI xs' | xs' <- shrink xs ] shrink (TMap xys@[(x,y)]) = x : y : [ TMap xys' | xys' <- shrink xys ] shrink (TMap xys) = [ TMap xys' | xys' <- shrink xys ] shrink (TMapI xys@[(x,y)]) = x : y : [ TMapI xys' | xys' <- shrink xys ] shrink (TMapI xys) = [ TMapI xys' | xys' <- shrink xys ] shrink (TTagged w t) = [ TTagged w' t' | (w', t') <- shrink (w, t) , not (reservedTag (fromUInt w')) ] shrink TFalse = [] shrink TTrue = [] shrink TNull = [] shrink TUndef = [] shrink (TSimple n) = [ TSimple n' | n' <- shrink n , unassignedSimple (fromSimple n') ] shrink (TFloat16 f) = [ TFloat16 f' | f' <- shrink f ] shrink (TFloat32 f) = [ TFloat32 f' | f' <- shrink f ] shrink (TFloat64 f) = [ TFloat64 f' | f' <- shrink f ] decodeTerm :: Decoder Term decodeTerm = decodeToken >>= decodeTermFrom decodeTermFrom :: Token -> Decoder Term decodeTermFrom tk = case tk of MT0_UnsignedInt n -> return (TUInt n) MT1_NegativeInt n -> return (TNInt n) MT2_ByteString _ bs -> return (TBytes bs) MT2_ByteStringIndef -> decodeBytess [] MT3_String _ ws -> either fail (return . TString) (decodeUTF8 ws) MT3_StringIndef -> decodeStrings [] MT4_ArrayLen len -> decodeArrayN (fromUInt len) [] MT4_ArrayLenIndef -> decodeArray [] MT5_MapLen len -> decodeMapN (fromUInt len) [] MT5_MapLenIndef -> decodeMap [] MT6_Tag tag -> decodeTagged tag MT7_Simple n | n' == 20 -> return TFalse | n' == 21 -> return TTrue | n' == 22 -> return TNull | n' == 23 -> return TUndef | otherwise -> return (TSimple n) where n' = fromSimple n MT7_Float16 f -> return (TFloat16 f) MT7_Float32 f -> return (TFloat32 f) MT7_Float64 f -> return (TFloat64 f) MT7_Break -> fail "unexpected" decodeBytess :: [[Word8]] -> Decoder Term decodeBytess acc = do tk <- decodeToken case tk of MT7_Break -> return $! TBytess (reverse acc) MT2_ByteString _ bs -> decodeBytess (bs : acc) _ -> fail "unexpected" decodeStrings :: [String] -> Decoder Term decodeStrings acc = do tk <- decodeToken case tk of MT7_Break -> return $! TStrings (reverse acc) MT3_String _ ws -> do cs <- either fail return (decodeUTF8 ws) decodeStrings (cs : acc) _ -> fail "unexpected" decodeArrayN :: Word64 -> [Term] -> Decoder Term decodeArrayN n acc = case n of 0 -> return $! TArray (reverse acc) _ -> do t <- decodeTerm decodeArrayN (n-1) (t : acc) decodeArray :: [Term] -> Decoder Term decodeArray acc = do tk <- decodeToken case tk of MT7_Break -> return $! TArrayI (reverse acc) _ -> do tm <- decodeTermFrom tk decodeArray (tm : acc) decodeMapN :: Word64 -> [(Term, Term)] -> Decoder Term decodeMapN n acc = case n of 0 -> return $! TMap (reverse acc) _ -> do tm <- decodeTerm tm' <- decodeTerm decodeMapN (n-1) ((tm, tm') : acc) decodeMap :: [(Term, Term)] -> Decoder Term decodeMap acc = do tk <- decodeToken case tk of MT7_Break -> return $! TMapI (reverse acc) _ -> do tm <- decodeTermFrom tk tm' <- decodeTerm decodeMap ((tm, tm') : acc) decodeTagged :: UInt -> Decoder Term decodeTagged tag | fromUInt tag == 2 = do MT2_ByteString _ bs <- decodeToken let !n = integerFromBytes bs return (TBigInt n) decodeTagged tag | fromUInt tag == 3 = do MT2_ByteString _ bs <- decodeToken let !n = integerFromBytes bs return (TBigInt (-1 - n)) decodeTagged tag = do tm <- decodeTerm return (TTagged tag tm) integerFromBytes :: [Word8] -> Integer integerFromBytes [] = 0 integerFromBytes (w0:ws0) = go (fromIntegral w0) ws0 where go !acc [] = acc go !acc (w:ws) = go (acc `shiftL` 8 + fromIntegral w) ws integerToBytes :: Integer -> [Word8] integerToBytes n0 | n0 == 0 = [0] | n0 < 0 = reverse (go (-n0)) | otherwise = reverse (go n0) where go n | n == 0 = [] | otherwise = narrow n : go (n `shiftR` 8) narrow :: Integer -> Word8 narrow = fromIntegral prop_integerToFromBytes :: LargeInteger -> Bool prop_integerToFromBytes (LargeInteger n) | n >= 0 = let ws = integerToBytes n n' = integerFromBytes ws in n == n' | otherwise = let ws = integerToBytes n n' = integerFromBytes ws in n == -n' ------------------------------------------------------------------------------- encodeTerm :: Encoder Term encodeTerm (TUInt n) = encodeToken (MT0_UnsignedInt n) encodeTerm (TNInt n) = encodeToken (MT1_NegativeInt n) encodeTerm (TBigInt n) | n >= 0 = encodeToken (MT6_Tag (UIntSmall 2)) <> let ws = integerToBytes n len = lengthUInt ws in encodeToken (MT2_ByteString len ws) | otherwise = encodeToken (MT6_Tag (UIntSmall 3)) <> let ws = integerToBytes (-1 - n) len = lengthUInt ws in encodeToken (MT2_ByteString len ws) encodeTerm (TBytes ws) = let len = lengthUInt ws in encodeToken (MT2_ByteString len ws) encodeTerm (TBytess wss) = encodeToken MT2_ByteStringIndef <> mconcat [ encodeToken (MT2_ByteString len ws) | ws <- wss , let len = lengthUInt ws ] <> encodeToken MT7_Break encodeTerm (TString cs) = let ws = encodeUTF8 cs len = lengthUInt ws in encodeToken (MT3_String len ws) encodeTerm (TStrings css) = encodeToken MT3_StringIndef <> mconcat [ encodeToken (MT3_String len ws) | cs <- css , let ws = encodeUTF8 cs len = lengthUInt ws ] <> encodeToken MT7_Break encodeTerm (TArray ts) = let len = lengthUInt ts in encodeToken (MT4_ArrayLen len) <> mconcat (map encodeTerm ts) encodeTerm (TArrayI ts) = encodeToken MT4_ArrayLenIndef <> mconcat (map encodeTerm ts) <> encodeToken MT7_Break encodeTerm (TMap kvs) = let len = lengthUInt kvs in encodeToken (MT5_MapLen len) <> mconcat [ encodeTerm k <> encodeTerm v | (k,v) <- kvs ] encodeTerm (TMapI kvs) = encodeToken MT5_MapLenIndef <> mconcat [ encodeTerm k <> encodeTerm v | (k,v) <- kvs ] <> encodeToken MT7_Break encodeTerm (TTagged tag t) = encodeToken (MT6_Tag tag) <> encodeTerm t encodeTerm TFalse = encodeToken (MT7_Simple (SimpleSmall 20)) encodeTerm TTrue = encodeToken (MT7_Simple (SimpleSmall 21)) encodeTerm TNull = encodeToken (MT7_Simple (SimpleSmall 22)) encodeTerm TUndef = encodeToken (MT7_Simple (SimpleSmall 23)) encodeTerm (TSimple w) = encodeToken (MT7_Simple w) encodeTerm (TFloat16 f) = encodeToken (MT7_Float16 f) encodeTerm (TFloat32 f) = encodeToken (MT7_Float32 f) encodeTerm (TFloat64 f) = encodeToken (MT7_Float64 f) ------------------------------------------------------------------------------- prop_Term :: Term -> Bool prop_Term term = let ws = encodeTerm term Just (term', []) = runDecoder decodeTerm ws in term == term' isCanonicalTerm :: Term -> Bool isCanonicalTerm t = canonicaliseTerm t == t canonicaliseTerm :: Term -> Term canonicaliseTerm (TUInt n) = TUInt (canonicaliseUInt n) canonicaliseTerm (TNInt n) = TNInt (canonicaliseUInt n) canonicaliseTerm (TBigInt n) | n >= 0 && n <= fromIntegral (maxBound :: Word64) = TUInt (toUInt (fromIntegral n)) | n < 0 && n >= -1 - fromIntegral (maxBound :: Word64) = TNInt (toUInt (fromIntegral (-1 - n))) | otherwise = TBigInt n canonicaliseTerm (TSimple n) = TSimple (canonicaliseSimple n) canonicaliseTerm (TFloat16 f) = canonicaliseFloat TFloat16 f canonicaliseTerm (TFloat32 f) = canonicaliseFloat TFloat32 f canonicaliseTerm (TFloat64 f) = canonicaliseFloat TFloat64 f canonicaliseTerm (TBytess wss) = TBytess (filter (not . null) wss) canonicaliseTerm (TStrings css) = TStrings (filter (not . null) css) canonicaliseTerm (TArray ts) = TArray (map canonicaliseTerm ts) canonicaliseTerm (TArrayI ts) = TArrayI (map canonicaliseTerm ts) canonicaliseTerm (TMap ts) = TMap (map canonicaliseTermPair ts) canonicaliseTerm (TMapI ts) = TMapI (map canonicaliseTermPair ts) canonicaliseTerm (TTagged tag t) = TTagged (canonicaliseUInt tag) (canonicaliseTerm t) canonicaliseTerm t = t canonicaliseUInt :: UInt -> UInt canonicaliseUInt = toUInt . fromUInt canonicaliseSimple :: Simple -> Simple canonicaliseSimple = toSimple . fromSimple canonicaliseFloat :: RealFloat t => (t -> Term) -> t -> Term canonicaliseFloat tfloatNN f | isNaN f = TFloat16 canonicalNaN | otherwise = tfloatNN f canonicaliseTermPair :: (Term, Term) -> (Term, Term) canonicaliseTermPair (x,y) = (canonicaliseTerm x, canonicaliseTerm y) ------------------------------------------------------------------------------- diagnosticNotation :: Term -> String diagnosticNotation = \t -> showsTerm t "" where showsTerm tm = case tm of TUInt n -> shows (fromUInt n) TNInt n -> shows (-1 - fromIntegral (fromUInt n) :: Integer) TBigInt n -> shows n TBytes bs -> showsBytes bs TBytess bss -> surround '(' ')' (underscoreSpace . commaSep showsBytes bss) TString cs -> shows cs TStrings css -> surround '(' ')' (underscoreSpace . commaSep shows css) TArray ts -> surround '[' ']' (commaSep showsTerm ts) TArrayI ts -> surround '[' ']' (underscoreSpace . commaSep showsTerm ts) TMap ts -> surround '{' '}' (commaSep showsMapElem ts) TMapI ts -> surround '{' '}' (underscoreSpace . commaSep showsMapElem ts) TTagged tag t -> shows (fromUInt tag) . surround '(' ')' (showsTerm t) TTrue -> showString "true" TFalse -> showString "false" TNull -> showString "null" TUndef -> showString "undefined" TSimple n -> showString "simple" . surround '(' ')' (shows (fromSimple n)) -- convert to float to work around https://github.com/ekmett/half/issues/2 TFloat16 f -> showFloatCompat (float2Double (Half.fromHalf (getHalfSpecials f))) TFloat32 f -> showFloatCompat (float2Double (getFloatSpecials f)) TFloat64 f -> showFloatCompat (getDoubleSpecials f) surround a b x = showChar a . x . showChar b commaSpace = showChar ',' . showChar ' ' underscoreSpace = showChar '_' . showChar ' ' showsMapElem (k,v) = showsTerm k . showChar ':' . showChar ' ' . showsTerm v catShows :: (a -> ShowS) -> [a] -> ShowS catShows f xs = \s -> foldr (\x r -> f x . r) id xs s sepShows :: ShowS -> (a -> ShowS) -> [a] -> ShowS sepShows sep f xs = foldr (.) id (intersperse sep (map f xs)) commaSep = sepShows commaSpace showsBytes :: [Word8] -> ShowS showsBytes bs = showChar 'h' . showChar '\'' . catShows showFHex bs . showChar '\'' showFHex n | n < 16 = showChar '0' . showHex n | otherwise = showHex n showFloatCompat n | exponent' >= -5 && exponent' <= 15 = showFFloat Nothing n | otherwise = showEFloat Nothing n where exponent' = snd (floatToDigits 10 n) word16FromNet :: Word8 -> Word8 -> Word16 word16FromNet w1 w0 = fromIntegral w1 `shiftL` (8*1) .|. fromIntegral w0 `shiftL` (8*0) word16ToNet :: Word16 -> (Word8, Word8) word16ToNet w = ( fromIntegral ((w `shiftR` (8*1)) .&. 0xff) , fromIntegral ((w `shiftR` (8*0)) .&. 0xff) ) word32FromNet :: Word8 -> Word8 -> Word8 -> Word8 -> Word32 word32FromNet w3 w2 w1 w0 = fromIntegral w3 `shiftL` (8*3) .|. fromIntegral w2 `shiftL` (8*2) .|. fromIntegral w1 `shiftL` (8*1) .|. fromIntegral w0 `shiftL` (8*0) word32ToNet :: Word32 -> (Word8, Word8, Word8, Word8) word32ToNet w = ( fromIntegral ((w `shiftR` (8*3)) .&. 0xff) , fromIntegral ((w `shiftR` (8*2)) .&. 0xff) , fromIntegral ((w `shiftR` (8*1)) .&. 0xff) , fromIntegral ((w `shiftR` (8*0)) .&. 0xff) ) word64FromNet :: Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word64 word64FromNet w7 w6 w5 w4 w3 w2 w1 w0 = fromIntegral w7 `shiftL` (8*7) .|. fromIntegral w6 `shiftL` (8*6) .|. fromIntegral w5 `shiftL` (8*5) .|. fromIntegral w4 `shiftL` (8*4) .|. fromIntegral w3 `shiftL` (8*3) .|. fromIntegral w2 `shiftL` (8*2) .|. fromIntegral w1 `shiftL` (8*1) .|. fromIntegral w0 `shiftL` (8*0) word64ToNet :: Word64 -> (Word8, Word8, Word8, Word8, Word8, Word8, Word8, Word8) word64ToNet w = ( fromIntegral ((w `shiftR` (8*7)) .&. 0xff) , fromIntegral ((w `shiftR` (8*6)) .&. 0xff) , fromIntegral ((w `shiftR` (8*5)) .&. 0xff) , fromIntegral ((w `shiftR` (8*4)) .&. 0xff) , fromIntegral ((w `shiftR` (8*3)) .&. 0xff) , fromIntegral ((w `shiftR` (8*2)) .&. 0xff) , fromIntegral ((w `shiftR` (8*1)) .&. 0xff) , fromIntegral ((w `shiftR` (8*0)) .&. 0xff) ) prop_word16ToFromNet :: Word8 -> Word8 -> Bool prop_word16ToFromNet w1 w0 = word16ToNet (word16FromNet w1 w0) == (w1, w0) prop_word32ToFromNet :: Word8 -> Word8 -> Word8 -> Word8 -> Bool prop_word32ToFromNet w3 w2 w1 w0 = word32ToNet (word32FromNet w3 w2 w1 w0) == (w3, w2, w1, w0) prop_word64ToFromNet :: Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Bool prop_word64ToFromNet w7 w6 w5 w4 w3 w2 w1 w0 = word64ToNet (word64FromNet w7 w6 w5 w4 w3 w2 w1 w0) == (w7, w6, w5, w4, w3, w2, w1, w0) -- Note: some NaNs do not roundtrip https://github.com/ekmett/half/issues/3 -- but all the others had better prop_halfToFromFloat :: Bool prop_halfToFromFloat = all (\w -> roundTrip w || isNaN (Half.Half w)) [minBound..maxBound] where roundTrip w = w == (Half.getHalf . Half.toHalf . Half.fromHalf . Half.Half $ w) cborg-0.2.10.0/tests/Tests/Reference/TestVectors.hs0000644000000000000000000002001007346545000020171 0ustar0000000000000000{-# LANGUAGE NamedFieldPuns #-} {-# LANGUAGE OverloadedStrings #-} module Tests.Reference.TestVectors ( -- * RFC7049 test vector RFC7049TestCase(..) , rfc7049TestVector -- * External test vector , ExternalTestCase(..) , loadExternalTestVector , withExternalTestVector ) where import Test.Tasty as Tasty (TestTree, withResource) import qualified Data.ByteString.Lazy as LBS import qualified Data.ByteString.Base64 as Base64 import qualified Data.ByteString.Base16 as Base16 import qualified Data.Text.Encoding as T import Data.Aeson as Aeson import Control.Applicative import Control.Monad import Data.Word ------------------------------------------------------------ -- Test vector from https://github.com/cbor/test-vectors/ -- -- | A CBOR encoding unit test case for the data from -- -- data ExternalTestCase = ExternalTestCase { encoded :: !LBS.ByteString, decoded :: !(Either Aeson.Value String), roundTrip :: !Bool } deriving Show instance FromJSON ExternalTestCase where parseJSON = withObject "cbor test" $ \obj -> do encoded64 <- T.encodeUtf8 <$> obj .: "cbor" encoded <- either fail return $ Base64.decode encoded64 encoded16 <- T.encodeUtf8 <$> obj .: "hex" let encoded' = Base16.decodeLenient encoded16 when (encoded /= encoded') $ fail "hex and cbor encoding mismatch in input" roundTrip <- obj .: "roundtrip" decoded <- Left <$> obj .: "decoded" <|> Right <$> obj .: "diagnostic" return $! ExternalTestCase { encoded = LBS.fromStrict encoded, roundTrip, decoded } loadExternalTestVector :: IO [ExternalTestCase] loadExternalTestVector = do content <- LBS.readFile "tests/test-vectors/appendix_a.json" either fail return (Aeson.eitherDecode' content) withExternalTestVector :: (IO [ExternalTestCase] -> TestTree) -> TestTree withExternalTestVector = Tasty.withResource loadExternalTestVector (\_ -> return ()) --------------------------------------------------- -- Test vector from CBOR spec RFC7049 Appendix A -- -- | A CBOR encoding unit test case consisting of the encoded bytes and -- corresponding diagnostic notation. -- data RFC7049TestCase = RFC7049TestCase { expectedDiagnostic :: String, encodedBytes :: [Word8] } deriving Show -- | The examples from the CBOR spec RFC7049 Appendix A. -- rfc7049TestVector :: [RFC7049TestCase] rfc7049TestVector = map (uncurry RFC7049TestCase) [ ("0", [0x00]) , ("1", [0x01]) , ("10", [0x0a]) , ("23", [0x17]) , ("24", [0x18, 0x18]) , ("25", [0x18, 0x19]) , ("100", [0x18, 0x64]) , ("1000", [0x19, 0x03, 0xe8]) , ("1000000", [0x1a, 0x00, 0x0f, 0x42, 0x40]) , ("1000000000000", [0x1b, 0x00, 0x00, 0x00, 0xe8, 0xd4, 0xa5, 0x10, 0x00]) , ("18446744073709551615", [0x1b, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff]) , ("18446744073709551616", [0xc2, 0x49, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]) , ("-18446744073709551616", [0x3b, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff]) , ("-18446744073709551617", [0xc3, 0x49, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]) , ("-1", [0x20]) , ("-10", [0x29]) , ("-100", [0x38, 0x63]) , ("-1000", [0x39, 0x03, 0xe7]) , ("0.0", [0xf9, 0x00, 0x00]) , ("-0.0", [0xf9, 0x80, 0x00]) , ("1.0", [0xf9, 0x3c, 0x00]) , ("1.1", [0xfb, 0x3f, 0xf1, 0x99, 0x99, 0x99, 0x99, 0x99, 0x9a]) , ("1.5", [0xf9, 0x3e, 0x00]) , ("65504.0", [0xf9, 0x7b, 0xff]) , ("100000.0", [0xfa, 0x47, 0xc3, 0x50, 0x00]) , ("3.4028234663852886e38", [0xfa, 0x7f, 0x7f, 0xff, 0xff]) , ("1.0e300", [0xfb, 0x7e, 0x37, 0xe4, 0x3c, 0x88, 0x00, 0x75, 0x9c]) , ("5.960464477539063e-8", [0xf9, 0x00, 0x01]) , ("0.00006103515625", [0xf9, 0x04, 0x00]) , ("-4.0", [0xf9, 0xc4, 0x00]) , ("-4.1", [0xfb, 0xc0, 0x10, 0x66, 0x66, 0x66, 0x66, 0x66, 0x66]) , ("Infinity", [0xf9, 0x7c, 0x00]) , ("NaN", [0xf9, 0x7e, 0x00]) , ("-Infinity", [0xf9, 0xfc, 0x00]) , ("Infinity", [0xfa, 0x7f, 0x80, 0x00, 0x00]) , ("-Infinity", [0xfa, 0xff, 0x80, 0x00, 0x00]) , ("Infinity", [0xfb, 0x7f, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]) , ("-Infinity", [0xfb, 0xff, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]) , ("false", [0xf4]) , ("true", [0xf5]) , ("null", [0xf6]) , ("undefined", [0xf7]) , ("simple(16)", [0xf0]) , ("simple(24)", [0xf8, 0x18]) , ("simple(255)", [0xf8, 0xff]) , ("0(\"2013-03-21T20:04:00Z\")", [0xc0, 0x74, 0x32, 0x30, 0x31, 0x33, 0x2d, 0x30, 0x33, 0x2d, 0x32, 0x31, 0x54, 0x32, 0x30, 0x3a, 0x30, 0x34, 0x3a, 0x30, 0x30, 0x5a]) , ("1(1363896240)", [0xc1, 0x1a, 0x51, 0x4b, 0x67, 0xb0]) , ("1(1363896240.5)", [0xc1, 0xfb, 0x41, 0xd4, 0x52, 0xd9, 0xec, 0x20, 0x00, 0x00]) , ("23(h'01020304')", [0xd7, 0x44, 0x01, 0x02, 0x03, 0x04]) , ("24(h'6449455446')", [0xd8, 0x18, 0x45, 0x64, 0x49, 0x45, 0x54, 0x46]) , ("32(\"http://www.example.com\")", [0xd8, 0x20, 0x76, 0x68, 0x74, 0x74, 0x70, 0x3a, 0x2f, 0x2f, 0x77, 0x77, 0x77, 0x2e, 0x65, 0x78, 0x61, 0x6d, 0x70, 0x6c, 0x65, 0x2e, 0x63, 0x6f, 0x6d]) , ("h''", [0x40]) , ("h'01020304'", [0x44, 0x01, 0x02, 0x03, 0x04]) , ("\"\"", [0x60]) , ("\"a\"", [0x61, 0x61]) , ("\"IETF\"", [0x64, 0x49, 0x45, 0x54, 0x46]) , ("\"\\\"\\\\\"", [0x62, 0x22, 0x5c]) , ("\"\\252\"", [0x62, 0xc3, 0xbc]) , ("\"\\27700\"", [0x63, 0xe6, 0xb0, 0xb4]) , ("\"\\65873\"", [0x64, 0xf0, 0x90, 0x85, 0x91]) , ("[]", [0x80]) , ("[1, 2, 3]", [0x83, 0x01, 0x02, 0x03]) , ("[1, [2, 3], [4, 5]]", [0x83, 0x01, 0x82, 0x02, 0x03, 0x82, 0x04, 0x05]) , ("[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]", [0x98, 0x19, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x18, 0x18, 0x19]) , ("{}", [0xa0]) , ("{1: 2, 3: 4}", [0xa2, 0x01, 0x02, 0x03, 0x04]) , ("{\"a\": 1, \"b\": [2, 3]}", [0xa2, 0x61, 0x61, 0x01, 0x61, 0x62, 0x82, 0x02, 0x03]) , ("[\"a\", {\"b\": \"c\"}]", [0x82, 0x61, 0x61, 0xa1, 0x61, 0x62, 0x61, 0x63]) , ("{\"a\": \"A\", \"b\": \"B\", \"c\": \"C\", \"d\": \"D\", \"e\": \"E\"}", [0xa5, 0x61, 0x61, 0x61, 0x41, 0x61, 0x62, 0x61, 0x42, 0x61, 0x63, 0x61, 0x43, 0x61, 0x64, 0x61, 0x44, 0x61, 0x65, 0x61, 0x45]) , ("(_ h'0102', h'030405')", [0x5f, 0x42, 0x01, 0x02, 0x43, 0x03, 0x04, 0x05, 0xff]) , ("(_ \"strea\", \"ming\")", [0x7f, 0x65, 0x73, 0x74, 0x72, 0x65, 0x61, 0x64, 0x6d, 0x69, 0x6e, 0x67, 0xff]) , ("[_ ]", [0x9f, 0xff]) , ("[_ 1, [2, 3], [_ 4, 5]]", [0x9f, 0x01, 0x82, 0x02, 0x03, 0x9f, 0x04, 0x05, 0xff, 0xff]) , ("[_ 1, [2, 3], [4, 5]]", [0x9f, 0x01, 0x82, 0x02, 0x03, 0x82, 0x04, 0x05, 0xff]) , ("[1, [2, 3], [_ 4, 5]]", [0x83, 0x01, 0x82, 0x02, 0x03, 0x9f, 0x04, 0x05, 0xff]) , ("[1, [_ 2, 3], [4, 5]]", [0x83, 0x01, 0x9f, 0x02, 0x03, 0xff, 0x82, 0x04, 0x05]) , ("[_ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]", [0x9f, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x18, 0x18, 0x19, 0xff]) , ("{_ \"a\": 1, \"b\": [_ 2, 3]}", [0xbf, 0x61, 0x61, 0x01, 0x61, 0x62, 0x9f, 0x02, 0x03, 0xff, 0xff]) , ("[\"a\", {_ \"b\": \"c\"}]", [0x82, 0x61, 0x61, 0xbf, 0x61, 0x62, 0x61, 0x63, 0xff]) , ("{_ \"Fun\": true, \"Amt\": -2}", [0xbf, 0x63, 0x46, 0x75, 0x6e, 0xf5, 0x63, 0x41, 0x6d, 0x74, 0x21, 0xff]) ] -- TODO FIXME: test redundant encodings e.g. -- bigint with zero-length bytestring -- bigint with leading zeros -- bigint using indefinate bytestring encoding -- larger than necessary ints, lengths, tags, simple etc cborg-0.2.10.0/tests/Tests/Regress.hs0000644000000000000000000000074307346545000015433 0ustar0000000000000000module Tests.Regress ( testTree -- :: TestTree ) where import Test.Tasty import qualified Tests.Regress.Issue160 as Issue160 import qualified Tests.Regress.Issue162 as Issue162 import qualified Tests.Regress.FlatTerm as FlatTerm -------------------------------------------------------------------------------- -- Tests and properties testTree :: TestTree testTree = testGroup "Regression tests" [ FlatTerm.testTree , Issue160.testTree , Issue162.testTree ] cborg-0.2.10.0/tests/Tests/Regress/0000755000000000000000000000000007346545000015073 5ustar0000000000000000cborg-0.2.10.0/tests/Tests/Regress/FlatTerm.hs0000644000000000000000000000313007346545000017142 0ustar0000000000000000{-# LANGUAGE CPP #-} module Tests.Regress.FlatTerm ( testTree -- :: TestTree ) where import Data.Int #if !MIN_VERSION_base(4,8,0) import Data.Word #endif import Test.Tasty import Test.Tasty.HUnit import Codec.CBOR.Encoding import Codec.CBOR.Decoding import Codec.CBOR.FlatTerm -------------------------------------------------------------------------------- -- Tests and properties -- | Test an edge case in the FlatTerm implementation: when encoding a word -- larger than @'maxBound' :: 'Int'@, we store it as an 'Integer', and -- need to remember to handle this case when we decode. largeWordTest :: Either String Word largeWordTest = fromFlatTerm decodeWord $ toFlatTerm (encodeWord largeWord) largeWord :: Word largeWord = fromIntegral (maxBound :: Int) + 1 -- | Test an edge case in the FlatTerm implementation: when encoding an -- Int64 that is less than @'minBound' :: 'Int'@, make sure we use an -- 'Integer' to store the result, because sticking it into an 'Int' -- will result in overflow otherwise. smallInt64Test :: Either String Int64 smallInt64Test = fromFlatTerm decodeInt64 $ toFlatTerm (encodeInt64 smallInt64) smallInt64 :: Int64 smallInt64 = fromIntegral (minBound :: Int) - 1 -------------------------------------------------------------------------------- -- TestTree API testTree :: TestTree testTree = testGroup "FlatTerm regressions" [ testCase "Decoding of large-ish words" (Right largeWord @=? largeWordTest) , testCase "Encoding of Int64s on 32bit" (Right smallInt64 @=? smallInt64Test) ] cborg-0.2.10.0/tests/Tests/Regress/Issue160.hs0000644000000000000000000000443707346545000016756 0ustar0000000000000000{-# LANGUAGE CPP #-} module Tests.Regress.Issue160 ( testTree ) where import Codec.CBOR.Decoding import Codec.CBOR.Read import Control.DeepSeq #if !MIN_VERSION_base(4,8,0) import Control.Applicative import Data.Monoid (Monoid(..)) #endif import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as BSL import Data.Text (Text) import Test.Tasty import Test.Tasty.HUnit testTree :: TestTree testTree = testGroup "Issue 160 - decoder checks" [ nonUtf8FailureTest "fast path" (BSL.fromStrict $ BS.pack [0x61, 128]) , nonUtf8FailureTest "slow path" (BSL.fromChunks $ map BS.singleton [0x61, 128]) , testCase "decodeListLen doesn't produce negative lengths using a Word64" $ do let bs = BSL.fromStrict $ BS.pack [0x9b, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff] case deserialiseFromBytes decodeListLen bs of Left err -> deepseq err $ pure () Right (rest, t) -> deepseq rest $ assertBool "Length is not negative" (t >= 0) , testCase "decodeMapLen doesn't produce negative lengths using a Word64" $ do let bs = BSL.fromStrict $ BS.pack [0xbb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff] case deserialiseFromBytes decodeMapLen bs of Left err -> deepseq err $ pure () Right (rest, t) -> deepseq rest $ assertBool "Length is not negative" (t >= 0) , testCase "decodeBytes doesn't create bytestrings that cause segfaults or worse" $ do let bs = BSL.fromStrict $ BS.pack $ [0x5b, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff] ++ replicate 100 0x00 case deserialiseFromBytes decodeBytes bs of Left err -> deepseq err $ pure () Right (rest, t) -> deepseq rest $ assertBool "Length is not negative" (BS.length t >= 0) ] where nonUtf8FailureTest pathType bs = let title = mconcat ["decodeString fails on non-utf8 bytes instead of crashing (" , pathType , ")" ] in testCase title $ do case deserialiseFromBytes decodeString bs of Left err -> deepseq err $ pure () Right (rest, t) -> deepseq (rest, t :: Text) $ pure () cborg-0.2.10.0/tests/Tests/Regress/Issue162.hs0000644000000000000000000000534707346545000016761 0ustar0000000000000000module Tests.Regress.Issue162 ( testTree ) where import Control.Monad (void) import Control.Applicative ((<$), (<*)) import Data.Word import Data.ByteString.Lazy (ByteString) import qualified Data.ByteString.Lazy as LBS import Codec.CBOR.Decoding import Codec.CBOR.Read import Test.Tasty import Test.Tasty.HUnit -- This example demonstrates a bug in cborg canonical decoding. -- The bytes used here were drawn from a real application. -- Decodes bigBytes x assuming x is a canonical integer encoding. bigBytesDecoder :: Decoder s () bigBytesDecoder = () <$ decodeListLenCanonical <* decodeMapLenCanonical <* decodeListLenCanonical <* decodeListLenCanonical <* decodeWord16Canonical <* decodeWord16Canonical <* decodeWord8Canonical <* decodeMapLenCanonical <* decodeMapLenCanonical <* decodeMapLenCanonical <* decodeListLenCanonical <* decodeWord8Canonical <* decodeListLenCanonical <* decodeListLenCanonical <* decodeMapLenCanonical <* decodeWord8Canonical <* decodeListLenCanonical <* decodeIntegerCanonical <* decodeWord32Canonical <* decodeWord8Canonical <* decodeListLenCanonical <* decodeIntegerCanonical <* decodeWord32Canonical -- Encoding of 592033 :: BigInteger -- 0xc2 means bignum, 0x43 means a 3-byte sequence bigBytes :: [Word8] -> ByteString bigBytes someEncodedInteger = LBS.pack $ [ -- list of length 7 0x87 -- empty map , 0xa0 -- Just , 0x81 -- list of length 3, all items are 0 , 0x83 , 0x00, 0x00, 0x00 -- empty maps , 0xa0 , 0xa0 , 0xa0 -- singleton list (encoded Just) , 0x81 , 0x00 , 0x80 -- singleton list (encoded Just) , 0x81 , 0xa2 -- key 0 , 0x00 -- value 0: a pair of numbers. , 0x82 , 0x1a, 0x00, 0x04, 0xec, 0xf9 , 0x1a, 0x1a, 0xeb, 0x97, 0x7a -- key 1 , 0x01 -- value 1: a pair of numbers. , 0x82 ] ++ someEncodedInteger ++ [ 0x1a, 0x05, 0xee, 0x4d, 0x20 ] nonCanonicalInteger :: [Word8] nonCanonicalInteger = [0xc2, 0x43, 0x09, 0x08, 0xa1] shouldFailSimple :: Either DeserialiseFailure (LBS.ByteString, ()) shouldFailSimple = deserialiseFromBytes (void decodeIntegerCanonical) (LBS.pack nonCanonicalInteger) shouldFailComposite :: Either DeserialiseFailure (LBS.ByteString, ()) shouldFailComposite = deserialiseFromBytes bigBytesDecoder (bigBytes nonCanonicalInteger) testTree :: TestTree testTree = testGroup "Issue 162 - canonical decoding" [ testCase "simple" (Left (DeserialiseFailure 0 "non-canonical integer") @=? shouldFailSimple) , testCase "composite" (Left (DeserialiseFailure 34 "non-canonical integer") @=? shouldFailComposite) ] cborg-0.2.10.0/tests/Tests/Term.hs0000644000000000000000000002222007346545000014722 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE NamedFieldPuns #-} {-# LANGUAGE OverloadedStrings #-} {-# OPTIONS_GHC -fno-warn-orphans #-} module Tests.Term ( Term , serialise , deserialise , toRefTerm , fromRefTerm , eqTerm , canonicaliseTerm , prop_fromToRefTerm , prop_toFromRefTerm ) where import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS import qualified Data.Text as T import qualified Data.Text.Lazy as LT import Data.Word import qualified Numeric.Half as Half import Codec.CBOR.Term import Codec.CBOR.Read import Codec.CBOR.Write import Test.QuickCheck import qualified Tests.Reference.Implementation as Ref import Tests.Reference.Generators ( floatToWord, doubleToWord, canonicalNaN , HalfSpecials(..), FloatSpecials(..), DoubleSpecials(..) ) #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif import Control.Exception (throw) ------------------------------------------------------------------------------ serialise :: Term -> LBS.ByteString serialise = toLazyByteString . encodeTerm deserialise :: LBS.ByteString -> Term deserialise b = case deserialiseFromBytes decodeTerm b of Left failure -> throw failure Right (trailing, _) | not (LBS.null trailing) -> error "Test.deserialise: trailing data" Right (_, t) -> t ------------------------------------------------------------------------------ toRefTerm :: Term -> Ref.Term toRefTerm (TInt n) | n >= 0 = Ref.TUInt (Ref.toUInt (fromIntegral n)) | otherwise = Ref.TNInt (Ref.toUInt (fromIntegral (-1 - n))) toRefTerm (TInteger n) -- = Ref.TBigInt n | n >= 0 && n <= fromIntegral (maxBound :: Word64) = Ref.TUInt (Ref.toUInt (fromIntegral n)) | n < 0 && n >= -1 - fromIntegral (maxBound :: Word64) = Ref.TNInt (Ref.toUInt (fromIntegral (-1 - n))) | otherwise = Ref.TBigInt n toRefTerm (TBytes bs) = Ref.TBytes (BS.unpack bs) toRefTerm (TBytesI bs) = Ref.TBytess (map BS.unpack (LBS.toChunks bs)) toRefTerm (TString st) = Ref.TString (T.unpack st) toRefTerm (TStringI st) = Ref.TStrings (map T.unpack (LT.toChunks st)) toRefTerm (TList ts) = Ref.TArray (map toRefTerm ts) toRefTerm (TListI ts) = Ref.TArrayI (map toRefTerm ts) toRefTerm (TMap ts) = Ref.TMap [ (toRefTerm x, toRefTerm y) | (x,y) <- ts ] toRefTerm (TMapI ts) = Ref.TMapI [ (toRefTerm x, toRefTerm y) | (x,y) <- ts ] toRefTerm (TTagged w t) = Ref.TTagged (Ref.toUInt (fromIntegral w)) (toRefTerm t) toRefTerm (TBool False) = Ref.TFalse toRefTerm (TBool True) = Ref.TTrue toRefTerm TNull = Ref.TNull toRefTerm (TSimple 23) = Ref.TUndef toRefTerm (TSimple w) = Ref.TSimple (Ref.toSimple w) toRefTerm (THalf f) = if isNaN f then Ref.TFloat16 canonicalNaN else Ref.TFloat16 (HalfSpecials (Half.toHalf f)) toRefTerm (TFloat f) = if isNaN f then Ref.TFloat16 canonicalNaN else Ref.TFloat32 (FloatSpecials f) toRefTerm (TDouble f) = if isNaN f then Ref.TFloat16 canonicalNaN else Ref.TFloat64 (DoubleSpecials f) fromRefTerm :: Ref.Term -> Term fromRefTerm (Ref.TUInt u) | n <= fromIntegral (maxBound :: Int) = TInt (fromIntegral n) | otherwise = TInteger (fromIntegral n) where n = Ref.fromUInt u fromRefTerm (Ref.TNInt u) | n <= fromIntegral (maxBound :: Int) = TInt (-1 - fromIntegral n) | otherwise = TInteger (-1 - fromIntegral n) where n = Ref.fromUInt u fromRefTerm (Ref.TBigInt n) = TInteger n fromRefTerm (Ref.TBytes bs) = TBytes (BS.pack bs) fromRefTerm (Ref.TBytess bs) = TBytesI (LBS.fromChunks (map BS.pack bs)) fromRefTerm (Ref.TString st) = TString (T.pack st) fromRefTerm (Ref.TStrings st) = TStringI (LT.fromChunks (map T.pack st)) fromRefTerm (Ref.TArray ts) = TList (map fromRefTerm ts) fromRefTerm (Ref.TArrayI ts) = TListI (map fromRefTerm ts) fromRefTerm (Ref.TMap ts) = TMap [ (fromRefTerm x, fromRefTerm y) | (x,y) <- ts ] fromRefTerm (Ref.TMapI ts) = TMapI [ (fromRefTerm x, fromRefTerm y) | (x,y) <- ts ] fromRefTerm (Ref.TTagged w t) = TTagged (Ref.fromUInt w) (fromRefTerm t) fromRefTerm (Ref.TFalse) = TBool False fromRefTerm (Ref.TTrue) = TBool True fromRefTerm Ref.TNull = TNull fromRefTerm Ref.TUndef = TSimple 23 fromRefTerm (Ref.TSimple w) = TSimple (Ref.fromSimple w) fromRefTerm (Ref.TFloat16 f) = THalf (Half.fromHalf (getHalfSpecials f)) fromRefTerm (Ref.TFloat32 f) = TFloat (getFloatSpecials f) fromRefTerm (Ref.TFloat64 f) = TDouble (getDoubleSpecials f) -- | Compare terms for equality. -- -- It does exact bit for bit equality of floats. This means we can compare -- NaNs, and different NaNs do not compare equal. If you need equality -- modulo different NaNs then use 'canonicaliseTerm'. -- -- If you need equality modulo different representations of 'TInt' vs 'TInteger' -- then use 'canonicaliseTerm'. -- eqTerm :: Term -> Term -> Bool eqTerm (TList ts) (TList ts') = and (zipWith eqTerm ts ts') eqTerm (TListI ts) (TListI ts') = and (zipWith eqTerm ts ts') eqTerm (TMap ts) (TMap ts') = and (zipWith eqTermPair ts ts') eqTerm (TMapI ts) (TMapI ts') = and (zipWith eqTermPair ts ts') eqTerm (TTagged w t) (TTagged w' t') = w == w' && eqTerm t t' eqTerm (THalf f) (THalf f') = floatToWord f == floatToWord f' eqTerm (TFloat f) (TFloat f') = floatToWord f == floatToWord f' eqTerm (TDouble f) (TDouble f') = doubleToWord f == doubleToWord f' eqTerm a b = a == b eqTermPair :: (Term, Term) -> (Term, Term) -> Bool eqTermPair (a,b) (a',b') = eqTerm a a' && eqTerm b b' -- | Both 'toRefTerm' and the encoding \/ decoding round trip canonicalises -- NaNs. So tests involving these often need this in combination with -- comparing for exact equality using 'eqTerm'. -- canonicaliseTerm :: Term -> Term canonicaliseTerm (THalf f) | isNaN f = canonicalTermNaN canonicaliseTerm (TFloat f) | isNaN f = canonicalTermNaN canonicaliseTerm (TDouble f) | isNaN f = canonicalTermNaN canonicaliseTerm (TInteger n) | n <= fromIntegral (maxBound :: Int) , n >= fromIntegral (minBound :: Int) = TInt (fromIntegral n) canonicaliseTerm (TList ts) = TList (map canonicaliseTerm ts) canonicaliseTerm (TListI ts) = TListI (map canonicaliseTerm ts) canonicaliseTerm (TMap ts) = TMap (map canonicaliseTermPair ts) canonicaliseTerm (TMapI ts) = TMapI (map canonicaliseTermPair ts) canonicaliseTerm (TTagged tag t) = TTagged tag (canonicaliseTerm t) canonicaliseTerm t = t canonicalTermNaN :: Term canonicalTermNaN = THalf canonicalNaN canonicaliseTermPair :: (Term, Term) -> (Term, Term) canonicaliseTermPair (a,b) = (canonicaliseTerm a, canonicaliseTerm b) prop_fromToRefTerm :: Ref.Term -> Bool prop_fromToRefTerm term = toRefTerm (fromRefTerm term) == Ref.canonicaliseTerm term prop_toFromRefTerm :: Term -> Bool prop_toFromRefTerm term = fromRefTerm (toRefTerm term) `eqTerm` canonicaliseTerm term instance Arbitrary Term where arbitrary = fromRefTerm <$> arbitrary shrink (TInt n) = [ TInt n' | n' <- shrink n ] shrink (TInteger n) = [ TInteger n' | n' <- shrink n ] shrink (TBytes ws) = [ TBytes (BS.pack ws') | ws' <- shrink (BS.unpack ws) ] shrink (TBytesI wss) = [ TBytesI (LBS.fromChunks (map BS.pack wss')) | wss' <- shrink (map BS.unpack (LBS.toChunks wss)) ] shrink (TString cs) = [ TString (T.pack cs') | cs' <- shrink (T.unpack cs) ] shrink (TStringI css) = [ TStringI (LT.fromChunks (map T.pack css')) | css' <- shrink (map T.unpack (LT.toChunks css)) ] shrink (TList xs@[x]) = x : [ TList xs' | xs' <- shrink xs ] shrink (TList xs) = [ TList xs' | xs' <- shrink xs ] shrink (TListI xs@[x]) = x : [ TListI xs' | xs' <- shrink xs ] shrink (TListI xs) = [ TListI xs' | xs' <- shrink xs ] shrink (TMap xys@[(x,y)]) = x : y : [ TMap xys' | xys' <- shrink xys ] shrink (TMap xys) = [ TMap xys' | xys' <- shrink xys ] shrink (TMapI xys@[(x,y)]) = x : y : [ TMapI xys' | xys' <- shrink xys ] shrink (TMapI xys) = [ TMapI xys' | xys' <- shrink xys ] shrink (TTagged w t) = t : [ TTagged w' t' | (w', t') <- shrink (w, t) , not (Ref.reservedTag (fromIntegral w')) ] shrink (TBool _) = [] shrink TNull = [] shrink (TSimple w) = [ TSimple w' | w' <- shrink w , Ref.unassignedSimple w || w == 23 ] shrink (THalf _f) = [] shrink (TFloat f) = [ TFloat f' | f' <- shrink f ] shrink (TDouble f) = [ TDouble f' | f' <- shrink f ] cborg-0.2.10.0/tests/Tests/UTF8.hs0000644000000000000000000000324307346545000014545 0ustar0000000000000000{-# LANGUAGE CPP #-} module Tests.UTF8 ( testTree -- :: TestTree ) where import Control.DeepSeq import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as BSL #if ! MIN_VERSION_base(4,11,0) import Data.Monoid #endif import qualified Data.Text.Encoding as T import Codec.CBOR.Decoding import Codec.CBOR.Read import Tests.Util import Test.Tasty import Test.Tasty.QuickCheck #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif -- | Wrapper for ByteString with Arbitrary instance that might produce a valid -- UTF-8 encoding of a string. newtype MaybeText = MaybeText BS.ByteString deriving Show instance Arbitrary MaybeText where arbitrary = MaybeText . BS.pack <$> arbitrary -- | Test that decoding of both valid and invalid CBOR strings produces output -- without exceptions hidden within. utf8DecodingTest :: MaybeText -> Property utf8DecodingTest (MaybeText bs) = case T.decodeUtf8' bs of Right _ -> collect "valid utf8" $ (and splitsOk) Left _ -> collect "invalid utf8" $ not (or splitsOk) where -- We test 2-splits to check all decoder paths. splitsOk = [ok $ deserialiseFromBytes decodeString v | v <- splits2 s] where ok (Right v) = deepseq v True ok (Left v) = deepseq v False s = mkLengthPrefix True (Length . fromIntegral $ BS.length bs) <> BSL.fromStrict bs ---------------------------------------- testTree :: TestTree testTree = localOption (QuickCheckTests 1000) . testGroup "UTF8" $ [testProperty "Decoding of UTF8 encoded Text works and properly handles decoding failures" utf8DecodingTest ] cborg-0.2.10.0/tests/Tests/UnitTests.hs0000644000000000000000000000654307346545000015767 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE NamedFieldPuns #-} {-# OPTIONS_GHC -fno-warn-orphans #-} module Tests.UnitTests (testTree) where import qualified Data.ByteString.Lazy as LBS import Test.Tasty (TestTree, testGroup) import Test.Tasty.HUnit (Assertion, testCase, assertEqual, (@=?)) import qualified Tests.Reference.Implementation as Ref import Tests.Reference.TestVectors import Tests.Reference (termToJson, equalJson) import Tests.Term as Term (toRefTerm, serialise, deserialise) #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif ------------------------------------------------------------------------------- -- Unit tests for test vector from CBOR spec RFC7049 Appendix A -- unit_externalTestVector :: [ExternalTestCase] -> Assertion unit_externalTestVector = mapM_ unit_externalTestCase unit_externalTestCase :: ExternalTestCase -> Assertion unit_externalTestCase ExternalTestCase { encoded, decoded = Left expectedJson } = do let term = Term.deserialise encoded actualJson = termToJson (toRefTerm term) reencoded = Term.serialise term expectedJson `equalJson` actualJson encoded @=? reencoded unit_externalTestCase ExternalTestCase { encoded, decoded = Right expectedDiagnostic } = do let term = Term.deserialise encoded actualDiagnostic = Ref.diagnosticNotation (toRefTerm term) reencoded = Term.serialise term expectedDiagnostic @=? actualDiagnostic encoded @=? reencoded ------------------------------------------------------------------------------- -- Unit tests for test vector from CBOR spec RFC7049 Appendix A -- unit_expectedDiagnosticNotation :: RFC7049TestCase -> Assertion unit_expectedDiagnosticNotation RFC7049TestCase { expectedDiagnostic, encodedBytes } = do let term = Term.deserialise (LBS.pack encodedBytes) actualDiagnostic = Ref.diagnosticNotation (toRefTerm term) expectedDiagnostic @=? actualDiagnostic -- | The reference implementation satisfies the roundtrip property for most -- examples (all the ones from Appendix A). It does not satisfy the roundtrip -- property in general however, non-canonical over-long int encodings for -- example. -- unit_encodedRoundtrip :: RFC7049TestCase -> Assertion unit_encodedRoundtrip RFC7049TestCase { expectedDiagnostic, encodedBytes } = do let term = Term.deserialise (LBS.pack encodedBytes) reencodedBytes = LBS.unpack (Term.serialise term) assertEqual ("for CBOR: " ++ expectedDiagnostic) encodedBytes reencodedBytes -------------------------------------------------------------------------------- -- TestTree API testTree :: TestTree testTree = testGroup "unit tests" [ testCase "RFC7049 test vector: decode" $ mapM_ unit_expectedDiagnosticNotation rfc7049TestVector , testCase "RFC7049 test vector: roundtrip" $ mapM_ unit_encodedRoundtrip rfc7049TestVector , withExternalTestVector $ \getTestVector -> testCase "external test vector" $ getTestVector >>= unit_externalTestVector ] cborg-0.2.10.0/tests/Tests/Util.hs0000644000000000000000000000524707346545000014742 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE ScopedTypeVariables #-} module Tests.Util ( splits2 , splits3 , arbitraryWithBounds , Length(..) , mkLengthPrefix ) where import Data.Bits import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as BSL import Data.Word import Test.Tasty.QuickCheck #if !MIN_VERSION_base(4,8,0) import Control.Applicative #endif -- | Generate all 2-splits of a serialised CBOR value. splits2 :: BSL.ByteString -> [BSL.ByteString] splits2 bs = zipWith (\a b -> BSL.fromChunks [a,b]) (BS.inits sbs) (BS.tails sbs) where sbs = BSL.toStrict bs -- | Generate all 3-splits of a serialised CBOR value. splits3 :: BSL.ByteString -> [BSL.ByteString] splits3 bs = [ BSL.fromChunks [a,b,c] | (a,x) <- zip (BS.inits sbs) (BS.tails sbs) , (b,c) <- zip (BS.inits x) (BS.tails x) ] where sbs = BSL.toStrict bs ---------------------------------------- -- | Generate values of type 'a' embedded within (usually larger) type 'r' with -- upped probabilities of getting neighbourhood of bounds of 'a'. arbitraryWithBounds :: forall a r. (Bounded a, Integral a, Num r, Arbitrary r) => a -> Gen r arbitraryWithBounds _ = frequency [ (70, arbitrary) -- Boundaries , (5, pure $ fromIntegral (minBound :: a)) , (5, pure $ fromIntegral (maxBound :: a)) -- Near boundaries, in range , (5, pure $ fromIntegral (minBound + 1 :: a)) , (5, pure $ fromIntegral (maxBound - 1 :: a)) -- Near boundaries, out of range (assuming there is no overflow). It overflows -- if a ~ r, but it's fine as then we just get a value within range. , (5, pure $ fromIntegral (minBound :: a) - 1) , (5, pure $ fromIntegral (maxBound :: a) + 1) ] ---------------------------------------- -- | Wrapper for list/map length. newtype Length = Length { unLength :: Word } instance Show Length where showsPrec p = showsPrec p . unLength instance Arbitrary Length where arbitrary = Length <$> arbitraryWithBounds (undefined::Int) -- | Generate CBOR prefix of non-empty string/bytes containing its length. mkLengthPrefix :: Bool -> Length -> BSL.ByteString mkLengthPrefix string (Length w) | w <= 23 = BSL.pack $ [64 + stringBit + fromIntegral w] | w <= 0xff = BSL.pack $ [88 + stringBit] ++ f 1 w [] | w <= 0xffff = BSL.pack $ [89 + stringBit] ++ f 2 w [] | w <= 0xffffffff = BSL.pack $ [90 + stringBit] ++ f 4 w [] | otherwise = BSL.pack $ [91 + stringBit] ++ f 8 w [] where stringBit :: Word8 stringBit = if string then 32 else 0 f :: Int -> Word -> [Word8] -> [Word8] f 0 _ acc = acc f k n acc = f (k - 1) (n `shiftR` 8) (fromIntegral n : acc) cborg-0.2.10.0/tests/test-vectors/0000755000000000000000000000000007346545000015021 5ustar0000000000000000cborg-0.2.10.0/tests/test-vectors/README.md0000644000000000000000000000206307346545000016301 0ustar0000000000000000test-vectors ============ This repo collects some simple test vectors in machine-processable form. appendix_a.json --------------- All examples in Appendix A of RFC 7049, encoded as a JSON array. Each element of the test vector is a map (JSON object) with the keys: - cbor: a base-64 encoded CBOR data item - hex: the same CBOR data item in hex encoding - roundtrip: a boolean that indicates whether a generic CBOR encoder would _typically_ produce identical CBOR on re-encoding the decoded data item (your mileage may vary) - decoded: the decoded data item if it can be represented in JSON - diagnostic: the representation of the data item in CBOR diagnostic notation, otherwise To make use of the cases that need diagnostic notation, a diagnostic notation printer is usually all that is needed: decode the CBOR, print the decoded data item in diagnostic notation, and compare. (Note that the diagnostic notation uses full decoration for the indefinite length byte string, while the decoded indefinite length text string represented in JSON necessarily doesn't.) cborg-0.2.10.0/tests/test-vectors/appendix_a.json0000644000000000000000000002356307346545000020035 0ustar0000000000000000[ { "cbor": "AA==", "hex": "00", "roundtrip": true, "decoded": 0 }, { "cbor": "AQ==", "hex": "01", "roundtrip": true, "decoded": 1 }, { "cbor": "Cg==", "hex": "0a", "roundtrip": true, "decoded": 10 }, { "cbor": "Fw==", "hex": "17", "roundtrip": true, "decoded": 23 }, { "cbor": "GBg=", "hex": "1818", "roundtrip": true, "decoded": 24 }, { "cbor": "GBk=", "hex": "1819", "roundtrip": true, "decoded": 25 }, { "cbor": "GGQ=", "hex": "1864", "roundtrip": true, "decoded": 100 }, { "cbor": "GQPo", "hex": "1903e8", "roundtrip": true, "decoded": 1000 }, { "cbor": "GgAPQkA=", "hex": "1a000f4240", "roundtrip": true, "decoded": 1000000 }, { "cbor": "GwAAAOjUpRAA", "hex": "1b000000e8d4a51000", "roundtrip": true, "decoded": 1000000000000 }, { "cbor": "G///////////", "hex": "1bffffffffffffffff", "roundtrip": true, "decoded": 18446744073709551615 }, { "cbor": "wkkBAAAAAAAAAAA=", "hex": "c249010000000000000000", "roundtrip": true, "decoded": 18446744073709551616 }, { "cbor": "O///////////", "hex": "3bffffffffffffffff", "roundtrip": true, "decoded": -18446744073709551616 }, { "cbor": "w0kBAAAAAAAAAAA=", "hex": "c349010000000000000000", "roundtrip": true, "decoded": -18446744073709551617 }, { "cbor": "IA==", "hex": "20", "roundtrip": true, "decoded": -1 }, { "cbor": "KQ==", "hex": "29", "roundtrip": true, "decoded": -10 }, { "cbor": "OGM=", "hex": "3863", "roundtrip": true, "decoded": -100 }, { "cbor": "OQPn", "hex": "3903e7", "roundtrip": true, "decoded": -1000 }, { "cbor": "+QAA", "hex": "f90000", "roundtrip": true, "decoded": 0.0 }, { "cbor": "+YAA", "hex": "f98000", "roundtrip": true, "decoded": -0.0 }, { "cbor": "+TwA", "hex": "f93c00", "roundtrip": true, "decoded": 1.0 }, { "cbor": "+z/xmZmZmZma", "hex": "fb3ff199999999999a", "roundtrip": true, "decoded": 1.1 }, { "cbor": "+T4A", "hex": "f93e00", "roundtrip": true, "decoded": 1.5 }, { "cbor": "+Xv/", "hex": "f97bff", "roundtrip": true, "decoded": 65504.0 }, { "cbor": "+kfDUAA=", "hex": "fa47c35000", "roundtrip": true, "decoded": 100000.0 }, { "cbor": "+n9///8=", "hex": "fa7f7fffff", "roundtrip": true, "decoded": 3.4028234663852886e+38 }, { "cbor": "+3435DyIAHWc", "hex": "fb7e37e43c8800759c", "roundtrip": true, "decoded": 1.0e+300 }, { "cbor": "+QAB", "hex": "f90001", "roundtrip": true, "decoded": 5.960464477539063e-08 }, { "cbor": "+QQA", "hex": "f90400", "roundtrip": true, "decoded": 6.103515625e-05 }, { "cbor": "+cQA", "hex": "f9c400", "roundtrip": true, "decoded": -4.0 }, { "cbor": "+8AQZmZmZmZm", "hex": "fbc010666666666666", "roundtrip": true, "decoded": -4.1 }, { "cbor": "+XwA", "hex": "f97c00", "roundtrip": true, "diagnostic": "Infinity" }, { "cbor": "+X4A", "hex": "f97e00", "roundtrip": true, "diagnostic": "NaN" }, { "cbor": "+fwA", "hex": "f9fc00", "roundtrip": true, "diagnostic": "-Infinity" }, { "cbor": "+n+AAAA=", "hex": "fa7f800000", "roundtrip": false, "diagnostic": "Infinity" }, { "cbor": "+v+AAAA=", "hex": "faff800000", "roundtrip": false, "diagnostic": "-Infinity" }, { "cbor": "+3/wAAAAAAAA", "hex": "fb7ff0000000000000", "roundtrip": false, "diagnostic": "Infinity" }, { "cbor": "+//wAAAAAAAA", "hex": "fbfff0000000000000", "roundtrip": false, "diagnostic": "-Infinity" }, { "cbor": "9A==", "hex": "f4", "roundtrip": true, "decoded": false }, { "cbor": "9Q==", "hex": "f5", "roundtrip": true, "decoded": true }, { "cbor": "9g==", "hex": "f6", "roundtrip": true, "decoded": null }, { "cbor": "9w==", "hex": "f7", "roundtrip": true, "diagnostic": "undefined" }, { "cbor": "8A==", "hex": "f0", "roundtrip": true, "diagnostic": "simple(16)" }, { "cbor": "+Bg=", "hex": "f818", "roundtrip": true, "diagnostic": "simple(24)" }, { "cbor": "+P8=", "hex": "f8ff", "roundtrip": true, "diagnostic": "simple(255)" }, { "cbor": "wHQyMDEzLTAzLTIxVDIwOjA0OjAwWg==", "hex": "c074323031332d30332d32315432303a30343a30305a", "roundtrip": true, "diagnostic": "0(\"2013-03-21T20:04:00Z\")" }, { "cbor": "wRpRS2ew", "hex": "c11a514b67b0", "roundtrip": true, "diagnostic": "1(1363896240)" }, { "cbor": "wftB1FLZ7CAAAA==", "hex": "c1fb41d452d9ec200000", "roundtrip": true, "diagnostic": "1(1363896240.5)" }, { "cbor": "10QBAgME", "hex": "d74401020304", "roundtrip": true, "diagnostic": "23(h'01020304')" }, { "cbor": "2BhFZElFVEY=", "hex": "d818456449455446", "roundtrip": true, "diagnostic": "24(h'6449455446')" }, { "cbor": "2CB2aHR0cDovL3d3dy5leGFtcGxlLmNvbQ==", "hex": "d82076687474703a2f2f7777772e6578616d706c652e636f6d", "roundtrip": true, "diagnostic": "32(\"http://www.example.com\")" }, { "cbor": "QA==", "hex": "40", "roundtrip": true, "diagnostic": "h''" }, { "cbor": "RAECAwQ=", "hex": "4401020304", "roundtrip": true, "diagnostic": "h'01020304'" }, { "cbor": "YA==", "hex": "60", "roundtrip": true, "decoded": "" }, { "cbor": "YWE=", "hex": "6161", "roundtrip": true, "decoded": "a" }, { "cbor": "ZElFVEY=", "hex": "6449455446", "roundtrip": true, "decoded": "IETF" }, { "cbor": "YiJc", "hex": "62225c", "roundtrip": true, "decoded": "\"\\" }, { "cbor": "YsO8", "hex": "62c3bc", "roundtrip": true, "decoded": "ü" }, { "cbor": "Y+awtA==", "hex": "63e6b0b4", "roundtrip": true, "decoded": "水" }, { "cbor": "ZPCQhZE=", "hex": "64f0908591", "roundtrip": true, "decoded": "𐅑" }, { "cbor": "gA==", "hex": "80", "roundtrip": true, "decoded": [ ] }, { "cbor": "gwECAw==", "hex": "83010203", "roundtrip": true, "decoded": [ 1, 2, 3 ] }, { "cbor": "gwGCAgOCBAU=", "hex": "8301820203820405", "roundtrip": true, "decoded": [ 1, [ 2, 3 ], [ 4, 5 ] ] }, { "cbor": "mBkBAgMEBQYHCAkKCwwNDg8QERITFBUWFxgYGBk=", "hex": "98190102030405060708090a0b0c0d0e0f101112131415161718181819", "roundtrip": true, "decoded": [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 ] }, { "cbor": "oA==", "hex": "a0", "roundtrip": true, "decoded": { } }, { "cbor": "ogECAwQ=", "hex": "a201020304", "roundtrip": true, "diagnostic": "{1: 2, 3: 4}" }, { "cbor": "omFhAWFiggID", "hex": "a26161016162820203", "roundtrip": true, "decoded": { "a": 1, "b": [ 2, 3 ] } }, { "cbor": "gmFhoWFiYWM=", "hex": "826161a161626163", "roundtrip": true, "decoded": [ "a", { "b": "c" } ] }, { "cbor": "pWFhYUFhYmFCYWNhQ2FkYURhZWFF", "hex": "a56161614161626142616361436164614461656145", "roundtrip": true, "decoded": { "a": "A", "b": "B", "c": "C", "d": "D", "e": "E" } }, { "cbor": "X0IBAkMDBAX/", "hex": "5f42010243030405ff", "roundtrip": false, "diagnostic": "(_ h'0102', h'030405')" }, { "cbor": "f2VzdHJlYWRtaW5n/w==", "hex": "7f657374726561646d696e67ff", "roundtrip": false, "decoded": "streaming" }, { "cbor": "n/8=", "hex": "9fff", "roundtrip": false, "decoded": [ ] }, { "cbor": "nwGCAgOfBAX//w==", "hex": "9f018202039f0405ffff", "roundtrip": false, "decoded": [ 1, [ 2, 3 ], [ 4, 5 ] ] }, { "cbor": "nwGCAgOCBAX/", "hex": "9f01820203820405ff", "roundtrip": false, "decoded": [ 1, [ 2, 3 ], [ 4, 5 ] ] }, { "cbor": "gwGCAgOfBAX/", "hex": "83018202039f0405ff", "roundtrip": false, "decoded": [ 1, [ 2, 3 ], [ 4, 5 ] ] }, { "cbor": "gwGfAgP/ggQF", "hex": "83019f0203ff820405", "roundtrip": false, "decoded": [ 1, [ 2, 3 ], [ 4, 5 ] ] }, { "cbor": "nwECAwQFBgcICQoLDA0ODxAREhMUFRYXGBgYGf8=", "hex": "9f0102030405060708090a0b0c0d0e0f101112131415161718181819ff", "roundtrip": false, "decoded": [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 ] }, { "cbor": "v2FhAWFinwID//8=", "hex": "bf61610161629f0203ffff", "roundtrip": false, "decoded": { "a": 1, "b": [ 2, 3 ] } }, { "cbor": "gmFhv2FiYWP/", "hex": "826161bf61626163ff", "roundtrip": false, "decoded": [ "a", { "b": "c" } ] }, { "cbor": "v2NGdW71Y0FtdCH/", "hex": "bf6346756ef563416d7421ff", "roundtrip": false, "decoded": { "Fun": true, "Amt": -2 } } ]