bytestring-lexing-0.5.0.2/0000755000000000000000000000000012534743277013536 5ustar0000000000000000bytestring-lexing-0.5.0.2/AUTHORS0000644000000000000000000000151312534743277014606 0ustar0000000000000000=== Haskell bytestring-lexing package AUTHORS/THANKS file === The bytestring-lexing package was originally written by Don Stewart and released under the terms in the LICENSE file. In January 2012 maintainership was taken over by wren gayle romano. I would also like to give thanks to the following contributers: Bryan O'Sullivan --- For adding support for parsing Doubles from lazy bytestrings back during Don's maintainership. Also for inspiring the improved (v0.4.2) packDecimal implementation. Erik de Castro Lopo, Vincent Hanquez, and Christoph Breitkopf --- for excessive tweaking and benchmarking of the readDecimal function. Hirotomo Moriwaki --- for highlighting the inefficiency of the old Alex-based parser by publishing bytestring-read. And for the idea behind the new (v0.5.0) limited-precision parsers.bytestring-lexing-0.5.0.2/bytestring-lexing.cabal0000644000000000000000000000347612534743277020212 0ustar0000000000000000---------------------------------------------------------------- -- wren gayle romano ~ 2015.06.05 ---------------------------------------------------------------- -- By and large Cabal >=1.2 is fine; but >= 1.6 gives tested-with: -- and source-repository:. Cabal-Version: >= 1.6 Build-Type: Simple Name: bytestring-lexing Version: 0.5.0.2 Stability: provisional Homepage: http://code.haskell.org/~wren/ Author: wren gayle romano, Don Stewart Maintainer: wren@community.haskell.org Copyright: Copyright (c) 2012--2015 wren gayle romano, 2008--2011 Don Stewart License: BSD2 License-File: LICENSE Category: Data Synopsis: Parse and produce literals efficiently from strict or lazy bytestrings. Description: Parse and produce literals efficiently from strict or lazy bytestrings. . Some benchmarks for this package can be found at: -- Formerly tested with GHCs 6.8.2, 6.10.1, 6.12.1, 7.0.3, 7.6.1, 7.8.0; but those are no longer verified. Tested-With: GHC ==7.8.3, GHC == 7.10.1 Extra-source-files: AUTHORS, README, CHANGELOG Source-Repository head Type: darcs Location: http://community.haskell.org/~wren/bytestring-lexing ---------------------------------------------------------------- Library Ghc-Options: -O2 Hs-Source-Dirs: src Exposed-Modules: Data.ByteString.Lex.Integral Data.ByteString.Lex.Fractional Other-Modules: Data.ByteString.Lex.Internal -- Should actually be able to work as far back as base-2.0... Build-Depends: base >= 4 && < 5, bytestring ---------------------------------------------------------------- ----------------------------------------------------------- fin.bytestring-lexing-0.5.0.2/CHANGELOG0000644000000000000000000000236412534743277014755 0ustar00000000000000000.5.0.2 (2015-05-06): - Fixed the benchmarking url 0.5.0.1 (2015-05-06): - Cleaned up the README file 0.5.0 (2015-05-06): - Corrected the License field in the .cabal file to say BSD2 (instead of BSD3) - Data.ByteString.Lex.{Double,.Lazy.Double}: removed - Data.ByteString.Lex.Fractional: added based on the inefficiency of the old Alex-based parsers, as demonstrated by Hirotomo Moriwaki's bytestring-read (v0.3.0). 0.4.3.3 (2015-05-30): - Moved VERSION to CHANGELOG 0.4.3.1 (2014-03-07): - Updated the .cabal file to require newer alex for newer ghc. 0.4.3 (2013-03-21): - Data.ByteString.Lex.Integral: Corrected a segmentation fault in packDecimal. 0.4.2 (2013-03-20): - Data.ByteString.Lex.Integral: Improved packDecimal. 0.4.1 (2012-00-00): - Data.ByteString.Lex.Integral: Added buffer overflow check for asHexadecimal 0.4.0 (2012-02-03): - Data.ByteString.Lex.Integral: added readDecimal_ 0.3.0 (2012-01-28): - Added Data.ByteString.Lex.Integral - Converted repo to Darcs-2 hashed format. - wren ng thornton took over maintainership. 0.2.1 (2010-02-14): 0.2 (2008-10-15): - Add support for lexing lazy bytestrings. 0.1.2 (2008-07-23): 0.1.0.2 (2008-07-23): 0.1.0.1 (2008-07-19): 0.1.0 (2008-07-19): bytestring-lexing-0.5.0.2/LICENSE0000644000000000000000000000244512534743277014550 0ustar0000000000000000Copyright (c) wren gayle romano 2012--2015; Don Stewart 2008, 2010 All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. bytestring-lexing-0.5.0.2/README0000644000000000000000000000225012534743277014415 0ustar0000000000000000bytestring-lexing ================= This is a simple package and should be easy to install. You should be able to use one of the following standard methods to install it. -- With cabal-install and without the source: $> cabal install bytestring-lexing -- With cabal-install and with the source already: $> cd bytestring-lexing $> cabal install -- Without cabal-install, but with the source already: $> cd bytestring-lexing $> runhaskell Setup.hs configure --user $> runhaskell Setup.hs build $> runhaskell Setup.hs test $> runhaskell Setup.hs haddock --hyperlink-source $> runhaskell Setup.hs copy $> runhaskell Setup.hs register The test step is optional and currently does nothing. The Haddock step is also optional. Portability =========== An attempt has been made to keep this library portable. However, the decimalPrecision function in Data.ByteString.Lex.Fractional requires ScopedTypeVariables for efficiency. If your compiler does not support ScopedTypeVariables, this should be easy enough to fix. Contact the maintainer if this is an issue for you. ----------------------------------------------------------- fin.bytestring-lexing-0.5.0.2/Setup.hs0000644000000000000000000000016212534743277015171 0ustar0000000000000000#!/usr/bin/env runhaskell module Main (main) where import Distribution.Simple main :: IO () main = defaultMain bytestring-lexing-0.5.0.2/src/0000755000000000000000000000000012534743277014325 5ustar0000000000000000bytestring-lexing-0.5.0.2/src/Data/0000755000000000000000000000000012534743277015176 5ustar0000000000000000bytestring-lexing-0.5.0.2/src/Data/ByteString/0000755000000000000000000000000012534743277017270 5ustar0000000000000000bytestring-lexing-0.5.0.2/src/Data/ByteString/Lex/0000755000000000000000000000000012534743277020020 5ustar0000000000000000bytestring-lexing-0.5.0.2/src/Data/ByteString/Lex/Fractional.hs0000644000000000000000000004471112534743277022445 0ustar0000000000000000{-# OPTIONS_GHC -Wall -fwarn-tabs #-} {-# LANGUAGE ScopedTypeVariables #-} ---------------------------------------------------------------- -- 2015.06.05 -- | -- Module : Data.ByteString.Lex.Fractional -- Copyright : Copyright (c) 2015 wren gayle romano -- License : BSD2 -- Maintainer : wren@community.haskell.org -- Stability : provisional -- Portability : Haskell98 + ScopedTypeVariables -- -- Functions for parsing and producing 'Fractional' values from\/to -- 'ByteString's based on the \"Char8\" encoding. That is, we assume -- an ASCII-compatible encoding of alphanumeric characters. -- -- /Since: 0.5.0/ ---------------------------------------------------------------- module Data.ByteString.Lex.Fractional ( -- * General combinators readSigned -- packSigned -- * Decimal conversions , readDecimal -- packDecimal -- TODO: asDecimal -- this will be really hard to make efficient... -- * Hexadecimal conversions , readHexadecimal -- packHexadecimal -- asHexadecimal -- * Octal conversions , readOctal -- packOctal -- asOctal -- this will be really hard to make efficient... -- * Exponential conversions , readExponential -- packExponential -- asExponential -- * Precision-limited conversions , decimalPrecision , readDecimalLimited , readExponentialLimited ) where import Data.ByteString (ByteString) import qualified Data.ByteString as BS import qualified Data.ByteString.Unsafe as BSU import Data.Word (Word8) import qualified Data.ByteString.Lex.Integral as I import Data.ByteString.Lex.Integral (readSigned) import Data.ByteString.Lex.Internal (numDecimalDigits) ---------------------------------------------------------------- ---------------------------------------------------------------- -- | A helper function to ensure consistent strictness. -- TODO: should we really be this strict? justPair :: a -> b -> Maybe (a,b) {-# INLINE justPair #-} justPair x y | x `seq` y `seq` False = undefined | otherwise = Just (x,y) pair :: a -> b -> (a,b) {-# INLINE pair #-} pair x y | x `seq` y `seq` False = undefined | otherwise = (x,y) -- NOTE: We use 'fromInteger' everywhere instead of 'fromIntegral' -- in order to fix the types of the calls to 'I.readDecimal', etc. -- This is always correct, but for some result types there are other -- intermediate types which may be faster. {-# INLINE isNotPeriod #-} isNotPeriod :: Word8 -> Bool isNotPeriod w = w /= 0x2E {-# INLINE isNotE #-} isNotE :: Word8 -> Bool isNotE w = w /= 0x65 && w /= 0x45 {-# INLINE isDecimal #-} isDecimal :: Word8 -> Bool isDecimal w = 0x39 >= w && w >= 0x30 {-# INLINE isDecimalZero #-} isDecimalZero :: Word8 -> Bool isDecimalZero w = w == 0x30 ---------------------------------------------------------------- ----- Decimal -- | Read an unsigned\/non-negative fractional value in ASCII decimal -- format; that is, anything matching the regex @\\d+(\\.\\d+)?@. -- Returns @Nothing@ if there is no such number at the beginning -- of the string, otherwise returns @Just@ the number read and the -- remainder of the string. -- -- N.B., see 'readDecimalLimited' if your fractional type has limited -- precision and you expect your inputs to have greater precision -- than can be represented. Even for types with unlimited precision -- (e.g., 'Rational'), you may want to check out 'readDecimalLimited'. readDecimal :: (Fractional a) => ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readDecimal :: ByteString -> Maybe (Float, ByteString), ByteString -> Maybe (Double, ByteString), ByteString -> Maybe (Rational, ByteString) #-} readDecimal xs = case I.readDecimal xs of Nothing -> Nothing Just (whole, ys) -> case BS.uncons ys of Nothing -> justPair (fromInteger whole) BS.empty Just (y0,ys0) | isNotPeriod y0 -> justPair (fromInteger whole) ys | otherwise -> case I.readDecimal ys0 of Nothing -> justPair (fromInteger whole) ys Just (part, zs) -> let base = 10 ^ (BS.length ys - 1 - BS.length zs) frac = fromInteger whole + (fromInteger part / base) in justPair frac zs ---------------------------------------------------------------- -- If and only if(!) we have Real, then we can use 'toRational'... -- Similarly, only if we have RealFloat can we use 'decodeFloat'... -- TODO: -- Convert a non-negative fractional number into an (unsigned) -- ASCII decimal string. Returns @Nothing@ on negative inputs. -- packDecimal :: (Fractional a) => a -> Maybe ByteString ---------------------------------------------------------------- ---------------------------------------------------------------- ----- Hexadecimal -- | Read a non-negative integral value in ASCII hexadecimal format. -- Returns @Nothing@ if there is no integer at the beginning of the -- string, otherwise returns @Just@ the integer read and the remainder -- of the string. -- -- This function does not recognize the various hexadecimal sigils -- like \"0x\", but because there are so many different variants, -- those are best handled by helper functions which then use this -- function for the actual numerical parsing. This function recognizes -- both upper-case, lower-case, and mixed-case hexadecimal. readHexadecimal :: (Fractional a) => ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readHexadecimal :: ByteString -> Maybe (Float, ByteString), ByteString -> Maybe (Double, ByteString), ByteString -> Maybe (Rational, ByteString) #-} readHexadecimal xs = case I.readHexadecimal xs of Nothing -> Nothing Just (n, xs') -> justPair (fromInteger n) xs' -- TODO: -- Convert a non-negative integer into a lower-case ASCII hexadecimal -- string. Returns @Nothing@ on negative inputs. -- packHexadecimal :: (Fractional a) => a -> Maybe ByteString ---------------------------------------------------------------- ---------------------------------------------------------------- ----- Octal -- | Read a non-negative integral value in ASCII octal format. -- Returns @Nothing@ if there is no integer at the beginning of the -- string, otherwise returns @Just@ the integer read and the remainder -- of the string. -- -- This function does not recognize the various octal sigils like -- \"0o\", but because there are different variants, those are best -- handled by helper functions which then use this function for the -- actual numerical parsing. readOctal :: (Fractional a) => ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readOctal :: ByteString -> Maybe (Float, ByteString), ByteString -> Maybe (Double, ByteString), ByteString -> Maybe (Rational, ByteString) #-} readOctal xs = case I.readOctal xs of Nothing -> Nothing Just (n, xs') -> justPair (fromInteger n) xs' -- TODO: -- Convert a non-negative integer into an ASCII octal string. -- Returns @Nothing@ on negative inputs. -- packOctal :: (Fractional a) => a -> Maybe ByteString ---------------------------------------------------------------- ---------------------------------------------------------------- ----- Exponential -- | Read an unsigned\/non-negative fractional value in ASCII -- exponential format; that is, anything matching the regex -- @\\d+(\\.\\d+)?([eE][\\+\\-]?\\d+)?@. Returns @Nothing@ if there -- is no such number at the beginning of the string, otherwise -- returns @Just@ the number read and the remainder of the string. -- -- N.B., the current implementation assumes the exponent is small -- enough to fit into an 'Int'. This gives a significant performance -- increase for @a ~ Float@ and @a ~ Double@ and agrees with the -- 'RealFloat' class which has 'exponent' returning an 'Int'. If -- you need a larger exponent, contact the maintainer. -- -- N.B., see 'readExponentialLimited' if your fractional type has -- limited precision and you expect your inputs to have greater -- precision than can be represented. Even for types with unlimited -- precision, you may want to check out 'readExponentialLimited'. readExponential :: (Fractional a) => ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readExponential :: ByteString -> Maybe (Float, ByteString), ByteString -> Maybe (Double, ByteString), ByteString -> Maybe (Rational, ByteString) #-} readExponential xs = case readDecimal xs of Nothing -> Nothing Just (frac, ys) -> case BS.uncons ys of Nothing -> justPair frac BS.empty Just (y0,ys0) | isNotE y0 -> justPair frac ys | otherwise -> -- HACK: monomorphizing @e::Int@ for performance! case readSigned I.readDecimal ys0 of Nothing -> justPair frac ys Just (ex,zs) -> justPair (frac * (10 ^^ (ex::Int))) zs ---------------------------------------------------------------- ---------------------------------------------------------------- ----- Limited -- | A representation of unsigned fractional numbers decomposed -- into a significand\/mantissa and a decimal exponent. This allows -- efficient scaling by decimal exponents (cf., 'scaleDF'). -- -- TODO: the first component should be some @a@-specific intermediate -- representation, as defined by a fundep or typefamily! We use -- 'Integer' which is sufficient for all cases, but it'd be better -- to use @Word24@ for 'Float', @Word53@ for 'Double', and @a@ for -- @'Ratio' a@. data DecimalFraction a = DF !Integer {-# UNPACK #-}!Int -- BUG: Can't unpack integers... -- | A helpful smart constructor. fractionDF :: Integer -> Int -> Integer -> DecimalFraction a {-# INLINE fractionDF #-} fractionDF whole scale part = DF (whole * (10 ^ scale) + part) (negate scale) -- TODO: use an unsafe variant of (^) which has an assertion instead of a runtime check? -- | Extract the fractional number encoded in the record. -- -- > fromDF (DF frac scale) = fromIntegral frac * (10 ^^ scale) fromDF :: Fractional a => DecimalFraction a -> a {-# INLINE fromDF #-} fromDF (DF frac scale) -- Avoid possibility of returning NaN -- TODO: really, ought to check @fromInteger frac == 0@... | frac == 0 = 0 -- Avoid throwing an error due to @negate minBound == minBound@ | scale == minBound = fromInteger frac * (10 ^^ toInteger scale) -- Now we're safe for the default implementation | otherwise = fromInteger frac * (10 ^^ scale) -- TODO: manually implement (^^) so that we get @_ / (10^ _)@ -- instead of @_ * recip (10^ _)@ for negative exponents? -- | Scale a decimal fraction by some power of 10. scaleDF :: DecimalFraction a -> Int -> DecimalFraction a {-# INLINE scaleDF #-} scaleDF (DF frac scale) scale' = DF frac (scale + scale') -- TODO: is there a way to avoid ScopedTypeVariables without losing -- the fact that this is a constant function? -- -- TODO: try looking at core again to see if @n@ gets completely -- optimized away or not. If not, is there a way to help make that -- happen without using TH? -- -- | Return the 'RealFloat' type's inherent decimal precision -- limitation. This is the number of decimal digits in @floatRadix -- proxy ^ floatDigits proxy@. decimalPrecision :: forall proxy a. RealFloat a => proxy a -> Int {-# INLINE decimalPrecision #-} decimalPrecision = let proxy = undefined :: a n = numDecimalDigits (floatRadix proxy ^ floatDigits proxy) in n `seq` \_ -> n -- TODO: for the isDecimalZero instance, use 'BS.breakByte' where -- possible; or design our own similar... -- -- | Drop while the predicate is true, and return the number of -- bytes dropped. lengthDropWhile :: (Word8 -> Bool) -> ByteString -> (Int, ByteString) {-# INLINE lengthDropWhile #-} lengthDropWhile p xs = let ys = BS.dropWhile p xs in (BS.length xs - BS.length ys, ys) {- -- TODO: benchmark let len = BS.length (BS.takeWhile p xs) in (len, BS.drop len xs) case BS.break (not . p) xs of (ys,zs) -> (BS.length ys, zs) -} -- | A variant of 'readDecimal' which only reads up to some limited -- precision. The first argument gives the number of decimal digits -- at which to limit the precision. -- -- For types with inherently limited precision (e.g., 'Float' and -- 'Double'), when you pass in the precision limit (cf., -- 'decimalPrecision') this is far more efficient than 'readDecimal'. -- However, passing in a precision limit which is greater than the -- type's inherent limitation will degrate performance compared to -- 'readDecimal'. -- -- For types with unlimited precision (e.g., 'Rational') this may -- still be far more efficient than 'readDecimal' (it is for -- 'Rational', in fact). The reason being that it delays the scaling -- the significand\/mantissa by the exponent, thus allowing you to -- further adjust the exponent before computing the final value -- (e.g., as in 'readExponentialLimited'). This avoids the need to -- renormalize intermediate results, and allows faster computation -- of the scaling factor by doing it all at once. readDecimalLimited :: (Fractional a) => Int -> ByteString -> Maybe (a, ByteString) {-# INLINE readDecimalLimited #-} readDecimalLimited p xs = case readDecimalLimited_ p xs of Nothing -> Nothing Just (df,ys) -> justPair (fromDF df) ys readDecimalLimited_ :: (Fractional a) => Int -> ByteString -> Maybe (DecimalFraction a, ByteString) {-# SPECIALIZE readDecimalLimited_ :: Int -> ByteString -> Maybe (DecimalFraction Float, ByteString), Int -> ByteString -> Maybe (DecimalFraction Double, ByteString), Int -> ByteString -> Maybe (DecimalFraction Rational, ByteString) #-} readDecimalLimited_ = start where -- All calls to 'I.readDecimal' are monomorphized at 'Integer', -- as specified by what 'DF' needs. -- TODO: verify this is ~inferred~ strict in both @p@ and @xs@ -- without the guard trick or BangPatterns start p xs | p `seq` xs `seq` False = undefined | otherwise = case lengthDropWhile isDecimalZero xs of (0, _) -> readWholePart p xs (_, ys) -> case BS.uncons ys of Nothing -> justPair (DF 0 0) BS.empty Just (y0,ys0) | isDecimal y0 -> readWholePart p ys | isNotPeriod y0 -> justPair (DF 0 0) ys | otherwise -> case lengthDropWhile isDecimalZero ys0 of (0, _) -> readFractionPart p 0 ys (scale, zs) -> afterDroppingZeroes p scale zs afterDroppingZeroes p scale xs = let ys = BS.take p xs in case I.readDecimal ys of Nothing -> justPair (DF 0 0) xs Just (part, ys') -> let scale' = scale + BS.length xs - BS.length ys' in justPair (DF part (negate scale')) (BS.dropWhile isDecimal ys') readWholePart p xs = let ys = BS.take p xs in case I.readDecimal ys of Nothing -> Nothing Just (whole, ys') | BS.null ys' -> case lengthDropWhile isDecimal (BS.drop p xs) of (scale, zs) -> justPair (DF whole scale) (dropFractionPart zs) | otherwise -> let len = BS.length ys - BS.length ys' -- N.B., @xs' == ys' `BS.append` BS.drop p xs@ xs' = BS.drop len xs in -- N.B., @BS.null xs'@ is impossible. Were it to -- happen then returning @pair (DF whole 0) BS.empty@ -- is consistent with the branch where we drop the -- fraction part (the original input is less than -- the original @p@ long); however, reaching this -- branch ia that input would be a control-flow -- error. if isNotPeriod (BSU.unsafeHead xs') then justPair (DF whole 0) xs' else readFractionPart (p-len) whole xs' dropFractionPart xs = case BS.uncons xs of Nothing -> BS.empty -- == xs Just (x0,xs0) | isNotPeriod x0 -> xs | otherwise -> case BS.uncons xs0 of Nothing -> BS.singleton 0x2E -- == xs Just (x1,xs1) | isDecimal x1 -> BS.dropWhile isDecimal xs1 | otherwise -> xs -- NOTES: @BS.null xs@ is impossible as it begins with a period; -- see the call sites. If @not (BS.null ys')@ then the @BS.dropWhile -- isDecimal@ is a noop; but there's no reason to branch on -- testing for that. The @+1@ in @BS.drop (1+scale)@ is for the -- 'BSU.unsafeTail' in @ys@. readFractionPart p whole xs = let ys = BS.take p (BSU.unsafeTail xs) in case I.readDecimal ys of Nothing -> justPair (DF whole 0) xs Just (part, ys') -> let scale = BS.length ys - BS.length ys' in justPair (fractionDF whole scale part) (BS.dropWhile isDecimal (BS.drop (1+scale) xs)) -- | A variant of 'readExponential' which only reads up to some limited -- precision. The first argument gives the number of decimal digits -- at which to limit the precision. See 'readDecimalLimited' for -- more discussion of the performance benefits of using this function. readExponentialLimited :: (Fractional a) => Int -> ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readExponentialLimited :: Int -> ByteString -> Maybe (Float, ByteString), Int -> ByteString -> Maybe (Double, ByteString), Int -> ByteString -> Maybe (Rational, ByteString) #-} readExponentialLimited = start where start p xs = case readDecimalLimited_ p xs of Nothing -> Nothing Just (df,xs') -> Just $! readExponentPart df xs' readExponentPart df xs | BS.null xs = pair (fromDF df) BS.empty | isNotE (BSU.unsafeHead xs) = pair (fromDF df) xs | otherwise = -- HACK: monomorphizing at 'Int' -- TODO: how to handle too-large exponents? case readSigned I.readDecimal (BSU.unsafeTail xs) of Nothing -> pair (fromDF df) xs Just (scale, xs') -> pair (fromDF $ scaleDF df scale) xs' ---------------------------------------------------------------- ----------------------------------------------------------- fin.bytestring-lexing-0.5.0.2/src/Data/ByteString/Lex/Integral.hs0000644000000000000000000006726712534743277022143 0ustar0000000000000000{-# OPTIONS_GHC -Wall -fwarn-tabs #-} ---------------------------------------------------------------- -- 2013.03.21 -- | -- Module : Data.ByteString.Lex.Integral -- Copyright : Copyright (c) 2010--2015 wren gayle romano -- License : BSD2 -- Maintainer : wren@community.haskell.org -- Stability : provisional -- Portability : Haskell98 -- -- Functions for parsing and producing 'Integral' values from\/to -- 'ByteString's based on the \"Char8\" encoding. That is, we assume -- an ASCII-compatible encoding of alphanumeric characters. -- -- /Since: 0.3.0/ ---------------------------------------------------------------- module Data.ByteString.Lex.Integral ( -- * General combinators readSigned -- , packSigned -- * Decimal conversions , readDecimal , readDecimal_ , packDecimal -- TODO: asDecimal -- this will be really hard to make efficient... -- * Hexadecimal conversions , readHexadecimal , packHexadecimal , asHexadecimal -- * Octal conversions , readOctal , packOctal -- asOctal -- this will be really hard to make efficient... ) where import Data.ByteString (ByteString) import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS8 (pack) import qualified Data.ByteString.Internal as BSI import qualified Data.ByteString.Unsafe as BSU import Data.Int import Data.Word import Data.Bits import Foreign.Ptr (Ptr, plusPtr) import qualified Foreign.ForeignPtr as FFI (withForeignPtr) import Foreign.Storable (peek, poke) import Data.ByteString.Lex.Internal ---------------------------------------------------------------- ----- General -- TODO: On the one hand, making this a combinator is "the right -- thing to do" for generality. However, for performance critical -- code, we could optimize away some extraneous guards if we just -- provide both signed and unsigned versions of the -- {read,pack}{Decimal,Octal,Hex} functions... -- TODO: move to somewhere more general, shared by both Integral and Fractional -- | Adjust a reading function to recognize an optional leading -- sign. As with the other functions, we assume an ASCII-compatible -- encoding of the sign characters. readSigned :: (Num a) => (ByteString -> Maybe (a, ByteString)) -> ByteString -> Maybe (a, ByteString) readSigned f xs | BS.null xs = Nothing | otherwise = case BSU.unsafeHead xs of 0x2D -> f (BSU.unsafeTail xs) >>= \(n, ys) -> return (negate n, ys) 0x2B -> f (BSU.unsafeTail xs) _ -> f xs ---------------------------------------------------------------- ----- Decimal {- -- We unroll this definition in order to reduce the number of conversions from native Int to the Integral type. readDecimalSimple :: (Integral a) => ByteString -> Maybe (a, ByteString) readDecimalSimple = start where -- This implementation is near verbatim from -- bytestring-0.9.1.7:Data.ByteString.Char8.readInt. We do -- remove the superstrictness by lifting the 'Just' so it can -- be returned after seeing the first byte. Do beware of the -- scope of 'fromIntegral', we want to avoid unnecessary -- 'Integral' operations and do as much as possible in 'Word8'. start xs | BS.null xs = Nothing | otherwise = case BSU.unsafeHead xs of w | 0x39 >= w && w >= 0x30 -> Just $ loop (fromIntegral (w - 0x30)) (BSU.unsafeTail xs) | otherwise -> Nothing loop n xs | n `seq` xs `seq` False = undefined -- for strictness analysis | BS.null xs = (n, BS.empty) -- not @xs@, to help GC | otherwise = case BSU.unsafeHead xs of w | 0x39 >= w && w >= 0x30 -> loop (n * 10 + fromIntegral (w - 0x30)) (BSU.unsafeTail xs) | otherwise -> (n,xs) -} -- | Read an unsigned\/non-negative integral value in ASCII decimal -- format. Returns @Nothing@ if there is no integer at the beginning -- of the string, otherwise returns @Just@ the integer read and the -- remainder of the string. -- -- If you are extremely concerned with performance, then it is more -- performant to use this function at @Int@ or @Word@ and then to -- call 'fromIntegral' to perform the conversion at the end. However, -- doing this will make your code succeptible to overflow bugs if -- the target type is larger than @Int@. readDecimal :: (Integral a) => ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readDecimal :: ByteString -> Maybe (Int, ByteString), ByteString -> Maybe (Int8, ByteString), ByteString -> Maybe (Int16, ByteString), ByteString -> Maybe (Int32, ByteString), ByteString -> Maybe (Int64, ByteString), ByteString -> Maybe (Integer, ByteString), ByteString -> Maybe (Word, ByteString), ByteString -> Maybe (Word8, ByteString), ByteString -> Maybe (Word16, ByteString), ByteString -> Maybe (Word32, ByteString), ByteString -> Maybe (Word64, ByteString) #-} readDecimal = start where isDecimal :: Word8 -> Bool {-# INLINE isDecimal #-} isDecimal w = 0x39 >= w && w >= 0x30 toDigit :: (Integral a) => Word8 -> a {-# INLINE toDigit #-} toDigit w = fromIntegral (w - 0x30) addDigit :: Int -> Word8 -> Int {-# INLINE addDigit #-} addDigit n w = n * 10 + toDigit w -- TODO: should we explicitly drop all leading zeros before we jump into the unrolled loop? start :: (Integral a) => ByteString -> Maybe (a, ByteString) start xs | BS.null xs = Nothing | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> Just $ loop0 (toDigit w) (BSU.unsafeTail xs) | otherwise -> Nothing loop0 :: (Integral a) => a -> ByteString -> (a, ByteString) loop0 m xs | m `seq` xs `seq` False = undefined | BS.null xs = (m, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop1 m (toDigit w) (BSU.unsafeTail xs) | otherwise -> (m, xs) loop1, loop2, loop3, loop4, loop5, loop6, loop7, loop8 :: (Integral a) => a -> Int -> ByteString -> (a, ByteString) loop1 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*10 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop2 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> (m*10 + fromIntegral n, xs) loop2 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*100 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop3 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> (m*100 + fromIntegral n, xs) loop3 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*1000 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop4 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> (m*1000 + fromIntegral n, xs) loop4 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*10000 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop5 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> (m*10000 + fromIntegral n, xs) loop5 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*100000 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop6 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> (m*100000 + fromIntegral n, xs) loop6 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*1000000 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop7 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> (m*1000000 + fromIntegral n, xs) loop7 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*10000000 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop8 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> (m*10000000 + fromIntegral n, xs) loop8 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = (m*100000000 + fromIntegral n, BS.empty) | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop0 (m*1000000000 + fromIntegral (addDigit n w)) (BSU.unsafeTail xs) | otherwise -> (m*100000000 + fromIntegral n, xs) ---------------------------------------------------------------- -- | A variant of 'readDecimal' which does not return the tail of -- the string, and returns @0@ instead of @Nothing@. This is twice -- as fast for 'Int64' on 32-bit systems, but has identical performance -- to 'readDecimal' for all other types and architectures. -- -- /Since: 0.4.0/ readDecimal_ :: (Integral a) => ByteString -> a {-# SPECIALIZE readDecimal_ :: ByteString -> Int, ByteString -> Int8, ByteString -> Int16, ByteString -> Int32, ByteString -> Int64, ByteString -> Integer, ByteString -> Word, ByteString -> Word8, ByteString -> Word16, ByteString -> Word32, ByteString -> Word64 #-} readDecimal_ = start where isDecimal :: Word8 -> Bool {-# INLINE isDecimal #-} isDecimal w = 0x39 >= w && w >= 0x30 toDigit :: (Integral a) => Word8 -> a {-# INLINE toDigit #-} toDigit w = fromIntegral (w - 0x30) addDigit :: Int -> Word8 -> Int {-# INLINE addDigit #-} addDigit n w = n * 10 + toDigit w start xs | BS.null xs = 0 | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop0 (toDigit w) (BSU.unsafeTail xs) | otherwise -> 0 loop0 :: (Integral a) => a -> ByteString -> a loop0 m xs | m `seq` xs `seq` False = undefined | BS.null xs = m | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop1 m (toDigit w) (BSU.unsafeTail xs) | otherwise -> m loop1, loop2, loop3, loop4, loop5, loop6, loop7, loop8 :: (Integral a) => a -> Int -> ByteString -> a loop1 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*10 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop2 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> m*10 + fromIntegral n loop2 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*100 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop3 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> m*100 + fromIntegral n loop3 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*1000 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop4 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> m*1000 + fromIntegral n loop4 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*10000 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop5 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> m*10000 + fromIntegral n loop5 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*100000 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop6 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> m*100000 + fromIntegral n loop6 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*1000000 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop7 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> m*1000000 + fromIntegral n loop7 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*10000000 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop8 m (addDigit n w) (BSU.unsafeTail xs) | otherwise -> m*10000000 + fromIntegral n loop8 m n xs | m `seq` n `seq` xs `seq` False = undefined | BS.null xs = m*100000000 + fromIntegral n | otherwise = case BSU.unsafeHead xs of w | isDecimal w -> loop0 (m*1000000000 + fromIntegral (addDigit n w)) (BSU.unsafeTail xs) | otherwise -> m*100000000 + fromIntegral n ---------------------------------------------------------------- -- | Convert a non-negative integer into an (unsigned) ASCII decimal -- string. Returns @Nothing@ on negative inputs. packDecimal :: (Integral a) => a -> Maybe ByteString {-# INLINE packDecimal #-} packDecimal n | n < 0 = Nothing | otherwise = Just (unsafePackDecimal n) -- This implementation is modified from: -- -- -- | Convert a non-negative integer into an (unsigned) ASCII decimal -- string. This function is unsafe to use on negative inputs. unsafePackDecimal :: (Integral a) => a -> ByteString {-# SPECIALIZE unsafePackDecimal :: Int -> ByteString, Int8 -> ByteString, Int16 -> ByteString, Int32 -> ByteString, Int64 -> ByteString, Integer -> ByteString, Word -> ByteString, Word8 -> ByteString, Word16 -> ByteString, Word32 -> ByteString, Word64 -> ByteString #-} unsafePackDecimal n0 = let size = numDecimalDigits n0 in BSI.unsafeCreate size $ \p0 -> loop n0 (p0 `plusPtr` (size - 1)) where getDigit = BSU.unsafeIndex packDecimal_digits loop n p | n `seq` p `seq` False = undefined -- for strictness analysis | n >= 100 = do let (q,r) = n `quotRem` 100 write2 r p loop q (p `plusPtr` negate 2) | n >= 10 = write2 n p | otherwise = poke p (0x30 + fromIntegral n) write2 i0 p | i0 `seq` p `seq` False = undefined -- for strictness analysis | otherwise = do let i = fromIntegral i0; j = i + i poke p (getDigit $! j + 1) poke (p `plusPtr` negate 1) (getDigit j) packDecimal_digits :: ByteString {-# NOINLINE packDecimal_digits #-} packDecimal_digits = BS8.pack "0001020304050607080910111213141516171819\ \2021222324252627282930313233343536373839\ \4041424344454647484950515253545556575859\ \6061626364656667686970717273747576777879\ \8081828384858687888990919293949596979899" -- BUG: syntax highlighting fail: -> ---------------------------------------------------------------- ---------------------------------------------------------------- ----- Hexadecimal -- | Read a non-negative integral value in ASCII hexadecimal format. -- Returns @Nothing@ if there is no integer at the beginning of the -- string, otherwise returns @Just@ the integer read and the remainder -- of the string. -- -- This function does not recognize the various hexadecimal sigils -- like \"0x\", but because there are so many different variants, -- those are best handled by helper functions which then use this -- function for the actual numerical parsing. This function recognizes -- both upper-case, lower-case, and mixed-case hexadecimal. readHexadecimal :: (Integral a) => ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readHexadecimal :: ByteString -> Maybe (Int, ByteString), ByteString -> Maybe (Int8, ByteString), ByteString -> Maybe (Int16, ByteString), ByteString -> Maybe (Int32, ByteString), ByteString -> Maybe (Int64, ByteString), ByteString -> Maybe (Integer, ByteString), ByteString -> Maybe (Word, ByteString), ByteString -> Maybe (Word8, ByteString), ByteString -> Maybe (Word16, ByteString), ByteString -> Maybe (Word32, ByteString), ByteString -> Maybe (Word64, ByteString) #-} readHexadecimal = start where -- TODO: Would it be worth trying to do the magichash trick -- used by Warp here? It'd really help remove branch prediction -- issues etc. -- -- Beware the urge to make this code prettier, cf 'readDecimal'. start xs | BS.null xs = Nothing | otherwise = case BSU.unsafeHead xs of w | 0x39 >= w && w >= 0x30 -> Just $ loop (fromIntegral (w - 0x30)) (BSU.unsafeTail xs) | 0x46 >= w && w >= 0x41 -> Just $ loop (fromIntegral (w-0x41+10)) (BSU.unsafeTail xs) | 0x66 >= w && w >= 0x61 -> Just $ loop (fromIntegral (w-0x61+10)) (BSU.unsafeTail xs) | otherwise -> Nothing loop n xs | n `seq` xs `seq` False = undefined -- for strictness analysis | BS.null xs = (n, BS.empty) -- not @xs@, to help GC | otherwise = case BSU.unsafeHead xs of w | 0x39 >= w && w >= 0x30 -> loop (n*16 + fromIntegral (w - 0x30)) (BSU.unsafeTail xs) | 0x46 >= w && w >= 0x41 -> loop (n*16 + fromIntegral (w-0x41+10)) (BSU.unsafeTail xs) | 0x66 >= w && w >= 0x61 -> loop (n*16 + fromIntegral (w-0x61+10)) (BSU.unsafeTail xs) | otherwise -> (n,xs) -- | Convert a non-negative integer into a lower-case ASCII hexadecimal -- string. Returns @Nothing@ on negative inputs. packHexadecimal :: (Integral a) => a -> Maybe ByteString {-# INLINE packHexadecimal #-} packHexadecimal n | n < 0 = Nothing | otherwise = Just (unsafePackHexadecimal n) -- | Convert a non-negative integer into a lower-case ASCII hexadecimal -- string. This function is unsafe to use on negative inputs. unsafePackHexadecimal :: (Integral a) => a -> ByteString {-# SPECIALIZE unsafePackHexadecimal :: Int -> ByteString, Int8 -> ByteString, Int16 -> ByteString, Int32 -> ByteString, Int64 -> ByteString, Integer -> ByteString, Word -> ByteString, Word8 -> ByteString, Word16 -> ByteString, Word32 -> ByteString, Word64 -> ByteString #-} unsafePackHexadecimal n0 = let size = numTwoPowerDigits 4 (toInteger n0) -- for Bits in BSI.unsafeCreate size $ \p0 -> loop n0 (p0 `plusPtr` (size - 1)) where -- TODO: benchmark using @hexDigits@ vs using direct manipulations. loop :: (Integral a) => a -> Ptr Word8 -> IO () loop n p | n <= 15 = do poke p (BSU.unsafeIndex hexDigits (fromIntegral n .&. 0x0F)) | otherwise = do let (q,r) = n `quotRem` 16 poke p (BSU.unsafeIndex hexDigits (fromIntegral r .&. 0x0F)) loop q (p `plusPtr` negate 1) -- Inspired by, -- | Convert a bitvector into a lower-case ASCII hexadecimal string. -- This is helpful for visualizing raw binary data, rather than for -- parsing as such. asHexadecimal :: ByteString -> ByteString asHexadecimal = start where start buf | BS.length buf > maxBound `quot` 2 = error _asHexadecimal_overflow | otherwise = BSI.unsafeCreate (2 * BS.length buf) $ \p0 -> do _ <- foldIO step p0 buf return () -- needed for type checking step :: Ptr Word8 -> Word8 -> IO (Ptr Word8) step p w | p `seq` w `seq` False = undefined -- for strictness analysis | otherwise = do let ix = fromIntegral w poke p (BSU.unsafeIndex hexDigits ((ix .&. 0xF0) `shiftR` 4)) poke (p `plusPtr` 1) (BSU.unsafeIndex hexDigits (ix .&. 0x0F)) return (p `plusPtr` 2) _asHexadecimal_overflow :: String {-# NOINLINE _asHexadecimal_overflow #-} _asHexadecimal_overflow = "asHexadecimal: cannot create buffer larger than (maxBound::Int)" -- TODO: benchmark against the magichash hack used in Warp. -- | The lower-case ASCII hexadecimal digits, in numerical order -- for use as a lookup table. hexDigits :: ByteString {-# NOINLINE hexDigits #-} hexDigits = BS8.pack "0123456789abcdef" -- | We can only do this for MonadIO not just any Monad, but that's -- good enough for what we need... foldIO :: (a -> Word8 -> IO a) -> a -> ByteString -> IO a {-# INLINE foldIO #-} foldIO f z0 (BSI.PS fp off len) = FFI.withForeignPtr fp $ \p0 -> do let q = p0 `plusPtr` (off+len) let go z p | z `seq` p `seq` False = undefined -- for strictness analysis | p == q = return z | otherwise = do w <- peek p z' <- f z w go z' (p `plusPtr` 1) go z0 (p0 `plusPtr` off) ---------------------------------------------------------------- ---------------------------------------------------------------- ----- Octal -- | Read a non-negative integral value in ASCII octal format. -- Returns @Nothing@ if there is no integer at the beginning of the -- string, otherwise returns @Just@ the integer read and the remainder -- of the string. -- -- This function does not recognize the various octal sigils like -- \"0o\", but because there are different variants, those are best -- handled by helper functions which then use this function for the -- actual numerical parsing. readOctal :: (Integral a) => ByteString -> Maybe (a, ByteString) {-# SPECIALIZE readOctal :: ByteString -> Maybe (Int, ByteString), ByteString -> Maybe (Int8, ByteString), ByteString -> Maybe (Int16, ByteString), ByteString -> Maybe (Int32, ByteString), ByteString -> Maybe (Int64, ByteString), ByteString -> Maybe (Integer, ByteString), ByteString -> Maybe (Word, ByteString), ByteString -> Maybe (Word8, ByteString), ByteString -> Maybe (Word16, ByteString), ByteString -> Maybe (Word32, ByteString), ByteString -> Maybe (Word64, ByteString) #-} readOctal = start where start xs | BS.null xs = Nothing | otherwise = case BSU.unsafeHead xs of w | 0x37 >= w && w >= 0x30 -> Just $ loop (fromIntegral (w - 0x30)) (BSU.unsafeTail xs) | otherwise -> Nothing loop n xs | n `seq` xs `seq` False = undefined -- for strictness analysis | BS.null xs = (n, BS.empty) -- not @xs@, to help GC | otherwise = case BSU.unsafeHead xs of w | 0x37 >= w && w >= 0x30 -> loop (n * 8 + fromIntegral (w - 0x30)) (BSU.unsafeTail xs) | otherwise -> (n,xs) -- | Convert a non-negative integer into an ASCII octal string. -- Returns @Nothing@ on negative inputs. packOctal :: (Integral a) => a -> Maybe ByteString {-# INLINE packOctal #-} packOctal n | n < 0 = Nothing | otherwise = Just (unsafePackOctal n) -- | Convert a non-negative integer into an ASCII octal string. -- This function is unsafe to use on negative inputs. unsafePackOctal :: (Integral a) => a -> ByteString {-# SPECIALIZE unsafePackOctal :: Int -> ByteString, Int8 -> ByteString, Int16 -> ByteString, Int32 -> ByteString, Int64 -> ByteString, Integer -> ByteString, Word -> ByteString, Word8 -> ByteString, Word16 -> ByteString, Word32 -> ByteString, Word64 -> ByteString #-} unsafePackOctal n0 = let size = numTwoPowerDigits 3 (toInteger n0) -- for Bits in BSI.unsafeCreate size $ \p0 -> loop n0 (p0 `plusPtr` (size - 1)) where loop :: (Integral a) => a -> Ptr Word8 -> IO () loop n p | n <= 7 = do poke p (0x30 + fromIntegral n) | otherwise = do let (q,r) = n `quotRem` 8 poke p (0x30 + fromIntegral r) loop q (p `plusPtr` negate 1) {- -- BUG: This doesn't quite work right... asOctal :: ByteString -> ByteString asOctal buf = BSI.unsafeCreate (ceilEightThirds $ BS.length buf) $ \p0 -> do let (BSI.PS fq off len) = buf FFI.withForeignPtr fq $ \q0 -> do let qF = q0 `plusPtr` (off + len - rem len 3) let loop :: Ptr Word8 -> Ptr Word8 -> IO () loop p q | q /= qF = do {- Take three Word8s and write 8 chars at a time -} i <- peek q j <- peek (q `plusPtr` 1) :: IO Word8 k <- peek (q `plusPtr` 2) :: IO Word8 let w = fromIntegral i .|. (fromIntegral j `shiftL` 8) .|. (fromIntegral k `shiftL` 16) poke p (toC8( w .&. 0x07)) poke (p `plusPtr` 1) (toC8((w `shiftR` 3) .&. 0x07)) poke (p `plusPtr` 2) (toC8((w `shiftR` 6) .&. 0x07)) poke (p `plusPtr` 3) (toC8((w `shiftR` 9) .&. 0x07)) poke (p `plusPtr` 4) (toC8((w `shiftR` 12) .&. 0x07)) poke (p `plusPtr` 5) (toC8((w `shiftR` 15) .&. 0x07)) poke (p `plusPtr` 6) (toC8((w `shiftR` 18) .&. 0x07)) poke (p `plusPtr` 7) (toC8((w `shiftR` 21) .&. 0x07)) loop (p `plusPtr` 8) (q `plusPtr` 3) | 2 == rem len 3 = do {- Handle the last two Word8s -} i <- peek q j <- peek (q `plusPtr` 1) :: IO Word8 let w = fromIntegral i .|. (fromIntegral j `shiftL` 8) poke p (toC8( w .&. 0x07)) poke (p `plusPtr` 1) (toC8((w `shiftR` 3) .&. 0x07)) poke (p `plusPtr` 2) (toC8((w `shiftR` 6) .&. 0x07)) poke (p `plusPtr` 3) (toC8((w `shiftR` 9) .&. 0x07)) poke (p `plusPtr` 4) (toC8((w `shiftR` 12) .&. 0x07)) poke (p `plusPtr` 5) (toC8((w `shiftR` 15) .&. 0x01)) | otherwise = do {- Handle the last Word8 -} i <- peek q let w = fromIntegral i poke p (toC8( w .&. 0x07)) poke (p `plusPtr` 1) (toC8((w `shiftR` 3) .&. 0x07)) poke (p `plusPtr` 2) (toC8((w `shiftR` 6) .&. 0x03)) -- loop p0 (q0 `plusPtr` off) where toC8 :: Int -> Word8 toC8 i = fromIntegral (0x30+i) {-# INLINE toC8 #-} -- We can probably speed that up by using (.|.) in lieu of (+) -- See the benchmark file for credits and implementation details. ceilEightThirds x | x >= 3*(b-1) = error _asOctal_overflow | x >= b = ceiling (fromIntegral x / 3 * 8 :: Double) | otherwise = (x*8 + 2) `quot` 3 where {-# INLINE b #-} b = 2^(28::Int)::Int -- b*8-1 is the last positive number for Int=Int32 -- TODO: need to generalize for Int=Int64 _asOctal_overflow :: String {-# NOINLINE _asOctal_overflow #-} _asOctal_overflow = "asOctal: cannot create buffer larger than (maxBound::Int)" -- -} ---------------------------------------------------------------- ----------------------------------------------------------- fin.bytestring-lexing-0.5.0.2/src/Data/ByteString/Lex/Internal.hs0000644000000000000000000001275412534743277022141 0ustar0000000000000000{-# OPTIONS_GHC -Wall -fwarn-tabs #-} ---------------------------------------------------------------- -- 2015.06.05 -- | -- Module : Data.ByteString.Lex.Internal -- Copyright : Copyright (c) 2010--2015 wren gayle romano -- License : BSD2 -- Maintainer : wren@community.haskell.org -- Stability : provisional -- Portability : Haskell98 -- -- Some functions we want to share across the other modules without actually exposing them to the user. ---------------------------------------------------------------- module Data.ByteString.Lex.Internal ( -- * Integral logarithms numDigits , numTwoPowerDigits , numDecimalDigits ) where import Data.Word (Word64) import Data.Bits (Bits(shiftR)) ---------------------------------------------------------------- ---------------------------------------------------------------- ----- Integral logarithms -- TODO: cf. integer-gmp:GHC.Integer.Logarithms made available in version 0.3.0.0 (ships with GHC 7.2.1). -- -- This implementation is derived from -- -- modified to use 'quot' instead of 'div', to ensure strictness, -- and using more guard notation (but this last one's compiled -- away). See @./test/bench/BenchNumDigits.hs@ for other implementation -- choices. -- -- | @numDigits b n@ computes the number of base-@b@ digits required -- to represent the number @n@. N.B., this implementation is unsafe -- and will throw errors if the base is @(<= 1)@, or if the number -- is negative. If the base happens to be a power of 2, then see -- 'numTwoPowerDigits' for a more efficient implementation. -- -- We must be careful about the input types here. When using small -- unsigned types or very large values, the repeated squaring can -- overflow causing the function to loop. (E.g., the fourth squaring -- of 10 overflows 32-bits (==1874919424) which is greater than the -- third squaring. For 64-bit, the 5th squaring overflows, but it's -- negative so will be caught.) Forcing the type to Integer ensures -- correct behavior, but makes it substantially slower. numDigits :: Integer -> Integer -> Int {-# INLINE numDigits #-} numDigits b0 n0 | b0 <= 1 = error (_numDigits ++ _nonpositiveBase) | n0 < 0 = error (_numDigits ++ _negativeNumber) -- BUG: need to check n0 to be sure we won't overflow Int | otherwise = 1 + fst (ilog b0 n0) where ilog b n | n < b = (0, n) | r < b = ((,) $! 2*e) r | otherwise = ((,) $! 2*e+1) $! (r `quot` b) where (e, r) = ilog (b*b) n -- | Compute the number of base-@2^p@ digits required to represent a -- number @n@. N.B., this implementation is unsafe and will throw -- errors if the base power is non-positive, or if the number is -- negative. For bases which are not a power of 2, see 'numDigits' -- for a more general implementation. numTwoPowerDigits :: (Integral a, Bits a) => Int -> a -> Int {-# INLINE numTwoPowerDigits #-} numTwoPowerDigits p n0 | p <= 0 = error (_numTwoPowerDigits ++ _nonpositiveBase) | n0 < 0 = error (_numTwoPowerDigits ++ _negativeNumber) | n0 == 0 = 1 -- BUG: need to check n0 to be sure we won't overflow Int | otherwise = go 0 n0 where go d n | d `seq` n `seq` False = undefined | n > 0 = go (d+1) (n `shiftR` p) | otherwise = d -- This implementation is from: -- -- -- | Compute the number of base-@10@ digits required to represent -- a number @n@. N.B., this implementation is unsafe and will throw -- errors if the number is negative. numDecimalDigits :: (Integral a) => a -> Int {-# INLINE numDecimalDigits #-} numDecimalDigits n0 | n0 < 0 = error (_numDecimalDigits ++ _negativeNumber) -- Unfortunately this causes significant (1.2x) slowdown since -- GHC can't see it will always fail for types other than Integer... | n0 > limit = numDigits 10 (toInteger n0) | otherwise = go 1 (fromIntegral n0 :: Word64) where limit = fromIntegral (maxBound :: Word64) fin n bound = if n >= bound then 1 else 0 go k n | k `seq` False = undefined -- For strictness analysis | n < 10 = k | n < 100 = k + 1 | n < 1000 = k + 2 | n < 1000000000000 = k + if n < 100000000 then if n < 1000000 then if n < 10000 then 3 else 4 + fin n 100000 else 6 + fin n 10000000 else if n < 10000000000 then 8 + fin n 1000000000 else 10 + fin n 100000000000 | otherwise = go (k + 12) (n `quot` 1000000000000) _numDigits :: String _numDigits = "numDigits" {-# NOINLINE _numDigits #-} _numTwoPowerDigits :: String _numTwoPowerDigits = "numTwoPowerDigits" {-# NOINLINE _numTwoPowerDigits #-} _numDecimalDigits :: String _numDecimalDigits = "numDecimalDigits" {-# NOINLINE _numDecimalDigits #-} _nonpositiveBase :: String _nonpositiveBase = ": base must be greater than one" {-# NOINLINE _nonpositiveBase #-} _negativeNumber :: String _negativeNumber = ": number must be non-negative" {-# NOINLINE _negativeNumber #-} ---------------------------------------------------------------- ----------------------------------------------------------- fin.