gridtables-0.1.0.0/0000755000000000000000000000000007346545000012157 5ustar0000000000000000gridtables-0.1.0.0/CHANGELOG.md0000644000000000000000000000131407346545000013767 0ustar0000000000000000# Changelog `gridtables` uses [PVP Versioning][]. ## gridtables-0.1.0.0 Release pending. - Added support for table foots. ## gridtables-0.0.3.0 Released 2022-08-18. - Missing cells no longer cause an error, but are replaced with empty cells. - The borders of the last cell in a row are allowed to be shorter than the cell. Previously the last column was discarded in that case. ## gridtables-0.0.2.0 Released 2022-07-30. - Treat "combining" Unicode characters, such as the zero-width space or the word joiner, as having no width. ## gridtables-0.0.1.0 Released 2022-07-29. - Boldly going where no Haskell library has gone before. [PVP Versioning]: https://pvp.haskell.org gridtables-0.1.0.0/LICENSE0000644000000000000000000000205407346545000013165 0ustar0000000000000000MIT License Copyright © 2022 RStudio, PBC Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. gridtables-0.1.0.0/README.md0000644000000000000000000000546607346545000013451 0ustar0000000000000000# gridtables Parser for reStructuredText-style grid tables. This package provides a parser for plain-text representations of tables, like the one given below. ``` +---------------------+-----------------------+ | Location | Temperature 1961-1990 | | | in degree Celsius | | +-------+-------+-------+ | | min | mean | max | +=====================+=======+=======+=======+ | Antarctica | -89.2 | N/A | 19.8 | +---------------------+-------+-------+-------+ | Earth | -89.2 | 14 | 56.7 | +---------------------+-------+-------+-------+ ``` ## Character widths The tables are intended to look good when viewed in a monospace font. Therefore, wide and full-width characters, as those in East Asian scripts, are counted as two characters, while zero-width and combining characters are treated as if they have no width. ## Column alignments The parser re-implements a table extensions from John MacFarlane's pandoc, namely support for column-wide cell alignments. The alignment of cells is determined by placing colons in the row that separates the table head from the body, like so: +------+--------+-------+ | left | center | right | +:=====+:======:+======:+ | 1 | 2 | 3 | +------+--------+-------+ The first line must be used for headless tables: +:-----+:------:+------:+ | left | center | right | +------+--------+-------+ | a 1 | b 2 | c 3 | +------+--------+-------+ ## Table Foot This library implements an extension that enables to create tables with table foots: If the *last* separator line is a part separator, i.e., if it consists of `=` instead of `-`, then all rows after the *second-to-last* part separator are treated as the table foot. E.g., consider the following table: +------+-------+ | Item | Price | +======+=======+ | Eggs | 5£ | +------+-------+ | Spam | 3£ | +======+=======+ | Sum | 8£ | +======+=======+ Here, the last row, containing "Sum" and "8£", would be the table foot. ## Algorithm The cell tracing algorithm used in this package has been translated from the original Python implementation for reStructuredText. The parser has been placed in the public domain. ## Usage The usual way to use this package will be to use it as part of a parsec parser: ``` haskell main :: IO () main = do let gt = T.unlines [ "+------+--------+-------+" , "| left | center | right |" , "+:=====+:======:+======:+" , "| 1 | 2 | 3 |" , "+------+--------+-------+" ] in print (runParser GT.gridTable () "table" gt) ``` Use `traceLines :: [Text] -> Maybe (GridTable [Text])`, if the table's raw lines have been retrieved in a different way. gridtables-0.1.0.0/gridtables.cabal0000644000000000000000000000562107346545000015267 0ustar0000000000000000cabal-version: 2.4 name: gridtables version: 0.1.0.0 synopsis: Parser for reStructuredText-style grid tables. description: Provides a parser for plain-text representations of tables. This package supports table headers, cells spanning multiple columns or rows, as well as a way to specfiy column alignments. homepage: https://github.com/tarleb/gridtables bug-reports: https://github.com/tarleb/gridtables/issues license: MIT license-file: LICENSE author: Albert Krewinkel maintainer: Albert Krewinkel copyright: © 2022 RStudio, PBC category: Text extra-doc-files: README.md , CHANGELOG.md tested-with: GHC == 8.6.5 GHC == 8.8.4 GHC == 8.10.7 GHC == 9.0.2 GHC == 9.2.2 source-repository head type: git location: https://github.com/tarleb/gridtables.git common common-options build-depends: base >= 4.12 && < 5 , array , parsec >= 3.1 && < 3.2 , text >= 1.1.1.0 && < 1.3 || >= 2.0 && < 2.1 default-language: Haskell2010 default-extensions: OverloadedStrings other-extensions: FlexibleContexts , LambdaCase ghc-options: -Wall -Wcompat -Widentities -Wincomplete-record-updates -Wincomplete-uni-patterns -Wmissing-export-lists -Wpartial-fields -Wredundant-constraints -fhide-source-paths if impl(ghc >= 8.8) ghc-options: -Wmissing-deriving-strategies if impl(ghc >= 8.10) ghc-options: -Wunused-packages if impl(ghc >= 9.0) ghc-options: -Winvalid-haddock library import: common-options hs-source-dirs: src build-depends: containers , doclayout exposed-modules: Text.GridTable , Text.GridTable.ArrayTable , Text.GridTable.Parse , Text.GridTable.Trace other-extensions: DeriveFunctor , DerivingStrategies , GeneralizedNewtypeDeriving test-suite test-gridtables import: common-options type: exitcode-stdio-1.0 hs-source-dirs: test main-is: test-gridtables.hs build-depends: gridtables , tasty >= 0.11 , tasty-hunit >= 0.9 ghc-options: -threaded -rtsopts -with-rtsopts=-N gridtables-0.1.0.0/src/Text/0000755000000000000000000000000007346545000013672 5ustar0000000000000000gridtables-0.1.0.0/src/Text/GridTable.hs0000644000000000000000000000303407346545000016063 0ustar0000000000000000{-# LANGUAGE DerivingStrategies #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE LambdaCase #-} {-# LANGUAGE RankNTypes #-} {- | Module : Text.GridTable Copyright : © 2022 Albert Krewinkel License : MIT Maintainer : Albert Krewinkel Parse reStructuredText-style grid tables. -} module Text.GridTable ( module Text.GridTable.ArrayTable -- * Parse from character stream , gridTable -- * List-based representation , Cell (..) , rows ) where import Prelude hiding (lines) import Data.Array (Array, elems, bounds) import Data.Bifunctor (bimap) import Data.Maybe (mapMaybe) import Text.GridTable.ArrayTable import Text.GridTable.Parse (gridTable) colBounds :: Array CellIndex a -> (ColIndex, ColIndex) colBounds = bimap snd snd . bounds -- | Returns the rows of a grid table as lists of simple cells. rows :: ArrayTable a -> [[Cell a]] rows gt = let tarr = arrayTableCells gt ncols = fromColIndex . uncurry (flip (-)) $ colBounds tarr toSimpleCell = \case ContentCell rs cs c -> Just $ Cell c rs cs ContinuationCell {} -> Nothing mkRows :: [[Cell a]] -> [GridCell a] -> [[Cell a]] mkRows rs = \case [] -> reverse rs xs -> let (r, xs') = splitAt (ncols + 1) xs in mkRows (mapMaybe toSimpleCell r:rs) xs' in mkRows [] $ elems tarr -- | Raw grid table cell data Cell a = Cell { cellContent :: a , cellRowSpan :: RowSpan , cellColSpan :: ColSpan } deriving stock (Eq, Ord, Show) gridtables-0.1.0.0/src/Text/GridTable/0000755000000000000000000000000007346545000015527 5ustar0000000000000000gridtables-0.1.0.0/src/Text/GridTable/ArrayTable.hs0000644000000000000000000000502207346545000020110 0ustar0000000000000000{-# LANGUAGE DerivingStrategies #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE LambdaCase #-} {- | Module : Text.GridTable.ArrayTable Copyright : © 2022 RStudio, PBC License : MIT Maintainer : Albert Krewinkel Grid table representation based on arrays. -} module Text.GridTable.ArrayTable ( ArrayTable (..) , GridCell (..) , RowSpan (..) , ColSpan (..) , CellIndex , RowIndex (..) , ColIndex (..) , Alignment (..) , mapCells ) where import Data.Array (Array, Ix) import Data.Array.MArray (mapArray, thaw) import Data.Array.ST (runSTArray) -- | Table representation based on an array; cells are placed on a grid, -- with indices spanned by other cells containing placeholder cells that -- point to the spanning cell. data ArrayTable a = ArrayTable { arrayTableCells :: Array CellIndex (GridCell a) , arrayTableHead :: Maybe RowIndex , arrayTableFoot :: Maybe RowIndex , arrayTableColSpecs :: Array ColIndex (Alignment, Int) } deriving stock (Eq, Show) -- | Apply a function to all cell contents in a grid table. mapCells :: (a -> b) -> ArrayTable a -> ArrayTable b mapCells f gt = let f' = \case ContentCell rs cs c -> ContentCell rs cs $ f c ContinuationCell idx -> ContinuationCell idx cellArray = runSTArray $ do mut <- thaw $ arrayTableCells gt mapArray f' mut in gt { arrayTableCells = cellArray } -- | Row index in a table array. newtype RowIndex = RowIndex { fromRowIndex :: Int } deriving stock (Eq, Ix, Ord) deriving newtype (Enum, Num, Show) -- | Column index in a table array. newtype ColIndex = ColIndex { fromColIndex :: Int } deriving stock (Eq, Ix, Ord) deriving newtype (Enum, Num, Show) -- | Index to a cell in a table part. type CellIndex = (RowIndex, ColIndex) -- | A grid cell contains either a real table cell, or is the -- continuation of a column or row-spanning cell. In the latter case, -- the index of the continued cell is provided. data GridCell a = ContentCell RowSpan ColSpan a | ContinuationCell CellIndex deriving stock (Eq, Show) -- | The number of rows spanned by a cell. newtype RowSpan = RowSpan Int deriving stock (Eq, Ord) deriving newtype (Enum, Num, Read, Show) -- | The number of columns spanned by a cell. newtype ColSpan = ColSpan Int deriving stock (Eq, Ord) deriving newtype (Enum, Num, Read, Show) -- | Cell alignment data Alignment = AlignDefault | AlignLeft | AlignCenter | AlignRight deriving stock (Enum, Eq, Ord, Read, Show) gridtables-0.1.0.0/src/Text/GridTable/Parse.hs0000644000000000000000000000316507346545000017142 0ustar0000000000000000{-# LANGUAGE DerivingStrategies #-} {-# LANGUAGE FlexibleContexts #-} {- | Module : Text.GridTable Copyright : © 2022 Albert Krewinkel License : MIT Maintainer : Albert Krewinkel Parse reStructuredText-style grid tables. -} module Text.GridTable.Parse ( gridTable , tableLine ) where import Prelude hiding (lines) import Data.Text (Text) import Text.GridTable.ArrayTable import Text.GridTable.Trace (traceLines) import Text.Parsec import qualified Data.Text as T -- | Parses a grid table. gridTable :: Stream s m Char => ParsecT s u m (ArrayTable [Text]) gridTable = try $ do firstLine <- (:) <$> char '+' <*> (mconcat <$> many1 (gridPart '-')) <* skipSpaces <* newline lines <- many1 tableLine case traceLines (T.pack firstLine : lines) of Nothing -> fail "tracing failed" Just gt -> return gt skipSpaces :: Stream s m Char => ParsecT s u m () skipSpaces = skipMany (satisfy $ \c -> c == '\t' || c == ' ') -- | Parses a line that's part of a table. The line must start with -- either a plus @+@ or a pipe @|@. tableLine :: Stream s m Char => ParsecT s u m Text tableLine = try $ do let borderChar = char '+' <|> char '|' firstChar <- borderChar rest <- manyTill (noneOf "\n\r") newline return $ T.stripEnd $ T.pack (firstChar : rest) gridPart :: Stream s m Char => Char -> ParsecT s u m String gridPart ch = do leftColon <- option id ((:) <$> char ':') dashes <- many1 (char ch) rightColon <- option id ((:) <$> char ':') plus <- char '+' return . leftColon . (dashes ++) . rightColon $ [plus] gridtables-0.1.0.0/src/Text/GridTable/Trace.hs0000644000000000000000000004512007346545000017123 0ustar0000000000000000{-# LANGUAGE DerivingStrategies #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE LambdaCase #-} {-# LANGUAGE RankNTypes #-} {- | Module : Text.GridTable.Trace Copyright : © 2022 Albert Krewinkel License : MIT Maintainer : Albert Krewinkel Trace cells of a grid table. -} module Text.GridTable.Trace ( traceLines , TraceInfo (..) , initialTraceInfo , tableFromTraceInfo ) where import Prelude hiding (lines) import Control.Applicative ((<|>)) import Control.Monad (forM_) import Control.Monad.ST import Data.Array import Data.Array.MArray import Data.Array.ST import Data.Function (on) import Data.List (foldl') import Data.Maybe (fromMaybe) import Data.Set (Set) import Data.Text (Text) import Text.DocLayout (charWidth) import Text.GridTable.ArrayTable import qualified Data.Map.Strict as Map import qualified Data.Set as Set import qualified Data.Text as T -- | Traces out the cells in the given lines and converts them to a -- table containing the bare cell lines. traceLines :: [Text] -> Maybe (ArrayTable [Text]) traceLines lines = let charGrid = toCharGrid lines specs1 = colSpecsInLine '-' charGrid 1 partSeps = findSeparators charGrid -- The first separator can never be a part separator line (with -- =), but it can contain column alignment markers, so it is -- always normalized it as well. charGrid' = convertToNormalLines (1:map partSepLine partSeps) charGrid traceInfo = traceCharGrid charGrid' initialTraceInfo in if Set.null (gridCells traceInfo) then fail "no cells" else return $ tableFromTraceInfo traceInfo partSeps specs1 -- | Type used to represent the 2D layout of table characters type CharGrid = Array (CharRow, CharCol) GChar -- | Index of a half-width character in the character-wise -- representation. type CharIndex = (CharRow, CharCol) -- | Character row newtype CharRow = CharRow Int deriving stock (Eq, Show) deriving newtype (Enum, Ix, Num, Ord) -- | Character column newtype CharCol = CharCol { fromCharCol :: Int } deriving stock (Eq, Show) deriving newtype (Enum, Ix, Num, Ord) data GChar = C Char -- ^ half- or full-width character | CZ [Char] Char -- ^ character preceded by zero-width chars | WP -- ^ padding for wide characters | Missing -- ^ padding for short lines deriving stock (Eq) -- | Info on the grid. Used to keep track of information collected while -- tracing a character grid. The set of cells is used as a kind of queue -- during parsing, while the other data is required to assemble the -- final table. data TraceInfo = TraceInfo { gridRowSeps :: Set CharRow , gridColSeps :: Set CharCol , gridCorners :: Set CharIndex , gridCells :: Set CellTrace } -- | Initial tracing info. initialTraceInfo :: TraceInfo initialTraceInfo = TraceInfo { gridRowSeps = Set.fromList [CharRow 1] , gridColSeps = Set.fromList [CharCol 1] , gridCorners = Set.fromList [(CharRow 1, CharCol 1)] , gridCells = Set.empty } -- | Converts a list of lines into a char array. toCharGrid :: [Text] -> CharGrid toCharGrid lines = let chars = foldr (\t m -> max m (T.length t)) 0 lines -- potential overcount gbounds = ( (CharRow 1, CharCol 1) , (CharRow (length lines), CharCol chars) ) toGChars [] = repeat Missing toGChars (c:cs) = case charWidth c of 2 -> C c : WP : toGChars cs 1 -> C c : toGChars cs _ -> case span ((== 0) . charWidth) cs of (zw, []) -> [CZ (c:zw) '\0'] (zw, c':cs') -> CZ (c:zw) c' : case charWidth c' of 2 -> WP : toGChars cs' _ -> toGChars cs' extendedLines = map (take chars . toGChars . T.unpack) lines in listArray gbounds (mconcat extendedLines) -- | Information on, and extracted from, a body separator line. This is a line -- that uses @=@ instead of @-@ to demark cell borders. data PartSeparator = PartSeparator { partSepLine :: CharRow , partSepColSpec :: [ColSpec] } -- | Alignment and character grid position of a column. data ColSpec = ColSpec { colStart :: CharCol , colEnd :: CharCol , colAlign :: Alignment } -- | Finds the row indices of all separator lines, i.e., lines that -- contain only @+@ and @=@ characters. findSeparators :: CharGrid -> [PartSeparator] findSeparators charGrid = foldr go [] rowIdxs where gbounds = bounds charGrid rowIdxs = [fst (fst gbounds) .. fst (snd gbounds)] go i seps = case colSpecsInLine '=' charGrid i of Nothing -> seps Just colspecs -> PartSeparator i colspecs : seps -- | Checks for a separator in the given line, returning the column properties -- if it finds such a line. colSpecsInLine :: Char -- ^ Character used in line (usually @-@) -> CharGrid -> CharRow -> Maybe [ColSpec] colSpecsInLine c charGrid i = case charGrid ! (i, firstCol) of C '+' -> loop [] (firstCol + 1) _ -> Nothing where loop acc j = case colSpecAt j of Nothing -> Nothing Just Nothing -> Just $ reverse acc Just (Just colspec) -> loop (colspec:acc) (colEnd colspec + 1) gbounds = bounds charGrid firstCol = snd (fst gbounds) lastCol = snd (snd gbounds) colSpecAt :: CharCol -> Maybe (Maybe ColSpec) colSpecAt j | j >= lastCol = Just Nothing | otherwise = case findEnd (j + 1) of Nothing -> Nothing Just (end, rightMark) -> let leftMark = charGrid ! (i, j) == C ':' align = case (leftMark, rightMark) of (False , False) -> AlignDefault (True , False) -> AlignLeft (False , True ) -> AlignRight (True , True ) -> AlignCenter colspec = ColSpec { colStart = j , colEnd = end , colAlign = align } in pure (pure colspec) findEnd j = case charGrid ! (i, j) of C '+' -> pure (j, False) C ':' -> if charGrid ! (i, j + 1) == C '+' then pure (j + 1, True) else Nothing C c' | c' == c -> findEnd (j + 1) _ -> Nothing -- | Returns new character grid in which the given lines have been -- converted to normal cell-separating lines. convertToNormalLines :: [CharRow] -> CharGrid -> CharGrid convertToNormalLines sepLines charGrid = runSTArray $ do mutGrid <- thaw charGrid let gbounds = bounds charGrid cols = [snd (fst gbounds) .. snd (snd gbounds)] forM_ sepLines $ \rowidx -> do forM_ cols $ \colidx -> do let idx = (rowidx, colidx) c <- readArray mutGrid idx -- convert `=` to `-` and remove alignment markers case c of C '=' -> writeArray mutGrid idx (C '-') C ':' -> writeArray mutGrid idx (C '-') _ -> pure () return mutGrid -- | Trace the given char grid and collect all relevant info. -- This function calls itself recursively. traceCharGrid :: CharGrid -> TraceInfo -> TraceInfo traceCharGrid charGrid traceInfo = -- Get the next corner an remove it from the set of unparsed corners. case Set.minView (gridCorners traceInfo) of Nothing -> traceInfo Just (startIdx@(top, left), corners) -> case traceCell charGrid startIdx of Nothing -> -- Corner is not a top-left corner of another cell. Continue -- with the remaining corners. traceCharGrid charGrid traceInfo { gridCorners = corners } Just ((bottom, right), newrowseps, newcolseps) -> do let content = getLines charGrid startIdx (bottom, right) let cell = CellTrace content left right top bottom let rowseps = gridRowSeps traceInfo let colseps = gridColSeps traceInfo let cells = gridCells traceInfo traceCharGrid charGrid $ TraceInfo { gridRowSeps = newrowseps `Set.union` rowseps , gridColSeps = newcolseps `Set.union` colseps , gridCorners = Set.insert (top, right) $ Set.insert (bottom, left) corners , gridCells = cell `Set.insert` cells } type ScanResult = (CharIndex, Set CharRow, Set CharCol) type RowSeps = Set CharRow type ColSeps = Set CharCol -- | Traces a single cell on the grid, starting at the given position. traceCell :: CharGrid -> CharIndex -> Maybe ScanResult traceCell = scanRight -- | Scans right from the given index, following a cell separator line -- to the next column marker (@+@), then scans down. Returns the -- bottom-right index of the cell if it can complete the trace, and -- nothing if it reaches the end of line before the trace is complete. -- -- All row and column markers found during scanning are seen are -- collected and returned as part of the result. scanRight :: CharGrid -> CharIndex -> Maybe ScanResult scanRight charGrid start@(top, left) = do loop Set.empty (left + 1) where loop :: ColSeps -> CharCol -> Maybe ScanResult loop colseps j | not (bounds charGrid `inRange` (top, j)) = Nothing | otherwise = case charGrid ! (top, j) of C '-' -> loop colseps (j + 1) C '+' -> let colseps' = Set.insert j colseps in case scanDown charGrid start j of Nothing -> loop colseps' (j + 1) <|> lastCellInRow charGrid start (j + 1) Just (end, rowseps, newcolseps) -> pure ( end , rowseps , colseps' `Set.union` newcolseps ) _ -> Nothing -- | Like 'scanRight', but scans down in the given column. scanDown :: CharGrid -> CharIndex -- ^ top-left corner of cell -> CharCol -- ^ column of the cell's right border -> Maybe ScanResult scanDown charGrid start@(top, _left) right = do loop Set.empty (top + 1) where loop :: RowSeps -> CharRow -> Maybe ScanResult loop rowseps i = if not (bounds charGrid `inRange` (i, right)) then Nothing else case charGrid ! (i, right) of C '+' -> let rowseps' = Set.insert i rowseps in case scanLeft charGrid start (i, right) of Nothing -> loop rowseps' (i + 1) Just (newrowseps, colseps) -> Just ( (i, right) , rowseps' `Set.union` newrowseps , colseps ) C '|' -> loop rowseps (i + 1) _ -> -- all but the final column must be terminated if right == snd (snd (bounds charGrid)) then loop rowseps (i + 1) else Nothing -- | Like 'scanRight', but scans left starting at the bottom-right -- corner. scanLeft :: CharGrid -> CharIndex -> CharIndex -> Maybe (RowSeps, ColSeps) scanLeft charGrid start@(_top,left) end@(bottom, right) = let go :: CharCol -> Maybe ColSeps -> Maybe ColSeps go _ Nothing = Nothing go j (Just colseps) = case charGrid ! (bottom, j) of C '+' -> Just (Set.insert j colseps) C '-' -> Just colseps _ -> Nothing in if charGrid ! (bottom, left) /= C '+' then Nothing else case foldr go (Just Set.empty) [(right - 1), right - 2 .. (left + 1)] of Nothing -> Nothing Just colseps -> case scanUp charGrid start end of Just rowseps -> Just (rowseps, colseps) Nothing -> Nothing -- | Scans up from the bottom-left corner back to the top-left corner. scanUp :: CharGrid -> CharIndex -> CharIndex -> Maybe RowSeps scanUp charGrid (top, left) (bottom, _right) = let go :: CharRow -> Maybe RowSeps -> Maybe RowSeps go _ Nothing = Nothing go i (Just rowseps) = case charGrid ! (i, left) of C '+' -> Just (Set.insert i rowseps) C '|' -> Just rowseps _ -> Nothing in foldr go (Just Set.empty) [bottom - 1, bottom - 2 .. top + 1] lastCellInRow :: CharGrid -> CharIndex -> CharCol -> Maybe ScanResult lastCellInRow charGrid start@(top, _left) right = if bounds charGrid `inRange` (top, right) && charGrid ! (top, right) == Missing then scanRestOfLines charGrid start else Nothing lastColumn :: CharGrid -> CharCol lastColumn = snd . snd . bounds lastRow :: CharGrid -> CharRow lastRow = fst . snd . bounds scanRightRestOfLine :: CharGrid -> CharIndex -> CharRow -> Maybe ColSeps scanRightRestOfLine charGrid (_top, left) bottom = let go :: CharCol -> Maybe ColSeps -> Maybe ColSeps go _ Nothing = Nothing go j (Just colseps) = case charGrid ! (bottom, j) of C '+' -> Just (Set.insert j colseps) C '-' -> Just colseps Missing -> Just colseps _ -> Nothing in if charGrid ! (bottom, left) /= C '+' then Nothing else foldr go (Just Set.empty) [left + 1 .. lastColumn charGrid] scanRestOfLines :: CharGrid -> CharIndex -> Maybe ScanResult scanRestOfLines charGrid start@(top, _left) = let go :: Maybe CharIndex -> CharRow -> Maybe CharIndex go idx i = idx <|> case scanRightRestOfLine charGrid start i of Nothing -> Nothing Just _colseps -> Just (i, lastColumn charGrid) in case foldl' go Nothing [top + 1 .. lastRow charGrid] of Just (bottom, right) -> Just ( (bottom, right) , Set.singleton bottom , Set.singleton right) Nothing -> Nothing -- | Gets the textual contents, i.e. the lines of a cell. getLines :: CharGrid -> CharIndex -> CharIndex -> [Text] getLines charGrid (top, left) (bottom, right) = let rowIdxs = [top + 1 .. bottom - 1] colIdxs = [left + 1 .. right - 1] toChars rowIdx colIdx = case charGrid ! (rowIdx, colIdx) of C c -> [c] CZ zw c -> zw ++ [c] _ -> [] in map (\ir -> T.pack $ concatMap (toChars ir) colIdxs) rowIdxs -- | Traced cell with raw contents and border positions. data CellTrace = CellTrace { cellTraceContent :: [Text] , cellTraceLeft :: CharCol , cellTraceRight :: CharCol , cellTraceTop :: CharRow , cellTraceBottom :: CharRow } deriving stock (Eq, Show) instance Ord CellTrace where x `compare` y = case (compare `on` cellTraceTop) x y of EQ -> (compare `on` cellTraceLeft) x y o -> o -- | Create a final grid table from line scanning data. tableFromTraceInfo :: TraceInfo -> [PartSeparator] -> Maybe [ColSpec] -> ArrayTable [Text] tableFromTraceInfo traceInfo partSeps colSpecsFirstLine = let rowseps = Set.toAscList $ gridRowSeps traceInfo colseps = Set.toAscList $ gridColSeps traceInfo rowindex = Map.fromList $ zip rowseps [1..] colindex = Map.fromList $ zip colseps [1..] colwidths = [ b - a - 1 | (b, a) <- zip (tail colseps) colseps ] colSpecs = zip (map colAlign (case partSeps of partSep:_ -> partSepColSpec partSep [] -> fromMaybe [] colSpecsFirstLine) ++ repeat AlignDefault) (map fromCharCol colwidths) lastCol = ColIndex (length colwidths) mlastLine = Set.lookupMax (gridRowSeps traceInfo) tableHead = case partSeps of ps:_ -> subtract 1 <$> partSepLine ps `Map.lookup` rowindex [] -> Nothing tableFoot = case reverse partSeps of rps:rps':_ | Just (partSepLine rps) == mlastLine -> partSepLine rps' `Map.lookup` rowindex _ -> Nothing in ArrayTable { arrayTableCells = runSTArray (toMutableArray traceInfo rowindex colindex) , arrayTableHead = tableHead , arrayTableFoot = tableFoot , arrayTableColSpecs = listArray (1, lastCol) colSpecs } -- | Create a mutable cell array from the scanning data. toMutableArray :: TraceInfo -> Map.Map CharRow RowIndex -> Map.Map CharCol ColIndex -> ST s (STArray s CellIndex (GridCell [Text])) toMutableArray traceInfo rowindex colindex = do let nrows = Map.size rowindex - 1 let ncols = Map.size colindex - 1 let gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex nrows, ColIndex ncols) ) tblgrid <- newArray gbounds FreeCell forM_ (Set.toAscList $ gridCells traceInfo) $ \(CellTrace content left right top bottom) -> do let cellPos = do rnum <- Map.lookup top rowindex cnum <- Map.lookup left colindex rs <- RowSpan . fromRowIndex . subtract rnum <$> Map.lookup bottom rowindex cs <- ColSpan . fromColIndex . subtract cnum <$> Map.lookup right colindex pure ((rnum, cnum), rs, cs) let (idx, rowspan, colspan) = case cellPos of Just cp -> cp Nothing -> error "A cell or row index was not found" writeArray tblgrid idx . FilledCell $ ContentCell rowspan colspan content forM_ (continuationIndices idx rowspan colspan) $ \contIdx -> do -- FIXME: ensure that the cell has not been filled yet writeArray tblgrid contIdx $ FilledCell (ContinuationCell idx) -- Swap BuilderCells with normal GridCells. let fromBuilderCell :: BuilderCell -> GridCell [Text] fromBuilderCell = \case FilledCell c -> c FreeCell -> -- Found an unassigned cell; replace with empty cell. TODO: This -- should be reported as a warning. ContentCell 1 1 mempty mapArray fromBuilderCell tblgrid -- | Calculate the array indices that are spanned by a cell. continuationIndices :: (RowIndex, ColIndex) -> RowSpan -> ColSpan -> [CellIndex] continuationIndices (RowIndex ridx, ColIndex cidx) rowspan colspan = let (RowSpan rs) = rowspan (ColSpan cs) = colspan in [ (RowIndex r, ColIndex c) | r <- [ridx..(ridx + rs - 1)] , c <- [cidx..(cidx + cs - 1)] , (r, c) /= (ridx, cidx)] -- | Helper type used to track which indices have been already been -- filled in a mutable cell array. data BuilderCell = FilledCell (GridCell [Text]) | FreeCell gridtables-0.1.0.0/test/0000755000000000000000000000000007346545000013136 5ustar0000000000000000gridtables-0.1.0.0/test/test-gridtables.hs0000644000000000000000000004257107346545000016600 0ustar0000000000000000{-# LANGUAGE FlexibleContexts #-} {-| Module : Main Copyright : © 2022 Albert Krewinkel License : MIT Maintainer : Albert Krewinkel Tests for the gridtables library. -} module Main (main) where import Data.Array (Array, listArray) import Data.Functor.Identity (Identity) import Data.Text (Text) import Text.GridTable import Text.GridTable.Parse (tableLine) import Text.Parsec import Test.Tasty (TestTree, defaultMain, testGroup) import Test.Tasty.HUnit ((@?=), testCase) import qualified Data.Text as T main :: IO () main = defaultMain $ testGroup "gridtables" [ linesTests , gridTableTests ] parse' :: ParsecT Text () Identity a -> Text -> Either ParseError a parse' p t = runParser p () "" t -- | Test parsing into lines linesTests :: TestTree linesTests = testGroup "lines" [ testCase "get lines" $ parse' (many1 tableLine) "| one | two |\n| three |\n| four |\n" @?= Right ([ "| one | two |" , "| three |" , "| four |" ]) , testCase "fail if not a table" $ parse' (many tableLine) "nope\nnada\n" @?= Right [] ] -- | Test parsing of a text as grid tables. gridTableTests :: TestTree gridTableTests = testGroup "parseArrayTable" [ testCase "single cell" $ let gt = T.unlines [ "+-----+" , "| one |" , "+-----+" ] gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex 1, ColIndex 1) ) in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray gbounds [ContentCell 1 1 [" one "]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [5] }) , testCase "multi-cell row" $ let gt = T.unlines [ "+-----+-----+" , "| one | two |" , "+-----+-----+" ] gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex 1, ColIndex 2) ) in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray gbounds [ ContentCell 1 1 [" one "] , ContentCell 1 1 [" two "] ] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [5, 5] }) , testCase "wide character" $ let gt = T.unlines [ "+----+------+" , "| 魚 | fish |" , "+----+------+" ] gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex 1, ColIndex 2) ) in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray gbounds [ ContentCell 1 1 [" 魚 "] , ContentCell 1 1 [" fish "] ] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [4, 6] }) , testCase "two-row table" $ let gt = T.unlines [ "+-----+" , "| one |" , "+-----+" , "| two |" , "+-----+" ] gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex 2, ColIndex 1) ) in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray gbounds [ ContentCell 1 1 [" one "] , ContentCell 1 1 [" two "] ] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [5] }) , testCase "rowspan" $ let gt = T.unlines [ "+-----+-------+" , "| one | two |" , "| +-------+" , "| | three |" , "+-----+-------+" ] gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex 2, ColIndex 2) ) in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray gbounds [ ContentCell 2 1 [" one ", " ", " "] , ContentCell 1 1 [" two "] , ContinuationCell (1, 1) , ContentCell 1 1 [" three "] ] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [5, 7] }) , testGroup "table head" [ testCase "simple head" $ let gt = T.unlines [ "+-----+-----+" , "| one | two |" , "+=====+=====+" , "| 1 | 2 |" , "+-----+-----+" ] gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex 2, ColIndex 2) ) in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray gbounds [ ContentCell 1 1 [" one "] , ContentCell 1 1 [" two "] , ContentCell 1 1 [" 1 "] , ContentCell 1 1 [" 2 "] ] , arrayTableHead = Just 1 , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [5, 5] }) , testCase "alignment markers" $ let gt = T.unlines [ "+------+--------+-------+" , "| left | center | right |" , "+:=====+:======:+======:+" , "| 1 | 2 | 3 |" , "+------+--------+-------+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (2, 3)) [ ContentCell 1 1 [" left "] , ContentCell 1 1 [" center "] , ContentCell 1 1 [" right "] , ContentCell 1 1 [" 1 "] , ContentCell 1 1 [" 2 "] , ContentCell 1 1 [" 3 "] ] , arrayTableHead = Just 1 , arrayTableFoot = Nothing , arrayTableColSpecs = listArray (1, 3) [ (AlignLeft, 6) , (AlignCenter, 8) , (AlignRight, 7) ] }) ] , testGroup "table foot" [ testCase "simple foot" $ let gt = T.unlines [ "+------+-------+" , "| Item | Price |" , "+======+=======+" , "| Eggs | 5£ |" , "+------+-------+" , "| Spam | 3£ |" , "+======+=======+" , "| Sum | 8£ |" , "+======+=======+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (4, 2)) [ ContentCell 1 1 [" Item "] , ContentCell 1 1 [" Price "] , ContentCell 1 1 [" Eggs "] , ContentCell 1 1 [" 5£ "] , ContentCell 1 1 [" Spam "] , ContentCell 1 1 [" 3£ "] , ContentCell 1 1 [" Sum "] , ContentCell 1 1 [" 8£ "] ] , arrayTableHead = Just 1 , arrayTableFoot = Just 4 , arrayTableColSpecs = defaultAlign [6, 7] }) , testCase "table without body" $ let gt = T.unlines [ "+------+-------+" , "| Item | Price |" , "+======+=======+" , "| Sum | 8£ |" , "+======+=======+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (2, 2)) [ ContentCell 1 1 [" Item "] , ContentCell 1 1 [" Price "] , ContentCell 1 1 [" Sum "] , ContentCell 1 1 [" 8£ "] ] , arrayTableHead = Just 1 , arrayTableFoot = Just 2 , arrayTableColSpecs = defaultAlign [6, 7] }) ] , testCase "marker in first line" $ let gt = T.unlines [ "+:-----+:------:+------:+" , "| left | center | right |" , "+------+--------+-------+" , "| a 1 | b 2 | c 3 |" , "+------+--------+-------+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (2, 3)) [ ContentCell 1 1 [" left "] , ContentCell 1 1 [" center "] , ContentCell 1 1 [" right "] , ContentCell 1 1 [" a 1 "] , ContentCell 1 1 [" b 2 "] , ContentCell 1 1 [" c 3 "] ] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = listArray (1, 3) [ (AlignLeft, 6) , (AlignCenter, 8) , (AlignRight, 7) ] }) , testGroup "Char widths" [ testCase "wide character" $ let gt = T.unlines [ "+--+---+" , "|魚| x |" , "+--+---+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (1, 2)) [ ContentCell 1 1 ["魚"] , ContentCell 1 1 [" x "]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [2, 3] }) , testCase "zero-width space" $ let gt = T.unlines [ "+--+---+" , "|x\8203y| z |" , "+--+---+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (1, 2)) [ ContentCell 1 1 ["x\8203y"] , ContentCell 1 1 [" z "]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [2, 3] }) , testCase "zero-width space after wide character" $ let gt = T.unlines [ "+---+---+" , "|魚\8203y| z |" , "+---+---+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (1, 2)) [ ContentCell 1 1 ["魚\8203y"] , ContentCell 1 1 [" z "]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [3, 3] }) , testCase "wide character after zero-width space" $ let gt = T.unlines [ "+---+---+" , "|y\8203魚| z |" , "+---+---+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (1, 2)) [ ContentCell 1 1 ["y\8203魚"] , ContentCell 1 1 [" z "]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [3, 3] }) , testCase "multiple zero-width characters" $ let gt = T.unlines [ "+--+---+" , "|a\8204\8205b| c |" , "+--+---+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (1, 2)) [ ContentCell 1 1 ["a\8204\8205b"] , ContentCell 1 1 [" c "]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [2, 3] }) , testCase "many wide chars" $ let gt = T.unlines [ "+----------+-+" , "|12345|a|" , "+----------+-+" ] in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (1, 2)) [ ContentCell 1 1 ["12345"] , ContentCell 1 1 ["a"]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [10, 1] }) ] , testCase "unterminated row" $ let gt = T.unlines [ "+-----+" , "| one" , "+-----+" ] gbounds = ( (RowIndex 1, ColIndex 1) , (RowIndex 1, ColIndex 1) ) in parse' gridTable gt @?= Right (ArrayTable { arrayTableCells = listArray gbounds [ ContentCell 1 1 [" one"]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [5] }) , testCase "trailing spaces" $ let ls = T.unlines [ "+---+ " , "| 1 | " , "+---+" ] in parse' gridTable ls @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (1,1)) [ ContentCell 1 1 [" 1 "]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [3] }) , testCase "top row that's too short" $ let ls = T.unlines [ "+---+-+" , "|one|two|" , "+---+---+" , "|one|two|" , "+---+---+" ] in parse' gridTable ls @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (2,2)) [ ContentCell 1 1 ["one"] , ContentCell 1 1 ["two"] , ContentCell 1 1 ["one"] , ContentCell 1 1 ["two"]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [3, 3] }) , testCase "all vertical seps in last column too short" $ let ls = T.unlines [ "+----+:-:+" , "|eins|long text|" , "+----+---+" , "|zwei|long text|" , "+----+---+" , "|drei|more text|" , "+----+---+" ] in parse' gridTable ls @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (3,2)) [ ContentCell 1 1 ["eins"] , ContentCell 1 1 ["long text"] , ContentCell 1 1 ["zwei"] , ContentCell 1 1 ["long text"] , ContentCell 1 1 ["drei"] , ContentCell 1 1 ["more text"] ] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [4, 9] }) , testCase "missing cell" $ let ls = T.unlines [ "+---+" , "|one|" , "+---+---+" , "|one|two|" , "+---+---+" ] in parse' gridTable ls @?= Right (ArrayTable { arrayTableCells = listArray ((1,1), (2,2)) [ ContentCell 1 1 ["one"] , ContentCell 1 1 [] , ContentCell 1 1 ["one"] , ContentCell 1 1 ["two"]] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [3, 3] }) , testCase "followed by non-empty line" $ let ls = T.unlines [ "+-----+" , "| one |" , "+-----+" , "text" ] in parse' (gridTable *> many1 letter) ls @?= Right "text" , testCase "followed by non-empty line after blank line" $ let gt = T.unlines [ "+-----+" , "| one |" , "+-----+" , "" , "Hi Mom!" ] in parse' (gridTable *> newline *> many1 (letter <|> space)) gt @?= Right "Hi Mom" , testGroup "access functions" [ testCase "rows" $ let gt = ArrayTable { arrayTableCells = listArray ((1, 1), (2, 2)) [ ContentCell 2 1 "1" , ContentCell 1 1 "2" , ContinuationCell (1, 1) , ContentCell 1 1 "3" ] , arrayTableHead = Nothing , arrayTableFoot = Nothing , arrayTableColSpecs = defaultAlign [5, 7] } :: ArrayTable Text in rows gt @?= [ [Cell "1" 2 1, Cell "2" 1 1] , [Cell "3" 1 1] ] ] ] defaultAlign :: [Int] -> Array ColIndex (Alignment, Int) defaultAlign widths = listArray (1, ColIndex (length widths)) $ map (\w -> (AlignDefault, w)) widths