text-manipulate-0.2.0.1/0000755000000000000000000000000012514150117013154 5ustar0000000000000000text-manipulate-0.2.0.1/text-manipulate.cabal0000644000000000000000000000461012514150117017262 0ustar0000000000000000name: text-manipulate version: 0.2.0.1 synopsis: Case conversion, word boundary manipulation, and textual subjugation. homepage: https://github.com/brendanhay/text-manipulate license: OtherLicense license-file: LICENSE author: Brendan Hay maintainer: Brendan Hay copyright: Copyright (c) 2014-2015 Brendan Hay category: Data, Text build-type: Simple extra-source-files: README.md cabal-version: >= 1.10 description: Manipulate identifiers and structurally non-complex pieces of text by delimiting word boundaries via a combination of whitespace, control-characters, and case-sensitivity. . Has support for common idioms like casing of programmatic variable names, taking, dropping, and splitting by word, and modifying the first character of a piece of text. . /Caution:/ this library makes heavy use of the library's internal loop optimisation framework. Since internal modules are not guaranteed to have a stable API there is potential for build breakage when the text dependency is upgraded. Consider yourself warned! source-repository head type: git location: git://github.com/brendanhay/text-manipulate.git library default-language: Haskell2010 hs-source-dirs: src ghc-options: -Wall exposed-modules: Data.Text.Manipulate , Data.Text.Lazy.Manipulate other-modules: Data.Text.Manipulate.Internal.Fusion , Data.Text.Manipulate.Internal.Types build-depends: base >= 4.5 && < 5 , text >= 1.1 benchmark benchmarks type: exitcode-stdio-1.0 default-language: Haskell2010 main-is: Main.hs hs-source-dirs: bench ghc-options: -Wall -O2 -threaded -with-rtsopts=-T build-depends: base , criterion >= 1.0.0.2 , text-manipulate , text test-suite tests type: exitcode-stdio-1.0 default-language: Haskell2010 hs-source-dirs: test main-is: Main.hs ghc-options: -Wall -threaded build-depends: base , text-manipulate , tasty >= 0.8 , tasty-hunit >= 0.8 , text text-manipulate-0.2.0.1/LICENSE0000644000000000000000000004052512514150117014167 0ustar0000000000000000Mozilla Public License Version 2.0 ================================== 1. Definitions -------------- 1.1. "Contributor" means each individual or legal entity that creates, contributes to the creation of, or owns Covered Software. 1.2. "Contributor Version" means the combination of the Contributions of others (if any) used by a Contributor and that particular Contributor's Contribution. 1.3. "Contribution" means Covered Software of a particular Contributor. 1.4. "Covered Software" means Source Code Form to which the initial Contributor has attached the notice in Exhibit A, the Executable Form of such Source Code Form, and Modifications of such Source Code Form, in each case including portions thereof. 1.5. "Incompatible With Secondary Licenses" means (a) that the initial Contributor has attached the notice described in Exhibit B to the Covered Software; or (b) that the Covered Software was made available under the terms of version 1.1 or earlier of the License, but not also under the terms of a Secondary License. 1.6. "Executable Form" means any form of the work other than Source Code Form. 1.7. "Larger Work" means a work that combines Covered Software with other material, in a separate file or files, that is not Covered Software. 1.8. "License" means this document. 1.9. "Licensable" means having the right to grant, to the maximum extent possible, whether at the time of the initial grant or subsequently, any and all of the rights conveyed by this License. 1.10. "Modifications" means any of the following: (a) any file in Source Code Form that results from an addition to, deletion from, or modification of the contents of Covered Software; or (b) any new file in Source Code Form that contains any Covered Software. 1.11. "Patent Claims" of a Contributor means any patent claim(s), including without limitation, method, process, and apparatus claims, in any patent Licensable by such Contributor that would be infringed, but for the grant of the License, by the making, using, selling, offering for sale, having made, import, or transfer of either its Contributions or its Contributor Version. 1.12. "Secondary License" means either the GNU General Public License, Version 2.0, the GNU Lesser General Public License, Version 2.1, the GNU Affero General Public License, Version 3.0, or any later versions of those licenses. 1.13. "Source Code Form" means the form of the work preferred for making modifications. 1.14. "You" (or "Your") means an individual or a legal entity exercising rights under this License. For legal entities, "You" includes any entity that controls, is controlled by, or is under common control with You. For purposes of this definition, "control" means (a) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (b) ownership of more than fifty percent (50%) of the outstanding shares or beneficial ownership of such entity. 2. License Grants and Conditions -------------------------------- 2.1. Grants Each Contributor hereby grants You a world-wide, royalty-free, non-exclusive license: (a) under intellectual property rights (other than patent or trademark) Licensable by such Contributor to use, reproduce, make available, modify, display, perform, distribute, and otherwise exploit its Contributions, either on an unmodified basis, with Modifications, or as part of a Larger Work; and (b) under Patent Claims of such Contributor to make, use, sell, offer for sale, have made, import, and otherwise transfer either its Contributions or its Contributor Version. 2.2. Effective Date The licenses granted in Section 2.1 with respect to any Contribution become effective for each Contribution on the date the Contributor first distributes such Contribution. 2.3. Limitations on Grant Scope The licenses granted in this Section 2 are the only rights granted under this License. No additional rights or licenses will be implied from the distribution or licensing of Covered Software under this License. Notwithstanding Section 2.1(b) above, no patent license is granted by a Contributor: (a) for any code that a Contributor has removed from Covered Software; or (b) for infringements caused by: (i) Your and any other third party's modifications of Covered Software, or (ii) the combination of its Contributions with other software (except as part of its Contributor Version); or (c) under Patent Claims infringed by Covered Software in the absence of its Contributions. This License does not grant any rights in the trademarks, service marks, or logos of any Contributor (except as may be necessary to comply with the notice requirements in Section 3.4). 2.4. Subsequent Licenses No Contributor makes additional grants as a result of Your choice to distribute the Covered Software under a subsequent version of this License (see Section 10.2) or under the terms of a Secondary License (if permitted under the terms of Section 3.3). 2.5. Representation Each Contributor represents that the Contributor believes its Contributions are its original creation(s) or it has sufficient rights to grant the rights to its Contributions conveyed by this License. 2.6. Fair Use This License is not intended to limit any rights You have under applicable copyright doctrines of fair use, fair dealing, or other equivalents. 2.7. Conditions Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted in Section 2.1. 3. Responsibilities ------------------- 3.1. Distribution of Source Form All distribution of Covered Software in Source Code Form, including any Modifications that You create or to which You contribute, must be under the terms of this License. You must inform recipients that the Source Code Form of the Covered Software is governed by the terms of this License, and how they can obtain a copy of this License. You may not attempt to alter or restrict the recipients' rights in the Source Code Form. 3.2. Distribution of Executable Form If You distribute Covered Software in Executable Form then: (a) such Covered Software must also be made available in Source Code Form, as described in Section 3.1, and You must inform recipients of the Executable Form how they can obtain a copy of such Source Code Form by reasonable means in a timely manner, at a charge no more than the cost of distribution to the recipient; and (b) You may distribute such Executable Form under the terms of this License, or sublicense it under different terms, provided that the license for the Executable Form does not attempt to limit or alter the recipients' rights in the Source Code Form under this License. 3.3. Distribution of a Larger Work You may create and distribute a Larger Work under terms of Your choice, provided that You also comply with the requirements of this License for the Covered Software. If the Larger Work is a combination of Covered Software with a work governed by one or more Secondary Licenses, and the Covered Software is not Incompatible With Secondary Licenses, this License permits You to additionally distribute such Covered Software under the terms of such Secondary License(s), so that the recipient of the Larger Work may, at their option, further distribute the Covered Software under the terms of either this License or such Secondary License(s). 3.4. Notices You may not remove or alter the substance of any license notices (including copyright notices, patent notices, disclaimers of warranty, or limitations of liability) contained within the Source Code Form of the Covered Software, except that You may alter any license notices to the extent required to remedy known factual inaccuracies. 3.5. Application of Additional Terms You may choose to offer, and to charge a fee for, warranty, support, indemnity or liability obligations to one or more recipients of Covered Software. However, You may do so only on Your own behalf, and not on behalf of any Contributor. You must make it absolutely clear that any such warranty, support, indemnity, or liability obligation is offered by You alone, and You hereby agree to indemnify every Contributor for any liability incurred by such Contributor as a result of warranty, support, indemnity or liability terms You offer. You may include additional disclaimers of warranty and limitations of liability specific to any jurisdiction. 4. Inability to Comply Due to Statute or Regulation --------------------------------------------------- If it is impossible for You to comply with any of the terms of this License with respect to some or all of the Covered Software due to statute, judicial order, or regulation then You must: (a) comply with the terms of this License to the maximum extent possible; and (b) describe the limitations and the code they affect. Such description must be placed in a text file included with all distributions of the Covered Software under this License. Except to the extent prohibited by statute or regulation, such description must be sufficiently detailed for a recipient of ordinary skill to be able to understand it. 5. Termination -------------- 5.1. The rights granted under this License will terminate automatically if You fail to comply with any of its terms. However, if You become compliant, then the rights granted under this License from a particular Contributor are reinstated (a) provisionally, unless and until such Contributor explicitly and finally terminates Your grants, and (b) on an ongoing basis, if such Contributor fails to notify You of the non-compliance by some reasonable means prior to 60 days after You have come back into compliance. Moreover, Your grants from a particular Contributor are reinstated on an ongoing basis if such Contributor notifies You of the non-compliance by some reasonable means, this is the first time You have received notice of non-compliance with this License from such Contributor, and You become compliant prior to 30 days after Your receipt of the notice. 5.2. If You initiate litigation against any entity by asserting a patent infringement claim (excluding declaratory judgment actions, counter-claims, and cross-claims) alleging that a Contributor Version directly or indirectly infringes any patent, then the rights granted to You by any and all Contributors for the Covered Software under Section 2.1 of this License shall terminate. 5.3. In the event of termination under Sections 5.1 or 5.2 above, all end user license agreements (excluding distributors and resellers) which have been validly granted by You or Your distributors under this License prior to termination shall survive termination. ************************************************************************ * * * 6. Disclaimer of Warranty * * ------------------------- * * * * Covered Software is provided under this License on an "as is" * * basis, without warranty of any kind, either expressed, implied, or * * statutory, including, without limitation, warranties that the * * Covered Software is free of defects, merchantable, fit for a * * particular purpose or non-infringing. The entire risk as to the * * quality and performance of the Covered Software is with You. * * Should any Covered Software prove defective in any respect, You * * (not any Contributor) assume the cost of any necessary servicing, * * repair, or correction. This disclaimer of warranty constitutes an * * essential part of this License. No use of any Covered Software is * * authorized under this License except under this disclaimer. * * * ************************************************************************ ************************************************************************ * * * 7. Limitation of Liability * * -------------------------- * * * * Under no circumstances and under no legal theory, whether tort * * (including negligence), contract, or otherwise, shall any * * Contributor, or anyone who distributes Covered Software as * * permitted above, be liable to You for any direct, indirect, * * special, incidental, or consequential damages of any character * * including, without limitation, damages for lost profits, loss of * * goodwill, work stoppage, computer failure or malfunction, or any * * and all other commercial damages or losses, even if such party * * shall have been informed of the possibility of such damages. This * * limitation of liability shall not apply to liability for death or * * personal injury resulting from such party's negligence to the * * extent applicable law prohibits such limitation. Some * * jurisdictions do not allow the exclusion or limitation of * * incidental or consequential damages, so this exclusion and * * limitation may not apply to You. * * * ************************************************************************ 8. Litigation ------------- Any litigation relating to this License may be brought only in the courts of a jurisdiction where the defendant maintains its principal place of business and such litigation shall be governed by laws of that jurisdiction, without reference to its conflict-of-law provisions. Nothing in this Section shall prevent a party's ability to bring cross-claims or counter-claims. 9. Miscellaneous ---------------- This License represents the complete agreement concerning the subject matter hereof. If any provision of this License is held to be unenforceable, such provision shall be reformed only to the extent necessary to make it enforceable. Any law or regulation which provides that the language of a contract shall be construed against the drafter shall not be used to construe this License against a Contributor. 10. Versions of the License --------------------------- 10.1. New Versions Mozilla Foundation is the license steward. Except as provided in Section 10.3, no one other than the license steward has the right to modify or publish new versions of this License. Each version will be given a distinguishing version number. 10.2. Effect of New Versions You may distribute the Covered Software under the terms of the version of the License under which You originally received the Covered Software, or under the terms of any subsequent version published by the license steward. 10.3. Modified Versions If you create software not governed by this License, and you want to create a new license for such software, you may create and use a modified version of this License if you rename the license and remove any references to the name of the license steward (except to note that such modified license differs from this License). 10.4. Distributing Source Code Form that is Incompatible With Secondary Licenses If You choose to distribute Source Code Form that is Incompatible With Secondary Licenses under the terms of this version of the License, the notice described in Exhibit B of this License must be attached. Exhibit A - Source Code Form License Notice ------------------------------------------- This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/. If it is not possible or desirable to put the notice in a particular file, then You may include the notice in a location (such as a LICENSE file in a relevant directory) where a recipient would be likely to look for such a notice. You may add additional accurate notices of copyright ownership. Exhibit B - "Incompatible With Secondary Licenses" Notice --------------------------------------------------------- This Source Code Form is "Incompatible With Secondary Licenses", as defined by the Mozilla Public License, v. 2.0. text-manipulate-0.2.0.1/README.md0000644000000000000000000000055612514150117014441 0ustar0000000000000000# Text Manipulate Manipulate identifiers and structurally non-complex pieces of text by delimiting word boundaries via a combination of whitespace, control-characters, and case-sensitivity. Has support for common idioms like casing of programmatic variable names, taking, dropping, and splitting text by word, and modifying the first character of a piece of text. text-manipulate-0.2.0.1/Setup.hs0000644000000000000000000000005612514150117014611 0ustar0000000000000000import Distribution.Simple main = defaultMain text-manipulate-0.2.0.1/test/0000755000000000000000000000000012514150117014133 5ustar0000000000000000text-manipulate-0.2.0.1/test/Main.hs0000644000000000000000000001527212514150117015362 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# OPTIONS_GHC -fno-warn-type-defaults #-} -- Module : Main -- Copyright : (c) 2014-2015 Brendan Hay -- License : This Source Code Form is subject to the terms of -- the Mozilla Public License, v. 2.0. -- A copy of the MPL can be found in the LICENSE file or -- you can obtain it at http://mozilla.org/MPL/2.0/. -- Maintainer : Brendan Hay -- Stability : experimental -- Portability : non-portable (GHC extensions) module Main (main) where import Data.Text (Text) import Data.Text.Manipulate import Test.Tasty import Test.Tasty.HUnit main :: IO () main = defaultMain $ testGroup "tests" [ exampleGroup "lowerHead" lowerHead [ "" , "title cased phrase" , "camelCasedPhrase" , "pascalCasedPhrase" , "snake_cased_phrase" , "spinal-cased-phrase" , "train-Cased-Phrase" , "1-Mixed_string AOK" , "a double--stop__phrase" , "hTML5" , "είναιΥπάρχουν πολλές-Αντίθετα" , "je_obecněÚvodní-Španěl" ] , exampleGroup "upperHead" upperHead [ "" , "Title cased phrase" , "CamelCasedPhrase" , "PascalCasedPhrase" , "Snake_cased_phrase" , "Spinal-cased-phrase" , "Train-Cased-Phrase" , "1-Mixed_string AOK" , "A double--stop__phrase" , "HTML5" , "ΕίναιΥπάρχουν πολλές-Αντίθετα" , "Je_obecněÚvodní-Španěl" ] , exampleGroup "takeWord" takeWord [ "" , "Title" , "camel" , "Pascal" , "snake" , "spinal" , "Train" , "1" , "a" , "HTML5" , "Είναι" , "Je" ] , exampleGroup "dropWord" dropWord [ "" , "cased phrase" , "CasedPhrase" , "CasedPhrase" , "cased_phrase" , "cased-phrase" , "Cased-Phrase" , "Mixed_string AOK" , "double-stop_phrase" , "" , "Υπάρχουν πολλές-Αντίθετα" , "obecněÚvodní-Španěl" ] , exampleGroup "stripWord" stripWord [ Nothing , Just "cased phrase" , Just "CasedPhrase" , Just "CasedPhrase" , Just "cased_phrase" , Just "cased-phrase" , Just "Cased-Phrase" , Just "Mixed_string AOK" , Just "double-stop_phrase" , Just "" , Just "Υπάρχουν πολλές-Αντίθετα" , Just "obecněÚvodní-Španěl" ] , testGroup "splitWords" [ ] , testGroup "indentLines" [ ] , testGroup "toAcronym" [ ] , testGroup "toOrdinal" [ testCase "1st" ("1st" @=? toOrdinal 1) , testCase "2nd" ("2nd" @=? toOrdinal 2) , testCase "3rd" ("3rd" @=? toOrdinal 3) , testCase "4th" ("4th" @=? toOrdinal 4) , testCase "5th" ("5th" @=? toOrdinal 5) , testCase "21st" ("21st" @=? toOrdinal 21) , testCase "33rd" ("33rd" @=? toOrdinal 33) , testCase "102nd" ("102nd" @=? toOrdinal 102) , testCase "203rd" ("203rd" @=? toOrdinal 203) ] , exampleGroup "toCamel" toCamel [ "" , "titleCasedPhrase" , "camelCasedPhrase" , "pascalCasedPhrase" , "snakeCasedPhrase" , "spinalCasedPhrase" , "trainCasedPhrase" , "1MixedStringAOK" , "aDoubleStopPhrase" , "hTML5" , "είναιΥπάρχουνΠολλέςΑντίθετα" , "jeObecněÚvodníŠpaněl" ] , exampleGroup "toPascal" toPascal [ "" , "TitleCasedPhrase" , "CamelCasedPhrase" , "PascalCasedPhrase" , "SnakeCasedPhrase" , "SpinalCasedPhrase" , "TrainCasedPhrase" , "1MixedStringAOK" , "ADoubleStopPhrase" , "HTML5" , "ΕίναιΥπάρχουνΠολλέςΑντίθετα" , "JeObecněÚvodníŠpaněl" ] , exampleGroup "toSnake" toSnake [ "" , "title_cased_phrase" , "camel_cased_phrase" , "pascal_cased_phrase" , "snake_cased_phrase" , "spinal_cased_phrase" , "train_cased_phrase" , "1_mixed_string_aok" , "a_double_stop_phrase" , "html5" , "είναι_υπάρχουν_πολλές_αντίθετα" , "je_obecně_úvodní_španěl" ] , exampleGroup "toSpinal" toSpinal [ "" , "title-cased-phrase" , "camel-cased-phrase" , "pascal-cased-phrase" , "snake-cased-phrase" , "spinal-cased-phrase" , "train-cased-phrase" , "1-mixed-string-aok" , "a-double-stop-phrase" , "html5" , "είναι-υπάρχουν-πολλές-αντίθετα" , "je-obecně-úvodní-španěl" ] , exampleGroup "toTrain" toTrain [ "" , "Title-Cased-Phrase" , "Camel-Cased-Phrase" , "Pascal-Cased-Phrase" , "Snake-Cased-Phrase" , "Spinal-Cased-Phrase" , "Train-Cased-Phrase" , "1-Mixed-String-AOK" , "A-Double-Stop-Phrase" , "HTML5" , "Είναι-Υπάρχουν-Πολλές-Αντίθετα" , "Je-Obecně-Úvodní-Španěl" ] , exampleGroup "toTitle" toTitle [ "" , "Title Cased Phrase" , "Camel Cased Phrase" , "Pascal Cased Phrase" , "Snake Cased Phrase" , "Spinal Cased Phrase" , "Train Cased Phrase" , "1 Mixed String AOK" , "A Double Stop Phrase" , "HTML5" , "Είναι Υπάρχουν Πολλές Αντίθετα" , "Je Obecně Úvodní Španěl" ] ] exampleGroup :: (Show a, Eq a) => TestName -> (Text -> a) -> [a] -> TestTree exampleGroup n f = testGroup n . zipWith run reference where run (c, x) y = testCase c (f x @=? y) reference = [ ("Empty", "") , ("Title", "Title cased phrase") , ("Camel", "camelCasedPhrase") , ("Pascal", "PascalCasedPhrase") , ("Snake", "snake_cased_phrase") , ("Spinal", "spinal-cased-phrase") , ("Train", "Train-Cased-Phrase") , ("Mixed", "1-Mixed_string AOK") , ("Double Stop", "a double--stop__phrase") , ("Acronym", "HTML5") , ("Greek", "ΕίναιΥπάρχουν πολλές-Αντίθετα") , ("Czech", "Je_obecněÚvodní-Španěl") ] text-manipulate-0.2.0.1/bench/0000755000000000000000000000000012514150117014233 5ustar0000000000000000text-manipulate-0.2.0.1/bench/Main.hs0000644000000000000000000000375512514150117015465 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} -- Module : Main -- Copyright : (c) 2014-2015 Brendan Hay -- License : This Source Code Form is subject to the terms of -- the Mozilla Public License, v. 2.0. -- A copy of the MPL can be found in the LICENSE file or -- you can obtain it at http://mozilla.org/MPL/2.0/. -- Maintainer : Brendan Hay -- Stability : experimental -- Portability : non-portable (GHC extensions) module Main (main) where import Criterion import Criterion.Main import Data.List (intersperse) import Data.Monoid import Data.Text (Text) import qualified Data.Text as Text import Data.Text.Manipulate main :: IO () main = defaultMain [ bgroup "Data.Text" [ bench "takeWord" $ whnf (Text.takeWhile (not . isWordBoundary)) phrase , bench "toCamel" $ whnf (lowerHead . mconcat . group Text.toTitle) phrase , bench "toPascal" $ whnf (upperHead . mconcat . group Text.toTitle) phrase , bench "toSnake" $ whnf (mconcat . intersperse "_" . group Text.toLower) phrase , bench "toSpinal" $ whnf (mconcat . intersperse "-" . group Text.toLower) phrase , bench "toTrain" $ whnf (mconcat . intersperse "-" . group Text.toTitle) phrase ] , bgroup "Data.Text.Case" [ bench "takeWord" $ whnf takeWord phrase , bench "toCamel" $ whnf toCamel phrase , bench "toPacal" $ whnf toPascal phrase , bench "toSnake" $ whnf toSnake phrase , bench "toSpinal" $ whnf toSpinal phrase , bench "toTrain" $ whnf toTrain phrase ] ] phrase :: Text phrase = "Supercalifragilistic, world! This-is A multipleDelimiter_String" group :: (Text -> Text) -> Text -> [Text] group f = map (f . Text.dropWhile isBoundary) . Text.groupBy (const (not . isWordBoundary)) text-manipulate-0.2.0.1/src/0000755000000000000000000000000012514150117013743 5ustar0000000000000000text-manipulate-0.2.0.1/src/Data/0000755000000000000000000000000012514150117014614 5ustar0000000000000000text-manipulate-0.2.0.1/src/Data/Text/0000755000000000000000000000000012514150117015540 5ustar0000000000000000text-manipulate-0.2.0.1/src/Data/Text/Manipulate.hs0000644000000000000000000001655012514150117020202 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE ViewPatterns #-} -- Module : Data.Text.Manipulate -- Copyright : (c) 2014-2015 Brendan Hay -- License : This Source Code Form is subject to the terms of -- the Mozilla Public License, v. 2.0. -- A copy of the MPL can be found in the LICENSE file or -- you can obtain it at http://mozilla.org/MPL/2.0/. -- Maintainer : Brendan Hay -- Stability : experimental -- Portability : non-portable (GHC extensions) -- | Manipulate identifiers and structurally non-complex pieces -- of text by delimiting word boundaries via a combination of whitespace, -- control-characters, and case-sensitivity. -- -- Assumptions have been made about word boundary characteristics inherint -- in predominantely English text, please see individual function documentation -- for further details and behaviour. module Data.Text.Manipulate ( -- * Strict vs lazy types -- $strict -- * Unicode -- $unicode -- * Fusion -- $fusion -- * Subwords -- ** Removing words takeWord , dropWord , stripWord -- ** Breaking on words , breakWord , splitWords -- * Character manipulation , lowerHead , upperHead , mapHead -- * Line manipulation , indentLines , prependLines -- * Ellipsis , toEllipsis , toEllipsisWith -- * Acronyms , toAcronym -- * Ordinals , toOrdinal -- * Casing , toTitle , toCamel , toPascal , toSnake , toSpinal , toTrain -- * Boundary predicates , isBoundary , isWordBoundary ) where import qualified Data.Char as Char import Data.List (intersperse) import Data.Monoid import Data.Text (Text) import qualified Data.Text as Text import qualified Data.Text.Lazy as LText import qualified Data.Text.Lazy.Manipulate as LMan import Data.Text.Manipulate.Internal.Fusion (strict) import qualified Data.Text.Manipulate.Internal.Fusion as Fusion import Data.Text.Manipulate.Internal.Types -- $strict -- This library provides functions for manipulating both strict and lazy Text types. -- The strict functions are provided by the "Data.Text.Manipulate" module, while the lazy -- functions are provided by the "Data.Text.Lazy.Manipulate" module. -- $unicode -- While this library supports Unicode in a similar fashion to the -- underlying library, -- more explicit Unicode handling of word boundaries can be found in the -- library. -- $fusion -- Many functions in this module are subject to fusion, meaning that -- a pipeline of such functions will usually allocate at most one Text value. -- -- Functions that can be fused by the compiler are documented with the -- phrase /Subject to fusion/. -- DEBUG: -- import Data.Text.Internal.Fusion (stream) -- import Data.Text.Internal.Fusion.Common (unstreamList) -- tokens = unstreamList . Fusion.tokenise . stream -- FIXME: -- dropWord "ALong" == "" -- | Lowercase the first character of a piece of text. -- -- >>> lowerHead "Title Cased" -- "title Cased" lowerHead :: Text -> Text lowerHead = mapHead Char.toLower -- | Uppercase the first character of a piece of text. -- -- >>> upperHead "snake_cased" -- "Snake_cased" upperHead :: Text -> Text upperHead = mapHead Char.toUpper -- | Apply a function to the first character of a piece of text. mapHead :: (Char -> Char) -> Text -> Text mapHead f x = case Text.uncons x of Just (c, cs) -> Text.singleton (f c) <> cs Nothing -> x -- | Indent newlines by the given number of spaces. -- -- /See:/ 'prependLines' indentLines :: Int -> Text -> Text indentLines n = prependLines (Text.replicate n " ") -- | Prepend newlines with the given separator prependLines :: Text -> Text -> Text prependLines sep = mappend sep . Text.unlines . intersperse sep . Text.lines -- | O(n) Truncate text to a specific length. -- If the text was truncated the ellipsis sign "..." will be appended. -- -- /See:/ 'toEllipsisWith' toEllipsis :: Int -> Text -> Text toEllipsis n = toEllipsisWith n "..." -- | O(n) Truncate text to a specific length. -- If the text was truncated the given ellipsis sign will be appended. toEllipsisWith :: Int -- ^ Length. -> Text -- ^ Ellipsis. -> Text -> Text toEllipsisWith n suf x | Text.length x > n = Text.take n x <> suf | otherwise = x -- | O(n) Returns the first word, or the original text if no word -- boundary is encountered. /Subject to fusion./ takeWord :: Text -> Text takeWord = strict Fusion.takeWord -- | O(n) Return the suffix after dropping the first word. If no word -- boundary is encountered, the result will be empty. /Subject to fusion./ dropWord :: Text -> Text dropWord = strict Fusion.dropWord -- | Break a piece of text after the first word boundary is encountered. -- -- >>> breakWord "PascalCasedVariable" -- ("Pacal", "CasedVariable") -- -- >>> breakWord "spinal-cased-variable" -- ("spinal", "cased-variable") breakWord :: Text -> (Text, Text) breakWord x = (takeWord x, dropWord x) -- | O(n) Return the suffix after removing the first word, or 'Nothing' -- if no word boundary is encountered. -- -- >>> stripWord "HTML5Spaghetti" -- Just "Spaghetti" -- -- >>> stripWord "noboundaries" -- Nothing stripWord :: Text -> Maybe Text stripWord x | Text.length y < Text.length x = Just y | otherwise = Nothing where y = dropWord x -- | O(n) Split into a list of words delimited by boundaries. -- -- >>> splitWords "SupercaliFrag_ilistic" -- ["Supercali","Frag","ilistic"] splitWords :: Text -> [Text] splitWords = go where go x = case breakWord x of (h, t) | Text.null h -> go t | Text.null t -> [h] | otherwise -> h : go t -- | O(n) Create an adhoc acronym from a piece of cased text. -- -- >>> toAcronym "AmazonWebServices" -- Just "AWS" -- -- >>> toAcronym "Learn-You Some_Haskell" -- Just "LYSH" -- -- >>> toAcronym "this_is_all_lowercase" -- Nothing toAcronym :: Text -> Maybe Text toAcronym (Text.filter Char.isUpper -> x) | Text.length x > 1 = Just x | otherwise = Nothing -- | Render an ordinal used to denote the position in an ordered sequence. -- -- >>> toOrdinal (101 :: Int) -- "101st" -- -- >>> toOrdinal (12 :: Int) -- "12th" toOrdinal :: Integral a => a -> Text toOrdinal = LText.toStrict . LMan.toOrdinal -- | O(n) Convert casing to @Title Cased Phrase@. /Subject to fusion./ toTitle :: Text -> Text toTitle = strict Fusion.toTitle -- | O(n) Convert casing to @camelCasedPhrase@. /Subject to fusion./ toCamel :: Text -> Text toCamel = strict Fusion.toCamel -- | O(n) Convert casing to @PascalCasePhrase@. /Subject to fusion./ toPascal :: Text -> Text toPascal = strict Fusion.toPascal -- | O(n) Convert casing to @snake_cased_phrase@. /Subject to fusion./ toSnake :: Text -> Text toSnake = strict Fusion.toSnake -- | O(n) Convert casing to @spinal-cased-phrase@. /Subject to fusion./ toSpinal :: Text -> Text toSpinal = strict Fusion.toSpinal -- | O(n) Convert casing to @Train-Cased-Phrase@. /Subject to fusion./ toTrain :: Text -> Text toTrain = strict Fusion.toTrain text-manipulate-0.2.0.1/src/Data/Text/Manipulate/0000755000000000000000000000000012514150117017637 5ustar0000000000000000text-manipulate-0.2.0.1/src/Data/Text/Manipulate/Internal/0000755000000000000000000000000012514150117021413 5ustar0000000000000000text-manipulate-0.2.0.1/src/Data/Text/Manipulate/Internal/Fusion.hs0000644000000000000000000001604712514150117023222 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE ViewPatterns #-} {-# LANGUAGE RankNTypes #-} -- Module : Data.Text.Manipulate.Internal.Fusion -- Copyright : (c) 2014-2015 Brendan Hay -- License : This Source Code Form is subject to the terms of -- the Mozilla Public License, v. 2.0. -- A copy of the MPL can be found in the LICENSE file or -- you can obtain it at http://mozilla.org/MPL/2.0/. -- Maintainer : Brendan Hay -- Stability : experimental -- Portability : non-portable (GHC extensions) module Data.Text.Manipulate.Internal.Fusion where import qualified Data.Char as Char import Data.Text (Text) import qualified Data.Text.Internal.Fusion as Fusion import Data.Text.Internal.Fusion.CaseMapping (upperMapping, lowerMapping) import Data.Text.Internal.Fusion.Common import Data.Text.Internal.Fusion.Types import qualified Data.Text.Internal.Lazy.Fusion as LFusion import qualified Data.Text.Lazy as LText import Data.Text.Manipulate.Internal.Types takeWord :: Stream Char -> Stream Char takeWord = transform (const Done) yield . tokenise {-# INLINE [0] takeWord #-} dropWord :: Stream Char -> Stream Char dropWord (tokenise -> Stream next0 s0 len) = Stream next (True :*: s0) len where next (skip :*: s) = case next0 s of Done -> Done Skip s' -> Skip (skip :*: s') Yield t s' -> case t of B '\0' -> Skip (False :*: s') B _ | skip -> Skip (False :*: s') B c -> Yield c (False :*: s') _ | skip -> Skip (skip :*: s') U c -> Yield c (skip :*: s') L c -> Yield c (skip :*: s') {-# INLINE [0] dropWord #-} toTitle :: Stream Char -> Stream Char toTitle = mapHead toUpper . transformWith (yield ' ') upper lower . tokenise {-# INLINE [0] toTitle #-} toCamel :: Stream Char -> Stream Char toCamel = mapHead toLower . transformWith skip' upper lower . tokenise {-# INLINE [0] toCamel #-} toPascal :: Stream Char -> Stream Char toPascal = mapHead toUpper . transformWith skip' upper lower . tokenise {-# INLINE [0] toPascal #-} toSnake :: Stream Char -> Stream Char toSnake = transform (yield '_') lower . tokenise {-# INLINE [0] toSnake #-} toSpinal :: Stream Char -> Stream Char toSpinal = transform (yield '-') lower . tokenise {-# INLINE [0] toSpinal #-} toTrain :: Stream Char -> Stream Char toTrain = mapHead toUpper . transformWith (yield '-') upper lower . tokenise {-# INLINE [0] toTrain #-} strict :: (Stream Char -> Stream Char) -> Text -> Text strict f t = Fusion.unstream (f (Fusion.stream t)) {-# INLINE [0] strict #-} lazy :: (Stream Char -> Stream Char) -> LText.Text -> LText.Text lazy f t = LFusion.unstream (f (LFusion.stream t)) {-# INLINE [0] lazy #-} skip' :: forall s. s -> Step (CC s) Char skip' s = Skip (CC s '\0' '\0') yield, upper, lower :: forall s. Char -> s -> Step (CC s) Char yield !c s = Yield c (CC s '\0' '\0') upper !c s = upperMapping c s lower !c s = lowerMapping c s -- | Step across word boundaries using a custom action, and transform -- both subsequent uppercase and lowercase characters uniformly. -- -- /See:/ 'transformWith' transform :: (forall s. s -> Step (CC s) Char) -- ^ Boundary action. -> (forall s. Char -> s -> Step (CC s) Char) -- ^ Character mapping. -> Stream Token -- ^ Input stream. -> Stream Char transform s m = transformWith s m m {-# INLINE [0] transform #-} -- | Step across word boundaries using a custom action, and transform -- subsequent characters after the word boundary is encountered with a mapping -- depending on case. transformWith :: (forall s. s -> Step (CC s) Char) -- ^ Boundary action. -> (forall s. Char -> s -> Step (CC s) Char) -- ^ Boundary mapping. -> (forall s. Char -> s -> Step (CC s) Char) -- ^ Subsequent character mapping. -> Stream Token -- ^ Input stream. -> Stream Char transformWith md mu mc (Stream next0 s0 len) = -- HINT: len incorrect when the boundary replacement yields a char. Stream next (CC (False :*: False :*: s0) '\0' '\0') len where next (CC (up :*: prev :*: s) '\0' _) = case next0 s of Done -> Done Skip s' -> Skip (CC (up :*: prev :*: s') '\0' '\0') Yield t s' -> case t of B _ -> md (False :*: True :*: s') U c | prev -> mu c (True :*: False :*: s') L c | prev -> mu c (False :*: False :*: s') U c | up -> mu c (True :*: False :*: s') U c -> mc c (True :*: False :*: s') L c -> mc c (False :*: False :*: s') next (CC s a b) = Yield a (CC s b '\0') {-# INLINE [0] transformWith #-} -- | A token representing characters and boundaries in a stream. data Token = B {-# UNPACK #-} !Char -- ^ Word boundary. | U {-# UNPACK #-} !Char -- ^ Upper case character. | L {-# UNPACK #-} !Char -- ^ Lower case character. deriving (Show) -- | Tokenise a character stream using the default 'isBoundary' predicate. -- -- /See:/ 'tokeniseWith' tokenise :: Stream Char -- ^ Input stream. -> Stream Token tokenise = tokeniseWith isBoundary {-# INLINE [0] tokenise #-} -- | Tokenise a character stream using a custom boundary predicate. tokeniseWith :: (Char -> Bool) -- ^ Boundary predicate. -> Stream Char -- ^ Input stream. -> Stream Token tokeniseWith f (Stream next0 s0 len) = -- HINT: len incorrect if there are adjacent boundaries, which are skipped. Stream next (CC (True :*: False :*: False :*: s0) '\0' '\0') len where next (CC (start :*: up :*: prev :*: s) '\0' _) = case next0 s of Done -> Done Skip s' -> Skip (CC (start :*: up :*: prev :*: s') '\0' '\0') Yield c s' | not b, start -> push | up -> push | b, prev -> Skip (step start) | otherwise -> push where push | b = Yield (B c) (step False) | u, skip = Yield (U c) (step False) | u = Yield (B '\0') (CC (False :*: u :*: b :*: s') c '\0') | otherwise = Yield (L c) (step False) step p = CC (p :*: u :*: b :*: s') '\0' '\0' skip = up || start || prev b = f c u = Char.isUpper c next (CC s a b) = Yield (U a) (CC s b '\0') {-# INLINE [0] tokeniseWith #-} mapHead :: (Stream Char -> Stream Char) -> Stream Char -> Stream Char mapHead f s = maybe s (\(x, s') -> f (singleton x) `append` s') (uncons s) {-# INLINE [0] mapHead #-} text-manipulate-0.2.0.1/src/Data/Text/Manipulate/Internal/Types.hs0000644000000000000000000000366512514150117023065 0ustar0000000000000000{-# LANGUAGE MagicHash #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE ViewPatterns #-} -- Module : Data.Text.Manipulate.Internal.Types -- Copyright : (c) 2014-2015 Brendan Hay -- License : This Source Code Form is subject to the terms of -- the Mozilla Public License, v. 2.0. -- A copy of the MPL can be found in the LICENSE file or -- you can obtain it at http://mozilla.org/MPL/2.0/. -- Maintainer : Brendan Hay -- Stability : experimental -- Portability : non-portable (GHC extensions) module Data.Text.Manipulate.Internal.Types where import Control.Monad import qualified Data.Char as Char import Data.Monoid import Data.Text.Lazy.Builder (Builder, singleton) import GHC.Base -- | Returns 'True' for any boundary or uppercase character. isWordBoundary :: Char -> Bool isWordBoundary c = Char.isUpper c || isBoundary c -- | Returns 'True' for any boundary character. isBoundary :: Char -> Bool isBoundary = not . Char.isAlphaNum ordinal :: Integral a => a -> Builder ordinal (toInteger -> n) = decimal n <> suf where suf | x `elem` [11..13] = "th" | y == 1 = "st" | y == 2 = "nd" | y == 3 = "rd" | otherwise = "th" (flip mod 100 -> x, flip mod 10 -> y) = join (,) (abs n) {-# NOINLINE[0] ordinal #-} decimal :: Integral a => a -> Builder {-# SPECIALIZE decimal :: Int -> Builder #-} decimal i | i < 0 = singleton '-' <> go (-i) | otherwise = go i where go n | n < 10 = digit n | otherwise = go (n `quot` 10) <> digit (n `rem` 10) {-# NOINLINE[0] decimal #-} digit :: Integral a => a -> Builder digit n = singleton $! i2d (fromIntegral n) {-# INLINE digit #-} i2d :: Int -> Char i2d (I# i#) = C# (chr# (ord# '0'# +# i#)) {-# INLINE i2d #-} text-manipulate-0.2.0.1/src/Data/Text/Lazy/0000755000000000000000000000000012514150117016457 5ustar0000000000000000text-manipulate-0.2.0.1/src/Data/Text/Lazy/Manipulate.hs0000644000000000000000000001616512514150117021123 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE ViewPatterns #-} -- Module : Data.Text.Lazy.Manipulate -- Copyright : (c) 2014-2015 Brendan Hay -- License : This Source Code Form is subject to the terms of -- the Mozilla Public License, v. 2.0. -- A copy of the MPL can be found in the LICENSE file or -- you can obtain it at http://mozilla.org/MPL/2.0/. -- Maintainer : Brendan Hay -- Stability : experimental -- Portability : non-portable (GHC extensions) -- | Manipulate identifiers and structurally non-complex pieces -- of text by delimiting word boundaries via a combination of whitespace, -- control-characters, and case-sensitivity. -- -- Assumptions have been made about word boundary characteristics inherint -- in predominantely English text, please see individual function documentation -- for further details and behaviour. module Data.Text.Lazy.Manipulate ( -- * Strict vs lazy types -- $strict -- * Unicode -- $unicode -- * Fusion -- $fusion -- * Subwords -- ** Removing words takeWord , dropWord , stripWord -- ** Breaking on words , breakWord , splitWords -- * Character manipulation , lowerHead , upperHead , mapHead -- * Line manipulation , indentLines , prependLines -- * Ellipsis , toEllipsis , toEllipsisWith -- * Acronyms , toAcronym -- * Ordinals , toOrdinal -- * Casing , toTitle , toCamel , toPascal , toSnake , toSpinal , toTrain -- * Boundary predicates , isBoundary , isWordBoundary ) where import qualified Data.Char as Char import Data.Int import Data.List (intersperse) import Data.Monoid import Data.Text.Lazy (Text) import qualified Data.Text.Lazy as LText import Data.Text.Lazy.Builder (toLazyText) import Data.Text.Manipulate.Internal.Fusion (lazy) import qualified Data.Text.Manipulate.Internal.Fusion as Fusion import Data.Text.Manipulate.Internal.Types -- $strict -- This library provides functions for manipulating both strict and lazy Text types. -- The strict functions are provided by the "Data.Text.Manipulate" module, while the lazy -- functions are provided by the "Data.Text.Lazy.Manipulate" module. -- $unicode -- While this library supports Unicode in a similar fashion to the -- underlying library, -- more explicit Unicode specific handling of word boundaries can be found in the -- library. -- $fusion -- Many functions in this module are subject to fusion, meaning that -- a pipeline of such functions will usually allocate at most one Text value. -- -- Functions that can be fused by the compiler are documented with the -- phrase /Subject to fusion/. -- | Lowercase the first character of a piece of text. -- -- >>> lowerHead "Title Cased" -- "title Cased" lowerHead :: Text -> Text lowerHead = mapHead Char.toLower -- | Uppercase the first character of a piece of text. -- -- >>> upperHead "snake_cased" -- "Snake_cased" upperHead :: Text -> Text upperHead = mapHead Char.toUpper -- | Apply a function to the first character of a piece of text. mapHead :: (Char -> Char) -> Text -> Text mapHead f x = case LText.uncons x of Just (c, cs) -> LText.singleton (f c) <> cs Nothing -> x -- | Indent newlines by the given number of spaces. indentLines :: Int -> Text -> Text indentLines n = prependLines (LText.replicate (fromIntegral n) " ") -- | Prepend newlines with the given separator prependLines :: Text -> Text -> Text prependLines sep = mappend sep . LText.unlines . intersperse sep . LText.lines -- | O(n) Truncate text to a specific length. -- If the text was truncated the ellipsis sign "..." will be appended. -- -- /See:/ 'toEllipsisWith' toEllipsis :: Int64 -> Text -> Text toEllipsis n = toEllipsisWith n "..." -- | O(n) Truncate text to a specific length. -- If the text was truncated the given ellipsis sign will be appended. toEllipsisWith :: Int64 -- ^ Length. -> Text -- ^ Ellipsis. -> Text -> Text toEllipsisWith n suf x | LText.length x > n = LText.take n x <> suf | otherwise = x -- | O(n) Returns the first word, or the original text if no word -- boundary is encountered. /Subject to fusion./ takeWord :: Text -> Text takeWord = lazy Fusion.takeWord -- | O(n) Return the suffix after dropping the first word. If no word -- boundary is encountered, the result will be empty. /Subject to fusion./ dropWord :: Text -> Text dropWord = lazy Fusion.dropWord -- | Break a piece of text after the first word boundary is encountered. -- -- >>> breakWord "PascalCasedVariable" -- ("Pacal", "CasedVariable") -- -- >>> breakWord "spinal-cased-variable" -- ("spinal", "cased-variable") breakWord :: Text -> (Text, Text) breakWord x = (takeWord x, dropWord x) -- | O(n) Return the suffix after removing the first word, or 'Nothing' -- if no word boundary is encountered. -- -- >>> stripWord "HTML5Spaghetti" -- Just "Spaghetti" -- -- >>> stripWord "noboundaries" -- Nothing stripWord :: Text -> Maybe Text stripWord x | LText.length y < LText.length x = Just y | otherwise = Nothing where y = dropWord x -- | O(n) Split into a list of words delimited by boundaries. -- -- >>> splitWords "SupercaliFrag_ilistic" -- ["Supercali","Frag","ilistic"] splitWords :: Text -> [Text] splitWords = go where go x = case breakWord x of (h, t) | LText.null h -> go t | LText.null t -> [h] | otherwise -> h : go t -- | O(n) Create an adhoc acronym from a piece of cased text. -- -- >>> toAcronym "AmazonWebServices" -- Just "AWS" -- -- >>> toAcronym "Learn-You Some_Haskell" -- Just "LYSH" -- -- >>> toAcronym "this_is_all_lowercase" -- Nothing toAcronym :: Text -> Maybe Text toAcronym (LText.filter Char.isUpper -> x) | LText.length x > 1 = Just x | otherwise = Nothing -- | Render an ordinal used to denote the position in an ordered sequence. -- -- >>> toOrdinal (101 :: Int) -- "101st" -- -- >>> toOrdinal (12 :: Int) -- "12th" toOrdinal :: Integral a => a -> Text toOrdinal = toLazyText . ordinal -- | O(n) Convert casing to @Title Cased Phrase@. /Subject to fusion./ toTitle :: Text -> Text toTitle = lazy Fusion.toTitle -- | O(n) Convert casing to @camelCasedPhrase@. /Subject to fusion./ toCamel :: Text -> Text toCamel = lazy Fusion.toCamel -- | O(n) Convert casing to @PascalCasePhrase@. /Subject to fusion./ toPascal :: Text -> Text toPascal = lazy Fusion.toPascal -- | O(n) Convert casing to @snake_cased_phrase@. /Subject to fusion./ toSnake :: Text -> Text toSnake = lazy Fusion.toSnake -- | O(n) Convert casing to @spinal-cased-phrase@. /Subject to fusion./ toSpinal :: Text -> Text toSpinal = lazy Fusion.toSpinal -- | O(n) Convert casing to @Train-Cased-Phrase@. /Subject to fusion./ toTrain :: Text -> Text toTrain = lazy Fusion.toTrain