xss-sanitize-0.3.4/ 0000755 0000000 0000000 00000000000 12164063407 012350 5 ustar 00 0000000 0000000 xss-sanitize-0.3.4/README.md 0000644 0000000 0000000 00000012674 12164063407 013641 0 ustar 00 0000000 0000000 # Summary
xss-sanitize allows you to accept html from untrusted sources by first filtering it through a white list.
For example, this allows a web application to safely use a rich text editor or allow html in comments.
If you trust the HTML (you wrote it), you do not need to use this.
If you don't trust the html you probably also do not trust that the tags are balanced and should use the sanitizeBalance function.
# Usage
provides 2 functions in the module Text.HTML.SanitizeXSS
* sanitize - filters html to prevent XSS attacks.
* sanitizeBalance - same as sanitize but makes sure there are no lone opening/closing tags - useful to protect against a user's html messing up your page
# Details
This is not escaping! Escaping html does prevent XSS attacks. Strings (that aren't meant to be HTML) should be HTML escaped to show up properly and to prevent XSS attacks. However, escaping will ruin the display of actual HTML.
This function removes any HTML tags or attributes that are not in its white-list. This may sound picky, but most HTML should make it through unchanged, making the process unnoticeable to the user but giving us safe HTML.
## Integration
It is recommended to integrate this so that it is automatically used whenever an application receives untrusted html data (instead of before it is displayed). See the Yesod web framework as an example.
## Limitations
### Lowercase
All tag names and attribute names are converted to lower case as a matter of convenience. If you have a use case where this is undesirable let me know.
### Balancing - sanitizeBalance
The goal of this function is to prevent your html from breaking when (unknown) html with unbalanced tags are placed inside it. I would expect it to work very well in practice and don't see a downside to using it unless you have an alternative approach. However, this function does not at all guarantee valid html. In fact, it is likely that the result of balancing will still be invalid HTML. There is no guarantee for how a browser will display invalid HTML, so there is no guarantee that this function will protect your HTML from being broken by a user's html. Other possible approaches would be to run the HTML through a library like libxml2 which understands HTML or to first render the HTML in a hidden iframe or hidden div at the bottom of the page so that it is isolated, and then use JavaScript to insert it into the page where you want it.
### TagSoup Parser
TagSoup is used to parse the HTML, and it does a good job. However TagSoup does not maintain all white space. TagSoup does not distinguish between the following cases:
, , ,
In the third case, img and br tags will be output as a single self-closing tags. Other self-closing tags will be output as an open and closing pair. So ` or ` converts to ``, and ` or ` converts to ``. There are future updates to TagSoup planned so that TagSoup will be able to render tags exactly the same as they were parsed.
## Security
### Where is the white list from?
Ultimately this is where your security comes from. I would expect that a faulty white list would act as a strong deterrent, but this library strives for correctness.
The [source code of html5lib](http://code.google.com/p/html5lib/source/browse/python/html5lib/sanitizer.py) is the source of the white list and my implementation reference. They reference [a wiki page containing a white list](http://wiki.whatwg.org/wiki/Sanitization_rules), and hopefully they are careful of when they import into their code. Working with the maintainers of html5lib may make sense, but it doesn't make sense to merge the projects because sanitization is just one aspect of html5lib (They have a parser also).
If anyone knows of better sources or thinks a particular tag/attribute/value may be vulnerable, please let me know.
[HTML Purifier](http://htmlpurifier.org/live/smoketests/printDefinition.php) does have a more permissive and configurable (yet safe) white list if you are looking to add anything.
### Where is the code from?
Original code was taken from John MacFarlane's Pandoc (with permission), but modified by Greg Weber to be faster and with parsing redone using TagSoup, and to use html5lib's white list.
Michael Snoyman added the balanced tags functionality and released css-text specifically to help with css parsing.
html5lib's sanitizer.py is used as a reference implementation, and most of the code should look the same. For css parsing, html5lib's regexes were translated to a parser.
### style attribute
style attributes are now parsed with the css-text and autoparsec-text dependencies. They are then ran through a white list for properties and keywords. Whitespace is not preserved. This code was again translated from sanitizer.py, but uses attopoarsec instead of regexes. If you don't care about stripping css you can avoid the attoparsec dependendcy by using the older < 0.3 version of this library.
### data attributes
data attributes are not on the white list.
The href and style attributes are white listed, but its values must pass through a white list also. This is how the data attributes could work also.
### svg and mathml
A mathml white list is fully implemented. There is some support for svg styling.
There is a full white list for svg elements and attributes. However, some elements are not included because they need further filtering (just like the data attributes) and this has not been done yet.
xss-sanitize-0.3.4/xss-sanitize.cabal 0000644 0000000 0000000 00000003363 12164063407 016002 0 ustar 00 0000000 0000000 name: xss-sanitize
version: 0.3.4
license: BSD3
license-file: LICENSE
author: Greg Weber
maintainer: Greg Weber
synopsis: sanitize untrusted HTML to prevent XSS attacks
description: run untrusted HTML through Text.HTML.SanitizeXSS.sanitizeXSS to prevent XSS attacks. see README.md for more details
category: Web
stability: Stable
cabal-version: >= 1.8
build-type: Simple
homepage: http://github.com/yesodweb/haskell-xss-sanitize
extra-source-files: README.md
library
build-depends: base == 4.*, containers
, tagsoup >= 0.12.2 && < 1
, utf8-string >= 0.3 && < 1
, network >= 2 && < 3
, css-text >= 0.1.1 && < 0.2
, text >= 0.11 && < 1
, attoparsec >= 0.10.0.3 && < 1
exposed-modules: Text.HTML.SanitizeXSS
other-modules: Text.HTML.SanitizeXSS.Css
ghc-options: -Wall
test-suite test
type: exitcode-stdio-1.0
main-is: test/main.hs
cpp-options: -DTEST
build-depends: base == 4.* , containers
, tagsoup >= 0.12.2 && < 1
, utf8-string >= 0.3 && < 1
, network >= 2 && < 3
, css-text >= 0.1.1 && < 0.2
, text >= 0.11 && < 1
, attoparsec >= 0.10.0.3 && < 1
, hspec >= 1.3
, HUnit >= 1.2
source-repository head
type: git
location: http://github.com/yesodweb/haskell-xss-sanitize.git
xss-sanitize-0.3.4/LICENSE 0000644 0000000 0000000 00000002523 12164063407 013357 0 ustar 00 0000000 0000000 The following license covers this documentation, and the source code, except
where otherwise indicated.
Copyright 2010, Greg Weber. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS "AS IS" AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
xss-sanitize-0.3.4/Setup.lhs 0000644 0000000 0000000 00000000162 12164063407 014157 0 ustar 00 0000000 0000000 #!/usr/bin/env runhaskell
> module Main where
> import Distribution.Simple
> main :: IO ()
> main = defaultMain
xss-sanitize-0.3.4/Text/ 0000755 0000000 0000000 00000000000 12164063407 013274 5 ustar 00 0000000 0000000 xss-sanitize-0.3.4/Text/HTML/ 0000755 0000000 0000000 00000000000 12164063407 014040 5 ustar 00 0000000 0000000 xss-sanitize-0.3.4/Text/HTML/SanitizeXSS.hs 0000644 0000000 0000000 00000027300 12164063407 016562 0 ustar 00 0000000 0000000 {-# LANGUAGE OverloadedStrings #-}
-- | Sanatize HTML to prevent XSS attacks.
--
-- See README.md for more details.
module Text.HTML.SanitizeXSS
(
-- * Sanitize
sanitize
, sanitizeBalance
, sanitizeXSS
-- * Custom filtering
, filterTags
, safeTags
, balanceTags
-- * Utilities
, sanitizeAttribute
) where
import Text.HTML.SanitizeXSS.Css
import Text.HTML.TagSoup
import Data.Set (Set(), member, notMember, (\\), fromList, fromAscList)
import Data.Char ( toLower )
import Data.Text (Text)
import qualified Data.Text as T
import Network.URI ( parseURIReference, URI (..),
isAllowedInURI, escapeURIString, uriScheme )
import Codec.Binary.UTF8.String ( encodeString )
import qualified Data.Map as Map
import Data.Maybe (catMaybes)
-- | Sanitize HTML to prevent XSS attacks. This is equivalent to @filterTags safeTags@.
sanitize :: Text -> Text
sanitize = sanitizeXSS
-- | alias of sanitize function
sanitizeXSS :: Text -> Text
sanitizeXSS = filterTags safeTags
-- | Sanitize HTML to prevent XSS attacks and also make sure the tags are balanced.
-- This is equivalent to @filterTags (balanceTags . safeTags)@.
sanitizeBalance :: Text -> Text
sanitizeBalance = filterTags (balanceTags . safeTags)
-- | Filter which makes sure the tags are balanced. Use with 'filterTags' and 'safeTags' to create a custom filter.
balanceTags :: [Tag Text] -> [Tag Text]
balanceTags = balance Map.empty
-- | Parse the given text to a list of tags, apply the given filtering function, and render back to HTML.
-- You can insert your own custom filtering but make sure you compose your filtering function with 'safeTags'!
filterTags :: ([Tag Text] -> [Tag Text]) -> Text -> Text
filterTags f = renderTagsOptions renderOptions {
optMinimize = \x -> x `member` voidElems -- converts to , converts to
} . f . canonicalizeTags . parseTags
voidElems :: Set T.Text
voidElems = fromAscList $ T.words $ T.pack "area base br col command embed hr img input keygen link meta param source track wbr"
balance :: Map.Map Text Int -> [Tag Text] -> [Tag Text]
balance m [] =
concatMap go $ Map.toList m
where
go (name, i)
| noClosing name = []
| otherwise = replicate i $ TagClose name
noClosing = flip member voidElems
balance m (t@(TagClose name):tags) =
case Map.lookup name m of
Nothing -> TagOpen name [] : TagClose name : balance m tags
Just i ->
let m' = if i == 1
then Map.delete name m
else Map.insert name (i - 1) m
in t : balance m' tags
balance m (TagOpen name as : tags) =
TagOpen name as : balance m' tags
where
m' = case Map.lookup name m of
Nothing -> Map.insert name 1 m
Just i -> Map.insert name (i + 1) m
balance m (t:ts) = t : balance m ts
-- | Filters out any usafe tags and attributes. Use with filterTags to create a custom filter.
safeTags :: [Tag Text] -> [Tag Text]
safeTags [] = []
safeTags (t@(TagClose name):tags)
| safeTagName name = t : safeTags tags
| otherwise = safeTags tags
safeTags (TagOpen name attributes:tags)
| safeTagName name = TagOpen name
(catMaybes $ map sanitizeAttribute attributes) : safeTags tags
| otherwise = safeTags tags
safeTags (t:tags) = t:safeTags tags
safeTagName :: Text -> Bool
safeTagName tagname = tagname `member` sanitaryTags
safeAttribute :: (Text, Text) -> Bool
safeAttribute (name, value) = name `member` sanitaryAttributes &&
(name `notMember` uri_attributes || sanitaryURI value)
-- | low-level API if you have your own HTML parser. Used by safeTags.
sanitizeAttribute :: (Text, Text) -> Maybe (Text, Text)
sanitizeAttribute ("style", value) =
let css = sanitizeCSS value
in if T.null css then Nothing else Just ("style", css)
sanitizeAttribute attr | safeAttribute attr = Just attr
| otherwise = Nothing
-- | Returns @True@ if the specified URI is not a potential security risk.
sanitaryURI :: Text -> Bool
sanitaryURI u =
case parseURIReference (escapeURI $ T.unpack u) of
Just p -> (null (uriScheme p)) ||
((map toLower $ init $ uriScheme p) `member` safeURISchemes)
Nothing -> False
-- | Escape unicode characters in a URI. Characters that are
-- already valid in a URI, including % and ?, are left alone.
escapeURI :: String -> String
escapeURI = escapeURIString isAllowedInURI . encodeString
safeURISchemes :: Set String
safeURISchemes = fromList acceptable_protocols
sanitaryTags :: Set Text
sanitaryTags = fromList (acceptable_elements ++ mathml_elements ++ svg_elements)
\\ (fromList svg_allow_local_href) -- extra filtering not implemented
sanitaryAttributes :: Set Text
sanitaryAttributes = fromList (allowed_html_uri_attributes ++ acceptable_attributes ++ mathml_attributes ++ svg_attributes)
\\ (fromList svg_attr_val_allows_ref) -- extra unescaping not implemented
allowed_html_uri_attributes :: [Text]
allowed_html_uri_attributes = ["href", "src", "cite", "action", "longdesc"]
uri_attributes :: Set Text
uri_attributes = fromList $ allowed_html_uri_attributes ++ ["xlink:href", "xml:base"]
acceptable_elements :: [Text]
acceptable_elements = ["a", "abbr", "acronym", "address", "area",
"article", "aside", "audio", "b", "big", "blockquote", "br", "button",
"canvas", "caption", "center", "cite", "code", "col", "colgroup",
"command", "datagrid", "datalist", "dd", "del", "details", "dfn",
"dialog", "dir", "div", "dl", "dt", "em", "event-source", "fieldset",
"figure", "footer", "font", "form", "header", "h1", "h2", "h3", "h4",
"h5", "h6", "hr", "i", "img", "input", "ins", "keygen", "kbd",
"label", "legend", "li", "m", "map", "menu", "meter", "multicol",
"nav", "nextid", "ol", "output", "optgroup", "option", "p", "pre",
"progress", "q", "s", "samp", "section", "select", "small", "sound",
"source", "spacer", "span", "strike", "strong", "sub", "sup", "table",
"tbody", "td", "textarea", "time", "tfoot", "th", "thead", "tr", "tt",
"u", "ul", "var", "video"]
mathml_elements :: [Text]
mathml_elements = ["maction", "math", "merror", "mfrac", "mi",
"mmultiscripts", "mn", "mo", "mover", "mpadded", "mphantom",
"mprescripts", "mroot", "mrow", "mspace", "msqrt", "mstyle", "msub",
"msubsup", "msup", "mtable", "mtd", "mtext", "mtr", "munder",
"munderover", "none"]
-- this should include altGlyph I think
svg_elements :: [Text]
svg_elements = ["a", "animate", "animateColor", "animateMotion",
"animateTransform", "clipPath", "circle", "defs", "desc", "ellipse",
"font-face", "font-face-name", "font-face-src", "g", "glyph", "hkern",
"linearGradient", "line", "marker", "metadata", "missing-glyph",
"mpath", "path", "polygon", "polyline", "radialGradient", "rect",
"set", "stop", "svg", "switch", "text", "title", "tspan", "use"]
acceptable_attributes :: [Text]
acceptable_attributes = ["abbr", "accept", "accept-charset", "accesskey",
"align", "alt", "autocomplete", "autofocus", "axis",
"background", "balance", "bgcolor", "bgproperties", "border",
"bordercolor", "bordercolordark", "bordercolorlight", "bottompadding",
"cellpadding", "cellspacing", "ch", "challenge", "char", "charoff",
"choff", "charset", "checked", "class", "clear", "color",
"cols", "colspan", "compact", "contenteditable", "controls", "coords",
-- "data", TODO: allow this with further filtering
"datafld", "datapagesize", "datasrc", "datetime", "default",
"delay", "dir", "disabled", "draggable", "dynsrc", "enctype", "end",
"face", "for", "form", "frame", "galleryimg", "gutter", "headers",
"height", "hidefocus", "hidden", "high", "hreflang", "hspace",
"icon", "id", "inputmode", "ismap", "keytype", "label", "leftspacing",
"lang", "list", "loop", "loopcount", "loopend",
"loopstart", "low", "lowsrc", "max", "maxlength", "media", "method",
"min", "multiple", "name", "nohref", "noshade", "nowrap", "open",
"optimum", "pattern", "ping", "point-size", "prompt", "pqg",
"radiogroup", "readonly", "rel", "repeat-max", "repeat-min",
"replace", "required", "rev", "rightspacing", "rows", "rowspan",
"rules", "scope", "selected", "shape", "size", "span", "start",
"step",
"style", -- gets further filtering
"summary", "suppress", "tabindex", "target",
"template", "title", "toppadding", "type", "unselectable", "usemap",
"urn", "valign", "value", "variable", "volume", "vspace", "vrml",
"width", "wrap", "xml:lang"]
acceptable_protocols :: [String]
acceptable_protocols = [ "ed2k", "ftp", "http", "https", "irc",
"mailto", "news", "gopher", "nntp", "telnet", "webcal",
"xmpp", "callto", "feed", "urn", "aim", "rsync", "tag",
"ssh", "sftp", "rtsp", "afs" ]
mathml_attributes :: [Text]
mathml_attributes = ["actiontype", "align", "columnalign", "columnalign",
"columnalign", "columnlines", "columnspacing", "columnspan", "depth",
"display", "displaystyle", "equalcolumns", "equalrows", "fence",
"fontstyle", "fontweight", "frame", "height", "linethickness", "lspace",
"mathbackground", "mathcolor", "mathvariant", "mathvariant", "maxsize",
"minsize", "other", "rowalign", "rowalign", "rowalign", "rowlines",
"rowspacing", "rowspan", "rspace", "scriptlevel", "selection",
"separator", "stretchy", "width", "width", "xlink:href", "xlink:show",
"xlink:type", "xmlns", "xmlns:xlink"]
svg_attributes :: [Text]
svg_attributes = ["accent-height", "accumulate", "additive", "alphabetic",
"arabic-form", "ascent", "attributeName", "attributeType",
"baseProfile", "bbox", "begin", "by", "calcMode", "cap-height",
"class", "clip-path", "color", "color-rendering", "content", "cx",
"cy", "d", "dx", "dy", "descent", "display", "dur", "end", "fill",
"fill-opacity", "fill-rule", "font-family", "font-size",
"font-stretch", "font-style", "font-variant", "font-weight", "from",
"fx", "fy", "g1", "g2", "glyph-name", "gradientUnits", "hanging",
"height", "horiz-adv-x", "horiz-origin-x", "id", "ideographic", "k",
"keyPoints", "keySplines", "keyTimes", "lang", "marker-end",
"marker-mid", "marker-start", "markerHeight", "markerUnits",
"markerWidth", "mathematical", "max", "min", "name", "offset",
"opacity", "orient", "origin", "overline-position",
"overline-thickness", "panose-1", "path", "pathLength", "points",
"preserveAspectRatio", "r", "refX", "refY", "repeatCount",
"repeatDur", "requiredExtensions", "requiredFeatures", "restart",
"rotate", "rx", "ry", "slope", "stemh", "stemv", "stop-color",
"stop-opacity", "strikethrough-position", "strikethrough-thickness",
"stroke", "stroke-dasharray", "stroke-dashoffset", "stroke-linecap",
"stroke-linejoin", "stroke-miterlimit", "stroke-opacity",
"stroke-width", "systemLanguage", "target", "text-anchor", "to",
"transform", "type", "u1", "u2", "underline-position",
"underline-thickness", "unicode", "unicode-range", "units-per-em",
"values", "version", "viewBox", "visibility", "width", "widths", "x",
"x-height", "x1", "x2", "xlink:actuate", "xlink:arcrole",
"xlink:href", "xlink:role", "xlink:show", "xlink:title", "xlink:type",
"xml:base", "xml:lang", "xml:space", "xmlns", "xmlns:xlink", "y",
"y1", "y2", "zoomAndPan"]
-- the values for these need to be escaped
svg_attr_val_allows_ref :: [Text]
svg_attr_val_allows_ref = ["clip-path", "color-profile", "cursor", "fill",
"filter", "marker", "marker-start", "marker-mid", "marker-end",
"mask", "stroke"]
svg_allow_local_href :: [Text]
svg_allow_local_href = ["altGlyph", "animate", "animateColor",
"animateMotion", "animateTransform", "cursor", "feImage", "filter",
"linearGradient", "pattern", "radialGradient", "textpath", "tref",
"set", "use"]
xss-sanitize-0.3.4/Text/HTML/SanitizeXSS/ 0000755 0000000 0000000 00000000000 12164063407 016224 5 ustar 00 0000000 0000000 xss-sanitize-0.3.4/Text/HTML/SanitizeXSS/Css.hs 0000644 0000000 0000000 00000011406 12164063407 017312 0 ustar 00 0000000 0000000 {-# LANGUAGE OverloadedStrings, CPP #-}
module Text.HTML.SanitizeXSS.Css (
sanitizeCSS
#ifdef TEST
, allowedCssAttributeValue
#endif
) where
import Data.Text (Text)
import qualified Data.Text as T
import Data.Attoparsec.Text
import Data.Text.Lazy.Builder (toLazyText)
import Data.Text.Lazy (toStrict)
import Data.Set (member, fromList, Set)
import Data.Char (isDigit)
import Control.Applicative ((<|>), pure)
import Text.CSS.Render (renderAttrs)
import Text.CSS.Parse (parseAttrs)
import Prelude hiding (takeWhile)
-- import FileLocation (debug, debugM)
-- this is a direct translation from sanitizer.py, except
-- sanitizer.py filters out url(), but this is redundant
sanitizeCSS :: Text -> Text
sanitizeCSS css = toStrict . toLazyText .
renderAttrs . filter isSanitaryAttr . filterUrl $ parseAttributes
where
filterUrl :: [(Text,Text)] -> [(Text,Text)]
filterUrl = map filterUrlAttribute
where
filterUrlAttribute :: (Text, Text) -> (Text, Text)
filterUrlAttribute (prop,value) =
case parseOnly rejectUrl value of
Left _ -> (prop,value)
Right noUrl -> filterUrlAttribute (prop, noUrl)
rejectUrl = do
pre <- manyTill anyChar (string "url")
skipMany space
_<-char '('
skipWhile (/= ')')
_<-char ')'
rest <- takeText
return $ T.append (T.pack pre) rest
parseAttributes = case parseAttrs css of
Left _ -> []
Right as -> as
isSanitaryAttr (_, "") = False
isSanitaryAttr ("",_) = False
isSanitaryAttr (prop, value)
| prop `member` allowed_css_properties = True
| (T.takeWhile (/= '-') prop) `member` allowed_css_unit_properties &&
all allowedCssAttributeValue (T.words value) = True
| prop `member` allowed_svg_properties = True
| otherwise = False
allowed_css_unit_properties :: Set Text
allowed_css_unit_properties = fromList ["background","border","margin","padding"]
allowedCssAttributeValue :: Text -> Bool
allowedCssAttributeValue val =
val `member` allowed_css_keywords ||
case parseOnly allowedCssAttributeParser val of
Left _ -> False
Right b -> b
where
allowedCssAttributeParser = do
rgb <|> hex <|> rgb <|> cssUnit
aToF = fromList "abcdef"
hex = do
_ <- char '#'
hx <- takeText
return $ T.all (\c -> isDigit c || (c `member` aToF)) hx
-- should have used sepBy (symbol ",")
rgb = do
_<- string "rgb("
skipMany1 digit >> skipOk (== '%')
skip (== ',')
skipMany digit >> skipOk (== '%')
skip (== ',')
skipMany digit >> skipOk (== '%')
skip (== ')')
return True
cssUnit = do
skip isDigit
skipOk isDigit
skipOk (== '.')
skipOk isDigit >> skipOk isDigit
skipSpace
unit <- takeText
return $ T.null unit || unit `member` allowed_css_attribute_value_units
skipOk :: (Char -> Bool) -> Parser ()
skipOk p = skip p <|> pure ()
allowed_css_attribute_value_units :: Set Text
allowed_css_attribute_value_units = fromList
[ "cm", "em", "ex", "in", "mm", "pc", "pt", "px", "%", ",", "\\"]
allowed_css_properties :: Set Text
allowed_css_properties = fromList acceptable_css_properties
where
acceptable_css_properties = ["azimuth", "background-color",
"border-bottom-color", "border-collapse", "border-color",
"border-left-color", "border-right-color", "border-top-color", "clear",
"color", "cursor", "direction", "display", "elevation", "float", "font",
"font-family", "font-size", "font-style", "font-variant", "font-weight",
"height", "letter-spacing", "line-height", "overflow", "pause",
"pause-after", "pause-before", "pitch", "pitch-range", "richness",
"speak", "speak-header", "speak-numeral", "speak-punctuation",
"speech-rate", "stress", "text-align", "text-decoration", "text-indent",
"unicode-bidi", "vertical-align", "voice-family", "volume",
"white-space", "width"]
allowed_css_keywords :: Set Text
allowed_css_keywords = fromList acceptable_css_keywords
where
acceptable_css_keywords = ["auto", "aqua", "black", "block", "blue",
"bold", "both", "bottom", "brown", "center", "collapse", "dashed",
"dotted", "fuchsia", "gray", "green", "!important", "italic", "left",
"lime", "maroon", "medium", "none", "navy", "normal", "nowrap", "olive",
"pointer", "purple", "red", "right", "solid", "silver", "teal", "top",
"transparent", "underline", "white", "yellow"]
-- used in css filtering
allowed_svg_properties :: Set Text
allowed_svg_properties = fromList acceptable_svg_properties
where
acceptable_svg_properties = [ "fill", "fill-opacity", "fill-rule",
"stroke", "stroke-width", "stroke-linecap", "stroke-linejoin",
"stroke-opacity"]
xss-sanitize-0.3.4/test/ 0000755 0000000 0000000 00000000000 12164063407 013327 5 ustar 00 0000000 0000000 xss-sanitize-0.3.4/test/main.hs 0000644 0000000 0000000 00000007264 12164063407 014620 0 ustar 00 0000000 0000000 {-# LANGUAGE OverloadedStrings #-}
import Text.HTML.SanitizeXSS
import Text.HTML.SanitizeXSS.Css
import Data.Text (Text)
import Test.Hspec
import Test.HUnit (assert, (@?=), Assertion)
test :: (Text -> Text) -> Text -> Text -> Assertion
test f actual expected = do
let result = f actual
result @?= expected
sanitized :: Text -> Text -> Expectation
sanitized = test sanitize
sanitizedB = test sanitizeBalance
main :: IO ()
main = hspec $ do
describe "html sanitizing" $ do
it "big test" $ do
let testHTML = " safeanchor Unbalanced"
test sanitizeBalance testHTML " safeanchor Unbalanced"
sanitized testHTML " safeanchor Unbalanced"
it "relativeURI" $ do
let testRelativeURI = "bar"
sanitized testRelativeURI testRelativeURI
it "protocol hack" $
sanitized "" ""
it "object hack" $
sanitized "" ""
it "embed hack" $
sanitized "" ""
it "ucase image hack" $
sanitized "" ""
describe "allowedCssAttributeValue" $ do
it "allows hex" $ do
assert $ allowedCssAttributeValue "#abc"
assert $ allowedCssAttributeValue "#123"
assert $ not $ allowedCssAttributeValue "abc"
assert $ not $ allowedCssAttributeValue "123abc"
it "allows rgb" $ do
assert $ allowedCssAttributeValue "rgb(1,3,3)"
assert $ not $ allowedCssAttributeValue "rgb()"
it "allows units" $ do
assert $ allowedCssAttributeValue "10 px"
assert $ not $ allowedCssAttributeValue "10 abc"
describe "css sanitizing" $ do
it "removes style when empty" $
sanitized "" ""
it "allows any non-url value for white-listed properties" $ do
let whiteCss = ""
sanitized whiteCss whiteCss
it "rejects any url value" $ do
let whiteCss = ""
sanitized whiteCss ""
it "rejects properties not on the white list" $ do
let blackCss = ""
sanitized blackCss ""
it "rejects invalid units for grey-listed css" $ do
let greyCss = ""
sanitized greyCss ""
it "allows valid units for grey-listed css" $ do
let grey2Css = ""
sanitized grey2Css grey2Css
describe "balancing" $ do
it "adds missing elements" $ do
sanitizedB "foo" "foo"
it "doesn't add closing voids" $ do
sanitizedB "" ""
it "removes closing voids" $ do
sanitizedB "" ""