evalidate-2.0.3/NOTES0000644000000000000000000000032713615410400011207 0ustar00add security verification for attrs/nodes ... (test all jailbreaks against configuration) rebuild __builtins__ ? make fast filter code? (verify it, extend, and run as fast filter) try asteval filtering, maybe fasterevalidate-2.0.3/setup.py0000644000000000000000000000206513615410400012107 0ustar00from setuptools import setup from importlib.machinery import SourceFileLoader import os def read(fname): return open(os.path.join(os.path.dirname(__file__), fname), encoding='utf-8').read() def get_version(path): foo = SourceFileLoader(os.path.basename(path), path).load_module() return foo.__version__ setup(name='evalidate', version=get_version('evalidate/__init__.py'), url='http://github.com/yaroslaff/evalidate', author='Yaroslav Polyakov', author_email='xenon@sysattack.com', license='MIT', packages=['evalidate'], description='Validation and secure evaluation of untrusted python expressions', long_description=read('README.md'), long_description_content_type='text/markdown', classifiers=[ "Programming Language :: Python :: 3", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", 'Topic :: Utilities', "Topic :: Security", "Topic :: Software Development :: Interpreters", ], zip_safe=False) evalidate-2.0.3/x.py0000644000000000000000000000221713615410400011215 0ustar00import evalidate def eval_expression(input_string): # Step 1 allowed_names = {"sum": sum, "int": int, "a":1, "b":2, 'startswith': str.startswith} # Step 2 code = compile(input_string, "", "eval") # Step 3 print(code.co_names) for name in code.co_names: if name not in allowed_names: # Step 4 raise NameError(f"Use of {name!r} not allowed") return eval(code, {"__builtins__": {}}, allowed_names) try: src=""" (lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,0,0,b"BOOM",(),(),(),"","",0,b"" ),{} )() )() """ #src = "'asdf'.startswith('as')" #src2 ="""__builtins__['eval']("print(1)")""" #node = evalidate.evalidate(src, # addnodes=['Call', 'Attribute', 'ListComp', 'comprehension', 'Store'], # attrs=['startswith']) #code = compile(node, '', 'eval') #r = eval(code) #print("eval:", r) r = eval_expression(src) print(r) except Exception as e: print(e)evalidate-2.0.3/xx.py0000644000000000000000000000107613615410400011407 0ustar00import evalidate from dataclasses import dataclass from evalidate import Expr, EvalException, base_eval_model @dataclass class Person: name: str weight: float john = Person(name="John", weight=100) jack = Person(name="Jack", weight=60) passengers = {"john": john, "jack": jack} sum_expr = "john.weight + jack.weight" mymodel = base_eval_model.clone() mymodel.nodes.append('Attribute') mymodel.attributes.append('weight') validated_expr = evalidate.Expr(sum_expr, model=mymodel) total_weight = eval(validated_expr.code, passengers, None) print(total_weight) evalidate-2.0.3/xx2.py0000644000000000000000000000123713615410400011470 0ustar00import evalidate from dataclasses import dataclass from evalidate import Expr, EvalException, base_eval_model @dataclass class Person: name: str weight: float def get_weight(self): return self.weight john = Person(name="John", weight=100) jack = Person(name="Jack", weight=60) passengers = {"john": john, "jack": jack} sum_expr = "john.get_weight() + jack.get_weight()" mymodel = base_eval_model.clone() mymodel.nodes.extend(['Attribute', 'Call']) mymodel.attributes.append('get_weight') validated_expr = evalidate.Expr(sum_expr, model=mymodel) total_weight = eval(validated_expr.code, {"john": john, "jack": jack}, None) print(total_weight) evalidate-2.0.3/xx3.py0000644000000000000000000000123613615410400011470 0ustar00import evalidate from dataclasses import dataclass from evalidate import Expr, EvalException, base_eval_model @dataclass class Person: name: str weight: float def get_weight(self): return self.weight john = Person(name="John", weight=100) jack = Person(name="Jack", weight=60) passengers = {"john": john, "jack": jack} sum_expr = "john.get_weight() + jack.get_weight()" mymodel = evalidate.EvalModel() mymodel.nodes.extend(['Attribute', 'Call']) mymodel.attributes.append('get_weight') validated_expr = evalidate.Expr(sum_expr, model=mymodel) total_weight = eval(validated_expr.code, {"john": john, "jack": jack}, None) print(total_weight) evalidate-2.0.3/y.py0000644000000000000000000000030413615410400011211 0ustar00from evalidate import Expr, EvalException src = '-a + 40 > b' # src = "__import__('os').system('clear')" try: print(Expr(src).eval({'a':10, 'b':42})) except EvalException as e: print(e) evalidate-2.0.3/benchmark/benchmark.py0000755000000000000000000000426713615410400014644 0ustar00#!/usr/bin/env python3 from asteval import Interpreter import timeit import evalidate import requests import simpleeval import json from pprint import pprint # context = {'a':1, 'b': 2} # src="""a+b""" # node = evalidate.evalidate(src) # code = compile(node, '', 'eval') """ Our test: we prepare large list of products and filter it, find all items cheaper then 20 """ mult = 10000 # mult = 1 products = requests.get('https://dummyjson.com/products?limit=100').json()['products'] * mult accurate_counter = 8 * mult # gi.symtable = context def test_asteval(): aeval = Interpreter() src = """ counter=0 for p in products: if p['price']<20: counter += 1 counter """ aeval.symtable['products'] = products result = aeval(src) assert(result == accurate_counter) def test_simpleeval(): code = """price < 20""" s = simpleeval.SimpleEval() parsed = s.parse(code) counter = 0 for p in products: s.names = p if s.eval(code, previously_parsed=parsed, ): counter += 1 assert(counter == accurate_counter) def evalidate_raw_eval(): src="""price < 20""" e = evalidate.Expr(src) counter = 0 for p in products: if eval(e.code, None, p): counter+=1 assert(counter == accurate_counter) def evalidate_eval(): src="""price < 20""" e = evalidate.Expr(src) counter = 0 for p in products: if e.eval(p): counter+=1 assert(counter == accurate_counter) def main(): print(f"Products: {len(products)} items") t = timeit.timeit('evalidate_raw_eval()', setup='from __main__ import evalidate_raw_eval, products', number=1) print(f"evalidate_raw_eval(): {t:.3f}s") t = timeit.timeit('evalidate_eval()', setup='from __main__ import evalidate_eval, products', number=1) print(f"evalidate_eval(): {t:.3f}s") t = timeit.timeit('test_simpleeval()', setup='from __main__ import test_simpleeval, products', number=1) print(f"test_simpleeval(): {t:.3f}s") t = timeit.timeit('test_asteval()', setup='from __main__ import test_asteval, products', number=1) print(f"test_asteval(): {t:.3f}s") if __name__ == '__main__': main()evalidate-2.0.3/evalidate/__init__.py0000755000000000000000000001004613615410400014445 0ustar00#!/usr/bin/python """Safe user-supplied python expression evaluation.""" import ast import dataclasses from typing import Callable __version__ = '2.0.3' class EvalException(Exception): pass class ValidationException(EvalException): pass class CompilationException(EvalException): exc = None def __init__(self, exc): super().__init__(exc) self.exc = exc class ExecutionException(EvalException): exc = None def __init__(self, exc): super().__init__(exc) self.exc = exc @dataclasses.dataclass class EvalModel: """ eval security model """ nodes: list = dataclasses.field(default_factory=list) allowed_functions: list = dataclasses.field(default_factory=list) imported_functions: dict = dataclasses.field(default_factory=dict) attributes: list = dataclasses.field(default_factory=list) def clone(self): return EvalModel(**dataclasses.asdict(self)) class SafeAST(ast.NodeVisitor): """AST-tree walker class.""" def __init__(self, model: EvalModel): self.model = model def generic_visit(self, node): """Check node, raise exception if node is not in whitelist.""" if type(node).__name__ in self.model.nodes: if isinstance(node, ast.Attribute): if node.attr not in self.model.attributes: raise ValidationException( "Attribute {aname} is not allowed".format( aname=node.attr)) if isinstance(node, ast.Call): if isinstance(node.func, ast.Name): if node.func.id not in self.model.allowed_functions and node.func.id not in self.model.imported_functions: raise ValidationException( "Call to function {fname}() is not allowed".format( fname=node.func.id)) else: # Call to allowed function. good. No exception pass elif isinstance(node.func, ast.Attribute): pass # print("attr:", node.func.attr) else: raise ValidationException('Indirect function call') ast.NodeVisitor.generic_visit(self, node) else: raise ValidationException( "Node type {optype!r} is not allowed. (whitelist it manually)".format( optype=type(node).__name__)) base_eval_model = EvalModel( nodes = [ # 123, 'asdf' 'Num', 'Str', # any expression or constant 'Expression', 'Constant', # == ... 'Compare', 'Eq', 'NotEq', 'Gt', 'GtE', 'Lt', 'LtE', # variable name 'Name', 'Load', 'BinOp', 'Add', 'Sub', 'USub', 'Subscript', 'Index', # person['name'] 'BoolOp', 'And', 'Or', 'UnaryOp', 'Not', # True and True "In", "NotIn", # "aaa" in i['list'] "IfExp", # for if expressions, like: expr1 if expr2 else expr3 "NameConstant", # for True and False constants "Div", "Mod" ], ) mult_eval_model = base_eval_model.clone() mult_eval_model.nodes.append('Mul') class Expr(): def __init__(self, expr, model=None, filename=None): self.expr = expr self.model = model or base_eval_model try: self.node = ast.parse(self.expr, '', 'eval') except SyntaxError as e: raise CompilationException(e) v = SafeAST(model = self.model) v.visit(self.node) self.code = compile(self.node, filename or '', 'eval') def eval(self, ctx=None): try: result = eval(self.code, self.model.imported_functions, ctx) except Exception as e: raise ExecutionException(e) return result def __str__(self): return("Expr(expr={expr!r})".format(expr=self.expr)) evalidate-2.0.3/evalidate/credits.txt0000644000000000000000000000014713615410400014530 0ustar00Yaroslav Polyakov: main author Daniel Rech: python3 porting, ideas for IfExp function call filtering evalidate-2.0.3/evalidate/security.py0000644000000000000000000000471213615410400014555 0ustar00from evalidate import ValidationException, Expr, EvalModel, base_eval_model simple_attacks = [ """ os.system("clear") """, """ __import__('os').system('clear') """, """ __builtins__['eval']("print(1)") """, ] boom = """ (lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( {payload} ),{{}} )() )() """ boom_payload = [ # 2.7: '0,0,0,0,"BOOM",(),(),(),"","",0,""', # 3.5-3.7: '0,0,0,0,0,b"BOOM",(),(),(),"","",0,b""', # 3.8-3.10: '0,0,0,0,0,0,b"BOOM",(),(),(),"","",0,b""', # 3.11: '0,0,0,0,0,0,b"BOOM",(),(),(),"","","",0,b"",b"",b"",b"",(),()' ] def test_attack(attack: str, model: EvalModel, verbose=False): ''' test attack. return True if attack detected on validation, good. (False if passed, bad) ''' if verbose: print("Testing attack code:\n{}".format(attack)) try: e = Expr(attack, model=model) # node = evalidate(attack, safenodes=safenodes, addnodes=addnodes, funcs=funcs, attrs=attrs) except ValidationException as e: if verbose: print("Good! Attack blocked: {}".format(e)) return True else: print("Problem! Attack passed validation without exception!\nCode:\n{}".format(attack)) return False def test_security(attacks=None, model: EvalModel = None, verbose=False): ''' test all user-given attacks, or built-in attacks. Return value: True if good (all attacks detected), False if at least one attack passes validation ''' model = model or base_eval_model # test user-supplied attacks if attacks: return all( [test_attack(attack=attack, model=model, verbose=verbose) for attack in attacks] ) # test built-in set of attacks if not all([test_attack(attack=attack, model=model, verbose=verbose) for attack in simple_attacks]): return False for payload in boom_payload: attack = boom.format(payload=payload) if not test_attack(attack=attack, model=model, verbose=verbose): return False return True if __name__ == '__main__': attacks = [ '1*2' ] model = base_eval_model.clone() # model.nodes.append('Mult') test_security(attacks=attacks, model=model, verbose=True) test_security(model=model, verbose=True) evalidate-2.0.3/examples/basic.py0000755000000000000000000000033113615410400013643 0ustar00#!/usr/bin/env python3 from evalidate import Expr, EvalException src = 'a + 40 > b' # src="__import__('os').system('clear')" try: print(Expr(src).eval({'a':10, 'b':42})) except EvalException as e: print(e) evalidate-2.0.3/examples/products.py0000755000000000000000000000124113615410400014426 0ustar00#!/usr/bin/env python3 import requests from evalidate import Expr, ValidationException, CompilationException, ExecutionException import json import sys data = requests.get('https://dummyjson.com/products?limit=100').json() try: src = sys.argv[1] except IndexError: src = 'True' try: expr = Expr(src) except (ValidationException, CompilationException) as e: print(e) sys.exit(1) c=0 for p in data['products']: # print(p) try: r = expr.eval(p) if r: print(json.dumps(p, indent=2)) c+=1 except ExecutionException as e: print("Runtime exception:", e) print("# {} products matches".format(c)) evalidate-2.0.3/tests/test_exc.py0000644000000000000000000000222513615410400013725 0ustar00from evalidate import Expr, ValidationException, ExecutionException, CompilationException, base_eval_model import pytest class TestExceptions(): def test_exec(self): with pytest.raises(ExecutionException) as e: r = Expr('1/0').eval() with pytest.raises(ExecutionException) as e: r = Expr('1+x').eval() def test_badcode(self): with pytest.raises(CompilationException) as e: r = Expr('') with pytest.raises(CompilationException) as e: r = Expr(';') with pytest.raises(CompilationException) as e: r = Expr(""" 1+1 2+2""") def test_call(self): ctx = {'a':36.6} with pytest.raises(ValidationException) as e: r = Expr('int(a)').eval(ctx) def test_return(self): with pytest.raises(CompilationException) as e: Expr('return 42').eval() def test_startswith(self): with pytest.raises(ValidationException) as e: src = '"abcdef".startswith("abc")' m = base_eval_model.clone() m.nodes.append('Call') r = Expr(src).eval() evalidate-2.0.3/tests/test_expr.py0000644000000000000000000000642713615410400014134 0ustar00from evalidate import Expr, EvalException, ValidationException, base_eval_model, EvalModel import pytest class TestExpr(): def test_basic(self): r = Expr('1+2').eval() assert r == 3 def test_mult(self): with pytest.raises(ValidationException): r = Expr('42 * 42').eval() mult_model = base_eval_model.clone() mult_model.nodes.append('Mult') r = Expr('42 * 42', mult_model).eval() assert r == 1764 def test_blank(self): with pytest.raises(ValidationException): model = EvalModel(nodes=[]) r = Expr('1 + 2', model=model).eval() model = EvalModel(nodes=['Expression', 'BinOp', 'Constant', 'Add']) r = Expr('1 + 2', model=model).eval() assert r == 3 def test_cleanup(self): ctx=dict() Expr('None').eval(ctx) assert '__builtins__' not in ctx def test_context(self): e = Expr('a + b') assert e.eval({'a': 1, 'b': 2}) == 3 assert e.eval({'a': 0, 'b': 2}) == 2 def test_functions(self): with pytest.raises(ValidationException): r = Expr('int(x)').eval({'x': 1.3}) model = base_eval_model.clone() model.nodes.append('Call') model.allowed_functions.append('int') r = Expr('int(x)', model=model).eval({ "x": 1.3 }) assert r == 1 def test_attributes(self): class Person: pass class Movie: pass person = Person() person.name = 'Audrey Hepburn' person.birth = 1929 movie1 = Movie() movie1.title = "Roman Holiday" movie1.year = 1953 movie2 = Movie() movie2.title = "Breakfast at Tiffany's" movie2.year = 1961 movies = [movie1, movie2] with pytest.raises(ValidationException): r = Expr('movie.year - person.birth').eval({"person": person, "movie": movie1}) m = base_eval_model.clone() m.nodes.append('Attribute') m.attributes.extend(['year', 'birth']) e = Expr('movie.year - person.birth', model=m) age = e.eval({"person":person, "movie": movie1}) assert age == 24 age = e.eval({"person": person, "movie": movie2}) assert age == 32 ages = [ e.eval({"person": person, "movie": movie}) for movie in movies ] assert ages == [ 24, 32 ] def test_dicts(self): person = { 'name': 'Audrey Hepburn', 'birth': 1929 } movies = [ { 'title': "Roman Holiday", 'year': 1953 }, { 'title': "Breakfast at Tiffany's", 'year': 1961 } ] e = Expr('movie["year"] - person["birth"]') ages = [ e.eval({"movie": movie, "person": person}) for movie in movies ] assert ages == [24,32] def test_my_funcs(self): def double(x): return x * 2 ctx = {'a': 123.45} m = base_eval_model.clone() m.nodes.append('Call') m.allowed_functions.append('int') m.imported_functions['double'] = double e = Expr('int(a) + double(a)', model=m) r = e.eval(ctx) assert r == 369.9evalidate-2.0.3/tests/test_feature.py0000644000000000000000000000303613615410400014602 0ustar00import evalidate import pytest class TestSafeEval(): def test_sum(self): r = evalidate.Expr('1+2').eval() assert(r==3) def test_context(self): ctx = {'a':100, 'b': 200 } r = evalidate.Expr('a+b').eval(ctx) assert(r==300) def test_call(self): ctx = {'a':36.6} m = evalidate.base_eval_model.clone() m.nodes.append('Call') m.allowed_functions.append('int') r = evalidate.Expr('int(a)', model=m).eval(ctx) assert(r==36) def test_normal(self): codelist = [ ('1+1',2), ('a+b',3), ('int(pi)', 3) ] ctx={'a':1, 'b':2, 'pi': 3.14} m = evalidate.base_eval_model.clone() m.nodes.append('Call') m.allowed_functions.append('int') for src, r in codelist: result = evalidate.Expr(src, model=m).eval(ctx) assert(result==r) def test_stack(self): src='int(1)' m = evalidate.base_eval_model.clone() m.nodes.append('Call') m.allowed_functions.append('int') for i in range(199): src = f'int( {src} )' r = evalidate.Expr(src, model=m).eval() assert( r==1 ) def test_startswith(self): m = evalidate.base_eval_model.clone() m.nodes.extend(['Call', 'Attribute']) m.allowed_functions.append('int') m.attributes.append('startswith') src = '"abcdef".startswith("abc")' r = evalidate.Expr(src, model=m).eval() assert r evalidate-2.0.3/tests/test_jailbreak.py0000644000000000000000000000475713615410400015106 0ustar00from evalidate import ExecutionException, ValidationException, Expr, base_eval_model import pytest class TestJailbreak(): def test_ossystem_nocall(self): # must fail because calls are not allowed at all with pytest.raises(ValidationException): Expr('os.system("clear")') def test_ossystem_call_int(self): # must fail because this function not allowed with pytest.raises(ValidationException): m = base_eval_model.clone() m.nodes.append('Call') m.allowed_functions.append('int') Expr('os.system("clear")', model=m) def test_ossystem_import(self): # must fail anyway with pytest.raises(ValidationException): m = base_eval_model.clone() m.nodes.append('Call') m.allowed_functions.append('int') Expr("__import__('os').system('clear')", model=m) def test_builtins(self): # indirect call src="""__builtins__['eval']("print(1)")""" with pytest.raises(ValidationException): m = base_eval_model.clone() m.nodes.append('Call') result = Expr(src, model=m) def test_bomb(self): bomb_list = [""" (lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,0,b"BOOM",(),(),(),"","",0,b"" ),{} )() )() """, """ (lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,0,0,b"BOOM",(),(),(),"","",0,b"" ),{} )() )() """, """ (lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,0,0,b"BOOM",(),(),(),"","","",0,b"",b"",b"",b"",(),() ),{} )() )() """ ] m = base_eval_model.clone() m.nodes.append('Call') for bomb in bomb_list: with pytest.raises(ValidationException): Expr(expr=bomb, model=m) def test_mul_overflow(self): src = '"a"*1000000*1000000*1000000*1000000' with pytest.raises(ExecutionException): m = base_eval_model.clone() m.nodes.append('Mult') Expr(src, model=m).eval() evalidate-2.0.3/.gitignore0000644000000000000000000000006513615410400012363 0ustar00*~ __pycache__ evalidate.egg-info/ *pyc build/ dist/ evalidate-2.0.3/LICENSE0000644000000000000000000000206213615410400011377 0ustar00MIT License Copyright (c) 2022 Yaroslav Polyakov Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. evalidate-2.0.3/README.md0000644000000000000000000002250413615410400011654 0ustar00# Evalidate Evalidate is simple python module for safe and very fast eval()'uating user-supplied (possible malicious) python expressions. ## Upgrade warning Version 2.0 is backward incompatible with older versions. `safeeval()` and `evalidate()` methods are removed, and EvalMode class is introduced. See [upgrade example in ticket](https://github.com/yaroslaff/evalidate/issues/5) or use older (any before 2.0.0, e.g. [v1.1.0](https://pypi.org/project/evalidate/1.1.0/)) if you have old code and do not want to upgrade. But upgrading is easy, so please consider this option. ## Purpose Originally it's developed for filtering complex data structures e.g. Find cheap smartphones available for sale: ```python category="smartphones" and price<300 and stock>0 ``` But also, it can be used for other expressions, e.g. arithmetical, like ```python a+b-100 ``` Evalidate is fastest among all (known to me) secure eval pythong modules. ## Install ```shell pip3 install evalidate ``` ## Security Built-in python features such as compile() or eval() are quite powerful to run any kind of user-supplied code, but could be insecure if used code is malicious like `os.system("rm -rf /")`. Evalidate works on whitelist principle, allowing code only if it consist only of safe operations (based on authors views about what is safe and what is not, your mileage may vary - but you can supply your list of safe operations) ## TL;DR. Just give me safe eval! ```python from evalidate import Expr, EvalException src = 'a + 40 > b' # src = "__import__('os').system('clear')" try: print(Expr(src).eval({'a':10, 'b':42})) except EvalException as e: print(e) ``` Gives output: `True` In case of dangerous code (uncomment second src line to test): output will be: `ERR: Operation type Call is not allowed` ## Exceptions Evalidate throws exceptions `CompilationException`, `ValidationException`, `ExecutionException`. All of them inherit from base exception class `EvalException`. ## Configure validation Evalidate is very flexible, depending on security model, same code can either pass validation or raise exception. EvalModel is security model class for eval - lists of allowed AST nodes, function calls, attributes and dict of imported functions. There is built-in model `base_eval_model` with basic operations allowed (which are safe from authors point of view). You can create custom empty model (and extend it later): ~~~python my_model = evalidate.EvalModel() ~~~ (nothing is allowed by default, even `1+2` will not be considered safe) or you may start from `base_eval_mode` and extend it: ~~~python from evalidate import Expr, base_eval_model my_model = base_eval_model.clone() my_model.nodes.append('Mult') Expr('2*2', model=my_model).eval() ~~~ To enable `int()` function, need to allow `'Call'` node and add this function to list of allowed function: ~~~python my_model.nodes.append('Call') my_model.allowed_functions.append('int') Expr('int(36.6)', model=my_model).eval() ~~~ Or, to call attributes: ~~~python m = base_eval_model.clone() m.nodes.extend(['Call', 'Attribute']) m.attributes.append('startswith') src = '"abcdef".startswith("abc")' r = evalidate.Expr(src, model=m).eval() ~~~ But even with this settings, exploiting it with expression like `__builtins__["eval"](1)` will fail (good!). ### Exporting my functions to eval code ~~~python def one(): return 1 m = base_eval_model.clone() m.nodes.append('Call') Expr('one()', model=m).eval() ~~~ ## Improve speed by using native eval() with validated code Evalidate is very fast, but it's still takes CPU cycles... If you want to achieve maximal possible speed, you can use python native [eval](https://docs.python.org/3/library/functions.html#eval) with this kind of code: ~~~python from evalidate import Expr d = dict(a=1, b=2) expr = Expr('a+b') eval(expr.code, None, d) # <-- native python eval, will run at eval() speed ~~~ This is as secure as expr.eval(), because `expr.code` is already validated to be secure. Difference is very little: execution of `expr.code` can throw any exception, while `expr.eval()` can throw only ExecutionException. Also, if you want to export your functions to eval, you should do this manually. ## Limitations evalidate uses [ast.parse()](https://docs.python.org/3/library/ast.html#ast.parse) to get [AST node](https://docs.python.org/3/library/ast.html#node-classes) to validate it. >Warning > >It is possible to crash the Python interpreter with a sufficiently large/complex string due to stack depth limitations in Python’s AST compiler. In my test, works well with 200 nested int(): `int(int(.... int(1)...))` but not with 201. Source code is 1000+ characters. But even if evalidate will get such code, it will just raise `CompilationException`. ### evalidate.security.test_security() Evalidate is very flexible and it's possible to shoot yourself in foot if you will try hard. `test_security()` checks your configuration (nodes, funcs, attrs) against given list of possible attack code or against built-in list of attacks. `test_security()` returns True if everything is OK (all attacks raised ValidationException) or False if something passed. This code will never print (I hope). ~~~python from evalidate.security import test_security test_security() or print("default rules are vulnerable!") ~~~ But this will fail because nodes/funcs leads to successful validation for attack (suppose you do not want anyone to call `int()`) ~~~python from evalidate.security import test_security attacks = ['int(1)'] test_security(attacks, addnodes=['Call'], funcs=['int'], verbose=True) ~~~ It will print: ~~~ Testing attack code: int(1) Problem! Attack passed validation without exception! Code: int(1) ~~~ ## Example ### Filtering by user-supplied condition ### This is code of `examples/products.py`. Expression is validated and compiled once and executed (as byte-code, very fast) many times, so filtering is both fast and secure. ~~~python #!/usr/bin/env python3 import requests from evalidate import Expr, ValidationException, CompilationException, ExecutionException import json import sys data = requests.get('https://dummyjson.com/products?limit=100').json() try: src = sys.argv[1] except IndexError: src = 'True' try: expr = Expr(src) except (ValidationException, CompilationException) as e: print(e) sys.exit(1) c=0 for p in data['products']: # print(p) try: r = expr.eval(p) if r: print(json.dumps(p, indent=2)) c+=1 except ExecutionException as e: print("Runtime exception:", e) print("# {} products matches".format(c)) ~~~ ~~~shell # print all 100 products ./products.py # Only cheap products, 8 matches ./products.py 'price<20' # smartphones (5) ./products.py 'category=="smartphones"' # good smartphones ./products.py 'category=="smartphones" and rating>4.5' # cheap smartphones ./products.py 'category=="smartphones" and price<300' ~~~ ## Similar projects and benchmark [asteval](https://newville.github.io/asteval/) While asteval can compute much more complex code (define functions, use python math libraries) it has drawbacks: - asteval is much slower (evalidate can be used at speed of eval() python bytecode) - user can provide source code which runs very long time and consumes many resources [simpleeval](https://github.com/danthedeckie/simpleeval) Very similar project, using AST approach too and optimized to re-evaluate pre-parsed expressions. But parsed expressions are stored as more high-level [ast.Expr](https://docs.python.org/3/library/ast.html#ast.Expr) type and this approach is few times slower, while evalidate uses python native `code` type and evaluation itself goes at speed of python eval() evalidate is good to run same expression against different data. ## Benchmarking We use `benchmark/benchmark.py` in this repository. We prepare list of 1 million of products (actually, we take just 100 products sample, but repeat it 10 000 times to get 1 million), and then filter it, finding only specific products on "untrusted user-supplied expression" (`price < 20` in this case) ~~~ Products: 1000000 items evalidate_raw_eval(): 0.266s evalidate_eval(): 0.326s test_simpleeval(): 1.824s test_asteval(): 26.106s ~~~ As you see, evalidate is few times faster then simpleeval and both are much faster then asteval. Maybe my test is not perfectly optimized (I'm not expert with simpleeval/asteval), if you can suggest better filtering sample code (which produces faster result), I will include it. (Benchmark code must assume expression as unknown in advance and untrusted) ## Read about eval() risks - https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html - https://netsec.expert/posts/breaking-python3-eval-protections/ - https://realpython.com/python-eval-function/ Note: realpython article shows example with nice short method of validation source (using `code.co_names`), but it's vulnerable, it passes "bomb" from Ned Batchelder article (bomb has empty `co_names` tuple) and crash interpreter. Evalidate can block this code and similar bombs (unless you will intentionally configure evalidate to pass specific bomb code. Yes, with evalidate it is hard to shoot yourself in the foot, but it is possible if you will try hard). ## More info Want more info? Check source code of module, it's very short and simple, easy to modify ## Contact Write me: yaroslaff at gmail.com evalidate-2.0.3/pyproject.toml0000644000000000000000000000141013615410400013302 0ustar00[build-system] # requires = ["setuptools >= 40.6.0", "wheel"] requires = ["hatchling", "wheel"] # build-backend = "setuptools.build_meta" build-backend = "hatchling.build" [project] name="evalidate" dynamic = [ "version" ] dependencies = [] authors = [ { name="Yaroslav Polyakov", email="yaroslaff@gmail.com" }, ] description = "Validation and secure evaluation of untrusted python expressions" readme = "README.md" requires-python = ">=3.8" classifiers = [ "Programming Language :: Python :: 3", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", ] [project.urls] Homepage = "https://github.com/yaroslaff/evalidate" Issues = "https://github.com/yaroslaff/evalidate/issues" [tool.hatch.version] path = 'evalidate/__init__.py' evalidate-2.0.3/PKG-INFO0000644000000000000000000002354213615410400011475 0ustar00Metadata-Version: 2.3 Name: evalidate Version: 2.0.3 Summary: Validation and secure evaluation of untrusted python expressions Project-URL: Homepage, https://github.com/yaroslaff/evalidate Project-URL: Issues, https://github.com/yaroslaff/evalidate/issues Author-email: Yaroslav Polyakov License-File: LICENSE Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python :: 3 Requires-Python: >=3.8 Description-Content-Type: text/markdown # Evalidate Evalidate is simple python module for safe and very fast eval()'uating user-supplied (possible malicious) python expressions. ## Upgrade warning Version 2.0 is backward incompatible with older versions. `safeeval()` and `evalidate()` methods are removed, and EvalMode class is introduced. See [upgrade example in ticket](https://github.com/yaroslaff/evalidate/issues/5) or use older (any before 2.0.0, e.g. [v1.1.0](https://pypi.org/project/evalidate/1.1.0/)) if you have old code and do not want to upgrade. But upgrading is easy, so please consider this option. ## Purpose Originally it's developed for filtering complex data structures e.g. Find cheap smartphones available for sale: ```python category="smartphones" and price<300 and stock>0 ``` But also, it can be used for other expressions, e.g. arithmetical, like ```python a+b-100 ``` Evalidate is fastest among all (known to me) secure eval pythong modules. ## Install ```shell pip3 install evalidate ``` ## Security Built-in python features such as compile() or eval() are quite powerful to run any kind of user-supplied code, but could be insecure if used code is malicious like `os.system("rm -rf /")`. Evalidate works on whitelist principle, allowing code only if it consist only of safe operations (based on authors views about what is safe and what is not, your mileage may vary - but you can supply your list of safe operations) ## TL;DR. Just give me safe eval! ```python from evalidate import Expr, EvalException src = 'a + 40 > b' # src = "__import__('os').system('clear')" try: print(Expr(src).eval({'a':10, 'b':42})) except EvalException as e: print(e) ``` Gives output: `True` In case of dangerous code (uncomment second src line to test): output will be: `ERR: Operation type Call is not allowed` ## Exceptions Evalidate throws exceptions `CompilationException`, `ValidationException`, `ExecutionException`. All of them inherit from base exception class `EvalException`. ## Configure validation Evalidate is very flexible, depending on security model, same code can either pass validation or raise exception. EvalModel is security model class for eval - lists of allowed AST nodes, function calls, attributes and dict of imported functions. There is built-in model `base_eval_model` with basic operations allowed (which are safe from authors point of view). You can create custom empty model (and extend it later): ~~~python my_model = evalidate.EvalModel() ~~~ (nothing is allowed by default, even `1+2` will not be considered safe) or you may start from `base_eval_mode` and extend it: ~~~python from evalidate import Expr, base_eval_model my_model = base_eval_model.clone() my_model.nodes.append('Mult') Expr('2*2', model=my_model).eval() ~~~ To enable `int()` function, need to allow `'Call'` node and add this function to list of allowed function: ~~~python my_model.nodes.append('Call') my_model.allowed_functions.append('int') Expr('int(36.6)', model=my_model).eval() ~~~ Or, to call attributes: ~~~python m = base_eval_model.clone() m.nodes.extend(['Call', 'Attribute']) m.attributes.append('startswith') src = '"abcdef".startswith("abc")' r = evalidate.Expr(src, model=m).eval() ~~~ But even with this settings, exploiting it with expression like `__builtins__["eval"](1)` will fail (good!). ### Exporting my functions to eval code ~~~python def one(): return 1 m = base_eval_model.clone() m.nodes.append('Call') Expr('one()', model=m).eval() ~~~ ## Improve speed by using native eval() with validated code Evalidate is very fast, but it's still takes CPU cycles... If you want to achieve maximal possible speed, you can use python native [eval](https://docs.python.org/3/library/functions.html#eval) with this kind of code: ~~~python from evalidate import Expr d = dict(a=1, b=2) expr = Expr('a+b') eval(expr.code, None, d) # <-- native python eval, will run at eval() speed ~~~ This is as secure as expr.eval(), because `expr.code` is already validated to be secure. Difference is very little: execution of `expr.code` can throw any exception, while `expr.eval()` can throw only ExecutionException. Also, if you want to export your functions to eval, you should do this manually. ## Limitations evalidate uses [ast.parse()](https://docs.python.org/3/library/ast.html#ast.parse) to get [AST node](https://docs.python.org/3/library/ast.html#node-classes) to validate it. >Warning > >It is possible to crash the Python interpreter with a sufficiently large/complex string due to stack depth limitations in Python’s AST compiler. In my test, works well with 200 nested int(): `int(int(.... int(1)...))` but not with 201. Source code is 1000+ characters. But even if evalidate will get such code, it will just raise `CompilationException`. ### evalidate.security.test_security() Evalidate is very flexible and it's possible to shoot yourself in foot if you will try hard. `test_security()` checks your configuration (nodes, funcs, attrs) against given list of possible attack code or against built-in list of attacks. `test_security()` returns True if everything is OK (all attacks raised ValidationException) or False if something passed. This code will never print (I hope). ~~~python from evalidate.security import test_security test_security() or print("default rules are vulnerable!") ~~~ But this will fail because nodes/funcs leads to successful validation for attack (suppose you do not want anyone to call `int()`) ~~~python from evalidate.security import test_security attacks = ['int(1)'] test_security(attacks, addnodes=['Call'], funcs=['int'], verbose=True) ~~~ It will print: ~~~ Testing attack code: int(1) Problem! Attack passed validation without exception! Code: int(1) ~~~ ## Example ### Filtering by user-supplied condition ### This is code of `examples/products.py`. Expression is validated and compiled once and executed (as byte-code, very fast) many times, so filtering is both fast and secure. ~~~python #!/usr/bin/env python3 import requests from evalidate import Expr, ValidationException, CompilationException, ExecutionException import json import sys data = requests.get('https://dummyjson.com/products?limit=100').json() try: src = sys.argv[1] except IndexError: src = 'True' try: expr = Expr(src) except (ValidationException, CompilationException) as e: print(e) sys.exit(1) c=0 for p in data['products']: # print(p) try: r = expr.eval(p) if r: print(json.dumps(p, indent=2)) c+=1 except ExecutionException as e: print("Runtime exception:", e) print("# {} products matches".format(c)) ~~~ ~~~shell # print all 100 products ./products.py # Only cheap products, 8 matches ./products.py 'price<20' # smartphones (5) ./products.py 'category=="smartphones"' # good smartphones ./products.py 'category=="smartphones" and rating>4.5' # cheap smartphones ./products.py 'category=="smartphones" and price<300' ~~~ ## Similar projects and benchmark [asteval](https://newville.github.io/asteval/) While asteval can compute much more complex code (define functions, use python math libraries) it has drawbacks: - asteval is much slower (evalidate can be used at speed of eval() python bytecode) - user can provide source code which runs very long time and consumes many resources [simpleeval](https://github.com/danthedeckie/simpleeval) Very similar project, using AST approach too and optimized to re-evaluate pre-parsed expressions. But parsed expressions are stored as more high-level [ast.Expr](https://docs.python.org/3/library/ast.html#ast.Expr) type and this approach is few times slower, while evalidate uses python native `code` type and evaluation itself goes at speed of python eval() evalidate is good to run same expression against different data. ## Benchmarking We use `benchmark/benchmark.py` in this repository. We prepare list of 1 million of products (actually, we take just 100 products sample, but repeat it 10 000 times to get 1 million), and then filter it, finding only specific products on "untrusted user-supplied expression" (`price < 20` in this case) ~~~ Products: 1000000 items evalidate_raw_eval(): 0.266s evalidate_eval(): 0.326s test_simpleeval(): 1.824s test_asteval(): 26.106s ~~~ As you see, evalidate is few times faster then simpleeval and both are much faster then asteval. Maybe my test is not perfectly optimized (I'm not expert with simpleeval/asteval), if you can suggest better filtering sample code (which produces faster result), I will include it. (Benchmark code must assume expression as unknown in advance and untrusted) ## Read about eval() risks - https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html - https://netsec.expert/posts/breaking-python3-eval-protections/ - https://realpython.com/python-eval-function/ Note: realpython article shows example with nice short method of validation source (using `code.co_names`), but it's vulnerable, it passes "bomb" from Ned Batchelder article (bomb has empty `co_names` tuple) and crash interpreter. Evalidate can block this code and similar bombs (unless you will intentionally configure evalidate to pass specific bomb code. Yes, with evalidate it is hard to shoot yourself in the foot, but it is possible if you will try hard). ## More info Want more info? Check source code of module, it's very short and simple, easy to modify ## Contact Write me: yaroslaff at gmail.com