PEAK-Rules-0.5a1.dev-r2713/0000755000076600007660000000000012506203664014101 5ustar telec3telec3PEAK-Rules-0.5a1.dev-r2713/Syntax-Matching.txt0000755000076600007660000002257711633340500017667 0ustar telec3telec3====================== Matching Python Syntax ====================== The ``peak.rules.syntax`` module allows you to define pattern-matching predicates against snippets of parameterized Python code, such that a rule expression like:: syntax.match(expr, type(`x`) is `y`) and isinstance(y, Const) Will return true if ``expr`` is a PEAK-Rules AST of the form:: Compare(Call(Const(type), (v1,)), ('is', Const(v2))) (where v1 and v2 are arbitrary values). .. contents:: **Table of Contents** Bind Variables ============== Bind variables are placeholders in a pattern that "bind" themselves to the value found in that location in the matched data structure. Thus, in the example above, ```x``` and ```y``` are bind variables, and cause "y" in the later part of the expression to refer to the right-hand side of the ``is`` operator being matched. (The arbitrary value ``v2`` in the example above.) Bind variables are represented within a tree as an AST node created from the variable name:: >>> from peak.rules.syntax import Bind >>> Bind('x') Bind('x') Compiling Tree-Match Predicates =============================== The ``match_predicate(pattern, expr, binds)`` function is used to combine a pattern AST and an expression AST to create a PEAK-Rules predicate object that will match the specified pattern. The `binds` argument is a dictionary mapping from bind-variable names to lists of expression ASTs, and is modified in-place as the predicate is assembled:: >>> from peak.rules.syntax import match_predicate Rules defined for this function will determine what to do based on the type of `pattern`. If `pattern` is a bind variable, the `binds` dictionary is updated in-place, inserting `expr` under the bind variable's name, and ``True`` is returned, indicating that this part of the pattern will always match:: >>> from peak.util.assembler import Local >>> b = {} >>> match_predicate(Bind('x'), Local('y'), b) True >>> b {'x': [Local('y')]} If there's already an entry for that variable in the `binds` dictionary, a more complex predicate is returned, performing an equality comparison between the new binding and the old binding of the variable, and the value in `binds` is updated:: >>> match_predicate(Bind('x'), Local('z'), b) Test(Truth(Compare(Local('z'), (('==', Local('y')),))), True) This is so that patterns like "`x` is not `x`" will actually compare the two "x"s and see if they're equal. Of course, if you bind the same variable more than once to equal expression ASTs, you will not get back a comparison, and the `binds` will be unchanged:: >>> match_predicate(Bind('x'), Local('z'), b) True >>> b {'x': [Local('y'), Local('z')]} Finally, there is a special exception for bind variables named ```_```: that is, a single underscore. Bind variables of this name are never stored in the `binds`, and always return ``True`` as a predicate, allowing you to use them as "don't care" placeholders:: >>> any = Bind('_') >>> match_predicate(any, Local('q'), b) True >>> b {'x': [Local('y'), Local('z')]} Matching Structures and AST nodes --------------------------------- For most node types other than ``Bind``, the predicates are a bit more complex. By default, the predicate should be an exact (``istype``) match of the node type, intersected with a recursive application of ``match_predicate()`` to each of the target node's children. For example:: >>> b = {} >>> from peak.util.assembler import * >>> from peak.rules.codegen import * >>> match_predicate(Add(any, any), Local('q'), b) Test(IsInstance(Local('q')), istype(, True)) >>> b {} Each child is defined via a ``Getitem()`` operation on the target node, so that any placeholders and criteria will target the right part of the tree:: >>> match_predicate(Add(Bind('x'), Bind('y')), Local('q'), b) Test(IsInstance(Local('q')), istype(, True)) >>> b {'y': [Getitem(Local('q'), Const(2))], 'x': [Getitem(Local('q'), Const(1))]} Non-node patterns are treated as equality comparisons:: >>> b = {} >>> match_predicate(42, Local('q'), b) Test(Comparison(Local('q')), Value(42, True)) >>> b {} Except for ``None``, which produces an ``is None`` test:: >>> match_predicate(None, Local('q'), b) Test(Identity(Local('q')), IsObject(None, True)) >>> b {} And sequences are matched by comparing their length:: >>> match_predicate((), Local('q'), b) Test(Comparison(Call(Const(<... len>), (Local('q'),),...)), Value(0, True)) >>> match_predicate([], Local('q'), b) Test(Comparison(Call(Const(<... len>), (Local('q'),),...)), Value(0, True)) >>> b {} And recursively matching their contents:: >>> match_predicate((Bind('x'), Add(Bind('y'), any)), Local('q'), b) Signature([Test(Comparison(Call(Const(<... len>), (Local('q'),),...)), Value(2, True)), Test(IsInstance(Getitem(Local('q'), Const(1))), istype(, True))]) >>> b {'y': [Getitem(Getitem(Local('q'), Const(1)), Const(1))], 'x': [Getitem(Local('q'), Const(0))]} Parsing Syntax Patterns ======================= The ``syntax.SyntaxBuilder`` class is used to parse Python expressions into AST patterns suitable for use with ``match_predicate``:: >>> from peak.rules.syntax import SyntaxBuilder, match >>> builder = SyntaxBuilder({}, locals(), globals(), __builtins__) >>> pe = builder.parse It parses backquoted identifiers into ``Bind`` nodes: >>> pe('type(`x`) is `y`') Compare(Call(Const(), (Bind('x'),), (), (), (), True), (('is', Bind('y')),)) And rejects all other use of backquotes:: >>> pe('`type(x)`') Traceback (most recent call last): ... SyntaxError: backquotes may only be used around an indentifier In all other respects, it's essentially the same as ``codegen.ExprBuilder``. The ``match()`` Pseudo-function ------------------------------- This isn't really a function, but you can use it in a predicate string in order to perform a pattern match on a PEAK-Rules AST. It's mainly intended for use in extending PEAK-Rules to recognize and replace various kinds of subexpression patterns (e.g. by adding rules to ``predicates.expressionSignature()``), but it can of course also be used in any other tools you build atop PEAK-Rules' expression machinery. In this example, we show it being used to define a rule that will recognize expressions of the form ``"type(x) is y"``, where x and y are arbitrary expressions:: >>> from peak.rules.syntax import match >>> from peak.rules.predicates import CriteriaBuilder >>> builder = CriteriaBuilder( ... {'expr':Local('expr')}, locals(), globals(), __builtins__ ... ) >>> pe = builder.parse >>> pe('match(expr, type(`x`) is `y`)') Signature([Test(IsInstance(Local('expr')), istype(, True)), Test(IsInstance(Getitem(Local('expr'), Const(1))), istype(, True)), Test(Comparison(Getitem(Getitem(Local('expr'), Const(1)), Const(1))), Value(Const(), True)), Test(Comparison(Call(Const(<... len>), (Getitem(Getitem(Local('expr'), Const(1)), Const(2)),), (), (), (), True)), Value(1, True)), Test(Comparison(Call(Const(<... len>), (Getitem(Getitem(Local('expr'), Const(1)), Const(3)),), (), (), (), True)), Value(0, True)), Test(Comparison(Call(Const(<... len>), (Getitem(Getitem(Local('expr'), Const(1)), Const(4)),), (), (), (), True)), Value(0, True)), Test(Comparison(Call(Const(<... len>), (Getitem(Getitem(Local('expr'), Const(1)), Const(5)),), (), (), (), True)), Value(0, True)), Test(Comparison(Getitem(Getitem(Local('expr'), Const(1)), Const(6))), Value(True, True)), Test(Comparison(Call(Const(<... len>), (Getitem(Local('expr'), Const(2)),), (), (), (), True)), Value(1, True)), Test(Comparison(Call(Const(<... len>), (Getitem(Getitem(Local('expr'), Const(2)), Const(0)),), (), (), (), True)), Value(2, True)), Test(Comparison(Getitem(Getitem(Getitem(Local('expr'), Const(2)), Const(0)), Const(0))), Value('is', True))]) >>> builder.bindings[0] {'y': Getitem(Getitem(Getitem(Local('expr'), Const(2)), Const(0)), Const(1)), 'x': Getitem(Getitem(Getitem(Local('expr'), Const(1)), Const(2)), Const(0))} PEAK-Rules-0.5a1.dev-r2713/README.txt0000755000076600007660000006354111633340500015602 0ustar telec3telec3=========================================== Creating Generic Functions using PEAK-Rules =========================================== PEAK-Rules is a highly-extensible framework for creating and using generic functions, from the very simple to the very complex. Out of the box, it supports multiple-dispatch on positional arguments using tuples of types, full predicate dispatch using strings containing Python expressions, and CLOS-like method combining. (But the framework allows you to mix and match dispatch engines and custom method combinations, if you need or want to.) Basic usage:: >>> from peak.rules import abstract, when, around, before, after >>> @abstract() ... def pprint(ob): ... """A pretty-printing generic function""" >>> @when(pprint, (list,)) ... def pprint_list(ob): ... print "pretty-printing a list" >>> @when(pprint, "isinstance(ob,list) and len(ob)>50") ... def pprint_long_list(ob): ... print "pretty-printing a long list" >>> pprint([1,2,3]) pretty-printing a list >>> pprint([42]*1000) pretty-printing a long list >>> pprint(42) Traceback (most recent call last): ... NoApplicableMethods: ... PEAK-Rules works with Python 2.3 and up -- just omit the ``@`` signs if your code needs to run under 2.3. Also, note that with PEAK-Rules, *any* function can be generic: you don't have to predeclare a function as generic. (The ``abstract`` decorator is used to declare a function with no *default* method; i.e., one that will raise ``NoApplicableMethods`` instead of executing a default implementation, if no rules match the arguments it's invoked with.) PEAK-Rules is still under development; it lacks much in the way of error checking, so if you mess up your rules, it may not be obvious where or how you did. User documentation is also lacking, although there are extensive doctests describing and testing most of its internals, including: * `Introduction`_ (Method combination, porting from RuleDispatch) * `Core Design Overview`_ (Terminology, method precedence, etc.) * The `Basic AST Builder`_ and advanced `Code Generation`_ * `Criteria`_, `Indexing`_, and `Predicates`_ * `Syntax pattern matching`_ (Please note that these documents are still in a state of flux and some may still be incomplete or disorganized, prior to the first official release.) Source distribution snapshots are generated daily, but you can also update directly from the `development version`_ in SVN. .. _development version: svn://svn.eby-sarna.com/svnroot/PEAK-Rules#egg=PEAK_Rules-dev .. _Introduction: http://peak.telecommunity.com/DevCenter/PEAK-Rules#toc .. _Core Design Overview: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Design .. _Predicates: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Predicates .. _Basic AST Builder: http://peak.telecommunity.com/DevCenter/PEAK-Rules/AST-Builder .. _Code Generation: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Code-Generation .. _Criteria: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Criteria .. _Indexing: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Indexing .. _Predicates: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Predicates .. _Syntax pattern matching: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Syntax-Matching .. _toc: .. contents:: **Table of Contents** ----------------- Developer's Guide ----------------- XXX basics tutorial should go here Method Combination and Custom Method Types ========================================== Sometimes, more than one method of a generic function applies in a given circumstance. For example, you might need to sum the results of a series of pricing rules in order to compute a product's price. Or, sometimes you'd like a method to be able to modify the result of a less-specific method. For these scenarios, you will want to use "method combination", either using PEAK-Rules' built-in method decorators, or custom method types of your own. Using ``next_method`` --------------------- By default, a generic function will only invoke the most-specific applicable method. However, if you add a ``next_method`` argument to the beginning of an individual method's signature, you can use it to call the "next method" that applies. That is, the second-most-specific method. If that method also has a ``next_method`` argument, it too will be able to invoke the next method after it, and so on, down through all the applicable methods. For example:: >>> from peak.rules import DispatchError >>> @abstract() ... def foo(bar, baz): ... """Foo bar and baz""" >>> @when(foo, "bar>1 and baz=='spam'") ... def foo_one_spam(next_method, bar, baz): ... return bar + next_method(bar, baz) >>> @when(foo, "baz=='spam'") ... def foo_spam(bar, baz): ... return 42 >>> @when(foo, "baz=='blue'") ... def foo_spam(next_method, bar, baz): ... # if next_method is an instance of DispatchError, it means ... # that calling it will raise that error (NoApplicableMethods ... # or AmbiguousMethods) ... assert isinstance(next_method, DispatchError) ... ... # but we'll call it anyway, just to demo the error ... return 22 + next_method(bar, baz) >>> foo(2,"spam") # 2 + 42 44 >>> foo(2,"blue") # 22 + ...no next method! Traceback (most recent call last): File ... combiners.txt... in foo_spam return 22 + next_method(self,bar,baz) ... NoApplicableMethods: ... Notice that ``next_method`` comes *before* ``self`` in the arguments if the generic function is an instance method. (If used, it must be the *very first* argument of the method.) Its value is supplied automatically by the generic function machinery, so when you call ``next_method`` you do not have to care whether the next method needs to know *its* next method; just pass in all of the *other* arguments (including ``self`` if applicable) and the ``next_method`` implementation will do the rest. Also notice that methods that do not call their next method do not need to have a ``next_method`` argument. If a method calls ``next_method`` when there are no further methods available, ``NoApplicableMethods`` is raised. Similarly, if there is more than one "next method" and they are all equally specific (i.e. ambiguous), then ``AmbiguousMethods`` is raised. Most of the time, you will know when writing a routine whether it's safe to call ``next_method``. But sometimes you need a routine to behave differently depending on whether a next method is available. If calling ``next_method`` will raise an error, then ``next_method`` will be an instance of the error class, so you can detect it with ``isinstance()``. If there are no remaining methods, then ``next_method`` will be an instance of ``NoApplicableMethods``, and if the next method is ambiguous, it will be an ``AmbiguousMethods`` instance. In either case, calling ``next_method`` will raise that error with the supplied arguments. (And ``DispatchError`` is a base class of both ``AmbiguousMethods`` and ``NoApplicableMethods``, so you can just check for that.) Before/After Methods -------------------- Sometimes you'd like for some additional validation or notification to occur before or after the "normal" or "primary" methods. This is what "before", "after", and "around" methods are for. For example:: >>> class BankAccount: ... ... def __init__(self,balance,protection=0): ... self.balance = balance ... self.protection = protection ... ... def withdraw(self,amount): ... """Withdraw 'amount' from bank""" ... self.balance -= amount # nominal case ... ... @before(withdraw, "amount>self.balance and self.protection==0") ... def prevent_overdraft(self, amount): ... raise ValueError("Insufficient funds") ... ... @after(withdraw, "amount>self.balance") ... def automatic_overdraft(self, amount): ... print "Transferring",-self.balance,"from overdraft protection" ... self.protection += self.balance ... self.balance = 0 >>> acct = BankAccount(200) >>> acct.withdraw(400) Traceback (most recent call last): ... ValueError: Insufficient funds >>> acct.protection = 300 >>> acct.withdraw(400) Transferring 200 from overdraft protection >>> acct.balance 0 >>> acct.protection 100 This specific example could have been written entirely with normal ``when()`` methods, by using more complex conditions. But, in more complex scenarios, where different modules may be adding rules to the same generic function, it's not possible for one module to predict whether its conditions will be more specific than another's, and whether it will need to call ``next_method``, etc. So, generic functions offer ``before()`` and ``after()`` methods, that run before and after the ``when()`` (aka "primary") methods, respectively. Unlike primary methods, ``before()`` and ``after()`` methods: * Are allowed to have ambiguous conditions (and if they do, they execute in the order in which they were added to the generic function) * Are *always* run when their conditions apply, with no need to call ``next_method`` to invoke the next method * Cannot return a useful value and do not have access to the return value of any other method The overall order of method execution is: 1. All applicable ``before()`` methods, from most-specific to least-specific, methods at the same level of specificity execute in the order they were added. 2. Most-specifc primary method, which may optionally chain to less-specific primary methods. ``AmbiguousMethods`` or ``NoApplicableMethods`` may be raised if the most-specific method is ambiguous or no primary methods are applicable. 3. All applicable ``after()`` methods, from *least-specific* to most-specific, with methods at the same level of specificity executing in the reverse order from the order they were added. (In other words, the more specific the ``after()`` condition, the "more after" it gets run!) If any of these methods raises an uncaught exception, the overall function execution terminates at that point, and methods later in the order are not run. "Around" Methods ---------------- Sometimes you need to recognize certain special cases, and perhaps not run the entire generic function, or need to alter its return value in some way, or perhaps trap and handle certain exceptions, etc. You can do this with "around" methods, which run "around" the entire "before/primary/after" sequence described in the previous section. A good way to think of this is that it's as if the "around" methods form a separate generic function, whose default (least-specific) method is the original, "inner" generic function. When "around" methods are applicable on a given invocation of the generic function, the most-specific "around" method is invoked. It may then choose to call its ``next_method`` to invoke the next-most-specific "around" method, and so on. When there are no more "around" methods, calling ``next_method`` instead invokes the "before", "primary", and "after" methods, according to the sequence described in the previous section. For example:: >>> @around(BankAccount.withdraw, "amount > self.balance") ... def overdraft_fee(next_method,self,amount): ... print "Adding overdraft fee of $25" ... return next_method(self,amount+25) >>> acct.withdraw(20) Adding overdraft fee of $25 Transferring 45 from overdraft protection Values ------ Sometimes, if you're defining a generic function whose job is to classify things, it can get to be a pain defining a bunch of functions or lambdas just to return a few values -- especially if the generic function has a complex signature! So ``peak.rules`` provides a convenience function, ``value()`` for doing this:: >>> from peak.rules import value >>> value(42) value(42) >>> value(42)('whatever') 42 >>> classify = abstract(lambda age:None) >>> when(classify, "age<2")(value("infant")) value('infant') >>> when(classify, "age<13")(value("preteen")) value('preteen') >>> when(classify, "age<5")(value("preschooler")) value('preschooler') >>> when(classify, "age<20")(value("teenager")) value('teenager') >>> when(classify, "age>=20")(value("adult")) value('adult') >>> when(classify, "age>=55")(value("senior")) value('senior') >>> when(classify, "age==16")(value("sweet sixteen")) value('sweet sixteen') >>> classify(17) 'teenager' >>> classify(42) 'adult' Method Combination ------------------ The ``combine_using()`` decorator marks a function as yielding its method results (most-specific to least-specific, with later-defined methods taking precedence), and optionally specifies how the resulting iteration will be post-processed:: >>> from peak.rules import combine_using Let's take a look at how it works, by trying it with different ways of postprocessing on an example generic function. We'll start by defining a function to recreate a generic function with the same set of methods, so you can see what happens when we pass different arguments to ``combine_using``:: >>> class A: pass >>> class B(A): pass >>> class C(A, B): pass >>> class D(B, A): pass >>> def demo(*args): ... """We'll be setting this function up multiple times, so we do it in ... a function. In normal code, you won't need this outer function! ... """ ... @combine_using(*args) ... def func(ob): ... return "default" ... ... when(func, (object,))(value("object")) ... when(func, (int,)) (value("int")) ... when(func, (str,)) (value("str")) ... when(func, (A,)) (value("A")) ... when(func, (B,)) (value("B")) ... ... return func In the simplest case, you can just call ``@combine_using()`` with no arguments, and get a generic function that yields the results returned by its methods, in order from most-specific to least-specific:: >>> func = demo() >>> list(func(A())) ['A', 'object', 'default'] >>> list(func(42)) ['int', 'object', 'default'] In the event of ambiguity between methods, methods defined later are called first:: >>> list(func(C())) ['B', 'A', 'object', 'default'] >>> list(func(D())) ['B', 'A', 'object', 'default'] Passing a function to ``@combine_using()``, however, makes it wrap the result iterator with that function, e.g.:: >>> func = demo(list) >>> func(A()) ['A', 'object', 'default'] While including ``abstract`` anywhere in the wrapper sequence makes the function abstract (i.e., it omits the original function's body from the defined methods):: >>> func = demo(abstract, list) >>> func(A()) # 'default' isn't included any more: ['A', 'object'] You can also include more than one function in the wrapper list, and they will be called on the result iterator, first function outermost, ignoring any ``abstract`` in the sequence:: >>> func = demo(str.title, ' '.join) >>> func(B()) 'B A Object Default' >>> func = demo(str.title, abstract, ' '.join) >>> func(B()) 'B A Object' Some stdlib functions you might find useful for ``combine_using()`` include: * ``itertools.chain`` * ``sorted`` * ``reversed`` * ``list`` * ``set`` * ``"".join`` (or other string) * ``any`` * ``all`` * ``sum`` * ``min`` * ``max`` (And of course, you can write and use arbitrary functions of your own.) By the way, when using "around" methods with a method combination, the innermost ``next_method`` will return the *fully processed* combination of all the "when" methods, with the "before/after" methods running before and after the result is returned:: >>> from peak.rules import before, after, around >>> def b(ob): print "before" >>> def a(ob): print "after" >>> def ar(next_method, ob): ... print "entering around" ... print next_method(ob) ... print "leaving around" >>> b = before(func, ())(b) >>> a = after(func, ())(a) >>> ar = around(func, ())(ar) >>> func(B()) entering around before after B A Object leaving around Custom Method Types ------------------- If the standard before/after/around/when/combine_using decorators don't work for your application, you can create custom ones by defining your own "method types" and decorators. Suppose, for example, that you are using a "pricing rules" generic function that operates by summing its methods' return values to produce a total:: >>> @combine_using(sum) ... def getPrice(product, customer=None, options=()): ... """Get this product's price""" ... return 0 # base price for arbitrary items >>> class Product: ... @when(getPrice) ... def __addBasePrice(self, customer, options): ... """Always include the product's base price""" ... return self.base_price >>> @when(getPrice, "'blue suede' in options") ... def blueSuedeUpcharge(product,customer,options): ... return 24 >>> getPrice("arbitrary thing") 0 >>> shoes = Product() >>> shoes.base_price = 42 >>> getPrice(shoes) 42 >>> getPrice(shoes, options=['blue suede']) 66 This is useful, sure, but what if you also want to be able to compute discounts or tax as a percentage of the total, rather than as flat additional amounts? We can do this by implementing a custom "method type" and a corresponding decorator, to let us mark rules as computing a discount instead of a flat amount. We'll start by defining the template that will be used to generate our method's implementation. This format for method templates is taken from the DecoratorTools package's ``@template_method`` decorator. ``$args`` is used in places where the original generic function's calling signature is needed, and all local variables should be named so as not to conflict with possible argument names. The first argument of the template method will be the generic function the method is being used with, and all other arguments are defined by the method type's creator. In our case, we'll need two arguments: one for the "body" (the discount method being decorated) and one for the "next method" that will be called to get the base price:: >>> def discount_template(__func, __body, __next_method): ... return """ ... __price = __next_method($args) ... return __price - (__body($args) * __price) ... """ Okay, that's the easy bit. Now we need to define a bunch of other stuff to turn it into a method type and a decorator:: >>> from peak.rules.core import Around, MethodList, compile_method, \ ... combine_actions >>> class DiscountMethod(Around): ... """Subtract a discount""" ... ... def override(self, other): ... if self.__class__ == other.__class__: ... return self.override(other.tail) # drop the other one ... return self.tail_with(combine_actions(self.tail, other)) ... ... def compiled(self, engine): ... body = compile_method(self.body, engine) ... next = compile_method(self.tail, engine) ... return engine.apply_template(discount_template, body, next) >>> discount_when = DiscountMethod.make_decorator( ... "discount_when", "Discount price by the returned multiplier" ... ) >>> DiscountMethod >> MethodList # mark precedence The ``make_decorator()`` method of ``Method`` objects lets you create decorators similar to ``when()``, that we can now use to add a discount:: >>> @discount_when(getPrice, ... "customer=='Elvis' and 'blue suede' in options and product is shoes" ... ) ... def ElvisGetsTenPercentOff(product,customer,options): ... return .1 >>> getPrice(shoes) 42 >>> print getPrice(shoes, 'Elvis', options=['blue suede']) 59.4 >>> getPrice(shoes, 'Elvis') # no suede, no discount! 42 XXX This is still pretty hard; but without some real-world use cases for custom methods, it's hard to tell how to streamline the common cases. Porting Code from RuleDispatch ============================== The major design differences between PEAK-Rules and RuleDispatch are: 1. It's designed for extensibility/pluggability from the ground up 2. It's built from the ground up using generic functions instead of adaptation, so its code is a lot more straightforward. (The current implementation, combined with all its dependencies, is roughly the same number of lines as RuleDispatch *without* any of its dependencies -- and already has features that can't even be *added* to RuleDispatch.) 3. It generates custom bytecode for each generic function, to minimize calling and interpreter overhead, and to potentially allow compatibility with Psyco and PyPy in the future. (Currently, neither Psyco nor PyPy support the "computed jump" trick used in the generated code, so don't try to Psyco-optimize any generic functions yet - it'll probably core dump!) Because of its exensible design, PEAK-Rules can use custom-tuned engines for specific application scenarios, and over time it may evolve the ability to accept "tuning hints" to adjust the indexing techniques for special cases. PEAK-Rules also supports the full method combination semantics of RuleDispatch using a new decentralized approach, that allows you to easily create new method types or combination semantics, complete with their own decorators (like ``when``, ``around``, etc.) These decorators also all work with *existing* functions; you do not have to predeclare a function generic in order to use it. You can also omit the condition from the decorator call, in which case the effect is the same as RuleDispatch's ``strategy.default``, i.e. there is no condition. Thus, you can actually use PEAK-Rules's ``around()`` as a quick way to monkeypatch existing functions, even ones defined by other packages. (And the decorators use the ``DecoratorTools`` package, so you can omit the ``@`` signs for Python 2.3 compatibility.) RuleDispatch was always conceived as a single implementation of a single dispatch algorithm intended to be "good enough" for all uses. Guido's argument on the Py3K mailing list, however, was that applications with custom dispatch needs should write custom dispatchers. And I almost agree -- except that I think they should get a RuleDispatch-like dispatcher for free, and be able to tune or write ones to plug in for specialized needs. The kicker was that Guido's experiment with type-tuple caching (a predecessor algorithm to the Chambers-and-Chen algorithm used by RuleDispatch) showed it to be fast *enough* for common uses, even without any C code, as long as you were willing to do a little code generation. The code was super-small, simple, and fast enough that it got me thinking it was good enough for maybe 50% of what you need generic functions for, especially if you added method combination. And thus, PEAK-Rules was born, and RuleDispatch doomed to obsolescence. (It didn't help that RuleDispatch was a hurriedly-thrown-together experiment, with poor testing and little documentation, either.) So, if you are currently using RuleDispatch, we strongly advise that you port your code. To convert the most common RuleDispatch usages, simply do the following: * Replace ``@dispatch.on()`` and ``@dispatch.generic()`` with ``@abstract()`` * Replace ``@func.when(sig)`` with ``@when(func, sig)`` (and the same for ``before``, ``after``, and ``around``) * When replacing ``@func.when(type)`` calls where ``func`` was defined with ``@dispatch.on``, use ``@func.when("isinstance(arg, type)")``, where ``arg`` is the argument that was named in the ``@dispatch.on()`` call. RuleDispatch Emulation ---------------------- If your code doesn't use much of the RuleDispatch API, you may be able to use PEAK-Rules' "emulation API", which supports the following RuleDispatch APIs: * ``dispatch.on``, ``dispatch.generic``, and dispatch.as`` * ``strategy.default``, ``strategy.Min``, ``strategy.Max`` * ``DispatchError``, ``NoApplicableMethods``, and ``AmbiguousMethod`` errors * The ``when()``, ``before()``, ``after()`` and ``around()`` methods of generic functions. (Note that some APIs may issue deprecation warnings (e.g. ``dispatch.as``), and over time, the entire API will be deprecated. Please update your code as soon as practical.) The emulation API does NOT support: * custom combiners (use custom method types instead) * The ``addMethod`` or ``__setitem__`` APIs for adding rules * the ``clone()`` method of generics created with ``dispatch.on`` * PyProtocols (i.e., interfaces cannot be used for dispatching) In the future, a PyProtocols emulation API may be added, but it doesn't exist yet. To use the emulation API, simply import ``dispatch`` from ``peak.rules``:: >>> from peak.rules import dispatch >>> @dispatch.generic() # roughly equivalent to @abstract() ... def a_function(an_arg, other_arg): ... """Blah""" >>> @a_function.when((int, str)) ... def a_when_int_str(an_arg, other_arg): ... print "int and str" >>> a_function(42, "blue") int and str >>> a_function("blue", 42) Traceback (most recent call last): ... NoApplicableMethods: (('blue', 42), {}) Whether you use ``dispatch.generic`` or ``dispatch.on`` to define a generic function, you can begin using ``peak.rules.when`` to declare methods immediately:: >>> @when(a_function, (str, int)) ... def a_when_str_int(an_arg, other_arg): ... print "str and int" >>> a_function("blue", 42) str and int This means that you don't have to update your entire codebase at once; you can port your method definitions incrementally, if desired. ------------ Mailing List ------------ Please direct questions regarding this package to the PEAK mailing list; see http://www.eby-sarna.com/mailman/listinfo/PEAK/ for details. PEAK-Rules-0.5a1.dev-r2713/Code-Generation.txt0000775000076600007660000006653611430374112017622 0ustar telec3telec3======================================== Code Generation from Python Syntax Trees ======================================== The ``peak.rules.codegen`` module extends ``peak.util.assembler`` (from the "BytecodeAssembler" project) with additional AST node types to allow generation of code for simple Python expressions (i.e., those without lambdas, comprehensions, generators, or yields). It also provides "builder" classes that work with the ``peak.rules.ast_builder`` module to generate expression ASTs from Python source code, thus creating an end-to-end compiler tool chain, common subexpression caching support, and a state-machine interpreter generator. This document describes the design (and tests the implementation) of the ``codegen`` module. You don't need to read it unless you want to use this module directly in your own programs, or to create specialized add-ons to PEAK-Rules. If you do want to use it directly, keep in mind that it inherits the limitations and restrictions of both ``peak.util.assembler`` and ``peak.rules.ast_builder``, so you should consult the documentation for those tools before proceeding. .. contents:: **Table of Contents** -------------- AST Generation -------------- To generate an AST from Python code, you need the ``ast_builder.parse_expr()`` function, and the ``codegen.ExprBuilder`` type:: >>> from peak.rules.ast_builder import parse_expr >>> from peak.rules.codegen import ExprBuilder ``ExprBuilder`` instances are created using one or more namespaces. The first namespace maps names to arbitrary AST nodes that will be substituted for any matching names found in an expression. The second and remaining namespaces will have their values wrapped in ``Const`` nodes, so they can be used for constant-folding. For our examples, we'll define a base namespace containing arguments named "a" through "g":: >>> from peak.util.assembler import Local >>> argmap = dict([(name,Local(name)) for name in 'abcdefg']) >>> builder = ExprBuilder(argmap, locals(), globals(), __builtins__) And, for convenience, we'll save the builder's ``parse()`` method as ``pe``:: >>> pe = builder.parse Names and Constants =================== Constants are wrapped in BytecodeAsssembler ``Const()`` nodes:: >>> pe("1") Const(1) Names found in the first namespace are mapped to whatever value is in the namespace:: >>> pe("a") Local('a') Names found in subsequent namespaces get their values wrapped in ``Const()`` nodes:: >>> pe("ExprBuilder") Const(<...peak.rules.codegen.ExprBuilder...>) >>> pe("isinstance") Const() And unfound names produce a compile-time error:: >>> pe("fubar") Traceback (most recent call last): ... NameError: fubar Namespaces and Binding ====================== An ``ExprBuilder`` object's ``bindings`` attribute is a list of dictionaries, mapping names to the desired outputs:: >>> builder.bindings [{}, {'a': Local('a'), 'c': Local('c'), 'b': Local('b'), 'e': Local('e'), 'd': Local('d'), 'g': Local('g'), 'f': Local('f')}, {...}, {...}, {...}] You can add more bindings temporarily with the ``.push()`` method, then remove them with ``.pop()``:: >>> builder.push({'q': pe('42')}) >>> builder.Name('q') Const(42) >>> builder.pop() {'q': Const(42)} >>> builder.Name('q') Traceback (most recent call last): ... NameError: q If you omit the argument to ``.push()``, it just adds an empty namespace to the bindings:: >>> builder.push() >>> builder.bindings [{}, {}, {'a': Local('a'), 'c': Local('c'), 'b': Local('b'), 'e': Local('e'), 'd': Local('d'), 'g': Local('g'), 'f': Local('f')}, {...}, {...}, {...}] Which you can then modify using ``.bind()``:: >>> builder.bind({'x': pe('99')}) >>> builder.bindings [{'x': Const(99)}, {}, {'a': Local('a'), 'c': Local('c'), 'b': Local('b'), 'e': Local('e'), 'd': Local('d'), 'g': Local('g'), 'f': Local('f')}, {...}, {...}, {...}] And finally remove with ``.pop()``:: >>> builder.pop() {'x': Const(99)} >>> builder.bindings [{}, {'a': Local('a'), 'c': Local('c'), 'b': Local('b'), 'e': Local('e'), 'd': Local('d'), 'g': Local('g'), 'f': Local('f')}, {...}, {...}, {...}] Operators ========= Unary operators:: >>> pe("not - + ~ `a`") Not(Minus(Plus(Invert(Repr(Local('a')))))) Attribute access:: >>> pe("a.b.c") Getattr(Getattr(Local('a'), 'b'), 'c') Simple binary operators:: >>> pe("a+b") Add(Local('a'), Local('b')) >>> pe("b-a") Sub(Local('b'), Local('a')) >>> pe("c*d") Mul(Local('c'), Local('d')) >>> pe("c/d") Div(Local('c'), Local('d')) >>> pe("c%d") Mod(Local('c'), Local('d')) >>> pe("c//d") FloorDiv(Local('c'), Local('d')) >>> pe("a**b") Power(Local('a'), Local('b')) >>> pe("a<>> pe("a>>b") RightShift(Local('a'), Local('b')) >>> pe("a[1]") Getitem(Local('a'), Const(1)) >>> pe("a[1][2]") Getitem(Getitem(Local('a'), Const(1)), Const(2)) >>> pe("a&b&c") Bitand(Bitand(Local('a'), Local('b')), Local('c')) >>> pe("a|b|c") Bitor(Bitor(Local('a'), Local('b')), Local('c')) >>> pe("a^b^c") Bitxor(Bitxor(Local('a'), Local('b')), Local('c')) List operators:: >>> pe("a and b") And((Local('a'), Local('b'))) >>> pe("a or b") Or((Local('a'), Local('b'))) >>> pe("a and b and c") And((Local('a'), Local('b'), Local('c'))) >>> pe("a or b or c") Or((Local('a'), Local('b'), Local('c'))) >>> pe("[]") Const([]) >>> pe("[a]") List((Local('a'),)) >>> pe("[a,b]") List((Local('a'), Local('b'))) >>> pe("()") Const(()) >>> pe("a,") Tuple((Local('a'),)) >>> pe("a,b") Tuple((Local('a'), Local('b'))) Slicing:: >>> pe("a[:]") GetSlice(Local('a'), Pass, Pass) >>> pe("a[1:2]") GetSlice(Local('a'), Const(1), Const(2)) >>> pe("a[1:]") GetSlice(Local('a'), Const(1), Pass) >>> pe("a[:2]") GetSlice(Local('a'), Pass, Const(2)) >>> pe("a[::]") Getitem(Local('a'), Const(slice(None, None, None))) >>> pe("a[1:2:3]") Getitem(Local('a'), Const(slice(1, 2, 3))) >>> pe("a[b:c:d]") Getitem(Local('a'), BuildSlice(Local('b'), Local('c'), Local('d'))) Comparisons:: >>> pe("a>b") Compare(Local('a'), (('>', Local('b')),)) >>> pe("a>=b") Compare(Local('a'), (('>=', Local('b')),)) >>> pe("a>> pe("a<=b") Compare(Local('a'), (('<=', Local('b')),)) >>> pe("a<>b") Compare(Local('a'), (('!=', Local('b')),)) >>> pe("a!=b") Compare(Local('a'), (('!=', Local('b')),)) >>> pe("a==b") Compare(Local('a'), (('==', Local('b')),)) >>> pe("a in b") Compare(Local('a'), (('in', Local('b')),)) >>> pe("a is b") Compare(Local('a'), (('is', Local('b')),)) >>> pe("a not in b") Compare(Local('a'), (('not in', Local('b')),)) >>> pe("a is not b") Compare(Local('a'), (('is not', Local('b')),)) >>> pe("a>=b>c") Compare(Local('a'), (('>=', Local('b')), ('>', Local('c')))) Dictionaries:: >>> pe("{a:b,c:d}") Dict(((Local('a'), Local('b')), (Local('c'), Local('d')))) Conditional Expressions:: >>> import sys >>> if sys.version>='2.5': ... pe("a if b else c") ... else: ... print "IfElse(Local('a'), Local('b'), Local('c'))" IfElse(Local('a'), Local('b'), Local('c')) Calls:: >>> pe("a()") Call(Local('a'), (), (), (), (), True) >>> pe("a(1,2)") Call(Local('a'), (Const(1), Const(2)), (), (), (), True) >>> pe("a(1, b=2)") Call(Local('a'), (Const(1),), ((Const('b'), Const(2)),), (), (), True) >>> pe("a(*b)") Call(Local('a'), (), (), Local('b'), (), True) >>> pe("a(**c)") Call(Local('a'), (), (), (), Local('c'), True) >>> pe("a(*b, **c)") Call(Local('a'), (), (), Local('b'), Local('c'), True) ------------------- Bytecode Generation ------------------- AST's generated using ``ExprBuilder`` can be used directly with BytecodeAssembler ``Code`` objects to generate bytecode, complete with constant-folding. Note that the node types not demonstrated below (e.g. ``And``, ``Or``, ``Compare``, ``Call``) are not defined by the ``codegen`` module, but instead are imported from ``peak.util.assembler``:: >>> from peak.rules.codegen import * >>> from peak.util.assembler import Const, Pass >>> Minus(1), Plus(2), Not(True), Invert(-1), Repr(4) (Const(-1), Const(2), Const(False), Const(0), Const('4')) >>> Add(1,2), Sub(3,2), Mul(4,5), Div(10,2), Mod(7,3), FloorDiv(7,3) (Const(3), Const(1), Const(20), Const(5), Const(1), Const(2)) >>> Power(2,3), LeftShift(1,4), RightShift(12,2) (Const(8), Const(16), Const(3)) >>> Getitem(Const([1,2]), 1) Const(2) >>> Bitand(3, 1), Bitor(1,2), Bitxor(3,1) (Const(1), Const(3), Const(2)) >>> Dict([(1,2)]) Const({1: 2}) >>> aList = Const([1,2,3,4]) >>> GetSlice(aList) Const([1, 2, 3, 4]) >>> GetSlice(aList, 1) Const([2, 3, 4]) >>> GetSlice(aList, 1, -1) Const([2, 3]) >>> GetSlice(aList, Pass, -1) Const([1, 2, 3]) >>> BuildSlice(1, 2, 3) Const(slice(1, 2, 3)) >>> BuildSlice(1, 2) Const(slice(1, 2, None)) >>> Tuple([1,2]) Const((1, 2)) >>> List([1,2]) Const([1, 2]) >>> IfElse(1,2,3) Const(1) >>> IfElse(1,0,3) Const(3) State Machine Interpreter Generation ==================================== PEAK-Rules often processes fairly large dispatch trees that would take a long time to generate if translated entirely to bytecode. Plus, they would need to be regenerated every time rules were added to a dispatch tree. So, instead of generating bytecode that encodes the entire dispatch tree, PEAK-Rules uses a "state machine interpreter" approach. The dispatch tree is represented as a tree of objects. Each node consists of an "action" and an "argument". The generated code is simply an interpreter with inlined bytecode to implement the actions associated with the nodes. To minimize interpretation overhead, actions are encoded in the dispatch tree as jump offsets into the generated bytecode. Interpreter functions are generated using the ``SMIGenerator`` class, instantiated with a function whose calling signature will serve as a template for the interpreter function:: >>> from peak.rules.codegen import SMIGenerator >>> def interpreter(input): ... return input >>> smig = SMIGenerator(interpreter) To generate the interpreter function, you call the ``generate()`` method with a root node: an action/argument tuple:: >>> exit_node = (0, interpreter) >>> gfunc = smig.generate(exit_node) The action must either be zero, or a value returned by the ``action_id()`` method (described later below). When the generated interpreter encounters action zero, it will treat the argument as a callback. The callback must accept the same number and type of arguments as the interpreter function, and it will be called with the values of the corresponding local variables. The interpreter will invoke the callback, and then exit, returning whatever value or exception was provided by the exit callback:: >>> gfunc(23) 23 Now let's use the same generator, but add some more actions to it. Actions are added using the ``action_id()`` method, which takes a code generation target and returns an action ID for use in the interpreter. The code generation target will execute with no values on the stack, and must finish execution with one value on the stack -- another (action, argument) pair. It can use the generator's ``ARG`` attribute to refer to the action argument, and the generator's ``NEXT_STATE`` attribute to jump back to the action dispatch loop. A ``NEXT_STATE`` jump is automatically generated after each action, so you don't need to include it. For demonstration and testing, we'll create two new actions: an action that sets the ``input`` local variable to its argument, and an action that simply treats the argument as the next state -- a sort of "pass" action. We'll start with the "pass" action:: >>> pass_id = smig.action_id(smig.ARG) This is about the simplest possible action that meets the requirements of an action: it takes no values on the stack, and puts one value on the stack. In this case, the argument part of the current state. Now let's create a slightly more complex action, ``set_input``:: >>> from peak.util.assembler import nodetype >>> def SetInput(code=None): ... """Argument is a (value, nextstate) tuple; sets 'input' to value""" ... if code is None: return () ... code(smig.ARG) ... code.UNPACK_SEQUENCE(2) ... code.STORE_FAST('input') >>> SetInput = nodetype()(SetInput) >>> set_input = smig.action_id(SetInput()) This action treats its argument as a (`value`, `nextstate`) pair, where `value` is stored in the ``input`` local variable, and `nextstate` is the next state to proceed to. By the way, note that Action ID's are cached, so that passing in equivalent code targets will return the same ID each time:: >>> set_input == smig.action_id(SetInput()) True >>> pass_id == smig.action_id(smig.ARG) True >>> pass_id == smig.action_id(SetInput()) False Whenever you add new actions, you must regenerate the interpreter function in order to be able to use them in the dispatch tree. So we'll regenerate our input function, this time using the ``set_input`` action:: >>> gfunc = smig.generate((set_input, (99, exit_node))) >>> gfunc(27) 99 Now let's create a conditional action and try a more complex tree. This action will proceed to its argument if the input is true, otherwise it will exit immediately:: >>> input_arg_or_exit = smig.action_id( ... IfElse(smig.ARG, Local('input'), Const(exit_node)) ... ) >>> gfunc = smig.generate( ... (input_arg_or_exit, (set_input, (True, exit_node))) ... ) >>> gfunc(27) True >>> gfunc('') '' By the way, using unrecognized action IDs in a dispatch tree will cause an ``AssertionError`` at the point where the action is encountered:: >>> gfunc = smig.generate(("foo", "bar")) >>> gfunc(643) Traceback (most recent call last): ... AssertionError: Invalid action: foo, bar Finally, note that ``SMIGenerator`` objects have a ``maybe_cache`` method, that allows you to do `subexpression caching`_ as described in the next section:: >>> smig.maybe_cache > Note, however, that the cache lifetime is one full run of the generated interpreter function, so take care when choosing candidate expressions for caching. Subexpression Caching ===================== The ``peak.rules.codegen`` module includes a common-subexpression caching extension of ``peak.util.assembler``, used to implement "at most once" calculation of any intermediate results during rule evaluation. It works by setting aside a local variable (``$CSECache``) to hold a dictionary of temporary values, keyed by strings. Any time a cached value is needed, the dictionary is checked first. However, the local variable is initially set to ``None``, to avoid creating a dictionary unnecessarily. In this way, only those portions of the dispatch tree that require intermediate expression evaluation will incur the cost of creating or accessing the dictionary. Note that this caching mechanism is not primarily aimed at improving the performance of the underlying code, although in some cases it *might* have this effect. It is also not aimed at producing compact code; the code it generates may be considerably larger than the unadorned code would be! Rather, the goal is to provide the desired semantics (i.e. no duplicated calculations) with better performance than the ``RuleDispatch`` package provides for the same operations. In ``RuleDispatch``, expressions are calculated using partial functions and a similar cache dictionary to this one, whereas here the functions are effectively inlined as Python bytecode. The ``CSECode`` class replaces the ``assembler.Code`` class:: >>> from dis import dis >>> c = CSECode() >>> a, b = Local('a'), Local('b') >>> dis(c.code()) And the added ``cache()`` method takes an expression to cache. If no previous expressions were cached, a preamble is emitted to initialize the cache:: >>> c.cache(Add(a,b)) >>> dis(c.code()) 0 0 LOAD_CONST 0 (None) 3 STORE_FAST 0 ($CSECache) But subsequent ``cache()`` calls of course do not repeat the preamble:: >>> c.cache(Add(a,b)) # deliberate dupe to verify above only happens once >>> dis(c.code()) 0 0 LOAD_CONST 0 (None) 3 STORE_FAST 0 ($CSECache) Generating a cached object results in extra code being added to ensure that the cache variable is initialized and to retrieve the cached value, if present. The resulting code looks complex, but each of the possible code paths are actually fairly short. The cache keys are the string forms of the cached expressions, with an added number to ensure uniqueness:: >>> c.return_(Add(a,b)) >>> from peak.util.assembler import dump >>> dump(c.code()) LOAD_CONST 0 (None) STORE_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") LOAD_FAST 0 ($CSECache) JUMP_IF_TRUE L1 POP_TOP BUILD_MAP 0 DUP_TOP STORE_FAST 0 ($CSECache) L1: COMPARE_OP 6 (in) JUMP_IF_FALSE L2 POP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") BINARY_SUBSCR JUMP_FORWARD L3 L2: POP_TOP LOAD_FAST 1 (a) LOAD_FAST 2 (b) BINARY_ADD DUP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") STORE_SUBSCR L3: RETURN_VALUE While the ``cache()`` method marks an expression as definitely cacheable, the ``maybe_cache()`` method allows the code object to decide for itself whether the expression should be cached. Specifically, the given expression and all its subexpressions are evaluated against a dummy code object, and its tree structure is examined. Any non-leaf node that appears as a child of two or more parents, or twice or more as a child of the same parent, is considered suitable for caching. In our first example, the expression ``(a+b)/c*d`` is cached, because it's passed to ``maybe_cache()`` twice -- once by itself, and once as a child of ``((a+b)/c*d) % 3``:: >>> a_plus_b = Add(a,b) >>> c_times_d = Mul(Local('c'), Local('d')) >>> abcd = Div(a_plus_b, c_times_d) >>> m3 = Mod(abcd, 3) >>> c = CSECode() >>> c.maybe_cache(abcd) >>> c.maybe_cache(m3) >>> c.return_(m3) >>> dump(c.code()) LOAD_CONST 0 (None) STORE_FAST 0 ($CSECache) LOAD_CONST 1 ("Div(Add(Local('a'), Local('b')), Mul(Local('c'), Local('d'))) #1") LOAD_FAST 0 ($CSECache) JUMP_IF_TRUE L1 POP_TOP BUILD_MAP 0 DUP_TOP STORE_FAST 0 ($CSECache) L1: COMPARE_OP 6 (in) JUMP_IF_FALSE L2 POP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Div(Add(Local('a'), Local('b')), Mul(Local('c'), Local('d'))) #1") BINARY_SUBSCR JUMP_FORWARD L3 L2: POP_TOP LOAD_FAST 1 (a) LOAD_FAST 2 (b) BINARY_ADD LOAD_FAST 3 (c) LOAD_FAST 4 (d) BINARY_MULTIPLY BINARY_DIVIDE DUP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Div(Add(Local('a'), Local('b')), Mul(Local('c'), Local('d'))) #1") STORE_SUBSCR L3: LOAD_CONST 2 (3) BINARY_MODULO RETURN_VALUE In the next example, we compute ``(a+b)*(a+b)`` after inspecting ``(a+b)*(b+a)`` and ``(b+a)*(a+b)`` for recurring sub-expressions. Naturally, we detect that ``(a+b)`` is used more than once, so it is cached:: >>> c = CSECode() >>> b_plus_a = Add(b,a) >>> ab_2 = Mul(a_plus_b, a_plus_b) >>> c.maybe_cache(Mul(b_plus_a, a_plus_b)) >>> c.maybe_cache(Mul(a_plus_b, b_plus_a)) >>> c.return_(ab_2) >>> dump(c.code()) LOAD_CONST 0 (None) STORE_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") LOAD_FAST 0 ($CSECache) JUMP_IF_TRUE L1 POP_TOP BUILD_MAP 0 DUP_TOP STORE_FAST 0 ($CSECache) L1: COMPARE_OP 6 (in) JUMP_IF_FALSE L2 POP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") BINARY_SUBSCR JUMP_FORWARD L3 L2: POP_TOP LOAD_FAST 1 (a) LOAD_FAST 2 (b) BINARY_ADD DUP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") STORE_SUBSCR L3: LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") LOAD_FAST 0 ($CSECache) JUMP_IF_TRUE L4 POP_TOP BUILD_MAP 0 DUP_TOP STORE_FAST 0 ($CSECache) L4: COMPARE_OP 6 (in) JUMP_IF_FALSE L5 POP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") BINARY_SUBSCR JUMP_FORWARD L6 L5: POP_TOP LOAD_FAST 1 (a) LOAD_FAST 2 (b) BINARY_ADD DUP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") STORE_SUBSCR L6: BINARY_MULTIPLY RETURN_VALUE And in this example, we also compute ``(a+b)*(a+b)``, but this time only inspecting that one expression for recurrences. We still find the recurrence, because ``(a+b)`` occurs more than once under the parent expression:: >>> c = CSECode() >>> c.maybe_cache(ab_2) >>> c.return_(ab_2) >>> dump(c.code()) LOAD_CONST 0 (None) STORE_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") LOAD_FAST 0 ($CSECache) JUMP_IF_TRUE L1 POP_TOP BUILD_MAP 0 DUP_TOP STORE_FAST 0 ($CSECache) L1: COMPARE_OP 6 (in) JUMP_IF_FALSE L2 POP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") BINARY_SUBSCR JUMP_FORWARD L3 L2: POP_TOP LOAD_FAST 1 (a) LOAD_FAST 2 (b) BINARY_ADD DUP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") STORE_SUBSCR L3: LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") LOAD_FAST 0 ($CSECache) JUMP_IF_TRUE L4 POP_TOP BUILD_MAP 0 DUP_TOP STORE_FAST 0 ($CSECache) L4: COMPARE_OP 6 (in) JUMP_IF_FALSE L5 POP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") BINARY_SUBSCR JUMP_FORWARD L6 L5: POP_TOP LOAD_FAST 1 (a) LOAD_FAST 2 (b) BINARY_ADD DUP_TOP LOAD_FAST 0 ($CSECache) LOAD_CONST 1 ("Add(Local('a'), Local('b')) #1") STORE_SUBSCR L6: BINARY_MULTIPLY RETURN_VALUE Finally, it's important to note that only subexpressions that increase the stack size by exactly 1 are considered for caching:: >>> from peak.util.assembler import Suite, Code >>> c = CSECode() >>> s = Suite([a, b]) >>> ss = Suite([s, s]) >>> c.maybe_cache(ss) >>> c.return_(ss) >>> dis(c.code()) 0 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 LOAD_FAST 0 (a) 9 LOAD_FAST 1 (b) 12 RETURN_VALUE >>> c = CSECode() >>> s = Suite([a, b, Code.POP_TOP, Code.POP_TOP, Code.POP_TOP]) >>> ss = Suite([a, s, a, s]) >>> c.maybe_cache(ss) >>> c(ss) >>> dis(c.code()) 0 0 LOAD_FAST 0 (a) 3 LOAD_FAST 0 (a) 6 LOAD_FAST 1 (b) 9 POP_TOP 10 POP_TOP 11 POP_TOP 12 LOAD_FAST 0 (a) 15 LOAD_FAST 0 (a) 18 LOAD_FAST 1 (b) 21 POP_TOP 22 POP_TOP 23 POP_TOP PEAK-Rules-0.5a1.dev-r2713/ez_setup/0000755000076600007660000000000012506203664015737 5ustar telec3telec3PEAK-Rules-0.5a1.dev-r2713/ez_setup/README.txt0000775000076600007660000000114610761563234017447 0ustar telec3telec3This directory exists so that Subversion-based projects can share a single copy of the ``ez_setup`` bootstrap module for ``setuptools``, and have it automatically updated in their projects when ``setuptools`` is updated. For your convenience, you may use the following svn:externals definition:: ez_setup svn://svn.eby-sarna.com/svnroot/ez_setup You can set this by executing this command in your project directory:: svn propedit svn:externals . And then adding the line shown above to the file that comes up for editing. Then, whenever you update your project, ``ez_setup`` will be updated as well. PEAK-Rules-0.5a1.dev-r2713/ez_setup/__init__.py0000755000076600007660000002400011516250102020034 0ustar telec3telec3#!python """Bootstrap setuptools installation If you want to use setuptools in your package's setup.py, just include this file in the same directory with it, and add this to the top of your setup.py:: from ez_setup import use_setuptools use_setuptools() If you want to require a specific version of setuptools, set a download mirror, or use an alternate download directory, you can do so by supplying the appropriate options to ``use_setuptools()``. This file can also be run as a script to install or upgrade setuptools. """ import sys DEFAULT_VERSION = "0.6c11" DEFAULT_URL = "http://pypi.python.org/packages/%s/s/setuptools/" % sys.version[:3] md5_data = { 'setuptools-0.6b1-py2.3.egg': '8822caf901250d848b996b7f25c6e6ca', 'setuptools-0.6b1-py2.4.egg': 'b79a8a403e4502fbb85ee3f1941735cb', 'setuptools-0.6b2-py2.3.egg': '5657759d8a6d8fc44070a9d07272d99b', 'setuptools-0.6b2-py2.4.egg': '4996a8d169d2be661fa32a6e52e4f82a', 'setuptools-0.6b3-py2.3.egg': 'bb31c0fc7399a63579975cad9f5a0618', 'setuptools-0.6b3-py2.4.egg': '38a8c6b3d6ecd22247f179f7da669fac', 'setuptools-0.6b4-py2.3.egg': '62045a24ed4e1ebc77fe039aa4e6f7e5', 'setuptools-0.6b4-py2.4.egg': '4cb2a185d228dacffb2d17f103b3b1c4', 'setuptools-0.6c1-py2.3.egg': 'b3f2b5539d65cb7f74ad79127f1a908c', 'setuptools-0.6c1-py2.4.egg': 'b45adeda0667d2d2ffe14009364f2a4b', 'setuptools-0.6c10-py2.3.egg': 'ce1e2ab5d3a0256456d9fc13800a7090', 'setuptools-0.6c10-py2.4.egg': '57d6d9d6e9b80772c59a53a8433a5dd4', 'setuptools-0.6c10-py2.5.egg': 'de46ac8b1c97c895572e5e8596aeb8c7', 'setuptools-0.6c10-py2.6.egg': '58ea40aef06da02ce641495523a0b7f5', 'setuptools-0.6c11-py2.3.egg': '2baeac6e13d414a9d28e7ba5b5a596de', 'setuptools-0.6c11-py2.4.egg': 'bd639f9b0eac4c42497034dec2ec0c2b', 'setuptools-0.6c11-py2.5.egg': '64c94f3bf7a72a13ec83e0b24f2749b2', 'setuptools-0.6c11-py2.6.egg': 'bfa92100bd772d5a213eedd356d64086', 'setuptools-0.6c2-py2.3.egg': 'f0064bf6aa2b7d0f3ba0b43f20817c27', 'setuptools-0.6c2-py2.4.egg': '616192eec35f47e8ea16cd6a122b7277', 'setuptools-0.6c3-py2.3.egg': 'f181fa125dfe85a259c9cd6f1d7b78fa', 'setuptools-0.6c3-py2.4.egg': 'e0ed74682c998bfb73bf803a50e7b71e', 'setuptools-0.6c3-py2.5.egg': 'abef16fdd61955514841c7c6bd98965e', 'setuptools-0.6c4-py2.3.egg': 'b0b9131acab32022bfac7f44c5d7971f', 'setuptools-0.6c4-py2.4.egg': '2a1f9656d4fbf3c97bf946c0a124e6e2', 'setuptools-0.6c4-py2.5.egg': '8f5a052e32cdb9c72bcf4b5526f28afc', 'setuptools-0.6c5-py2.3.egg': 'ee9fd80965da04f2f3e6b3576e9d8167', 'setuptools-0.6c5-py2.4.egg': 'afe2adf1c01701ee841761f5bcd8aa64', 'setuptools-0.6c5-py2.5.egg': 'a8d3f61494ccaa8714dfed37bccd3d5d', 'setuptools-0.6c6-py2.3.egg': '35686b78116a668847237b69d549ec20', 'setuptools-0.6c6-py2.4.egg': '3c56af57be3225019260a644430065ab', 'setuptools-0.6c6-py2.5.egg': 'b2f8a7520709a5b34f80946de5f02f53', 'setuptools-0.6c7-py2.3.egg': '209fdf9adc3a615e5115b725658e13e2', 'setuptools-0.6c7-py2.4.egg': '5a8f954807d46a0fb67cf1f26c55a82e', 'setuptools-0.6c7-py2.5.egg': '45d2ad28f9750e7434111fde831e8372', 'setuptools-0.6c8-py2.3.egg': '50759d29b349db8cfd807ba8303f1902', 'setuptools-0.6c8-py2.4.egg': 'cba38d74f7d483c06e9daa6070cce6de', 'setuptools-0.6c8-py2.5.egg': '1721747ee329dc150590a58b3e1ac95b', 'setuptools-0.6c9-py2.3.egg': 'a83c4020414807b496e4cfbe08507c03', 'setuptools-0.6c9-py2.4.egg': '260a2be2e5388d66bdaee06abec6342a', 'setuptools-0.6c9-py2.5.egg': 'fe67c3e5a17b12c0e7c541b7ea43a8e6', 'setuptools-0.6c9-py2.6.egg': 'ca37b1ff16fa2ede6e19383e7b59245a', } import sys, os try: from hashlib import md5 except ImportError: from md5 import md5 def _validate_md5(egg_name, data): if egg_name in md5_data: digest = md5(data).hexdigest() if digest != md5_data[egg_name]: print >>sys.stderr, ( "md5 validation of %s failed! (Possible download problem?)" % egg_name ) sys.exit(2) return data def use_setuptools( version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir, download_delay=15 ): """Automatically find/download setuptools and make it available on sys.path `version` should be a valid setuptools version number that is available as an egg for download under the `download_base` URL (which should end with a '/'). `to_dir` is the directory where setuptools will be downloaded, if it is not already available. If `download_delay` is specified, it should be the number of seconds that will be paused before initiating a download, should one be required. If an older version of setuptools is installed, this routine will print a message to ``sys.stderr`` and raise SystemExit in an attempt to abort the calling script. """ was_imported = 'pkg_resources' in sys.modules or 'setuptools' in sys.modules def do_download(): egg = download_setuptools(version, download_base, to_dir, download_delay) sys.path.insert(0, egg) import setuptools; setuptools.bootstrap_install_from = egg try: import pkg_resources except ImportError: return do_download() try: pkg_resources.require("setuptools>="+version); return except pkg_resources.VersionConflict, e: if was_imported: print >>sys.stderr, ( "The required version of setuptools (>=%s) is not available, and\n" "can't be installed while this script is running. Please install\n" " a more recent version first, using 'easy_install -U setuptools'." "\n\n(Currently using %r)" ) % (version, e.args[0]) sys.exit(2) except pkg_resources.DistributionNotFound: pass del pkg_resources, sys.modules['pkg_resources'] # reload ok return do_download() def download_setuptools( version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir, delay = 15 ): """Download setuptools from a specified location and return its filename `version` should be a valid setuptools version number that is available as an egg for download under the `download_base` URL (which should end with a '/'). `to_dir` is the directory where the egg will be downloaded. `delay` is the number of seconds to pause before an actual download attempt. """ import urllib2, shutil egg_name = "setuptools-%s-py%s.egg" % (version,sys.version[:3]) url = download_base + egg_name saveto = os.path.join(to_dir, egg_name) src = dst = None if not os.path.exists(saveto): # Avoid repeated downloads try: from distutils import log if delay: log.warn(""" --------------------------------------------------------------------------- This script requires setuptools version %s to run (even to display help). I will attempt to download it for you (from %s), but you may need to enable firewall access for this script first. I will start the download in %d seconds. (Note: if this machine does not have network access, please obtain the file %s and place it in this directory before rerunning this script.) ---------------------------------------------------------------------------""", version, download_base, delay, url ); from time import sleep; sleep(delay) log.warn("Downloading %s", url) src = urllib2.urlopen(url) # Read/write all in one block, so we don't create a corrupt file # if the download is interrupted. data = _validate_md5(egg_name, src.read()) dst = open(saveto,"wb"); dst.write(data) finally: if src: src.close() if dst: dst.close() return os.path.realpath(saveto) def main(argv, version=DEFAULT_VERSION): """Install or upgrade setuptools and EasyInstall""" try: import setuptools except ImportError: egg = None try: egg = download_setuptools(version, delay=0) sys.path.insert(0,egg) from setuptools.command.easy_install import main return main(list(argv)+[egg]) # we're done here finally: if egg and os.path.exists(egg): os.unlink(egg) else: if setuptools.__version__ == '0.0.1': print >>sys.stderr, ( "You have an obsolete version of setuptools installed. Please\n" "remove it from your system entirely before rerunning this script." ) sys.exit(2) req = "setuptools>="+version import pkg_resources try: pkg_resources.require(req) except pkg_resources.VersionConflict: try: from setuptools.command.easy_install import main except ImportError: from easy_install import main main(list(argv)+[download_setuptools(delay=0)]) sys.exit(0) # try to force an exit else: if argv: from setuptools.command.easy_install import main main(argv) else: print "Setuptools version",version,"or greater has been installed." print '(Run "ez_setup.py -U setuptools" to reinstall or upgrade.)' def update_md5(filenames): """Update our built-in md5 registry""" import re for name in filenames: base = os.path.basename(name) f = open(name,'rb') md5_data[base] = md5(f.read()).hexdigest() f.close() data = [" %r: %r,\n" % it for it in md5_data.items()] data.sort() repl = "".join(data) import inspect srcfile = inspect.getsourcefile(sys.modules[__name__]) f = open(srcfile, 'rb'); src = f.read(); f.close() match = re.search("\nmd5_data = {\n([^}]+)}", src) if not match: print >>sys.stderr, "Internal error!" sys.exit(2) src = src[:match.start(1)] + repl + src[match.end(1):] f = open(srcfile,'w') f.write(src) f.close() if __name__=='__main__': if len(sys.argv)>2 and sys.argv[1]=='--md5update': update_md5(sys.argv[2:]) else: main(sys.argv[1:]) PEAK-Rules-0.5a1.dev-r2713/wikiup.cfg0000775000076600007660000000046711430374112016076 0ustar telec3telec3[PEAK] PEAK-Rules = README.txt PEAK-Rules/AST-Builder = AST-Builder.txt PEAK-Rules/Code-Generation = Code-Generation.txt PEAK-Rules/Criteria = Criteria.txt PEAK-Rules/Design = DESIGN.txt PEAK-Rules/Indexing = Indexing.txt PEAK-Rules/Predicates = Predicates.txt PEAK-Rules/Syntax-Matching = Syntax-Matching.txt PEAK-Rules-0.5a1.dev-r2713/DESIGN.txt0000755000076600007660000010307211631103276015615 0ustar telec3telec3============================= The PEAK Rules Core Framework ============================= NOTE: This document is for people who are extending the core framework in some way, e.g. adding custom action types to specialize method combination, or creating new kinds of engines or conditions. It isn't intended to be user documentation for the built-in rule facility. .. contents:: **Table of Contents** ------------------------ Overview and Terminology ------------------------ The PEAK-Rules core framework provides a generic API for creating and manipulating generic functions, with a high degree of extensibility. Almost any concept implemented by the core can be replaced by a third-party implementation on a function-by-function basis. In this way, an individual library or application can provide for its specific needs, without needing to reinvent the entire spectrum of tools. The main concepts implemented by the core are: Generic functions A function with a "dispatching" add-on, that manages a collection of methods, where each method has a rule to determine its applicability. When a generic function is invoked, a combination of the methods that apply to the invocation (as determined by their rules) is invoked. Method combination The ability to compose a set of methods into a single function, with their precedence determined by the type of method and the logical implication relationships of their applicability rules. Development Roadmap =================== The first versions will focus on developing a core framework for extensible functions that is itself implemented using extensible functions. This self-bootstrapping core will implement a type-tuple-caching engine using relatively primitive operations, and will then have a method combination system built on that. The core will thus be capable of implementing generic functions with multiple dispatch based on positional argument types, and the decorator APIs will be built around that. The next phase of development will add alternative engines that are oriented towards predicate dispatch and more sophisticated ways of specifying regular class dispatch (e.g. being able to say things like ``isinstance(x,Foo) or isinstance(y,Foo)``). To some extent this will be porting the expression machinery from RuleDispatch to work on the new core, but in a lot of ways it'll just be redone from scratch. Having type-based multiple dispatch available to implement the framework should enable a significant reduction in the complexity of the resulting library. An additional phase will focus on adding new features not possible with the RuleDispatch engine, such as "predicate functions" (a kind of dynamic macro or rule expansion feature), "classifiers" (a way of priority-sequencing a set of alternative criteria) and others. Finally, specialty features such as index customization, thread-safety, event-oriented rulesets, and such will be introduced. Design Concepts =============== (Note: Criteria, signatures, and predicates are described and tested in detail by the ``Criteria.txt`` document.) Criterion A criterion is a symbolic representation of a test that returns a boolean for a given value, for example by testing its type. The simplest criterion is just a class or type object, meaning that the value should be of that type. Signature A condition expressed purely in terms of simple tests "and"ed together, using no "or" operations of any kind. A signature specifies what argument expressions are tested, and which criteria should be applied to them. The simplest possible signature is a tuple of criteria, with each criterion applied to the corresponding argument in an argument tuple. (An empty tuple matches any possible input.) Signatures are also described Predicate One or more signatures "or"ed together. (Note that this means that signatures are predicates, but predicates are not necessarily signatures.) Rule A combination of a predicate, an action type, and a body (usually a function.) The existence of a rule implies the existence of one or more actions of the given action type and body, one for each possible signature that could match the predicate. Action Type A factory that can produce an Action when supplied with a signature, body, and sequence. (Examples in ``peak.rules`` will include the ``MethodList``, ``MethodChain``, ``Around``, ``Before``, and ``After`` types.) Action An object representing the behavior of a single invocation of a generic function. Action objects may be combined (using a generic function of the form ``combine_actions(a1,a2)``) to create combined methods ala RuleDispatch. Each action comprises at least a signature and a body, but actions of more complex types may include other information. Rule Set A collection of rules, combined with some policy information (such as the default action type) and optional optimization hints. A rule set does not directly implement dispatching. Instead, rule engines subscribe to rule sets, and the rule set informs them when actions are added and removed due to changes in the rule set's rules. This would almost be better named an "action set" than a "rule set", in that it's (virtually speaking) a collection of actions rather than rules. However, you do add and remove entries from it by specifying rules; the actions are merely implied by the rules. Generic functions will have a ``__rules__`` attribute that points to their rule set, so that the various decorators can add rules to them. You will probably be able to subclass the base RuleSet class or create alternate implementations, as might be useful for supporting persistent or database-stored rules. (Although you'd probably also need a custom rule engine for that.) Rule Engine An object that manages the dispatching of a given rule set to implement a specific generic function. Generic functions will have an ``__engine__`` attribute that points to their current engine. Engines will be responsible for doing any indexing, caching, or code generation that may be required to implement the resulting generic function. The default engine will implement simple type-based multiple dispatch with type-tuple caching. For simple generic functions this is likely to be faster than almost anything else, even C-assisted RuleDispatch. It also should have far less definition-time overhead than a RuleDispatch-style engine would. Engines will be pluggable, and in fact there will be a mechanism to allow engines to be switched at runtime when certain conditions are met. For example, the default engine could switch automatically to a RuleDispatch-like engine if a rule is added whose conditions can't be translated to simple type dispatching. There will also be some type of hint system to allow users to suggest what kind of engine implementation or special indexing might be appropriate for a particular function. ------------------ Method Combination ------------------ Method combination is performed using the ``combine_actions()`` API function:: >>> from peak.rules.core import combine_actions ``combine_actions()`` takes two arguments: a pair of actions. They are compared using the ``overrides()`` generic function to see if one is more specific than the other. If so, the more specific action's ``override()`` method is called, passing in the less-specific action. If neither action can override the other, the first action's ``merge()`` method is called, passing in the other action. In either case, the result of calling the ``merge()`` or ``override()`` method is returned. So, to define a custom action type for method combination, and it needs to implement ``merge()`` and ``override()`` methods, and it must be comparable to other method types via the ``overrides()`` generic function. Signature Implication ===================== The ``implies()`` function is used to determine the logical implication relationship between two signatures. A signature ``s1`` implies a signature ``s2`` if ``s2`` will always match an invocation matched by ``s1``. (Action implication is based on signature implication; see the `Action Types`_ section below for more details.) For the simplest signatures (tuples of types), this corresponds to a subclass relationship between the elements of the tuples:: >>> from peak.rules.core import implies >>> implies(int, object) True >>> implies(object, int) False >>> implies(int, str) False >>> implies(int, int) True >>> implies( (int,str), (object,object) ) True >>> implies( (object,int), (object,str) ) False It's possible for a longer tuple to imply a shorter one:: >>> implies( (int,int), (object,) ) True But not the other way around:: >>> implies( (int,), (object,object) ) False And as a special case of type implication, any classic class implies both ``object`` and ``InstanceType``, but cannot imply any other new-style classes. This special-casing is used to work around the fact that ``isinstance()`` will say that a classic class instance is an instance of both ``object`` and ``InstanceType``, but ``issubclass()`` doesn't agree. PEAK-Rules wants to conform with ``isinstance()`` here:: >>> class X: pass >>> implies(X, object) True >>> from types import InstanceType >>> implies(X, InstanceType) True ``istype()`` objects -------------------- Type or class objects are used to represent "this class or a subclass", but ``istype()`` objects are used to represent either "this exact type" (using ``istype(aType, True)``), or "anything but this exact type" (``istype(aType, False)``). So their implication rules are different. Internally, PEAK-Rules uses ``istype`` objects to represent a call signature being matched, because the argument being tested is of some exact type. Then, any rule signatures that are implied by the calling signature are considered "applicable". So, ``istype(aType, True)`` (the default) must always imply the same type or class, or any parent class thereof:: >>> from peak.rules import istype >>> implies(istype(int), int) True >>> implies(istype(int), object) True >>> implies(istype(X), InstanceType) True >>> implies(istype(X), object) True But not the other way around:: >>> implies(int, istype(int)) False >>> implies(object, istype(int)) False >>> implies(InstanceType, istype(X)) False >>> implies(object, istype(X)) False An exact type will also imply any exclusion of a *different* exact type:: >>> implies(istype(int), istype(str, False)) True In other words, if ``type(x) is int``, that implies ``type(x) is not str``. But of course, that doesn't work they other way around:: >>> implies(istype(str, False), istype(int)) False These implication rules are sufficient to bootstrap the basic types-only rules engine; additional rules for ``istype`` behavior are explained in Criteria.txt to show intersection of criteria such as ``istype``, and other more-advanced criteria manipulation used in the full predicate rules engine. Action Types ============ Method ------ The default action type (for rules with no specified action type) is ``Method``. A ``Method`` combines a body, a signature, a definition-order serial number, and an optional "chained" action that it can fall back to. All of these values are optional, except for the body:: >>> from peak.rules.core import Method, overrides >>> def dummy(*args, **kw): ... print "called with", args, kw >>> meth = Method.make(dummy, (object,), 1) >>> meth Method(<...dummy...>, (,), 1, None) Calling a ``Method`` invokes the wrapped body:: >>> meth(1,2,x=3) called with (1, 2) {'x': 3} One ``Method`` overrides another if and only if its signature implies the other's:: >>> overrides(Method.make(dummy,(int,int)), Method.make(dummy,(object,object))) True >>> overrides(Method.make(dummy,(object,object)), Method.make(dummy,(int,int))) False When a method overrides another, you get the overriding method:: >>> meth.override(Method.make(dummy)) Method(<...dummy...>, (,), 1, None) Unless the overriding method's body is a function whose first parameter is named ``next_method``, in which case a chain of methods is created via the "tail" of a copy of the overriding method:: >>> def overriding_fn(next_method, etc): ... print "calling", next_method ... return next_method(etc) >>> chain = Method.make(overriding_fn).override(Method.make(dummy)) >>> chain Method(<...overriding_fn...>, (), 0, Method(<...dummy...>, (), 0, None)) The resulting chain is a callable ``Method``, and the ``next_method`` is passed in to the first function of the chain:: >>> chain(42) calling called with (42,) {} Around ------ ``Around`` methods are identical to normal ``Method`` objects, except that whenever an ``Around`` method and a regular ``Method`` are combined, the ``Around`` method overrides the regular one. This forces all the regular methods to be further down the chain than all of the "around" methods. >>> from peak.rules.core import Around >>> combine_actions(Method.make(dummy), Around(overriding_fn)) Around(<...overriding_fn...>, (), 0, Method(<...dummy...>, (), 0, None)) You will normally only want to use ``Around`` methods with functions that have a ``next_method`` parameter, since their purpose is to wrap "around" the calling of lower-precedence methods. If you don't do this, then the method chain will always end at that ``Around`` instance:: >>> combine_actions(Method.make(overriding_fn), Around(dummy)) Around(<...dummy...>, (), 0, None) NoApplicableMethods ------------------- The simplest possible action type is ``NoApplicableMethods``, meaning that there is no applicable action. When it's overridden by another method, it will of course get chained to the other method's tail (if appropriate). >>> from peak.rules import NoApplicableMethods >>> naf = NoApplicableMethods() >>> meth = Method.make(overriding_fn) >>> combine_actions(naf, meth) Method(<...overriding_fn...>, (), 0, NoApplicableMethods()) >>> combine_actions(meth, naf) Method(<...overriding_fn...>, (), 0, NoApplicableMethods()) Calling a ``NoApplicableMethods`` raises it, displaying the arguments it was called with:: >>> naf(1,2,x="y") Traceback (most recent call last): ... NoApplicableMethods: ((1, 2), {'x': 'y'}) Before, After, and MethodList ----------------------------- ``MethodList`` actions differ from normal method chain actions in a number of ways: * In case of ambiguity, they are ordered according to the sequence they were given in the underlying rule set. * They do not need to inspect or call a ``next_method()``; the next method is always called automatically. The ``Before`` and ``After`` action types are both ``MethodList`` subclasses. ``Before`` actions are invoked before their tail action, and ``After`` actions are invoked afterward:: >>> from peak.rules.core import Before, After >>> def primary(*args,**kw): ... print "primary method called" ... return 99 >>> b = Before.make(dummy).override(Method.make(primary)) >>> a = After.make(dummy).override(Method.make(primary)) >>> b(23) called with (23,) {} primary method called 99 >>> a(42) primary method called called with (42,) {} 99 Notice that to create a ``MethodList`` with only one method, you must use the ``make()`` classmethod. ``Method`` also has this classmethod, but it has the same signature as the main constructor. The main constructor for ``MethodList`` has a different signature for its internal use. The combination of before, after, primary, and around methods is as shown:: >>> b = Before.make(dummy) >>> a = After.make(dummy) >>> p = Method.make(primary) >>> o = Around.make(overriding_fn) >>> combine_actions(b, combine_actions(a, combine_actions(p, o)))(17) calling called with (17,) {} primary method called called with (17,) {} 99 ``Around`` methods take precedence over all other method types, so the around method's tail is a ``Before`` that wraps the ``After`` that wraps the primary method. Within a ``MethodList``, methods are ordered by signature implication first, and then by definition order within groups of ambiguous signatures:: >>> b1 = Before.make("b1", (), 1) >>> b2 = Before.make("b2", (), 2) >>> b3 = Before.make("b3", (int,), 3) >>> combine_actions(b2, b3).sorted() [((,), 'b3'), ((), 'b2')] >>> combine_actions(b2, b1).sorted() [((), 'b1'), ((), 'b2')] >>> combine_actions(b3, combine_actions(b1,b2)).sorted() [((,), 'b3'), ((), 'b1'), ((), 'b2')] ``After`` methods sort the opposite way:: >>> a1 = After.make("a1", (), 1) >>> a2 = After.make("a2", (), 2) >>> a3 = After.make("a3", (int,), 3) >>> combine_actions(a2, a3).sorted() [((), 'a2'), ((,), 'a3')] >>> combine_actions(a2, a1).sorted() [((), 'a2'), ((), 'a1')] >>> combine_actions(a3, combine_actions(a1,a2)).sorted() [((), 'a2'), ((), 'a1'), ((,), 'a3')] And lower-precedence duplicate bodies are automatically eliminated from the results:: >>> combine_actions(a1,a1).sorted() [((), 'a1')] >>> combine_actions(b1,b1).sorted() [((), 'b1')] >>> combine_actions(b1, Before.make("b1", (int,), 1)).sorted() [((,), 'b1')] AmbiguousMethods ---------------- When you combine actions whose signatures are ambiguous (i.e. identical, overlapping, or mutually exclusive), you end up with an ``AmbiguousMethods`` object containing the ambiguous methods:: >>> am = combine_actions(meth, meth) >>> am AmbiguousMethods([Method(...), Method(...)]) Ambiguous methods can be overridden by an action that would override all of the ambiguous actions:: >>> m1 = Method.make(dummy, (int,)) >>> combine_actions(am, m1) is m1 True >>> combine_actions(m1, am) is m1 True And if appropriate, the ``AmbiguousMethods`` will end up chained to the overriding method:: >>> m2 = Method.make(overriding_fn, (str,)) >>> combine_actions(am, m2) Method(<...overriding_fn...>, (,), 0, AmbiguousMethods(...)) >>> combine_actions(m2, am) Method(<...overriding_fn...>, (,), 0, AmbiguousMethods(...)) Ambiguous methods override and ignore anything that would be overridden by any of their members:: >>> am = combine_actions(m1, m1) >>> combine_actions(am, meth) is am True >>> combine_actions(meth, am) is am True But anything that overlaps just results in a bigger ``AmbiguousMethods``:: >>> combine_actions(m2,am) AmbiguousMethods([Method(...), Method(...), Method(...)]) >>> combine_actions(am,m2) AmbiguousMethods([Method(...), Method(...), Method(...)]) And invoking an ``AmbiguousMethods`` instance just outputs diagnostic info:: >>> am(1,2,x="y") Traceback (most recent call last): ... AmbiguousMethods: ([Method(...), Method(...)], (1, 2), {'x': 'y'}) Custom Method Types and Compilation ----------------------------------- Custom method types can be defined by subclassing ``Method``, and used as a generic function's default method type by setting the functions' rules' ``default_actiontype``:: >>> class MyMethod(Method): ... def __call__(self, *args, **kw): ... print "calling!" ... return self.body(*args, **kw) >>> from peak.rules import when, abstract >>> from peak.rules.core import rules_for >>> tmp = lambda foo: 42 >>> def func_with(mtype): ... abstract() ... def f(foo): """dummy""" ... rules_for(f).default_actiontype = mtype ... when(f, (object,))(tmp) ... return f >>> f = func_with(MyMethod) >>> f(1) calling! 42 The ``compile_method(action, engine)`` function takes a method and a dispatch engine, and returns a compiled version of the action:: >>> from peak.rules.core import compile_method, Dispatching >>> engine = Dispatching(f).engine >>> compile_method(Method(tmp, ()), engine) is tmp True However, for our newly defined method type, there is no compilation:: >>> m = MyMethod(tmp, ()) >>> compile_method(m, engine) is tmp False >>> compile_method(m, engine) is m True This is because our method type redefined ``__call__()`` but did not include its own ``compiled()`` method. The ``compiled()`` method of a ``Method`` subclass takes an ``Engine`` as its argument, and should return a callable to be used in place of directly calling the method itself. It should pass any objects it plans to call (e.g. its tail or individual submethods) through ``compile_method(ob, engine)``, in order to ensure that those objects are also compiled:: >>> class MyMethod2(Method): ... def compiled(self, engine): ... print "compiling" ... return compile_method(self.body, engine) >>> m = MyMethod2(tmp) >>> compile_method(m, engine) is tmp compiling True As you can see, ``compile_method()`` invokes our new ``compiled()`` method, which ends up returning the original function. And, if we don't define a ``__call__()`` method of our own, we end up inheriting one from ``Method`` that compiles the method and invokes it for us:: >>> m(1) compiling 42 However, if we use this method type in a generic function, then the generic function will cache the compiled version of its methods so they don't have to be compiled every time they're called:: >>> f = func_with(MyMethod2) >>> f(1) compiling 42 >>> f(1) 42 (Note: what caching is done, and when the cache is reset is heavily dependent on the specific dispatching engine in use; it can also be the case that a similar-looking method object will be compiled more than once, because in each case it has a different tail or match signature.) Now, ``Method`` subclasses do NOT inherit their ``compiled()`` method from their base classes, unless they are *also* inheriting ``__call__``. This prevents you from ending up with strangely-broken code in the event you redefine ``__call__()``, but forget to redefine ``compiled()``:: >>> class MyMethod3(MyMethod2): ... def __call__(self, *args, **kw): ... print "calling!" ... return self.body(*args, **kw) >>> f = func_with(MyMethod3) >>> f(1) calling! 42 >>> f(1) calling! 42 As you can see, the new subclass *works*, but doesn't get compiled. So, you can do your initial debugging and development without compilation by defining ``__call__()``, and then switch over to ``compiled()`` once you're happy with your prototype. Now, let's define a method type that works like ``MyMethod3``, but is compiled using a template:: >>> class NoisyMethod(Method): ... def compiled(self, engine): ... print "compiling" ... body = compile_method(self.body, engine) ... return engine.apply_template(noisy_template, body) So far, it looks a little like our earlier compilation. We compile the body like before, but then, what's that ``apply_template`` stuff? The ``apply_template()`` method of engine objects takes a "template" function and one or more arguments representing values that need to be accessible in our compiled function. Let's go ahead and define ``noisy_template`` now:: >>> def noisy_template(__func, __body): ... return """ ... print "calling!" ... return __body($args) ... """ Template functions are defined using the conventions of DecoratorTools's ``@template_function`` decorator, only without the decorator. The first positional argument is the generic function the compiled method is being used with, and any others are up to you. Any use of ``$args`` is replaced with the correct calling signature for invoking a method of the corresponding generic function, and you *must* name all of your arguments and local variables such that they won't conflict with any actual argument names. (In practice, this means you want to use ``__``-prefixed names, which is why we're defining the template outside the class, to prevent Python from mangling our parameter names and messing up the template.) Note, too, that all the other caveats regarding ``@template_function`` functions apply, including the fact that the function cannot actually *use* any of its arguments (or any variables from its containing scope) to determine the return string -- it must simply return a constant string. (It can, however, refer to globals in its defining module, as long as they're not shadowed by the generic function's argument names.) Okay, let's see our new method type in action:: >>> f = func_with(NoisyMethod) >>> f(1) compiling calling! 42 >>> f(1) calling! 42 As you can see, the method is still compiled just once, but still prints "calling!" every time it's invoked, as the compiled form of the method is a purpose-built wrapper function. To save time and memory, the ``engine.apply_template()`` tries to memoize calls so that it will return the same function, given the same inputs, so long as the function still exists:: >>> from peak.rules import value >>> m = NoisyMethod((), value(42)) >>> m1 = compile_method(m) compiling >>> m2 = compile_method(m) compiling >>> m1 is m2 True This will only work, however, if all the arguments passed to ``apply_template`` are usable as dictionary keys. So, it's best to use tuples instead of lists, frozensets instead of sets, etc. (Also, this means you can't pass in keyword arguments.) Defining Method Precedence -------------------------- You can define one method type's precedence relative to another using the ``>>`` operator (which always returns its right-side operand):: >>> NoisyMethod >> Method You can also chain ``>>`` operators to define overall method precedence between multiple types, e.g.:: >>> Around >> NoisyMethod >> Method As long as you don't try to introduce a precedence cycle:: >>> NoisyMethod >> MyMethod2 >> Around Traceback (most recent call last): ... TypeError: already overrides Decorators ========== XXX decorators and how to create them: when, around, before, after:: >>> from peak.rules import before, after >>> def p(x): print x >>> def f(): p("yo!") Rule decorators return the function they are decorating, unless the function's name is also the name of the generic function they're adding to:: >>> before(f)(lambda: p("before")) at ...> >>> after(f)(lambda: p("after")) at ...> >>> f() before yo! after Decorators can accept an entry point string in place of an actual function, provided that the PEAK "Importing" package (``peak.util.imports``) is available. In that case, the registration is deferred until the named module is imported:: >>> before('some.module:somefunc')(lambda: p("before")) at ...> If the named module is already imported, the registration takes place immediately, otherwise it is deferred until the named module is actually imported. This allows you to provide optional integration with modules that might or might not be used by a given application, without creating a dependency between your code and that package. Note, however, that if the named function doesn't exist when the module is imported, then an attribute error will occur at import time. The syntax of the target name is lightly checked at call time, however:: >>> before('foo.bar')(lambda: p("before")) Traceback (most recent call last): ... TypeError: Function specifier 'foo.bar' is not in 'module.name:attrib.name' format >>> before('foo: bar')(lambda: p("before")) Traceback (most recent call last): ... TypeError: Function specifier 'foo: bar' is not in 'module.name:attrib.name' format (This is just a sanity check, though, just to make sure you didn't accidentally put some other string first (like the criteria). It won't detect a string that points to a non-existent module, or various other possible errors, so you should still verify that your code gets run when the target module is imported and the relevant conditions apply.) Creating Custom Combinations ============================ XXX custom combination demo from RuleDispatch (compute upcharges+tax) ---------------- Rules Management ---------------- Rules ===== Rules are currently implemented as 3-item tuples comprising a predicate, a body, and an action type that will be used as a factory to create the actions for the rule. At minimum, all a rule needs is a body, so there's a convenience constructor (``Rule``) that allows you to create a rule with defaults. The predicate and action type default to ``()`` and ``None`` if not specified:: >>> from peak.rules.core import Rule >>> def dummy(): pass >>> r = Rule(dummy, sequence=0) >>> r Rule(, (), None, 0) An action type of ``None`` (or any false value) means that the ruleset should decide what action type to use. Actually, it can decide anyway, since the rule set is always responsible for creating action objects; the rule's action type is really just advisory to begin with. RuleSet ======= ``RuleSet`` objects hold the rules and policy information for a generic function, including the default action type and optional optimization hints. Iterating over a ruleset yields its actions:: >>> from peak.rules.core import RuleSet >>> rs = RuleSet() >>> list(rs) [] And rules can be added and removed with the ``add()`` and ``remove()`` methods:: >>> r = Rule(dummy, sequence=42) >>> rs.add(r) >>> list(rs) [Rule(, (), <...Method...>, 42)] >>> rs.remove(r) >>> list(rs) [] Observers can be added with the ``subscribe()`` and ``unsubscribe()`` methods. Observers have their ``actions_changed`` method called with an "added" set and a "removed" set of action definitions. (An action definition is a tuple of the form ``(actiontype, body, signature, serial)``, and can thus be used to create action objects.) :: >>> class DummyObserver: ... def actions_changed(self, added, removed): ... for a in added: print "Add:", a ... for a in removed: print "Remove:", a >>> do = DummyObserver() >>> rs.subscribe(do) >>> rs.add(r) Add: Rule(, (), <...Method...>, 42) >>> rs.remove(r) Remove: Rule(, (), <...Method...>, 42) >>> rs.unsubscribe(do) When an observer is first added, it's notified of the current contents of the ``RuleSet``, if any. As a result, observers don't need to do any special case handling for their initial setup. Everything can be handled via the normal operation of the ``actions_changed()`` method:: >>> rs.add(r) >>> rs.subscribe(do) Add: Rule(, (), <...Method...>, 42) Unsubscribing, however, does not send any removal messages:: >>> rs.unsubscribe(do) ------------------ Criteria and Logic ------------------ This section is currently just design notes to myself, hopefully to grow into a more thorough discussion and doctests of the relevant sub-frameworks. DNF Logic ========= :: # These 2 funcs must skip dupes and return the item alone if only 1 found disj(*items) = Or( [i for item in items for i in disjuncts(item)] ) conj(items) = And([i for item in items for i in conjuncts(item)] ) def combinatorial(seq, *tail): if tail: return ((item,)+t for item in seq for t in combinatorial(*tail)) else: return ((item,) for item in seq) # this func would be more efficient if 'conj' were moved inside 'combinatorial' # especially if conj were a binary operation, and the results of each nested # loop were reduced to a unique set... # intersect(*items) = Or( map(conj, combinatorial(*map(disjuncts,items))) ) # simplified, but still needs dupe skipping/flattening of the Or intersect(i1, i2) = Or( *[conj((a,b)) for a in set(disjuncts(i1)) for b in set(disjuncts(i2))] ) disjuncts(Or) = Or.items disjuncts(Not) = map(negate, conjuncts(Not.expr)) disjuncts(*) = [*] conjuncts(And) = And.items conjuncts(Not) = map(negate, disjuncts(Not.expr)) conjuncts(*) = [*] negate(And) = Or(map(negate,And.items)) negate(Or) = And(map(negate,Or.items)) negate(Not) = Not.expr negate(Compare) = reverse comparison sense ? negate(*) = Not(*) Not-methods and negate() function could be eliminated by CriteriaBuilder tracking negation during build. implies(Or, *) iff all Or.items imply * implies(And, *) iff any And.items imply * implies(*, *) iff equal items [need to define for struct/struct, too!] implies(Range, Range) by range overlap implies(IsInstance, IsInstance) by subclass relationships/truth map to_logic(Call) -> via function mapping for Call(Const(),...) to_logic(Compare) -> Identity, IsInstance, Range, etc.? to_logic(*) -> Truth(expr, mode) Criteria/Indexing ================= :: dispatch_table(*, Identity, cases) -> {seed: bitmap} where bitmap = inclusions[seed] | (inclusions[None] - exclusions[seed]) | (cases - known_cases) PEAK-Rules-0.5a1.dev-r2713/Criteria.txt0000755000076600007660000011425411433706265016421 0ustar telec3telec3================ Logical Criteria ================ In order to process arbitrary expression-based rules, PEAK-Rules needs to "understand" the way that conditions logically relate to each other. This document describes the design (and tests the implementation) of its logical criteria management. You do not need to read this unless you are extending or interfacing with this subsystem directly, or just want to understand how this stuff actually works! The most important ideas here are implication, intersection, and disjunctive normal form. But don't panic if you don't know what those terms mean! They're really quite simple. Implication means that if one thing is true, then so is another. A implies B if B is always true whenever A is true. It doesn't matter what B is when A is not true, however. It could be true or false, we don't care. Implication is important for prioritizing which rules are "more specific" than others. Intersection just means that both things have to be true for a condition to be true - it's like the "and" of two conditions. But rather than performing an actual "and", we're creating a *new condition* that will only be true when the two original conditions would be true. And finally, disjunctive normal form (DNF) means "an OR of ANDs". For example, this expression is in DNF:: (A and C) or (B and C) or (A and D) or (B and D) But this equivalent expression is **not** in DNF:: (A or B) and (C or D) The criteria used to define generic function methods are likely to look more like this, than they are to be in disjunctive normal form. Therefore, we must convert them in order to implement the Chambers & Chen dispatch algorithm correctly (see `Indexing.txt`_). .. _Indexing.txt: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Indexing We do this using the ``DisjunctionSet`` and ``OrElse`` classes to represent overall expressions (sets or sequences of "ors"), and the ``Signature`` and ``Conjunction`` classes to represent sequences or sets of "and"-ed conditions. Within a ``Signature``, the things that are "and"-ed together are a sequence of ``Test`` instances. A ``Test`` pairs a "dispatch expression" with a "criterion". For example, this expression:: isinstance(x, Y) would be represented internally as a ``Test`` instance like this:: Test(IsInstance(Local('x')), Class(Y)) ``Conjunction`` instances, on the other hand, are used to "and" together criteria that apply to the same dispatch expression. For example, this expression:: isinstance(x, Y) and isinstance(x, Z) would be represented internally like this:: Test(IsInstance(Local('x')), Conjunction([Class(Y), Class(Z)])) The rest of this document describes how predicates, signatures, tests, dispatch expressions, and criteria work together to create expressions in disjunctive normal form, and whose implication of other expressions can be determined. The basic logical functions we will use are ``implies()``, ``intersect()``, ``disjuncts()``, and ``negate()``, all of which are defined in ``peak.rules.core``:: >>> from peak.rules.core import implies, intersect, disjuncts, negate ---------------------------------------- Boolean Conditions and Logical Operators ---------------------------------------- The most fundamental conditions are simply ``True`` and ``False``. ``True`` represents a rule that *always* applies, while ``False`` represents a rule that *never* applies. Therefore, the result of intersecting ``True`` and any other object, always returns that object, while intersecting ``False`` with any other object returns ``False``:: >>> intersect(False, False) False >>> intersect(False, True) False >>> intersect(True, False) False >>> intersect(True, True) True >>> intersect(object(), True) >>> intersect(True, object()) >>> intersect(object(), False) False >>> intersect(False, object()) False Because ``True`` means "condition that always applies", *everything* implies ``True``, but ``True`` only implies itself:: >>> implies(object(), True) True >>> implies(True, object()) False >>> implies(True, True) True On the other hand, because ``False`` means "condition that never applies", ``False`` implies *everything*. (Because if you start from a false premise, you can arrive at any conclusion!):: >>> implies(False, True) True >>> implies(False, object()) True However, no condition other than ``False`` can ever imply ``False``, because all other conditions can *sometimes* apply:: >>> implies(object(), False) False >>> implies(True, False) False >>> implies(False, False) True Notice, by the way, a few important differences between ``implies()`` and ``intersect()``. ``implies()`` *always* returns a boolean value, ``True`` or ``False``, because it's an immediate answer to the question of, "does the second condition always apply if the first condition applies?" ``intersect()``, on the other hand, returns a *condition* that will always be true when the original conditions apply. So, if it returns a boolean value, that's just an indication that the intersection of the two input conditions would always apply or never apply. Also, ``intersect()`` is logically symmetrical, in that it doesn't matter what order the arguments are in, whereas the order is critically important for ``implies()``. However, ``intersect()`` methods must be order *preserving*, because the order in which logical "and" operations occur is important. Consider, for example, the condition ``"y!=0 and z>x/y"``, in which it would be a bad thing to skip the zero check before the division! So, as we will see later on, when working with more complex conditions, ``intersect()`` methods must ensure that the subparts of the output condition are in the same relative order as they were in the input. (Also, note that in general, when you intersect two conditions, if one condition implies the other, the result of the intersection is the implying condition. This general rule greatly simplifies the implementation of most intersect operations, since as long as there is an implication relationship defined between conditions, many common cases of intersection can be handled automatically.) In contrast to both ``implies()`` and ``intersects()``, the ``disjuncts()`` function takes only a single argument, and returns a list of the "disjuncts" (or-ed-together conditions) of its argument. More precisely, it returns a list of conditions that each imply the original condition. That is, if any of the disjuncts were true, then the original condition would also be true. Thus, the ``disjuncts()`` of an arbitrary object will normally be a list containing just that object:: >>> disjuncts(object()) [] >>> disjuncts(True) [True] But ``False`` is a special case; ``False`` has *no* disjuncts, since no other condition can ever imply ``False``:: >>> disjuncts(False) [] As a result, "or"-ing ``False`` with other conditions will simply remove the ``False`` from the resulting predicate, and conditions that can never be true are not used for indexing or dispatching. Another special case is tuples containing nested tuples:: >>> disjuncts( (float, (int, str)) ) [(, ), (, )] >>> disjuncts( ((int, str), object) ) [(, ), (, )] >>> disjuncts( (object, (int, str), float) ) [(, , ), (, , )] >>> disjuncts( ((int, str), (int, str)) ) [(, ), (, ), (, ), (, )] This lets you avoid writing lots of decorators for the cases where you want more than one type (or ``istype()`` instance) to match in a given argument position. (As you can see, it's equivalent to specifying all the individual combinations of specified types.) Finally, the ``negate()`` function inverts the truth of a condition, e.g.:: >>> negate(True) False >>> negate(False) True Of course, it also applies to criteria other than pure boolean values, as we'll see in the upcoming sections. ------------------- "Criterion" Objects ------------------- A criterion object describes a set of possible values for a dispatch expression. There are several criterion types supplied with PEAK-Rules, but you can also add your own, as long as they can be tested for implication with ``implies()``, and intersected with ``intersect()``. (And if they represent an "or" of sub-criteria, they should be able to provide their list of ``disjuncts()``. They'll also need to be indexable, but more on that later in other documents!) "And"-ed Criteria ================= Sometimes, more than one criterion is applied to the same dispatch expression. For example in the expression ``x is not y and x is not z``, two criteria are being applied to the identity of ``x``. To represent this, we need a way to represent a set of "and-ed" criteria. ``peak.rules.criteria`` provides a base class for this, called ``Conjunction``:: >>> from peak.rules.criteria import Conjunction This class is a subclass of ``frozenset``, but has a few additional features. First, a ``Conjunction`` never contains redundant (implied) items. For example, the conjunction of the classes ``object`` and ``int`` is ``int``, because ``int`` already implies ``object``:: >>> Conjunction([int, object]) >>> Conjunction([object, int]) Notice also that instead of getting back a set with one member, we got back the item that would have been in the set. This helps to simplify the expression structure. As a further simplification, creating an empty conjunction returns ``True``, because "no conditions required" is the same as "always true":: >>> Conjunction([]) True A conjunction implies a condition, if any condition in the conjunction implies the other condition:: >>> implies(Conjunction([str, int]), str) True >>> implies(Conjunction([str, int]), int) True >>> implies(Conjunction([str, int]), object) True >>> implies(Conjunction([str, int]), float) False A condition implies a conjunction, however, only if the condition implies every part of the conjunction:: >>> class a: pass >>> class b: pass >>> class c(a,b): pass >>> class d(a, int): pass >>> implies(c, Conjunction([a, b])) True >>> implies(a, Conjunction([a, b])) False >>> implies(Conjunction([c,d]), Conjunction([a, int])) True >>> implies(Conjunction([c, int]), Conjunction([a, int])) True >>> implies(Conjunction([a, int]), Conjunction([c, int])) False (By the way, on a more sophisticated level of reasoning, you could say that ``Conjunction([str, int])`` should have equalled ``False`` above, since there's no way for an object to be both an ``int`` and a ``str`` at the same time. But that would be an excursion into semantics and outside the bounds of what PEAK-Rules can "reason" about using only logical implication as defined by the ``implies()`` generic function.) ``Conjunction`` objects can be intersected with one another, or with additional conditions, and the result is another ``Conjunction`` of the same type as the leftmost set. So, if we use subclasses of our own, the result of intersecting them will be a conjunction of the correct subclass:: >>> class MySet(Conjunction): pass >>> type(intersect(MySet([int, str]), float)) >>> intersect(MySet([int, str]), float) == MySet([int, str, float]) True >>> intersect(float, MySet([int, str])) == MySet([float, int, str]) True >>> intersect(MySet([d, c]), MySet([int, str])) == MySet([d,c,str]) True If you want to ensure that all items in a set are of appropriate type or value, you can override ``__init__`` to do the checking, and raise an appropriate error. PEAK-Rules does this for its specialized conjunction classes, but uses ``if __debug__:`` and ``assert`` statements to avoid the extra overhead when run with ``python -O``. You may wish to do the same for your subclasses. "Or"-ed Criteria ================ The ``DisjunctionSet`` and ``OrElse`` classes are used to represent sets and sequences of "or"-ed criteria:: >>> from peak.rules.criteria import DisjunctionSet, OrElse Both types automatically exclude redundant (i.e. more-specific) criteria, and can never contain less than 2 entries. For example, "or"-ing ``object`` and ``int`` always returns ``object``, because ``object`` is implied by ``int``:: >>> DisjunctionSet([int, object]) >>> DisjunctionSet([object, int]) >>> OrElse([int, object]) >>> OrElse([object, int]) Notice that instead of getting back a set or sequence with one member, we got back the item that would have been in the set. This helps to simplify the expression structure. As a further simplification, creating an empty disjunction returns ``False``, because "no conditions are sufficient" is the same as "always false":: >>> DisjunctionSet([]) False >>> OrElse([]) False In addition to eliminating redundancy, disjunction *sets* also flatten any nested disjunctions:: >>> DisjunctionSet([DisjunctionSet([1, 2]), DisjunctionSet([3, 4])]) DisjunctionSet([1, 2, 3, 4]) This is because it uses the ``disjuncts()`` generic function to determine whether any of the items it was given are "or"-ed conditions of some kind. And the ``disjuncts()`` of a ``DisjunctionSet`` are a list of its contents:: >>> disjuncts(DisjunctionSet([1, 2, 3, 4])) [1, 2, 3, 4] But ``OrElse`` sequences do not do this flattening, in order to avoid imposing an arbitrary sequence on their contents:: >>> OrElse([DisjunctionSet([1, 2]), DisjunctionSet([3, 4])]) OrElse([DisjunctionSet([1, 2]), DisjunctionSet([3, 4])]) (The ``disjuncts()`` of an ``OrElse`` are much more complicated, as the disjuncts of a Python expression like ``"a or b or c"`` reduce to ``"a"``, ``"(not a) and b"``, and ``"(not a and not b) and c"``! We'll talk more about this later, in the section on `Predicates`_ below.) A disjunction only implies a condition if *all* conditions in the disjunction imply the other condition:: >>> implies(DisjunctionSet([str, int]), str) False >>> implies(DisjunctionSet([str, int]), int) False >>> implies(DisjunctionSet([str, int]), float) False >>> implies(DisjunctionSet([str, int]), object) True >>> implies(OrElse([str, int]), str) False >>> implies(OrElse([str, int]), int) False >>> implies(OrElse([str, int]), float) False >>> implies(OrElse([str, int]), object) True A condition implies a disjunction, however, if the condition implies any part of the disjunction:: >>> class a: pass >>> class b: pass >>> class c(a,b): pass >>> class d(a, int): pass >>> implies(c, DisjunctionSet([a, b])) True >>> implies(a, DisjunctionSet([a, b])) True >>> implies(a, DisjunctionSet([int, str])) False >>> implies(DisjunctionSet([c,d]), DisjunctionSet([a, int])) True >>> implies(DisjunctionSet([c,int]), DisjunctionSet([a, int])) True >>> implies(DisjunctionSet([c, int]), True) True >>> implies(False, DisjunctionSet([c, int])) True >>> implies(c, OrElse([a, b])) True >>> implies(a, OrElse([a, b])) True >>> implies(a, OrElse([int, str])) False >>> implies(OrElse([c,d]), OrElse([a, int])) True >>> implies(OrElse([c,int]), OrElse([a, int])) True >>> implies(OrElse([c, int]), True) True >>> implies(False, OrElse([c, int])) True The intersection of a disjunction and any other object is a disjunction containing the intersection of that object with the original disjunctions' contents. In other words:: >>> int_or_str = DisjunctionSet([int, str]) >>> long_or_float = DisjunctionSet([long, float]) >>> intersect(int_or_str, float) == DisjunctionSet([ ... Conjunction([int, float]), Conjunction([str, float]) ... ]) True >>> intersect(long, int_or_str) == DisjunctionSet([ ... Conjunction([long, int]), Conjunction([long, str]) ... ]) True >>> intersect(int_or_str, long_or_float) == DisjunctionSet([ ... Conjunction([int,long]), Conjunction([int, float]), ... Conjunction([str,long]), Conjunction([str, float]), ... ]) True >>> intersect(int_or_str, Conjunction([long, float])) == \ ... DisjunctionSet( ... [Conjunction([int, long, float]), ... Conjunction([str, long, float])] ... ) True >>> intersect(Conjunction([int, str]), long_or_float) == \ ... DisjunctionSet( ... [Conjunction([int, str, long]), Conjunction([int, str, float])] ... ) True As you can see, this is the heart of the process that allows expressions like ``(A or B) and (C or D)`` to be transformed into their disjunctive normal form (i.e. ``(A and C) or (A and D) or (B and C) or (B and D)``). (In other words, by using ``Disjunction()`` as an "or" operator and ``intersect()`` as the "and" operator, we always end up with a DNF result!) Object Identity =============== The ``IsObject`` criterion type represents the set of objects which either are -- or are *not* -- one specific object instance. ``IsObject(x)`` (or ``IsObject(x, True)``) represents the set of objects ``y`` for which the ``y is x`` condition would be true. Conversely, ``IsObject(x, False)`` represents the set of objects ``y`` for whom ``y is not x``:: >>> from peak.rules.criteria import IsObject, Conjunction >>> o = object() >>> is_o = IsObject(o) >>> is_not_o = IsObject(o, False) >>> is_o IsObject(, True) >>> is_not_o IsObject(, False) >>> is_not_o == negate(is_o) True >>> is_o == negate(is_not_o) True The intersection of two different ``is`` identities is ``False``, since an object cannot be both itself and another object:: >>> intersect(is_o, IsObject("foo")) False >>> implies(is_o, IsObject("foo")) False Similarly, an object can't be both itself, and not itself:: >>> intersect(is_o, is_not_o) False >>> intersect(is_not_o, is_o) False >>> implies(is_o, is_not_o) False But it *can* be itself and itself:: >>> intersect(is_o, is_o) == is_o True >>> implies(is_o, is_o) True Or not itself and not itself:: >>> intersect(is_not_o, is_not_o) == is_not_o True >>> implies(is_not_o, is_not_o) True And an object can be itself, while not being something else:: >>> intersect(is_o, IsObject("foo", False)) == is_o True >>> intersect(IsObject("foo", False), is_o) == is_o True >>> implies(is_o, IsObject("foo", False)) True But just because an object is not something, doesn't mean it's something else:: >>> implies(is_not_o, IsObject("foo")) False And the intersection of multiple ``is not`` conditions produces a ``Conjunction``:: >>> not_foo = IsObject("foo", False) >>> not_bar = IsObject("bar", False) >>> not_foobar = intersect(not_foo, not_bar) >>> not_foobar Conjunction([IsObject('foo', False), IsObject('bar', False)]) Which of course then implies each of the individual "not" conditions:: >>> implies(not_foobar, not_bar) True >>> implies(not_foobar, not_foo) True But not their opposites:: >>> implies(not_foobar, IsObject("bar")) False Oh, and an ``is`` condition implies any ``Conjunction`` that don't contain its opposite:: >>> implies(is_o, not_foobar) True But not the other way around:: >>> implies(not_foobar, is_o) False Finally, negating a ``Conjunction`` of is-nots returns a disjunction of true ``IsObject`` tests, and vice versa:: >>> negate(not_foobar) DisjunctionSet([IsObject('foo', True), IsObject('bar', True)]) >>> negate(DisjunctionSet([IsObject('foo'), IsObject('bar')])) Conjunction([IsObject('foo', False), IsObject('bar', False)]) Values and Ranges ================= ``Value`` objects are used to represent ``==`` and ``!=`` comparisons. ``Value(x)`` represents ``==x`` and ``Value(x, False)`` represents ``!=x``. A ``Value`` implies another ``Value`` if the two are identical:: >>> from peak.rules.criteria import Value, Range, Min, Max >>> implies(Value(27), Value(42)) False >>> implies(Value(27, False), Value(42)) False >>> implies(Value(27), Value(27)) True >>> implies(Value(99), Value(99, False)) False >>> implies(Value(99, False), Value(99, False)) True Or, if they have different target values, but the first is an ``==`` comparison, and the second is a ``!=`` comparison:: >>> implies(Value(27), Value(99, False)) True >>> intersect(Value(27), Value(99, False)) Value(27, True) The negation of a ``Value`` is of course another ``Value`` of the same target but the reverse operator:: >>> negate(Value(27)) Value(27, False) >>> negate(Value(99, False)) Value(99, True) The intersection of two different ``==`` values, or a ``!=`` and ``==`` of the same value, is ``False`` (i.e., no possible match:: >>> intersect(Value(27), Value(42)) False >>> intersect(Value(27), Value(27, False)) False But the intersection of two different ``!=`` values produces a disjunction of three ``Range()`` objects:: >>> one_two = intersect(Value(1, False), Value(2, False)) >>> one_two == DisjunctionSet([ ... Range((Min, -1), (1, -1)), ... Range((1, 1), (2, -1)), ... Range((2, 1), (Max, 1)) ... ]) True >>> intersect(one_two, Value(3,False)) == DisjunctionSet([ ... Range((Min, -1), (1, -1)), ... Range((1, 1), (2, -1)), ... Range((2, 1), (3, -1)), ... Range((3, 1), (Max, 1)) ... ]) True The ``Range()`` criterion type represents an inequality such as ``lo < x < hi`` or ``x >= lo``. The lows and highs given have to be a 2-tuple, consisting of a value and a "direction". The direction is an integer (either -1 or 1) that indicates whether the edge is on the low or high side of the target value. Thus, a tuple ``(27, -1)`` means "the low edge of 27", while ``(99, 1)`` means "the high edge of 99". In this way, any simple inequality or range can be represented by a pair of edges. Thus, the intersection of two different ``!=`` values produces a disjunction of three ``Range()`` objects, representing the intervals that "surround" the original ``!=`` values:: >>> from peak.rules.criteria import Range >>> intersect(Value(27, False), Value(42, False)) == DisjunctionSet([ ... Range(hi=(27, -1)), # below Min ... below 27 ... Range((27,1), (42,-1)), # above 27 ... below 42 ... Range(lo=(42, 1)), # above 42 ... above Max ... ]) True Notice that if we omit the ``hi`` or ``lo``, end of the range, it's replaced with "below ``Min``" or "above ``Max``", as appropriate. (The ``Min`` and ``Max`` values are special objects that compare below or above any other object.) When creating range and value objects, it can be useful to use the ``Inequality`` constructor, which takes a comparison operator and a value:: >>> from peak.rules.criteria import Inequality >>> Inequality('>=', 27) # >=27 : below 27 ... above Max Range((27, -1), (Max, 1)) >>> negate(Inequality('<', 27)) Range((27, -1), (Max, 1)) >>> Inequality('>', 27) # > 27 : above 27 ... above Max Range((27, 1), (Max, 1)) >>> Inequality('<', 99) # < 99 : below Min ... below 99 Range((Min, -1), (99, -1)) >>> Inequality('<=',99) # <=99 : below Min ... above 99 Range((Min, -1), (99, 1)) >>> negate(Inequality('>', 99)) Range((Min, -1), (99, 1)) >>> Inequality('==', 66) Value(66, True) >>> Inequality('!=', 77) Value(77, False) Intersecting two ranges (or a range and an ``==`` value) produces a smaller range or value, or ``False`` if there is no overlap:: >>> intersect(Inequality('<', 27), Inequality(">",19)) Range((19, 1), (27, -1)) >>> intersect(Inequality('>=', 27), Inequality("<=",19)) False >>> intersect(Value(27), Inequality('>=', 27)) Value(27, True) >>> intersect(Inequality('<=', 27), Value(27)) Value(27, True) >>> intersect(Value(27), Inequality('<',27)) False >>> intersect(Inequality('>',27), Value(27)) False Last, but not least, a range (or value) implies another range or value if it lies entirely within it:: >>> implies(Range((42,-1), (42,1)), Value(42)) True >>> implies(Range((27,-1), (42,1)), Range((15,1),(99,-1))) True >>> implies(Range((27,-1), (42,1)), Value(99, False)) True But not if it overlaps or lies outside of it:: >>> implies(Range((15,-1),(42,1)), Range((15,1),(99,-1))) False >>> implies(Range((27,-1), (42,1)), Value(99)) False Class/Type Tests ================ ``Class`` objects represent ``issubclass()`` or ``isinstance()`` sets. ``Class(x)`` is a instance/subclass match, while ``Class(x, False)`` is a non-match. Implication, negation, and intersection are defined accordingly:: >>> from peak.rules.criteria import Class >>> implies(Class(int), Class(object)) True >>> implies(Class(object, False), Class(int, False)) True >>> negate(Class(int)) Class(, False) >>> negate(Class(object, False)) Class(, True) >>> implies(Class(int), Class(str)) False >>> implies(Class(object), Class(int, False)) False >>> implies(Class(object), Class(int)) False >>> implies(Class(int), Class(int)) True >>> intersect(Class(int), Class(object)) Class(, True) >>> intersect(Class(object), Class(int)) Class(, True) The intersection of two or more unrelated ``Class`` criteria is represented by a ``Conjunction``:: >>> from peak.rules.criteria import Conjunction >>> intersect(Class(int, False), Class(str, False)) == Conjunction( ... [Class(int, False), Class(str, False)] ... ) True Exact-Type and Type-Exclusion Tests =================================== Exact type tests are expressed using ``istype(x)``, and type exclusion tests are represented as ``istype(x, False)``:: >>> from peak.rules import istype >>> negate(istype(int)) istype(, False) >>> negate(istype(object, False)) istype(, True) One ``istype()`` test implies another only if they're equal:: >>> implies(istype(int), istype(int)) True >>> implies(istype(int, False), istype(int, False)) True >>> implies(istype(int, False), istype(int)) False Or if the first is an exact match, and the second is an exclusion test for a different type:: >>> implies(istype(int), istype(str, False)) True Thus, the intersection of two ``istype()`` tests will be either one of the input tests, or ``False`` (meaning no overlap):: >>> intersect(istype(int), istype(int)) istype(, True) >>> intersect(istype(int), istype(str, False)) istype(, True) >>> intersect(istype(int, False), istype(int, False)) istype(, False) >>> intersect(istype(int), istype(str)) False Unless both are exclusion tests on different types, in which case their intersection is a ``Conjunction`` of the two:: >>> intersect(istype(str, False), istype(int, False)) == Conjunction([ ... istype(int, False), istype(str, False) ... ]) True An ``istype(x)`` implies ``Class(y)`` only if x is y or a subtype thereof:: >>> implies(istype(int), Class(str)) False >>> implies(istype(int), Class(object)) True And it implies ``Class(y, False)`` only if x is *not* y or a subtype thereof:: >>> implies(istype(int), Class(str, False)) True >>> implies(istype(int), Class(object, False)) False But ``istype(x, False)`` implies nothing about any ``Class`` test, since it refers to exactly one type, while the ``Class`` may refer to infinitely many types:: >>> implies(istype(int, False), Class(int, False)) False >>> implies(istype(int, False), Class(object)) False Meanwhile, ``Class(x)`` tests can only imply ``istype(y, False)``, where y is a *superclass* of x:: >>> implies(Class(int), istype(int)) False >>> implies(Class(int), istype(object)) False >>> implies(Class(int), istype(object, False)) True And ``Class(x, False)`` cannot imply anything about any ``istype()`` test, whether true or false:: >>> implies(Class(int, False), istype(int)) False >>> implies(Class(int, False), istype(int, False)) False When ``Class()`` is intersected with an exact type test, the result is either the exact type test, or ``False``:: >>> intersect(Class(int), istype(int)) istype(, True) >>> intersect(istype(int), Class(int)) istype(, True) >>> intersect(Class(int), istype(object)) False >>> intersect(istype(object), Class(int)) False >>> intersect(Class(int, False), istype(object)) istype(, True) >>> intersect(istype(object), Class(int, False)) istype(, True) But when it's intersected with a type exclusion test, the result is a ``Conjunction``:: >>> intersect(istype(int, False), Class(str)) == Conjunction([ ... istype(int, False), Class(str, True) ... ]) True >>> s = intersect(Class(str), istype(int, False)) >>> s == Conjunction([istype(int, False), Class(str, True)]) True >>> intersect(s, istype(int)) False >>> intersect(s, istype(int, False)) == s True >>> intersect(s, istype(str)) istype(, True) -------------------- Tests and Signatures -------------------- ``Test`` Objects ================ A ``Test`` is the combination of a "dispatch expression" and a criterion to be applied to it:: >>> from peak.rules.criteria import Test >>> x_isa_int = Test("x", Class(int)) (Note that although these examples use strings, actual dispatch expressions will be AST-like structures.) Creating a test with disjunct criteria actually returns a set of tests:: >>> Test("x", DisjunctionSet([int, str])) == \ ... DisjunctionSet([Test('x', int), Test('x', str)]) True So the ``disjuncts()`` of a test will always just be the test itself:: >>> disjuncts(x_isa_int) [Test('x', Class(, True))] Negating a test usually just negates its criterion, leaving the expression intact:: >>> negate(x_isa_int) Test('x', Class(, False)) But if the test criterion is a conjunction or range, negating it can produce a disjunction of tests:: >>> negate( ... Test('x', ... Conjunction([IsObject('foo',False), IsObject('bar',False)]) ... ) ... ) == DisjunctionSet( ... [Test('x', IsObject('foo', True)), ... Test('x', IsObject('bar', True))]) True Intersecting two tests for the same dispatch expression returns a test whose criterion is the intersection of the original tests' criteria:: >>> intersect(x_isa_int, Test("x", Class(str))) == Test( ... 'x', Conjunction([Class(int), Class(str)]) ... ) True And similarly, a test only implies another test if they have equal dispatch expressions, and the second test's criterion is implied by the first's:: >>> implies(x_isa_int, Test("x", Class(str))) False >>> implies(x_isa_int, Test("x", Class(object))) True >>> implies(x_isa_int, Test("y", Class(int))) False But the intersection of two tests with *different* dispatch expressions produces a ``Signature`` object:: >>> y_isa_str = Test("y", Class(str)) >>> x_int_y_str = intersect(x_isa_int, y_isa_str) >>> x_int_y_str Signature([Test('x', Class(...int...)), Test('y', Class(...str...))]) ``Signature`` Objects ===================== ``Signature`` objects are similar to ``Conjunction`` objects, except for three important differences. First, signatures are sequences, not sets. They preserve the ordering they were created with:: >>> intersect(x_isa_int, y_isa_str) Signature([Test('x', Class(...int...)), Test('y', Class(...str...))]) >>> intersect(y_isa_str, x_isa_int) Signature([Test('y', Class(...str...)), Test('x', Class(...int...))]) and their negations preserve the order as well (using ``OrElse`` instances):: >>> negate(intersect(x_isa_int, y_isa_str)) OrElse([Test('x', Class(...int..., False)), Test('y', Class(...str..., False))]) >>> negate(intersect(y_isa_str, x_isa_int)) OrElse([Test('y', Class(...str..., False)), Test('x', Class(...int..., False))]) Second, signatures can only contain ``Test`` instances, and they automatically ``intersect()`` any tests that apply to the same dispatch signatures:: >>> from peak.rules.criteria import Signature >>> intersect(x_int_y_str, Test("y", Class(float))) == Signature([ ... Test('x', Class(int)), ... Test('y', Conjunction([Class(str), Class(float)])) ... ]) True >>> intersect(x_int_y_str, Test("x", Class(float))) == Signature([ ... Test('x', Conjunction([Class(int), Class(float)])), ... Test('y', Class(str)) ... ]) True >>> intersect(Test("x", Class(float)), x_int_y_str) == Signature([ ... Test('x', Conjunction([Class(int), Class(float)])), ... Test('y', Class(str)) ... ]) True But, as with conjunctions, you can't create a signature with less than two items in it:: >>> Signature([Test("x",1)]) Test('x', 1) >>> Signature([True]) True >>> Signature([False]) False >>> Signature([]) True ---------- Predicates ---------- Now that we've got all the basic pieces in place, we can now operationally define predicates for the Chambers & Chen dispatch algorithm. Specifically, a predicate can be any of the following: * ``True`` (meaning a condition that always applies) * ``False`` (meaning a condition that *never* applies) * A ``Test`` or ``Signature`` instance * A ``DisjunctionSet`` or ``OrElse`` containing two or more ``Test`` or ``Signature`` instances In each case, invoking ``disjuncts()`` on the object in question will return a list of objects suitable for constructing dispatch "cases" -- i.e., sets of simple "and-ed" criteria that can easily be indexed. The ``tests_for()`` function can then be used to yield the component tests of each case signature. When called on a ``Test``, it yields the given test:: >>> from peak.rules.criteria import tests_for >>> list(tests_for(Test('y',42))) [Test('y', 42)] But called on a ``Signature``, it yields the tests contained within:: >>> list(tests_for(x_int_y_str)) [Test('x', Class(...int...)), Test('y', Class(...str...))] And called on ``True``, it yields nothing:: >>> list(tests_for(True)) [] ``tests_for(False)``, however, is undefined, because ``False`` cannot be represented as a conjunction of tests. ``False`` is still a valid predicate, of course, because it represents an empty disjunction. In normal predicate processing, one loops over the ``disjuncts()`` of a predicate, and only then uses ``tests_for()`` to inspect the individual items. But since ``disjuncts(False)`` is an empty list, it should never be necessary to invoke ``tests_for(False)``. There is an important distinction, however, in how ``disjuncts()`` works on ``OrElse`` objects, compared to all other kinds of predicates. ``disjuncts()`` is used to obtain the *unordered* disjunctions of a logical condition, but ``OrElse`` is *ordered*, because it represents a series of applications of the Python "or" operator. In Python, a condition on the right-hand side of an "or" operator is not tested *unless the condition on the left is false*. PEAK-Rules, however, tests the ``disjuncts()`` of a predicate *independently*. Thus, in order to properly translate "or" conditions in a predicate, the ``disjuncts()`` of an ``OrElse`` must include additional and-ed conditions to force them to be tested in order. Specifically, the ``disjuncts()`` of ``OrElse([a, b, c])`` will be: * ``a``, * ``intersect(negate(a), b)``, and * ``intersect(intersect(negate(a), negate(b)), c)``! This expansion ensures that ``b`` will never be tested unless ``a`` is false, and ``c`` will never be tested unless ``a`` and ``b`` are both false, just like in a regular Python expression. Observe:: >>> DisjunctionSet([OrElse([Class(a), Class(b)])]) == DisjunctionSet([ ... Class(a, True), ... Conjunction([Class(a, False), Class(b, True)]) ... ]) True Also, because ``OrElse`` objects don't expand their contents' disjuncts at creation time, they must be expanded as part of the ``disjuncts()`` operation:: >>> a_or_b = DisjunctionSet([Class(a), Class(b)]) >>> try: ... set = set ... except NameError: ... from sets import Set as set # 2.3, ugh >>> set(disjuncts(OrElse([istype(int), a_or_b]))) == set([ ... istype(int), ... Conjunction([istype(int, False), Class(b)]), ... Conjunction([istype(int, False), Class(a)]) ... ]) True This delayed expansion "preserves the unorderedness" of the contents, by not forcing them to be evaluated in any specific sequence, apart from the requirements imposed by their position within the ``OrElse``. We'll do one more test, to show that the disjuncts of the negated portions of the ``OrElse`` are also expanded:: >>> a_and_b = Conjunction([Class(a), Class(b)]) >>> not_a = Class(a, False) >>> not_b = Class(b, False) >>> int_or_str = DisjunctionSet([Class(int), Class(str)]) >>> set(disjuncts(OrElse([a_and_b, int_or_str]))) == set([ ... a_and_b, Conjunction([not_a, Class(int)]), Conjunction([not_a, Class(str)]), ... Conjunction([not_b, Class(int)]), Conjunction([not_b, Class(str)]) ... ]) True per this expansion logic (using ``|`` for "symmetric or"):: (a and b) or (int|str) => (a and b) | not (a and b) and (int|str) not (a and b) and (int|str) => (not a | not b) and (int|str) (not a | not b) and (int|str) => ( (not a and int) | (not a and str) | (not b and int) | (not b and str) ) PEAK-Rules-0.5a1.dev-r2713/setup.py0000755000076600007660000000232711631103276015616 0ustar telec3telec3#!/usr/bin/env python """Distutils setup file""" import ez_setup ez_setup.use_setuptools() from setuptools import setup PACKAGE_NAME = "PEAK-Rules" PACKAGE_VERSION = "0.5a1" PACKAGES = ['peak', 'peak.rules'] def get_description(): # Get our long description from the documentation f = file('README.txt') lines = [] for line in f: if not line.strip(): break # skip to first blank line for line in f: if line.startswith('.. contents::'): break # read to table of contents lines.append(line) f.close() return ''.join(lines) setup( name=PACKAGE_NAME, version=PACKAGE_VERSION, description="Generic functions and business rules support systems", long_description = get_description(), install_requires=[ 'BytecodeAssembler>=0.6','DecoratorTools>=1.7dev-r2450', 'AddOns>=0.6', 'Extremes>=1.1', ], author="Phillip J. Eby", author_email="peak@eby-sarna.com", license="ZPL 2.1", url="http://pypi.python.org/pypi/PEAK-Rules", download_url = "http://peak.telecommunity.com/snapshots/", test_suite = 'test_rules', tests_require='Importing', packages = PACKAGES, namespace_packages = ['peak'], ) PEAK-Rules-0.5a1.dev-r2713/PKG-INFO0000644000076600007660000000770312506203664015205 0ustar telec3telec3Metadata-Version: 1.0 Name: PEAK-Rules Version: 0.5a1.dev-r2713 Summary: Generic functions and business rules support systems Home-page: http://pypi.python.org/pypi/PEAK-Rules Author: Phillip J. Eby Author-email: peak@eby-sarna.com License: ZPL 2.1 Download-URL: http://peak.telecommunity.com/snapshots/ Description: PEAK-Rules is a highly-extensible framework for creating and using generic functions, from the very simple to the very complex. Out of the box, it supports multiple-dispatch on positional arguments using tuples of types, full predicate dispatch using strings containing Python expressions, and CLOS-like method combining. (But the framework allows you to mix and match dispatch engines and custom method combinations, if you need or want to.) Basic usage:: >>> from peak.rules import abstract, when, around, before, after >>> @abstract() ... def pprint(ob): ... """A pretty-printing generic function""" >>> @when(pprint, (list,)) ... def pprint_list(ob): ... print "pretty-printing a list" >>> @when(pprint, "isinstance(ob,list) and len(ob)>50") ... def pprint_long_list(ob): ... print "pretty-printing a long list" >>> pprint([1,2,3]) pretty-printing a list >>> pprint([42]*1000) pretty-printing a long list >>> pprint(42) Traceback (most recent call last): ... NoApplicableMethods: ... PEAK-Rules works with Python 2.3 and up -- just omit the ``@`` signs if your code needs to run under 2.3. Also, note that with PEAK-Rules, *any* function can be generic: you don't have to predeclare a function as generic. (The ``abstract`` decorator is used to declare a function with no *default* method; i.e., one that will raise ``NoApplicableMethods`` instead of executing a default implementation, if no rules match the arguments it's invoked with.) PEAK-Rules is still under development; it lacks much in the way of error checking, so if you mess up your rules, it may not be obvious where or how you did. User documentation is also lacking, although there are extensive doctests describing and testing most of its internals, including: * `Introduction`_ (Method combination, porting from RuleDispatch) * `Core Design Overview`_ (Terminology, method precedence, etc.) * The `Basic AST Builder`_ and advanced `Code Generation`_ * `Criteria`_, `Indexing`_, and `Predicates`_ * `Syntax pattern matching`_ (Please note that these documents are still in a state of flux and some may still be incomplete or disorganized, prior to the first official release.) Source distribution snapshots are generated daily, but you can also update directly from the `development version`_ in SVN. .. _development version: svn://svn.eby-sarna.com/svnroot/PEAK-Rules#egg=PEAK_Rules-dev .. _Introduction: http://peak.telecommunity.com/DevCenter/PEAK-Rules#toc .. _Core Design Overview: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Design .. _Predicates: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Predicates .. _Basic AST Builder: http://peak.telecommunity.com/DevCenter/PEAK-Rules/AST-Builder .. _Code Generation: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Code-Generation .. _Criteria: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Criteria .. _Indexing: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Indexing .. _Predicates: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Predicates .. _Syntax pattern matching: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Syntax-Matching .. _toc: Platform: UNKNOWN PEAK-Rules-0.5a1.dev-r2713/peak/0000755000076600007660000000000012506203664015021 5ustar telec3telec3PEAK-Rules-0.5a1.dev-r2713/peak/__init__.py0000775000076600007660000000007110761563121017133 0ustar telec3telec3__import__('pkg_resources').declare_namespace(__name__) PEAK-Rules-0.5a1.dev-r2713/peak/rules/0000755000076600007660000000000012506203664016153 5ustar telec3telec3PEAK-Rules-0.5a1.dev-r2713/peak/rules/ast_builder.py0000644000076600007660000002475711633340477021045 0ustar telec3telec3from token import tok_name, NAME, NUMBER, STRING, ISNONTERMINAL, EQUAL from symbol import sym_name from new import instancemethod import token, symbol, parser, sys __all__ = [ 'parse_expr', 'build' ] _name = lambda builder,nodelist: builder.Name(nodelist[1]) _const = lambda builder,nodelist: builder.Const(eval(nodelist[1])) production = { NAME: _name, NUMBER: _const, STRING: _const, } ops = { token.LEFTSHIFT: 'LeftShift', token.RIGHTSHIFT: 'RightShift', token.PLUS: 'Add', token.MINUS: 'Sub', token.STAR: 'Mul', token.SLASH: 'Div', token.PERCENT: 'Mod', token.DOUBLESLASH: 'FloorDiv', } def left_assoc(builder, nodelist): return getattr(builder,ops[nodelist[-2][0]])(nodelist[:-2],nodelist[-1]) def curry(f,*args): for arg in args: f = instancemethod(f,arg,type(arg)) return f def com_binary(opname, builder,nodelist): "Compile 'NODE (OP NODE)*' into (type, [ node1, ..., nodeN ])." items = [nodelist[i] for i in range(1,len(nodelist),2)] return getattr(builder,opname)(items) # testlist: test (',' test)* [','] # exprlist: expr (',' expr)* [','] # subscriptlist: subscript (',' subscript)* [','] testlist = testlist1 = exprlist = subscriptlist = curry(com_binary, 'Tuple') # test: and_test ('or' and_test)* | lambdef test = curry(com_binary, 'Or') if sys.version>='2.5': or_test = test # test: or_test ['if' or_test 'else' test] | lambdef def test(builder, nodelist): return builder.IfElse(nodelist[1], nodelist[3], nodelist[5]) # and_test: not_test ('and' not_test)* and_test = curry(com_binary, 'And') # not_test: 'not' not_test | comparison def not_test(builder, nodelist): return builder.Not(nodelist[2]) # comparison: expr (comp_op expr)* def comparison(builder, nodelist): if len(nodelist)>4 and builder.simplify_comparisons: # Reduce (x < y < z ...) to (x' | '>=' | '<=' | '<>' | '!=' | '==' # | 'in' | 'not' 'in' | 'is' | 'is' 'not' n = nl[1] if n[0] == token.NAME: type = n[1] if len(nl) == 3: if type == 'not': type = 'not in' else: type = 'is not' else: type = n[1] results.append((type, nodelist[i])) return builder.Compare(nodelist[1], results) # expr: xor_expr ('|' xor_expr)* expr = curry(com_binary, 'Bitor') # xor_expr: and_expr ('^' and_expr)* xor_expr = curry(com_binary, 'Bitxor') # and_expr: shift_expr ('&' shift_expr)* and_expr = curry(com_binary, 'Bitand') # shift_expr: arith_expr ('<<'|'>>' arith_expr)* # arith_expr: term (('+'|'-') term)* # term: factor (('*'|'/'|'%'|'//') factor)* shift_expr = arith_expr = term = left_assoc unary_ops = { token.PLUS: 'UnaryPlus', token.MINUS: 'UnaryMinus', token.TILDE: 'Invert', } # factor: ('+'|'-'|'~') factor | power def factor(builder, nodelist): return getattr(builder,unary_ops[nodelist[1][0]])(nodelist[2]) # power: atom trailer* ['**' factor] def power(builder, nodelist): if nodelist[-2][0]==token.DOUBLESTAR: return builder.Power(nodelist[:-2], nodelist[-1]) node = nodelist[-1] nodelist = nodelist[:-1] t = node[1][0] # trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME if t == token.LPAR: return com_call_function(builder,nodelist,node[2]) elif t == token.DOT: return builder.Getattr(nodelist, node[2][1]) elif t == token.LSQB: item = node[2] while len(item)==2: item = item[1] if item[0]==token.COLON: lineno = item[2] return builder.Subscript( nodelist, (symbol.subscript, item, None) # XXX optimization bypass ) return builder.Subscript(nodelist, item) raise AssertionError("Unknown power", nodelist) # atom: '(' [yield_expr|testlist_gexp] ')' | # '[' [listmaker] ']' | # '{' [dictmaker] '}' | # '`' testlist1 '`' | # NAME | NUMBER | STRING+ def atom(builder, nodelist): t = nodelist[1][0] if t == token.LPAR: if nodelist[2][0] == token.RPAR: return builder.Tuple(()) return build(builder,nodelist[2]) elif t==token.LSQB: if nodelist[2][0] == token.RSQB: return builder.List(()) return listmaker(builder,nodelist[2]) elif t==token.LBRACE: if nodelist[2][0] == token.RBRACE: items = () else: dm = nodelist[2] items = [(dm[i],dm[i+2]) for i in range(1,len(dm),4)] return builder.Dict(items) elif t==token.BACKQUOTE: return builder.Backquote(nodelist[2]) elif t==token.STRING: return builder.Const(eval(' '.join([n[1] for n in nodelist[1:]]))) raise AssertionError("Unknown atom", nodelist) # arglist: (argument ',')* (argument [',']| '*' test [',' '**' test] | '**' test) def com_call_function(builder, primaryNode, nodelist): if nodelist[0] == token.RPAR: return builder.CallFunc(primaryNode,(),(),None,None) args = []; kw = [] len_nodelist = len(nodelist) for i in range(1, len_nodelist, 2): node = nodelist[i] if node[0] == token.STAR or node[0] == token.DOUBLESTAR: break iskw, result = com_argument(node, kw) if iskw: kw.append(result) else: args.append(result) else: # No broken by star arg, so skip the last one we processed. i = i + 1 if i < len_nodelist and nodelist[i][0] == token.COMMA: # need to accept an application that looks like "f(a, b,)" i = i + 1 star_node = dstar_node = None while i < len_nodelist: tok = nodelist[i] ch = nodelist[i+1] i = i + 3 if tok[0]==token.STAR: star_node = ch elif tok[0]==token.DOUBLESTAR: dstar_node = ch else: raise AssertionError, 'unknown node type: %s' % (tok,) return builder.CallFunc(primaryNode, args, kw, star_node, dstar_node) # argument: [test '='] test [gen_for] (Really [keyword '='] test) # argument: test [gen_for] | test '=' test # Really [keyword '='] test def com_argument(nodelist, kw): if len(nodelist) == 2: if kw: raise SyntaxError, "non-keyword arg after keyword arg" return 0, nodelist[1] if nodelist[2][0] != token.EQUAL and len(nodelist)==3: return 0, (testlist_gexp.symbol, nodelist[1], nodelist[2]) elif len(nodelist) !=4: raise AssertionError n = nodelist[1] while len(n) == 2 and n[0] != token.NAME: n = n[1] if n[0] != token.NAME: raise SyntaxError, "keyword can't be an expression (%r)" % (n,) return 1, ((token.STRING,`n[1]`,n[2]), nodelist[3]) # listmaker: test ( list_for | (',' test)* [','] ) def listmaker(builder, nodelist): values = [] for i in range(1, len(nodelist)): if nodelist[i][0] == symbol.list_for: assert i==2 and len(nodelist)==3 and len(values)==1 return com_iterator(builder.ListComp,values[0],nodelist[i]) if nodelist[i][0] == token.COMMA: continue values.append(nodelist[i]) return builder.List(values) # list_iter: list_for | list_if # list_for: 'for' exprlist 'in' testlist_safe [list_iter] # list_if: 'if' old_test [list_iter] # # gen_iter: gen_for | gen_if # gen_for: 'for' exprlist 'in' or_test [gen_iter] # gen_if: 'if' old_test [gen_iter] def com_iterator(method, value, nodelist): clauses = [] nodelist = nodelist[1:] # skip the symbol while nodelist: while len(nodelist)==1: nodelist, = nodelist nodelist = nodelist[1:] # skip the symbol tok, val, line = nodelist[0] assert tok==NAME assert val in ('for', 'in', 'if') clauses.append((val, nodelist[1])) nodelist = nodelist[2:] return method(value, clauses) # testlist_gexp test (gen_for | (',' test)* [',']) def testlist_gexp(builder, nodelist): if nodelist[2][0] == token.COMMA: return testlist(builder, nodelist) else: value, nodelist = nodelist[1:] return com_iterator(builder.GenExpr, value, nodelist) testlist_comp = testlist_gexp # Python 2.7 if hasattr(symbol, 'testlist_comp'): testlist_comp.symbol = symbol.testlist_comp else: testlist_gexp.symbol = getattr(symbol, 'testlist_gexp', None) # subscript: '.' '.' '.' | test | [test] ':' [test] [sliceop] # sliceop: ':' [test] def subscript(builder, nodelist): if nodelist[1][0]==token.DOT: return builder.Const(Ellipsis) item = nodelist; nl = len(nodelist) while type(item[1]) is tuple: item=item[1] # find token, to get a line# lineno = item[-1] have_stride = nodelist[-1] and nodelist[-1][0]==symbol.sliceop if have_stride: start = stop = stride = None#; stride = (token.STRING, 'None', lineno) if len(nodelist[-1])==3: stride = nodelist[-1][2] else: start = None #(token.NUMBER,`0`,lineno) stop = stride = None #(token.NUMBER,`sys.maxint`,lineno) if nl==5: start,stop = nodelist[1],nodelist[3] # test : test sliceop elif nl==4: if nodelist[1][0]==token.COLON: # : test sliceop stop = nodelist[2] elif have_stride: # test : sliceop start = nodelist[1] else: start, stop = nodelist[1], nodelist[3] # test : test elif nl==3: if nodelist[1][0]==token.COLON: if not have_stride: stop = nodelist[2] # : test else: start = nodelist[1] # test : else: raise AssertionError("Unrecognized subscript", nodelist) if have_stride: return builder.Slice3(start,stop,stride) return builder.Slice2(start,stop) for sym,name in sym_name.items(): if name in globals(): production[sym] = globals()[name] def build(builder, nodelist): while len(nodelist)==2: nodelist = nodelist[1] return production[nodelist[0]](builder,nodelist) def parse_expr(expr,builder): # include line numbers in parse data so valid symbols are never of length 2 return build(builder, parser.expr(expr).totuple(1)[1]) PEAK-Rules-0.5a1.dev-r2713/peak/rules/codegen.py0000755000076600007660000003303312041201063020117 0ustar telec3telec3from peak.util.assembler import * from peak.util.symbols import Symbol from peak.rules.core import gen_arg, clone_function from ast_builder import build, parse_expr from types import ModuleType import sys try: set except NameError: from sets import Set as set __all__ = [ 'GetSlice', 'BuildSlice', 'Dict', 'ExprBuilder', 'IfElse', 'CSECode', 'UnaryOp', 'BinaryOp', 'ListOp', ] nodetype(fmt='%s[%s:%s]') def GetSlice(expr, start=Pass, stop=Pass, code=None): if code is None: if expr is not Pass: return fold_args(GetSlice, expr, start, stop) return expr, start, stop code(expr) if start is not Pass: code(start) if stop is not Pass: return code(stop, Code.SLICE_3) return code.SLICE_1() elif stop is not Pass: code(stop) return code.SLICE_2() return code.SLICE_0() def module_to_ns(ns): if isinstance(ns, ModuleType): return ns.__dict__ return ns nodetype(fmt="%s:%s:%s") def BuildSlice(start=Pass, stop=Pass, stride=Pass, code=None): if code is None: return fold_args(BuildSlice, start, stop, stride) if start is Pass: start = None if stop is Pass: stop = None code(start, stop, stride) if stride is not Pass: return code.BUILD_SLICE(3) return code.BUILD_SLICE(2) nodetype() def Dict(items, code=None): if code is None: return fold_args(Dict, tuple(map(tuple,items))) code.BUILD_MAP(0) for k,v in items: code.DUP_TOP() code(k, v) code.ROT_THREE() code.STORE_SUBSCR() nodetype(fmt='%s if %s else %s') def IfElse(tval, cond, fval, code=None): if code is None: return fold_args(IfElse, tval, cond, fval) else_clause, end_if = Label(), Label() code(cond) if tval != cond: code(else_clause.JUMP_IF_FALSE_OR_POP, tval) if code.stack_size is not None: code(end_if.JUMP_FORWARD) elif fval != cond: code(end_if.JUMP_IF_TRUE) if fval !=cond: return code(else_clause, Code.POP_TOP, fval, end_if) else: return code(else_clause, end_if) def unaryOp(name, (fmt, opcode)): if '%' not in fmt: fmt += '%s' nodetype(UnaryOp, fmt=fmt) def tmp(expr, code=None): if code is None: return fold_args(tmp, expr) return code(expr, opcode) tmp.__name__ = name return tmp def binaryOp(name, (fmt, opcode)): if '%' not in fmt: fmt = '%s' + fmt + '%s' nodetype(BinaryOp, fmt=fmt) def tmp(left, right, code=None): if code is None: return fold_args(tmp, left, right) return code(left, right, opcode) tmp.__name__ = name return tmp def listOp(name, (fmt, opcode)): nodetype(ListOp, fmt=fmt) def tmp(items, code=None): if code is None: return fold_args(tmp, tuple(items)) code(*items) return opcode(code, len(items)) tmp.__name__ = name return tmp def mkOps(optype, **ops): return dict([(name,optype(name, op)) for (name, op) in ops.items()]) def globalOps(optype, **ops): __all__.extend(ops) localOps(globals(), optype, **ops) def localOps(ns, optype, **ops): ns.update(mkOps(optype, **ops)) class UnaryOp(object): __slots__ = () class BinaryOp(object): __slots__ = () class ListOp(object): __slots__ = () globalOps( unaryOp, Not = ('not ', Code.UNARY_NOT), Plus = ('+', Code.UNARY_POSITIVE), Minus = ('-', Code.UNARY_NEGATIVE), Repr = ('`%s`', Code.UNARY_CONVERT), Invert = ('~', Code.UNARY_INVERT), ) globalOps( binaryOp, Add = ('+', Code.BINARY_ADD), Sub = ('-', Code.BINARY_SUBTRACT), Mul = ('*', Code.BINARY_MULTIPLY), Div = ('/', Code.BINARY_DIVIDE), Mod = ('%s%%%s', Code.BINARY_MODULO), FloorDiv = ('//', Code.BINARY_FLOOR_DIVIDE), Power = ('**', Code.BINARY_POWER), LeftShift = ('<<', Code.BINARY_LSHIFT), RightShift = ('>>', Code.BINARY_RSHIFT), Getitem = ('%s[%s]', Code.BINARY_SUBSCR), Bitor = ('|', Code.BINARY_OR), Bitxor = ('^', Code.BINARY_XOR), Bitand = ('&', Code.BINARY_AND), ) globalOps( listOp, Tuple = ('(%s)', Code.BUILD_TUPLE), List = ('[%s]', Code.BUILD_LIST) ) class SMIGenerator: """State Machine Interpreter Generator""" ARG = Local('$Arg') SET_ARG = lambda self, code: code.STORE_FAST(self.ARG.name) WHY_CONTINUE = {'2.3':5}.get(sys.version[:3], 32) def __init__(self, func): import inspect self.code = code = CSECode.from_function(func) #, copy_lineno=True) self.actions = {} self.func = func self.first_action, loop_top, exit, bad_action = Label(), Label(), Label(), Label() args, star, dstar, defaults = inspect.getargspec(func) actions, self.actions_const = self.make_const({}) start_node, self.startnode_const = self.make_const(object()) code.cache(None) # force CSE preamble here code(start_node, loop_top) code.UNPACK_SEQUENCE(2) # action, argument code( exit.JUMP_IF_FALSE, Compare(Code.DUP_TOP, (('in', actions),)), bad_action.JUMP_IF_FALSE_OR_POP, Code.ROT_TWO, # argument, action self.SET_ARG, # action self.dispatch_handler, exit, Code.POP_TOP, # drop action, leaving argument Return( Call(Pass, map(gen_arg, args),(),gen_arg(star), gen_arg(dstar)) ), bad_action, Code.POP_TOP, Return(Call(Const(self.bad_action),(Code.ROT_THREE, Code.ROT_TWO))), self.first_action, ) self.NEXT_STATE = loop_top.JUMP_ABSOLUTE self.maybe_cache = code.maybe_cache def generate(self, start_node): func = clone_function(self.func) self.code.co_consts[self.startnode_const] = start_node self.code.co_consts[self.actions_const] = dict.fromkeys( self.actions.values() ) func.func_code = self.code.code() return func def make_const(self, value): self.code.co_consts.append(value) return Const(value), len(self.code.co_consts)-1 def action_id(self, expression): try: return self.actions[expression] except KeyError: action = self.actions[expression] = self.code.here() self.code_action(action, expression) return action def bad_action(self, action, argument): raise AssertionError("Invalid action: %s, %s" % (action, argument)) def next_state(self, code): if code.stack_size is not None: code(self.NEXT_STATE) def dispatch_handler(self, code): fake = Label() code( fake.SETUP_LOOP, self.WHY_CONTINUE, Code.END_FINALLY, Code.POP_BLOCK, fake, Return(Pass), # <- all dead code, never runs ) def code_action(self, action, expression): self.code.stack_size = 0 self.code(expression, self.next_state) if '__pypy__' in sys.builtin_module_names: # Use a linear search in place of computed goto - PyPy handles # finally: blocks in a way that prevents computed goto from working def dispatch_handler(self, code): code(self.first_action.JUMP_FORWARD) def code_action(self, action, expression): self.code.stack_size = 1 next_action = Label() self.code( Compare(Code.DUP_TOP, (('==', action),)), next_action.JUMP_IF_FALSE_OR_POP, Code.POP_TOP, expression, self.next_state,next_action, Code.POP_TOP ) CACHE = Local('$CSECache') SET_CACHE = lambda code: code.STORE_FAST(CACHE.name) class CSETracker(Code): """Helper object that tracks common sub-expressions""" def __init__(self): super(CSETracker, self).__init__() self.cse_depends = {} def track(self, expr): self.track_stack = [None, 0] self.to_cache = [] try: self(expr) return self.to_cache finally: del self.track_stack, self.to_cache def __call__(self, *args): scall = super(CSETracker, self).__call__ ts = self.track_stack for ob in args: ts[-1] += 1 ts.append(ob) ts.append(0) try: before = self.stack_size scall(ob) finally: count = ts.pop() ts.pop() if count and callable(ob) and self.stack_size==before+1: # Only consider non-leaf callables for caching top = tuple(ts[-2:]) if self.cse_depends.setdefault(ob, top) != top: if ob not in self.to_cache: self.to_cache.append(ob) class CSECode(Code): """Code object with common sub-expression caching support""" def __init__(self): super(CSECode, self).__init__() self.expr_cache = {} self.tracker = CSETracker() def cache(self, expr): if not self.expr_cache: self.LOAD_CONST(None) self.STORE_FAST(CACHE.name) self.expr_cache.setdefault( expr, "%s #%d" % (expr, len(self.expr_cache)+1) ) def maybe_cache(self, expr): map(self.cache, self.tracker.track(expr)) def __call__(self, *args): scall = super(CSECode, self).__call__ for ob in args: if callable(ob) and ob in self.expr_cache: key = self.expr_cache[ob] def calculate(code): scall(ob, Code.DUP_TOP, CACHE, Const(key), Code.STORE_SUBSCR) cache = IfElse( CACHE, CACHE, lambda c: scall({}, Code.DUP_TOP, SET_CACHE) ) scall( IfElse( Getitem(CACHE, Const(key)), Compare(Const(key), [('in', cache)]), calculate ) ) else: scall(ob) class ExprBuilder: """Expression builder returning bytecode-able AST nodes""" def __init__(self,arguments,*namespaces): self.bindings = [ dict([(k,self.Const(v)) for k,v in module_to_ns(ns).iteritems()]) for ns in namespaces ] self.push(arguments); self.push() def push(self, ns={}): self.bindings.insert(0, {}); self.bind(ns) def bind(self, ns): self.bindings[0].update(ns) def pop(self): return self.bindings.pop(0) def parse(self, expr): return parse_expr(expr, self) def Const(self,value): return Const(value) def Name(self,name): for ns in self.bindings: if name in ns: return ns[name] raise NameError(name) def Subscript(self, left, right): expr = build(self,left) key = build(self,right) if isinstance(key, GetSlice): return GetSlice(expr, key.start, key.stop) return Getitem(expr, key) def Slice2(self, start, stop): start = start and build(self, start) or Pass stop = stop and build(self, stop ) or Pass return GetSlice(Pass, start, stop) def Slice3(self, start, stop, stride): start = start and build(self, start ) or Pass stop = stop and build(self, stop ) or Pass stride = stride and build(self, stride) or Pass return BuildSlice(start, stop, stride) def Getattr(self, expr, attr): return Getattr(build(self,expr), attr) simplify_comparisons = False def Compare(self, expr, ops): return Compare( build(self, expr), [(op=='<>' and '!=' or op, build(self,arg)) for op, arg in ops] ) def _unaryOp(name, nt): def method(self, expr): return nt(build(self,expr)) return method localOps(locals(), _unaryOp, UnaryPlus = Plus, UnaryMinus = Minus, Invert = Invert, Backquote = Repr, Not = Not, ) del _unaryOp def _mkBinOp(name, nt): def method(self, left, right): return nt(build(self,left), build(self,right)) return method localOps(locals(), _mkBinOp, Add = Add, Sub = Sub, Mul = Mul, Div = Div, Mod = Mod, FloorDiv = FloorDiv, Power = Power, LeftShift = LeftShift, RightShift = RightShift, ) del _mkBinOp def _multiOp(name, nt): def method(self, items): result = build(self,items[0]) for item in items[1:]: result = nt(result, build(self,item)) return result return method localOps(locals(), _multiOp, Bitor = Bitor, Bitxor = Bitxor, Bitand = Bitand, ) del _multiOp def _listOp(name, op): def method(self,items): return op(map(build.__get__(self), items)) return method localOps(locals(), _listOp, And = And, Or = Or, Tuple = Tuple, List = List, ) def Dict(self, items): b = build.__get__(self) return Dict([(b(k),b(v)) for k,v in items]) def CallFunc(self, func, args, kw, star_node, dstar_node): b = build.__get__(self) return Call( b(func), map(b,args), [(b(k),b(v)) for k,v in kw], star_node and b(star_node), dstar_node and b(dstar_node) ) def IfElse(self, tval, cond, fval): return IfElse(build(self,tval), build(self,cond), build(self,fval)) PEAK-Rules-0.5a1.dev-r2713/peak/rules/predicates.py0000755000076600007660000005240212041201063020637 0ustar telec3telec3from peak.util.assembler import * from core import * from core import class_or_type_of, call_thru, flatten from criteria import * from indexing import * from codegen import SMIGenerator, ExprBuilder, Getitem, IfElse, Tuple from peak.util.decorators import decorate, synchronized, decorate_assignment from types import InstanceType, ClassType from ast_builder import build, parse_expr import inspect, new, codegen, parser __all__ = [ 'IsInstance', 'IsSubclass', 'Truth', 'Identity', 'Comparison', 'priority', 'IndexedEngine', 'predicate_node_for', 'meta_function', 'expressionSignature', ] abstract() def predicate_node_for(builder, expr, cases, remaining_exprs, memo): """Return a dispatch tree node argument appropriate for the expr""" def value_check(val, (exact, ranges)): if val in exact: return exact[val] lo = 0 hi = len(ranges) while loth: lo = mid+1 else: return node raise AssertionError("Should never get here") nodetype() def IsInstance(expr, code=None): if code is None: return expr, return IsSubclass(class_or_type_of(expr), code) _unpack = lambda c: c.UNPACK_SEQUENCE(2) subclass_check = TryExcept( Suite([ Code.DUP_TOP, SMIGenerator.ARG, _unpack, Code.ROT_THREE, Code.POP_TOP, Code.BINARY_SUBSCR, Code.ROT_TWO, Code.POP_TOP ]), [(Const(KeyError), Suite([ SMIGenerator.ARG, _unpack, Code.POP_TOP, Call(Code.ROT_TWO, (Pass,)), ]))] ) nodetype() def IsSubclass(expr, code=None): if code is None: return expr, code(expr, subclass_check) identity_check = IfElse( Getitem(SMIGenerator.ARG, Code.ROT_TWO), Compare(Code.DUP_TOP, [('in', SMIGenerator.ARG)]), Suite([Code.POP_TOP, Getitem(SMIGenerator.ARG, None)]) ) nodetype() def Identity(expr, code=None): if code is None: return expr, code(Call(Const(id), (expr,), fold=False), identity_check) nodetype() def Comparison(expr, code=None): if code is None: return expr, code.LOAD_CONST(value_check) Call(Pass, (expr, SMIGenerator.ARG), code=code) nodetype() def Truth(expr, code=None): if code is None: return expr, skip = Label() code(SMIGenerator.ARG); code.UNPACK_SEQUENCE(2) code(expr, skip.JUMP_IF_TRUE, Code.ROT_THREE, skip, Code.POP_TOP, Code.ROT_TWO, Code.POP_TOP) class priority(int): """An integer priority for manually resolving a rule ambiguity""" when(implies, (priority, priority))(lambda p1,p2: p1>=p2) class ExprBuilder(ExprBuilder): """Extended expression builder with support for meta-functions""" def Backquote(self, expr): raise SyntaxError("backquotes are not allowed in predicates") meta_functions = {} def meta_function(*stub, **parsers): """Declare a meta-function and its argument parsers""" stub, = stub def callback(frame, name, func, old_locals): for name in inspect.getargs(func.func_code)[0]: if not isinstance(name, basestring): raise TypeError( "Meta-functions cannot have packed-tuple arguments" ) what = func, parsers, inspect.getargspec(func) meta_functions[stub] = ( lambda builder, *args: apply_meta(builder, what, *args) ) return func return decorate_assignment(callback) def expressionSignature(expr): """Convert raw Python code into logical conditions""" # Default is to simply test the truth of the expression return Test(Truth(expr), Value(True)) when(expressionSignature, (Const,))(lambda expr: bool(expr.value)) when(expressionSignature, ((bool, Test, Signature, Disjunction),)) def pass_through(expr): return expr class CriteriaBuilder(ExprBuilder): simplify_comparisons = True def build_with(self, expr): self.push() try: return build(self, expr) finally: self.pop() def Not(self, expr): return codegen.Not(self.build_with(expr)) def Or(self, items): return codegen.Or(map(self.build_with, items)) def CallFunc(self, func, args, kw, star_node, dstar_node): b = build.__get__(self) target = b(func) if isinstance(target, Const) and target.value in meta_functions: return meta_functions[target.value]( self, args, kw, star_node, dstar_node ) return Call( target, map(b,args), [(b(k),b(v)) for k,v in kw], star_node and b(star_node), dstar_node and b(dstar_node) ) def parse(self, expr): return expressionSignature(ExprBuilder.parse(self, expr)) when(expressionSignature, codegen.And) def do_intersect(expr): return reduce(intersect, map(expressionSignature, expr.values), True) when(expressionSignature, codegen.Or) def do_union(expr): return OrElse(map(expressionSignature, expr.values)) when(expressionSignature, codegen.Not) def do_negate(expr): return negate(expressionSignature(expr.expr)) _mirror_ops = { '>': '<', '>=': '<=', '=>':'<=', '<': '>', '<=': '>=', '=<':'>=', '<>': '<>', '!=': '<>', '==':'==', 'is': 'is', 'is not': 'is not' } when(expressionSignature, codegen.Compare) def do_compare(expr): left = expr.expr (op, right), = expr.ops if isinstance(left, Const) and op in _mirror_ops: left, right, op = right, left, _mirror_ops[op] if isinstance(right, Const): if op=='in' or op=='not in': cond = compileIn(left, right.value) if cond is not None: return maybe_invert(cond, op=='in') elif op=='is' or op=='is not': return maybe_invert(compileIs(left, right.value), op=='is') else: return Test(Comparison(left), Inequality(op, right.value)) # Both sides involve variables or an un-optimizable constant, # so it's a generic boolean criterion :( return Test(Truth(expr), Value(True)) def apply_meta(builder, (func, parsers, (argnames, varargs, varkw, defaults)), args, kw, star, dstar ): # NB: tuple-args not allowed! def parse(arg, node): if not node: return None return parsers.get(arg, build)(builder, node) data = {} extra = [] offset = 0 for name in argnames: if name=='__builder__': data[name] = builder elif name=='__star__': data[name] = parse(name, star) elif name=='__dstar__': data[name] = parse(name, dstar) else: break offset += 1 for k, v in zip(argnames[offset:], args): data[k] = parse(k, v) varargpos = len(argnames)-offset if len(args)> varargpos: if not varargs: raise TypeError("Too many arguments for %r" % (func,)) extra.extend([parse(varargs, node) for node in args[varargpos:]]) for k,v in kw: k = build(builder, k) assert type(k) is Const and isinstance(k.value, basestring) k = k.value if k in data: raise TypeError("Duplicate keyword %s for %r" % (k,func)) if varkw and k not in argnames and k not in parsers: data[k] = parse(varkw, v) else: data[k] = parse(k, v) if star and '__star__' not in data: raise TypeError("%r does not support parsing *args" % (func,)) if dstar and '__dstar__' not in data: raise TypeError("%r does not support parsing **kw" % (func,)) if defaults: for k,v in zip(argnames[-len(defaults):], defaults): data.setdefault(k, v) try: args = map(data.pop, argnames)+extra except KeyError, e: raise TypeError( "Missing positional argument %s for %r"%(e.args[0], func) ) return func(*args, **data) def compile_let(builder, args, kw, star, dstar): """Compile the let() function""" if args or star or dstar: raise TypeError("let() only accepts inline keyword arguments") for k,v in kw: k = build(builder, k) assert type(k) is Const and isinstance(k.value, basestring) k = k.value v = build(builder, v) builder.bind({k:v}) return True from peak.rules import let meta_functions[let] = compile_let def _expand_as(func, predicate_string, *namespaces): """Pre-parse predicate string and register meta function""" args, varargs, kw, defaults = arginfo = inspect.getargspec(func) argnames = list(flatten(filter(None, [args, varargs, kw]))) parsed = parser.expr(predicate_string).totuple(1)[1] builder = CriteriaBuilder( dict([(arg,Local(arg)) for arg in argnames]), *namespaces ) bindings = {} for b in builder.bindings[-len(namespaces):][::-1]: bindings.update(b) # Make a function that just gets the arguments we want c = Code.from_function(func) c.return_(Call(Const(locals),fold=False)) getargs = new.function( c.code(), func.func_globals, func.func_name, func.func_defaults, func.func_closure ) def expand(builder, *args): builder.push(bindings) # globals, locals, etc. builder.bind(apply_meta(builder, (getargs, {}, arginfo), *args)) # build in the newly-isolated namespace result = build(builder, parsed) builder.pop() return result meta_functions[func] = expand func.__doc__ # workaround for PyPy issue #1293 c = Code.from_function(func) c.return_() if func.func_code.co_code == c.code().co_code: # function body is empty c = Code.from_function(func) c.return_(build(builder, parsed)) func.func_code = c.code() return func def compileIs(expr, criterion): """Return a signature or predicate for 'expr is criterion'""" #if criterion is None: # XXX this should be smarter # return Test(IsInstance(expr), istype(NoneType)) #else: return Test(Identity(expr), IsObject(criterion)) def maybe_invert(cond, truth): if not truth: return negate(cond) return cond def compileIn(expr, criterion): """Return a signature or predicate (or None) for 'expr in criterion'""" try: iter(criterion) except TypeError: pass # treat the in condition as a truth expression else: expr = Comparison(expr) return Test(expr, Disjunction([Value(v) for v in criterion])) when(compileIn, (object, (type, ClassType))) def compileInClass(expr, criterion): warn_parse("'x in SomeClass' syntax is deprecated; use 'isinstance(x,SomeClass)'") return Test(IsInstance(expr), Class(criterion)) when(compileIn, (object, istype)) def compileInIsType(expr, criterion): warn_parse("'x in istype(y)' syntax is deprecated; use 'type(x) is y'") return Test(IsInstance(expr), criterion) def warn_parse(message, category=DeprecationWarning): """Issue a warning about a parsed string""" from warnings import warn, warn_explicit # Find the original call to _parse_string() to get its ParseContext import sys frame = sys._getframe(3) code = _parse_string.func_code ct = 2 while frame is not None and frame.f_code is not code: frame = frame.f_back ct += 1 if frame is None: # XXX direct use of expressionSignature; can't pinpoint a location return warn(message, category, ct) ctx = frame.f_locals['ctx'] # Issue a warning against the method body g = ctx.globaldict #lineno = getattr(getattr(ctx.body, 'func_code', None), 'co_firstlineno', 2) module = g.get('__name__', "") filename = g.get('__file__') if filename: fnl = filename.lower() if fnl.endswith(".pyc") or fnl.endswith(".pyo"): filename = filename[:-1] else: if module == "__main__": filename = sys.argv[0] if not filename: filename = module return warn_explicit( message, category, filename, ctx.lineno, g.setdefault("__warningregistry__", {}) ) class IndexedEngine(Engine, TreeBuilder): """A dispatching engine that builds trees using bitmap indexes""" def __init__(self, disp): self.signatures = [] self.all_exprs = {} super(IndexedEngine, self).__init__(disp) self.arguments = dict([(arg,Local(arg)) for arg in self.argnames]) def _add_method(self, signature, rule): signature = Signature(tests_for(signature, self)) if signature not in self.registry: case_id = len(self.signatures) self.signatures.append(signature) requires = [] exprs = self.all_exprs for _t, expr, criterion in tests_for(signature, self): Ordering(self, expr).requires(requires) requires.append(expr) index_type = bitmap_index_type(self, expr) if index_type is not None: if expr not in exprs: exprs[expr] = 1 if always_testable(expr): Ordering(self, expr).requires([]) index_type(self, expr).add_case(case_id, criterion) return super(IndexedEngine, self)._add_method(signature, rule) def _generate_code(self): smig = SMIGenerator(self.function) all_exprs = map(self.to_expression, self.all_exprs) for expr in all_exprs: smig.maybe_cache(expr) memo = dict([(expr, smig.action_id(expr)) for expr in all_exprs]) return smig.generate(self.build_root(memo)).func_code def _full_reset(self): # Replace the entire engine with a new one Dispatching(self.function).create_engine(self.__class__) synchronized() def seed_bits(self, expr, cases): return BitmapIndex(self, expr).seed_bits(cases) synchronized() def reseed(self, expr, criterion): return BitmapIndex(self, expr).reseed(criterion) # Make build() a synchronized method build = synchronized(TreeBuilder.build.im_func) def build_root(self, memo): return self.build( to_bits([len(self.signatures)])-1, frozenset(self.all_exprs), memo ) def best_expr(self, cases, exprs): return super(IndexedEngine, self).best_expr( list(from_bits(cases)), exprs ) def build_node(self, expr, cases, remaining_exprs, memo): return memo[expr], predicate_node_for( self, expr, cases, remaining_exprs, memo ) def selectivity(self, expr, cases): return BitmapIndex(self, expr).selectivity(cases) def to_expression(self, expr): return expr def build_leaf(self, cases, memo): action = self.rules.default_action signatures = self.signatures registry = self.registry for case_no in from_bits(cases): action = combine_actions(action, registry[signatures[case_no]]) # No need to memoize here, since the combined action probably isn't # a meaningful key, and template-compiled methods are memoized at a # lower level anyway. return (0, compile_method(action, self)) when(bitmap_index_type, (IndexedEngine, Truth))(lambda en,ex:TruthIndex) when(predicate_node_for, (IndexedEngine, Truth)) def truth_node(builder, expr, cases, remaining_exprs, memo): dont_cares, seedmap = builder.seed_bits(expr, cases) return ( # True/false tuple for Truth builder.build(seedmap[True][0] | dont_cares, remaining_exprs, memo), builder.build(seedmap[False][0] | dont_cares, remaining_exprs, memo) ) when(bitmap_index_type, (IndexedEngine, Identity))(lambda en,ex:PointerIndex) when(predicate_node_for, (IndexedEngine, Identity)) def identity_node(builder, expr, cases, remaining_exprs, memo): dont_cares, seedmap = builder.seed_bits(expr, cases) return dict( [(seed, builder.build(inc|dont_cares, remaining_exprs, memo)) for seed, (inc, exc) in seedmap.iteritems()] ) when(bitmap_index_type, (IndexedEngine, Comparison))(lambda en,ex:RangeIndex) when(predicate_node_for, (IndexedEngine, Comparison)) def range_node(builder, expr, cases, remaining_exprs, memo): dontcares, seedmap = builder.seed_bits(expr, cases) return split_ranges( dontcares, seedmap, lambda cases: builder.build(cases, remaining_exprs, memo) ) try: frozenset except NameError: from core import frozenset when(bitmap_index_type, (IndexedEngine, type(None)))(value(None)) when(bitmap_index_type, (IndexedEngine, (IsInstance, IsSubclass)))( value(TypeIndex) ) when(predicate_node_for, (IndexedEngine, (IsInstance, IsSubclass))) def class_node(builder, expr, cases, remaining_exprs, memo): dontcares, seedmap = builder.seed_bits(expr, cases) cache = {} def lookup_fn(cls): try: inc, exc = seedmap[cls] except KeyError: builder.reseed(expr, Class(cls)) seedmap.update(builder.seed_bits(expr, cases)[1]) inc, exc = seedmap[cls] cbits = dontcares | inc cbits ^= (exc & cbits) return cache.setdefault(cls, builder.build(cbits,remaining_exprs,memo)) return cache, lookup_fn abstract() def type_to_test(typ, expr, engine): """Convert `typ` to a ``Test()`` of `expr` for `engine`""" when(type_to_test, (type,)) when(type_to_test, (ClassType,)) def std_type_to_test(typ, expr, engine): return Test(IsInstance(expr), Class(typ)) when(type_to_test, (istype,)) def istype_to_test(typ, expr, engine): return Test(IsInstance(expr), typ) when(tests_for, (istype(tuple), Engine)) def tests_for_tuple(ob, engine): for cls, arg in zip(ob, engine.argnames): yield type_to_test(cls, Local(arg), engine) def always_testable(expr): """Is `expr` safe to evaluate in any order?""" return False #when(always_testable, (IsSubclass,)) XXX might not be a class! when(always_testable, ((Identity, Truth, Comparison, IsInstance),)) def testable_criterion(expr): return always_testable(expr.expr) when(always_testable, ((Local, Const),))(value(True)) when(parse_rule, (IndexedEngine, basestring)) def _parse_string(engine, predicate, ctx, cls): b = CriteriaBuilder(engine.arguments, ctx.localdict, ctx.globaldict, __builtins__) expr = b.parse(predicate) bindings = b.bindings[0] if cls is not None and engine.argnames: cls = type_to_test(cls, engine.arguments[engine.argnames[0]], engine) expr = intersect(cls, expr) # XXX rewrap body for bindings here? problem: CSE isn't ready # XXX so we'd have to make a temporary wrapper that gets replaced later. :( # XXX (the wrapper would just always recalc the values) # XXX Ugly bit at that point is that we're (slowly) generating code TWICE return Rule( maybe_bind(ctx.body, bindings), expr, ctx.actiontype, ctx.sequence ) def maybe_bind(func, bindings): """Apply expression bindings to arguments, if applicable""" if not bindings or not hasattr(func, 'func_code'): return func # no bindings or not a function args, varargs, varkw, defaults = inspect.getargspec(func) if not args or isinstance(args[0], basestring): return func # no args or first arg isn't a tuple for arg in args[0]: if not isinstance(arg, basestring): # nested tuple arg, not a binding return func for arg in args[0]: if arg in bindings: for arg in args[0]: if arg not in bindings: raise TypeError("Missing binding for %r" % arg) break else: return func # none of the tuple args are in the binding argtuple = Tuple([bindings[arg] for arg in args[0]]) c = Code.from_spec(func.func_name, args[1:], varargs, varkw) f = new.function( c.code(), func.func_globals, func.func_name, func.func_defaults ) f.func_code = c.code() # make f's signature match func w/out tuple c.return_(call_thru(f, Const(func), [argtuple])) # call to match that f.func_code = c.code() # now include the actual body f.__predicate_bindings__ = bindings, func # mark for later optimization return f # === As of this point, it should be possible to compile expressions! # meta_function(isinstance)( lambda __star__, __dstar__, *args, **kw: compileIsXCall(isinstance, IsInstance, args, kw, __star__, __dstar__) ) meta_function(issubclass)( lambda __star__, __dstar__, *args, **kw: compileIsXCall(issubclass, IsSubclass, args, kw, __star__, __dstar__) ) def compileIsXCall(func, test, args, kw, star, dstar): if ( kw or len(args)!=2 or not isinstance(args[1], Const) or isinstance(args[0], Const) or star is not None or dstar is not None ): return expressionSignature( Call(Const(func), args, tuple(kw.items()), star, dstar) ) expr, seq = args return Test(test(expr), Disjunction(map(Class, _yield_tuples(seq.value)))) def _yield_tuples(ob): if type(ob) is tuple: for i1 in ob: for i2 in _yield_tuples(i1): yield i2 else: yield ob when(compileIs, # matches 'type(x) is y' "isinstance(expr,Call) and isinstance(expr.func, Const)" " and (expr.func.value is type) and len(expr.args)==1" ) def compileTypeIsX(expr, criterion): return Test(IsInstance(expr.args[0]), istype(criterion)) when(expressionSignature, "isinstance(expr, Const) and isinstance(expr.value,priority)") def test_for_priority(expr): return Test(None, expr.value) PEAK-Rules-0.5a1.dev-r2713/peak/rules/syntax.py0000775000076600007660000000536311425706217020070 0ustar telec3telec3from peak.util.assembler import * from codegen import * from criteria import * from predicates import * from core import * from token import NAME from ast_builder import build __all__ = ['match', 'Bind', 'match_predicate', 'match_sequence'] nodetype() def Bind(name, code=None): if code is None: return name, raise TypeError("Can't compile Bind expression") def match_predicate(pattern, expr, binds): """Return predicate matching pattern to expr, updating binds w/bindings""" return Test(Comparison(expr), Inequality('==', pattern)) when(match_predicate, (type(None),)) def match_none(pattern, expr, binds): return Test(Identity(expr), IsObject(pattern)) when(match_predicate, (Bind,)) def match_bind(pattern, expr, binds): if pattern.name != '_': vals = binds.setdefault(pattern.name, []) if expr not in vals: vals.append(expr) for old in vals[-2:-1]: return Test(Truth(Compare(expr, (('==', old),))), True) return True class SyntaxBuilder(ExprBuilder): def Backquote(self, expr): while len(expr)==2: expr, expr = expr if expr[0]==NAME: return Bind(expr[1]) raise SyntaxError("backquotes may only be used around an indentifier") def match(expr, pattern): """Match `expr` against inline pattern `pattern` This function can only be called inside a rule string; the second argument is treated as a syntactic match pattern, which can include backquoted locals to be used as bind patterns.""" raise NotImplementedError( "Use match_predicate() to match compiled patterns outside a rule" ) def build_pattern(builder, node): old = builder.__class__ builder.__class__ = SyntaxBuilder try: return build(builder, node) finally: builder.__class__ = old meta_function(match, pattern=build_pattern) def compile_match(__builder__, expr, pattern): binds = {} pred = match_predicate(pattern, expr, binds) __builder__.bind(dict([(k, list(v)[0]) for k, v in binds.items()])) return pred when(match_predicate, (istype(list),)) when(match_predicate, (istype(tuple),)) def match_sequence(pattern, expr, binds): pred = Test(Comparison(Call(Const(len), (expr,))), Value(len(pattern))) for pos, item in enumerate(pattern): pred = intersect( pred, match_predicate(item, Getitem(expr, Const(pos)), binds) ) return pred when(match_predicate, (Node,)) def match_node(pattern, expr, binds): pred = Test(IsInstance(expr), istype(type(pattern))) for pos, item in enumerate(pattern): if pos: pred = intersect( pred, match_predicate(item, Getitem(expr, Const(pos)), binds) ) return pred PEAK-Rules-0.5a1.dev-r2713/peak/rules/indexing.py0000755000076600007660000003322711432714070020337 0ustar telec3telec3from __future__ import division import sys from peak.util.addons import AddOn from peak.rules.core import * from peak.rules.criteria import * from peak.rules.core import sorted, set, frozenset from peak.util.extremes import Min, Max, Extreme from peak.util.decorators import decorate from types import InstanceType __all__ = [ 'Ordering', 'BitmapIndex', 'TreeBuilder', 'split_ranges', 'to_bits', 'from_bits', 'RangeIndex', 'TypeIndex', 'TruthIndex', 'PointerIndex', 'bitmap_index_type' ] def define_ordering(ob, seq): items = [] for key in seq: Ordering(ob, key).requires(items) items.append(key) class Ordering(AddOn): """Track inter-expression ordering constraints""" def __init__(self, owner, expr): self.constraints = set() def requires(self, guards): c = frozenset(guards) cs = self.constraints if c in cs: return for oldc in list(cs): if c >= oldc: return # already a less-restrictive condition elif oldc >= c: cs.remove(oldc) cs.add(c) def can_precede(self, exprs): if not self.constraints: return True for c in self.constraints: for e in c: if e in exprs: break else: return True else: return False def to_bits(ints): """Return a bitset encoding the numbers contained in sequence `ints`""" b = 0 for i in ints: b |= 1<31: i = long(i) b |= 1<31 unless it's a long return b def from_bits(n): """Yield the (ascending) numbers contained in bitset n""" b = 0 while n: while not n & 15: n >>= 4 b += 4 if n & 1: yield b n >>= 1 b += 1 class TreeBuilder(object): """Template methods for the Chambers&Chen dispatch tree algorithm""" def build(self, cases, exprs, memo): key = (cases, exprs) if key in memo: return memo[key] if not exprs: node = self.build_leaf(cases, memo) else: best, rest = self.best_expr(cases, exprs) assert len(rest) < len(exprs) if best is None: # No best expression found, recurse node = self.build(cases, rest, memo) else: node = self.build_node(best, cases, rest, memo) memo[key] = node return node def build_node(self, expr, cases, remaining_exprs, memo): raise NotImplementedError def build_leaf(self, cases, memo): raise NotImplementedError def selectivity(self, expression, cases): """Return (seedcount,totalcases) selectivity statistics""" raise NotImplementedError def cost(self, expr, remaining_exprs): return 1 def best_expr(self, cases, exprs): best_expr = None best_spread = None to_do = list(exprs) remaining = dict.fromkeys(exprs) active_cases = len(cases) skipped = [] while to_do: expr = to_do.pop() if not Ordering(self, expr).can_precede(remaining): # Skip criteria that have unchecked prerequisites skipped.append(expr) continue branches, total_cases = self.selectivity(expr, cases) if total_cases == active_cases * branches: # None of the branches for this expression eliminate any # cases, so this expression isn't needed for dispatching del remaining[expr] # recheck skipped exprs that might be legal now to_do.extend(skipped) skipped = [] continue spread = total_cases / branches if best_expr is None or spread < best_spread: best_expr, best_spread = expr, spread best_cost = self.cost(expr, remaining) elif spread==best_spread: cost = self.cost(expr, remaining) if cost < best_cost: best_expr, best_cost = expr, cost if best_expr is not None: del remaining[best_expr] return best_expr, frozenset(remaining) class BitmapIndex(AddOn): """Index that computes selectivity and handles basic caching w/bitmaps""" known_cases = 0 match = True, decorate(staticmethod) def addon_key(*args): # Always use BitmapIndex as the add-on key return (BitmapIndex,)+args def __new__(cls, engine, expr): if cls is BitmapIndex: cls = bitmap_index_type(engine, expr) return super(BitmapIndex, cls).__new__(cls) def __init__(self, engine, expr): self.extra = {} self.null = self.all_seeds = {} # seed -> inc_cri, exc_cri self.criteria_bits = {} # cri -> case bits self.case_seeds = [] # case -> seed set self.criteria_seeds = {} # cri -> seeds def add_case(self, case_id, criterion): if criterion in self.criteria_seeds: seeds = self.criteria_seeds[criterion] else: self.criteria_bits.setdefault(criterion, 0) seeds = self.criteria_seeds[criterion] = self.add_criterion(criterion) case_seeds = self.case_seeds if case_id==len(case_seeds): case_seeds.append(seeds) else: self._extend_cases(case_id) case_seeds[case_id] = seeds bit = to_bits([case_id]) self.known_cases |= bit self.criteria_bits[criterion] |= bit def add_criterion(self, criterion): """Ensure `criterion` is indexed, return its "applicable seeds" set As a side effect, ``self.all_seeds`` must be updated to include any new seeds required by `criterion`. """ raise NotImplementedError def _extend_cases(self, case_id): if case_id >= len(self.case_seeds): self.case_seeds.extend( [self.null]*(case_id+1-len(self.case_seeds)) ) def selectivity(self, cases): if cases and cases[-1] >= len(self.case_seeds): self._extend_cases(cases[-1]) cases = map(self.case_seeds.__getitem__, cases) return (len(self.all_seeds), sum(map(len, cases))) def seed_bits(self, cases): bits = self.criteria_bits return cases ^ (self.known_cases & cases), dict([ (seed, (sum([bits[i] for i in inc]) & cases, sum([bits[e] for e in exc]) & cases)) for seed, (inc, exc) in self.all_seeds.items() ]) def expanded_sets(self): return [ (seed, [list(from_bits(inc)), list(from_bits(exc))]) for seed, (inc, exc) in self.seed_bits(self.known_cases)[1].items() ] def reseed(self, criterion): self.add_criterion(criterion) def include(self, seed, criterion, exclude=False): try: s = self.all_seeds[seed] except KeyError: s = self.all_seeds[seed] = set(), set() s[exclude].add(criterion) self.criteria_bits.setdefault(criterion, 0) def exclude(self, seed, criterion): self.include(seed, criterion, True) def split_ranges(dont_cares, bitmap, node=lambda b:b): """Return (exact, ranges) where `exact` is a dict[value]->node and `ranges` is a sorted list of ``((lo,hi),node)`` tuples expressing non-inclusive ranges. `dont_cares` and `bitmap` should be the return values from a bitmap index's ``seed_bits()`` method. `node(bits)` should return the value to be used as a node in the output; the default is to just return a bitmap of cases. """ ranges = [] exact = {} current = dont_cares for (val,d), (inc, exc) in bitmap.iteritems(): if d and not (val is Min and d==-1 and not inc): break # something other than == was used; use full algorithm exact[val] = node(current | inc) else: return exact, [((Min, Max), node(dont_cares))] low = Min for (val,d), (inc, exc) in sorted(bitmap.iteritems()): if val != low: if ranges and ranges[-1][-1]==current: low = ranges.pop()[0][0] ranges.append(((low, val), node(current))) low = val new = current | inc new ^= (new & exc) if not isinstance(val, Extreme): if d==0 or d<0: exact[val] = node(new) elif val not in exact: exact[val] = node(current) if d: current = new if low != Max: if ranges and ranges[-1][-1]==current: low = ranges.pop()[0][0] ranges.append(((low, Max), node(current))) return exact, ranges class PointerIndex(BitmapIndex): """Index for pointer equality""" def add_criterion(self, criterion): if isinstance(criterion, Intersection): match = False items = criterion else: match = criterion.match items = [criterion] for cri in items: seed = id(cri.ref) self.extra[seed] = 1 if match: self.include(seed, criterion) self.exclude(None, criterion) else: self.include(None, criterion) self.exclude(seed, criterion) if match: return self.match return self.extra def seed_bits(self, cases): dontcares, seedmap = BitmapIndex.seed_bits(self, cases) defaults = seedmap[None][0] # default inclusions for seed, (inc, exc) in seedmap.items(): if seed is not None: inc |= defaults ^ (defaults & exc) seedmap[seed] = inc, exc return dontcares, seedmap class TruthIndex(BitmapIndex): """Index for simple boolean expression tests""" def add_criterion(self, criterion): assert isinstance(criterion, Value) and criterion.value is True self.include(criterion.match, criterion) self.include(not criterion.match, not criterion) return self.match abstract() def bitmap_index_type(engine, expr): """Get the BitmapIndex subclass to use for the given engine and `expr`""" raise NotImplementedError(engine, expr) def class_seeds_for(criterion): """Yield class objects that 'criterion' thinks are relevant""" if isinstance(criterion, istype): yield criterion.type elif isinstance(criterion, Class): yield criterion.cls elif isinstance(criterion, Intersection): for c in criterion: for seed in class_seeds_for(c): yield seed class TypeIndex(BitmapIndex): """Index for istype(), Class(), and Classes() criteria""" def add_class(self, cls): t = istype(cls) for criterion, seeds in self.criteria_seeds.iteritems(): if implies(t, criterion): self.include(cls, criterion) seeds.add(cls) self.include(cls, t) # ensure it's in all_seeds def reseed(self, criterion): map(self.add_class, class_seeds_for(criterion)) if object not in self.all_seeds: self.include(object, istype(object)) def add_criterion(self, criterion): my_seeds = self.criteria_seeds.setdefault(criterion, set()) self.reseed(criterion) for seed in self.all_seeds: if implies(istype(seed), criterion): self.include(seed, criterion) my_seeds.add(seed) return my_seeds class RangeIndex(BitmapIndex): def __init__(self, engine, expr): BitmapIndex.__init__(self, engine, expr) self.null = None def add_criterion(self, criterion): if isinstance(criterion, Range): lo = i = criterion.lo hi = e = criterion.hi match = True elif isinstance(criterion, Value): lo = hi = val = (criterion.value, 0) if criterion.match: e = (Min, -1) i = val else: i = (Min, -1) e = val match = criterion.match else: raise TypeError(criterion) if (i not in self.all_seeds or e not in self.all_seeds or (lo,hi,match) not in self.extra ): # oops, there's a seed we haven't seen before: # make sure the offsets are rebuilt on the next selectivity() call self.extra.clear() self.include(i, criterion) self.exclude(e, criterion) return lo, hi, match def selectivity(self, cases): case_seeds = self.case_seeds if cases and cases[-1] >= len(case_seeds): self._extend_cases(cases[-1]) extras = self.extra if not extras: all_seeds = self.all_seeds offsets = dict([(k,n) for n,k in enumerate(sorted(all_seeds))]) all = extras[None] = len(all_seeds) all_but_one = all - 1 for lo,hi,inc in self.criteria_seeds.values(): if lo==hi: extras[lo,hi,inc] = [all_but_one, 1][inc] else: extras[lo,hi,inc] = offsets[hi] - offsets[lo] cases = map(case_seeds.__getitem__, cases) return (len(self.all_seeds), sum(map(extras.__getitem__, cases))) PEAK-Rules-0.5a1.dev-r2713/peak/rules/core.py0000755000076600007660000007672512041201063017462 0ustar telec3telec3__all__ = [ 'Rule', 'RuleSet', 'Dispatching', 'Engine', 'rules_for', 'compile_method', 'Method', 'Around', 'Before', 'After', 'MethodList', 'value', 'DispatchError', 'AmbiguousMethods', 'NoApplicableMethods', 'abstract', 'when', 'before', 'after', 'around', 'istype', 'parse_rule', 'implies', 'dominant_signatures', 'combine_actions', 'overrides', 'always_overrides', 'merge_by_default', 'intersect', 'disjuncts', 'negate' ] from peak.util.decorators import decorate_assignment, decorate, struct, \ synchronized, frameinfo, decorate_class, classy, apply_template from peak.util.assembler import Code, Const, Call, Local, Getattr, TryExcept, \ Suite, with_name from peak.util.addons import AddOn import inspect, new, itertools, operator, sys try: set, frozenset = set, frozenset except NameError: from sets import Set as set from sets import ImmutableSet class frozenset(ImmutableSet): """Kludge to fix the abomination that is ImmutableSet.__init__""" def __new__(cls, iterable=None): self = ImmutableSet.__new__(cls, iterable) ImmutableSet.__init__(self, iterable) return self def __init__(self, iterable=None): pass # all immutable initialization should be done by __new__! empty = frozenset() try: sorted = sorted except NameError: def sorted(seq,key=None): if key: d = [(key(v),v) for v in seq] else: d = list(seq) d.sort() if key: return [v[1] for v in d] return d # Core logic -- many of these are generics to be specialized w/"when" later def disjuncts(ob): """Return a *list* of the logical disjunctions of `ob`""" # False == no condition is sufficient == no disjuncts if ob is False: return [] if type(ob) is tuple: return _tuple_disjuncts(ob) return [ob] def implies(s1,s2): """Is s2 always true if s1 is true?""" return s1==s2 def overrides(a1, a2): """Does action a1 take precedence over action a2?""" return False def combine_actions(a1,a2): """Return a new action for the combination of a1 and a2""" if a1 is None: return a2 elif a2 is None: return a1 elif overrides(a1,a2): if not overrides(a2,a1): return a1.override(a2) elif overrides(a2,a1): return a2.override(a1) return a1.merge(a2) def rules_for(f): """Return the initialized ruleset for a generic function""" if not Dispatching.exists_for(f): d = Dispatching(f) d.rules.add(Rule(clone_function(f))) return Dispatching(f).rules class Dispatching(AddOn): """Manage a generic function's rules, engine, locking, and code""" engine = None def __init__(self, func): func.__doc__ # workaround for PyPy issue #1293 self.function = func self._regen = self._regen_code() # callback to regenerate code self.rules = RuleSet(self.get_lock()) self.backup = None # allows func to call itself during regeneration self.create_engine(TypeEngine) synchronized() def get_lock(self): return self.__lock__ def create_engine(self, engine_type): """Create a new engine of `engine_type`, unsubscribing old""" if self.engine is not None and self.engine in self.rules.listeners: self.rules.unsubscribe(self.engine) self.engine = engine_type(self) return self.engine synchronized() def request_regeneration(self): """Ensure code regeneration occurs on next call of the function""" if self.backup is None: self.backup = self.function.func_code self.function.func_code = self._regen def _regen_code(self): c = Code.from_function(self.function, copy_lineno=True) c.return_( call_thru( self.function, Call(Getattr( Call(Const(Dispatching), (Const(self.function),), fold=False), '_regenerate' )) ) ) return c.code() synchronized() def as_abstract(self): for action in self.rules: raise AssertionError("Can't make abstract: rules already exist") c = Code.from_function(self.function, copy_lineno=True) c.return_(call_thru(self.function, Const(self.rules.default_action))) if self.backup is None: self.function.func_code = c.code() else: self.backup = c.code() return self.function synchronized() def _regenerate(self): func = self.function assert self.backup is not None func.func_code = self.backup # ensure re-entrant calls work try: # try to replace the code with new code func.func_code = self.engine._generate_code() except: # failure: we'll try to regen again, next time we're called func.func_code = self._regen raise else: # success! get rid of the old backup code and return the function self.backup = None return func class DispatchError(Exception): """A dispatch error has occurred""" def __call__(self,*args,**kw): raise self.__class__(*self.args+(args,kw)) # XXX def __repr__(self): # This method is needed so doctests for 2.3/2.4 match 2.5 return self.__class__.__name__+repr(self.args) class MethodType(type): """Metaclass for method types This allows precedence to be declared between method types, and ensures that ``compiled()`` methods aren't inherited when ``__call__`` is redefined in a Method subclass. """ def __init__(cls, name, bases, cdict): if '__call__' in cdict and 'compiled' not in cdict: # Ensure 'compiled()' is not inherited for changed __call__ cls.compiled = lambda self, engine: self return type.__init__(cls, name, bases, cdict) def __rshift__(self, other): if type(other) is tuple: for item in other: always_overrides(self, item) else: always_overrides(self, other) return other def __rrshift__(self, other): if type(other) is tuple: for item in other: always_overrides(item, self) else: always_overrides(other, self) return self class Method(object): """A simple method w/optional chaining""" __metaclass__ = MethodType def __init__(self, body, signature=(), serial=0, tail=None): self.body = body self.signature = signature self.serial = serial self.tail = tail self.can_tail = False try: args = inspect.getargspec(body)[0] except TypeError: pass else: if args and args[0]=='next_method': if getattr(body, 'im_self', None) is None: # already bound? self.can_tail = True decorate(classmethod) def make(cls, body, signature=(), serial=0): return cls(body, signature, serial) def __repr__(self): data = (self.body, self.signature, self.serial, self.tail) return self.__class__.__name__+repr(data) def __call__(self, *args, **kw): """Slow way to call a method -- use compile_method instead!""" return compile_method(self)(*args, **kw) def override(self, other): if not self.can_tail: return self return self.tail_with(combine_actions(self.tail, other)) def tail_with(self, tail): return self.__class__(self.body, self.signature, self.serial, tail) def merge(self, other): #if self.__class__ is other.__class__ and self.body is other.body: # XXX need to merge signatures? # return self.__class__( # self.body, ???, ???, combine_actions(self.tail, other.tail) # ) return AmbiguousMethods([self,other]) decorate(classmethod) def make_decorator(cls, name, doc=None): if doc is None: doc = "Extend a generic function with a method of type ``%s``" \ % cls.__name__ if cls is Method: maker = None # allow gf's to use something else instead of Method else: maker = cls.make def decorate(f, pred=(), depth=2, frame=None): def callback(frame, name, func, old_locals): assert f is not func # XXX kind, module, locals_, globals_ = frameinfo(frame) context = ParseContext(func, maker, locals_, globals_, lineno) def register_for_class(cls, f=f): _register_rule(f, pred, context, cls) return cls if kind=='class': # 'when()' in class body; defer adding the method decorate_class(register_for_class, frame=frame) else: register_for_class(None) if old_locals.get(name) is f: return f # prevent overwriting if name is the same return func rv = decorate_assignment(callback, depth, frame) if frame is None: frame = sys._getframe(depth-1) lineno = frame.f_lineno # this isn't valid w/out active trace! return rv decorate = with_name(decorate, name) decorate.__doc__ = doc return decorate def compiled(self, engine): body = compile_method(self.body, engine) if not self.can_tail: return body else: return new.instancemethod(body, compile_method(self.tail, engine)) when = Method.make_decorator( "when", "Extend a generic function with a new action" ) class NoApplicableMethods(DispatchError): """No applicable action has been defined for the given arguments""" def merge(self, other): return AmbiguousMethods([self,other]) class AmbiguousMethods(DispatchError): """More than one choice of action is possible""" def __init__(self, methods, *args): DispatchError.__init__(self, methods, *args) mine = self.methods = [] for m in methods: if isinstance(m, AmbiguousMethods): mine.extend(m.methods) else: mine.append(m) def merge(self, other): return AmbiguousMethods(self.methods+[other]) def override(self, other): return self def __repr__(self): return "AmbiguousMethods(%s)" % self.methods class RuleSet(object): """An observable, stably-ordered collection of rules""" default_action = NoApplicableMethods() default_actiontype = Method def __init__(self, lock=None): self.rules = [] self.actiondefs = {} self.listeners = [] if lock is not None: self.__lock__ = lock synchronized() def add(self, rule): actiondefs = frozenset(self._actions_for(rule)) self.rules.append( rule ) self.actiondefs[rule] = actiondefs self._notify(added=actiondefs) synchronized() def remove(self, rule): actiondefs = self.actiondefs.pop(rule) self.rules.remove(rule) self._notify(removed=actiondefs) #def changed(self, rule): # sequence, actions = self.actions[rule] # new_actions = frozenset(self._actions_for(rule, sequence)) # self.actions[rule] = sequence, new_actions # self.notify(new_actions-actions, actions-new_actions) synchronized() def clear(self): actiondefs = frozenset(self) del self.rules[:]; self.actiondefs.clear() self._notify(removed=actiondefs) def _notify(self, added=empty, removed=empty): for listener in self.listeners[:]: # must be re-entrant listener.actions_changed(added, removed) synchronized() def __iter__(self): ad = self.actiondefs return iter([a for rule in self.rules for a in ad[rule]]) def _actions_for(self, (na, body, predicate, actiontype, seq)): actiontype = actiontype or self.default_actiontype for signature in disjuncts(predicate): yield Rule(body, signature, actiontype, seq) synchronized() def subscribe(self, listener): self.listeners.append(listener) if self.rules: listener.actions_changed(frozenset(self), empty) synchronized() def unsubscribe(self, listener): self.listeners.remove(listener) def _register_rule(gf, pred, context, cls): """Register a rule for `gf` with possible import-deferring""" if not isinstance(gf, basestring): rules = rules_for(gf) rules.add(parse_rule(Dispatching(gf).engine, pred, context, cls)) return if len(gf.split(':'))<>2 or len(gf.split())>1: raise TypeError( "Function specifier %r is not in 'module.name:attrib.name' format" % (gf,) ) modname, attrname = gf.split(':') from peak.util.imports import whenImported def _delayed_register(module): for attr in attrname.split('.'): module = getattr(module, attr) _register_rule(module, pred, context, cls) whenImported(modname, _delayed_register) class Engine(object): """Abstract base for dispatching engines""" reset_on_remove = True def __init__(self, disp): self.function = disp.function self.registry = {} self.closures = {} self.rules = disp.rules self.__lock__ = disp.get_lock() self.argnames = list( flatten(filter(None, inspect.getargspec(self.function)[:3])) ) self.rules.subscribe(self) synchronized() def actions_changed(self, added, removed): if removed and self.reset_on_remove: return self._full_reset() for rule in removed: self._remove_method(rule.predicate, rule) for rule in added: self._add_method(rule.predicate, rule) if added or removed: self._changed() def _changed(self): """Some change to the rules has occurred""" Dispatching(self.function).request_regeneration() def _full_reset(self): """Regenerate any code, caches, indexes, etc.""" self.registry.clear() self.actions_changed(self.rules, ()) Dispatching(self.function).request_regeneration() compiled_cache = None def apply_template(self, template, *args): try: return self.compiled_cache[template, args] except (KeyError, TypeError, AttributeError): pass try: closure = self.closures[template] except KeyError: if template.func_closure: raise TypeError("Templates cannot use outer-scope variables") import linecache; from peak.util.decorators import cache_source tmp = apply_template(template, self.function, *args) body = ''.join(linecache.getlines(tmp.func_code.co_filename)) filename = "<%s at 0x%08X wrapping %s at 0x%08X>" % ( template.__name__, id(template), self.function.__name__, id(self) ) d ={} exec compile(body, filename, "exec") in template.func_globals, d tmp, closure = d.popitem() closure.func_defaults = template.func_defaults cache_source(filename, body, closure) self.closures[template] = closure f = closure(self.function, *args) f.func_defaults = self.function.func_defaults try: hash(args) except TypeError: pass else: if self.compiled_cache is None: from weakref import WeakValueDictionary self.compiled_cache = WeakValueDictionary() self.compiled_cache[template, args] = f return f def _add_method(self, signature, rule): """Add a case for the given signature and rule""" registry = self.registry action = rule.actiontype(rule.body, signature, rule.sequence) if signature in registry: registry[signature] = combine_actions(registry[signature], action) else: registry[signature] = action return action def _remove_method(self, signature, rule): """Remove the case for the given signature and rule""" raise NotImplementedError def _generate_code(self): """Return a code object for the current state of the function""" raise NotImplementedError class TypeEngine(Engine): """Simple type-based dispatching""" cache = None def __init__(self, disp): self.static_cache = {} super(TypeEngine, self).__init__(disp) def _changed(self): if self.cache != self.static_cache: Dispatching(self.function).request_regeneration() def _bootstrap(self): # Bootstrap a self-referential generic function by ensuring an exact # list of signatures is always in the function's dispatch cache. # # Only peak.rules.core generic functions used in the implementation of # other generic functions need this; currently that's just implies() # and overrides(), which control method order and combining. # cache = self.static_cache for sig, act in self.registry.items(): for key in type_keys(sig): cache[key] = compile_method(act, self) self._changed() def _add_method(self, signature, rule): action = super(TypeEngine, self)._add_method(signature, rule) cache = self.static_cache for key in cache.keys(): if key==signature or implies(key, signature): del cache[key] return action def _generate_code(self): self.cache = cache = self.static_cache.copy() def callback(*args, **kw): types = tuple([getattr(arg,'__class__',type(arg)) for arg in args]) key = tuple(map(istype, types)) self.__lock__.acquire() try: action = self.rules.default_action for sig in self.registry: if implies(key, sig): action = combine_actions(action, self.registry[sig]) f = cache[types] = compile_method(action, self) finally: self.__lock__.release() return f(*args, **kw) c = Code.from_function(self.function, copy_lineno=True) types = [class_or_type_of(Local(name)) for name in flatten(inspect.getargspec(self.function)[0])] target = Call(Const(cache.get), (tuple(types), Const(callback))) c.return_(call_thru(self.function, target)) return c.code() # Handle alternates in tuple signatures # # when(disjuncts, (istype(tuple),)) - this has to be hardcoded for bootstrap def _tuple_disjuncts(ob): for posn, item in enumerate(ob): if type(item) is tuple: head = ob[:posn] return [ head+(mid,)+tail for tail in _tuple_disjuncts(ob[posn+1:]) for mid in item ] return [ob] # no alternates # Code generation stuff def flatten(v): if isinstance(v,basestring): yield v; return for i in v: for ii in flatten(i): yield ii def gen_arg(v): if isinstance(v,basestring): return Local(v) if isinstance(v,list): return tuple(map(gen_arg,v)) def call_thru(sigfunc, target, prefix=()): args, star, dstar, defaults = inspect.getargspec(sigfunc) return Call(target, list(prefix)+map(gen_arg,args), (), gen_arg(star), gen_arg(dstar), fold=False) def class_or_type_of(expr): return Suite([expr, TryExcept( Suite([Getattr(Code.DUP_TOP, '__class__'), Code.ROT_TWO, Code.POP_TOP]), [(Const(AttributeError), Call(Const(type), (Code.ROT_TWO,)))] )]) def clone_function(f): return new.function( f.func_code, f.func_globals, f.func_name, f.func_defaults, f.func_closure ) _default_engine = None def compile_method(action, engine=None): """Convert `action` into an optimized callable for `engine`""" if engine is None: global _default_engine if _default_engine is None: _default_engine = Dispatching(abstract(lambda *a, **k: None)).engine # allow any rules on non-None engines to apply return compile_method(action, _default_engine) if isinstance(action, (Method, value)): return action.compiled(engine) elif action is None: return engine.rules.default_action return action # Rules management def abstract(func=None): """Declare a function to be abstract""" if func is None: return decorate_assignment( lambda f,n,func,old: Dispatching(func).as_abstract() ) else: return Dispatching(func).as_abstract() next_sequence = itertools.count().next struct() def Rule(body, predicate=(), actiontype=None, sequence=None): if sequence is None: sequence = next_sequence() return body, predicate, actiontype, sequence struct() def ParseContext( body, actiontype=None, localdict=(), globaldict=(), lineno=None, sequence=None ): """Hold information needed to parse a predicate""" if sequence is None: sequence = next_sequence() return body, actiontype, dict(localdict), dict(globaldict), lineno, sequence def parse_rule(engine, predicate, context, cls): """Hook for pre-processing predicates, e.g. parsing string expressions""" if cls is not None and type(predicate) is tuple: predicate = (cls,) + predicate return Rule(context.body, predicate, context.actiontype, context.sequence) # Class/type rules and implications [struct()] def istype(type, match=True): return type, match def type_key(arg): if isinstance(arg, (type, ClassType)): return arg elif type(arg) is istype and arg.match: return arg.type def type_keys(sig): if type(sig) is tuple: key = tuple(map(type_key, sig)) if None not in key: yield key when(implies, (istype(tuple), istype(tuple))) def tuple_implies(s1,s2): if len(s2)>len(s1): return False # shorter tuple can't imply longer tuple for t1,t2 in zip(s1,s2): if not implies(t1,t2): return False else: return True from types import ClassType, InstanceType when(implies, (type, (ClassType, type) ))(issubclass) when(implies, (ClassType, ClassType ))(issubclass) when(implies, (istype, istype ))(lambda s1,s2: s1==s2 or (s1.type is not s2.type and s1.match and not s2.match)) when(implies, (istype, (ClassType, type) ))(lambda s1,s2: s1.match and implies(s1.type,s2)) # A classic class only implies a new-style one if it's ``object`` # or ``InstanceType``; this is an exception to the general rule that # isinstance(X,Y) implies issubclass(X.__class__,Y) when(implies, (ClassType, type))(lambda s1,s2: s2 is object or s2 is InstanceType) # Rule precedence [struct( __call__ = lambda self,*a,**kw: self.value, compiled = lambda self, engine: engine.apply_template(value_template, self.value), __repr__ = lambda self: 'value(%r)' % self[1:] )] def value(value): """Method body returning a constant value""" return value, def value_template(__func,__value): return "return __value" YES, NO = value(True), value(False) def always_overrides(a, b): """`a` instances always override `b`s; `b` instances never override `a`s""" a,b = istype(a), istype(b) pairs = {} to_add = [(a,b)] for rule in rules_for(overrides): sig = rule.predicate if type(sig) is not tuple or len(sig)!=2 or rule.body is not YES: continue pairs[sig]=1 if sig[0]==b: to_add.append((a, sig[1])) if sig[1]==a: to_add.append((sig[0], b)) for (p1,p2) in to_add: if (p2,p1) in pairs: raise TypeError("%r already overrides %r" % (b.type, a.type)) for (p1,p2) in to_add: if (p1,p2) not in pairs: when(overrides, (p1, p2))(YES) when(overrides, (p2, p1))(NO) def merge_by_default(t): """instances of `t` never imply other instances of `t`""" when(overrides, (t, t))(NO) class MethodList(Method): """A list of related methods""" can_tail = False _sorted_items = None def __init__(self, items=(), tail=None): self.items = list(items) self.tail = tail decorate(classmethod) def make(cls, body, signature=(), serial=0): return cls( [(serial, signature, body)] ) def __repr__(self): data = self.items, self.tail return self.__class__.__name__+repr(data) def tail_with(self, tail): return self.__class__(self.items, tail) def merge(self, other): if other.__class__ is not self.__class__: raise TypeError("Incompatible action types for merge", self, other) return self.__class__( self.items+other.items, combine_actions(self.tail, other.tail) ) def compiled(self, engine): wrappers = tuple(engine.rules.methodlist_wrappers) bodies = [compile_method(body,engine) for sig, body in self.sorted()] return engine.apply_template(list_template, tuple(bodies), wrappers) def sorted(self): if self._sorted_items is not None: return self._sorted_items self.items.sort() rest = [(s,b) for (serial, s, b) in self.items] self._sorted_items = items = [] seen = set() while rest: best = dominant_signatures(rest) map(rest.remove, best) for s,b in best: if b not in seen: seen.add(b) items.append((s,b)) return items def list_template(__func, __bodies, __wrappers): return """ def __iterate(): for __body in __bodies: yield __body($args) __result = __iterate() for __wrapper in __wrappers: __result = __wrapper(__result) return __result""" merge_by_default(MethodList) class Around(Method): """'Around' Method (takes precedence over regular methods)""" around = Around.make_decorator('around') class Before(MethodList): """Method(s) to be called before the primary method(s)""" can_tail = True def compiled(self, engine): tail = compile_method(self.tail, engine) bodies = [compile_method(body,engine) for sig, body in self.sorted()] return engine.apply_template(before_template, tail, tuple(bodies)) def before_template(__func, __tail, __bodies): return """ for __body in __bodies: __body($args) return __tail($args)""" before = Before.make_decorator('before') class After(MethodList): """Method(s) to be called after the primary method(s)""" can_tail = True def sorted(self): # Reverse the sorting for after methods if self._sorted_items is not None: return self._sorted_items items = super(After,self).sorted() items.reverse() return items def compiled(self, engine): tail = compile_method(self.tail, engine) bodies = [compile_method(body, engine) for sig, body in self.sorted()] return engine.apply_template(after_template, tail, tuple(bodies)) def after_template(__func, __tail, __bodies): return """ __retval = __tail($args) for __body in __bodies: __body($args) return __retval""" after = After.make_decorator('after') # Define the overall method order Around >> Before >> After >> (Method, MethodList) # These are necessary to ensure that any added Method subclasses will # automatically override NoApplicableMethods (and any subclasses thereof): # when(overrides, (Method, NoApplicableMethods))(YES) when(overrides, (NoApplicableMethods, Method))(NO) when(overrides, (Method,Method)) def method_overrides(a1, a2): if a1.__class__ is a2.__class__: return implies(a1.signature, a2.signature) raise TypeError("Incompatible action types", a1, a2) when(overrides, (AmbiguousMethods, Method)) def ambiguous_overrides(a1, a2): for m in a1.methods: if overrides(m, a2): # if any ambiguous method overrides a2, we can toss it return True return False when(overrides, (Method, AmbiguousMethods)) def override_ambiguous(a1, a2): for m in a2.methods: if not overrides(a1, m): return False return True # can only override if it overrides all the ambiguity # needed to disambiguate the above two methods if combining a pair of AM's: merge_by_default(AmbiguousMethods) # And now we can bootstrap the core! These two functions are used in the # TypeEngine implementation, so we force them to statically cache all their # current methods. That way, they'll still work even if they're called during # one of their own cache misses or code regenerations: Dispatching(implies).engine._bootstrap() Dispatching(overrides).engine._bootstrap() when(parse_rule, (TypeEngine, istype(tuple, False))) def parse_upgrade(engine, predicate, context, cls): """Upgrade to predicate dispatch engine when called w/unrecognized args""" if isinstance(predicate, (type, ClassType, istype)): # convert single item to tuple - no need to upgrade engine return parse_rule(engine, (predicate,), context, cls) from peak.rules.predicates import IndexedEngine return parse_rule( Dispatching(engine.function).create_engine(IndexedEngine), predicate, context, cls ) when(rules_for, type(After.sorted))(lambda f: rules_for(f.im_func)) # Logical functions needed for extensions to the core, but that should be # shared by all extensions. abstract() def negate(c): """Return the logical negation of criterion `c`""" when(negate, (bool,) )(operator.not_) when(negate, (istype,))(lambda c: istype(c.type, not c.match)) abstract() def intersect(c1, c2): """Return the logical intersection of two conditions""" around(intersect, (object, object)) def intersect_if_implies(next_method, c1, c2): if implies(c1,c2): return c1 elif implies(c2, c1): return c2 return next_method(c1, c2) # These are needed for boolean intersects to work correctly when(implies, (bool, bool))(lambda c1, c2: c2 or not c1) when(implies, (bool, object))(lambda c1, c2: not c1) when(implies, (object, bool))(lambda c1, c2: c2) def dominant_signatures(cases): """Return the most-specific ``(signature,body)`` pairs from `cases` `cases` is a sequence of ``(signature,body)`` pairs. This routine checks the ``implies()`` relationships between pairs of signatures, and then returns a list of ``(signature,method)`` pairs such that no signature remaining in the original list implies a signature in the new list. The relative order of cases in the new list is preserved. """ if len(cases)==1: # Shortcut for common case return list(cases) best, rest = list(cases[:1]), list(cases[1:]) for new_sig, new_meth in rest: for old_sig, old_meth in best[:]: # copy so we can modify inplace new_implies_old = implies(new_sig, old_sig) old_implies_new = implies(old_sig, new_sig) if new_implies_old: if not old_implies_new: # better, remove the old one best.remove((old_sig, old_meth)) elif old_implies_new: # worse, skip adding the new one break else: # new_sig has passed the gauntlet, as it has not been implied # by any of the current "best" items best.append((new_sig,new_meth)) return best PEAK-Rules-0.5a1.dev-r2713/peak/rules/__init__.py0000755000076600007660000001253311633340477020277 0ustar telec3telec3"""The PEAK Rules Framework""" from peak.rules.core import abstract, when, before, after, around, istype, \ DispatchError, AmbiguousMethods, NoApplicableMethods, value def combine_using(*wrappers): """Designate a generic function that wraps the iteration of its methods Standard "when" methods will be combined by iteration in precedence order, and the resulting iterator will be passed to the supplied wrapper(s), last first. (e.g. ``combine_using(sorted, itertools.chain)`` will chain the sequences supplied by each method into one giant list, and then sort it). As a special case, if you include ``abstract`` in the wrapper list, it will be removed, and the decorated function will be marked as abstract. This decorator can only be used once per function, and can't be used if the generic function already has methods (even the default method) or if a custom method type has already been set (e.g. if you already called ``combine_using()`` on it before). """ is_abstract = abstract in wrappers if is_abstract: wrappers = tuple([w for w in wrappers if w is not abstract]) def callback(frame, name, func, old_locals): if core.Dispatching.exists_for(func) and list(core.rules_for(func)): raise RuntimeError("Methods already defined for", func) if is_abstract: func = abstract(func) r = core.Dispatching(func).rules if r.default_actiontype is not core.Method: raise RuntimeError("Method type already defined for", func) r.default_actiontype = core.MethodList.make r.methodlist_wrappers = wrappers[::-1] if not is_abstract: r.add(core.Rule(core.clone_function(func))) return func return core.decorate_assignment(callback) def expand_as(predicate_string): """In rules, use the supplied condition in place of the decorated function Usage:: @expand_as('filter is None or value==filter') def check(filter, value): "Check whether value matches filter" When the above function is used in a rule, the supplied condition will be "inlined", so that PEAK-Rules can expand the function call without losing its ability to index the rule or determine its precedence relative to other rules. When the condition is inlined, it's done in a namespace-safe manner. Names in the supplied condition will refer to either the arguments of the decorated function, or to locals/globals in the frame where ``expand_as`` was called. Names defined in the condition body (e.g. via ``let``) will not be visible to the caller. To prevent needless code duplication, you do not need to provide a body for your function, unless it is to behave differently than the supplied condition when it's called outside a rule. If the decorated function has no body of its own (i.e. it's a ``pass`` or just a docstring), the supplied condition will be compiled to provide one automatically. (That's why the above example usage has no body for the ``check()`` function.) """ def callback(frame, name, func, old_locals): from peak.rules.predicates import _expand_as kind, module, locals_, globals_ = core.frameinfo(frame) return _expand_as( func, predicate_string, locals_, globals_, __builtins__ ) return core.decorate_assignment(callback) def let(**kw): """Define temporary variables for use in rules and methods Usage:: @when(somefunc, "let(x=foo(y), z=y*2) and x>z") def some_method((x,z), next_method, y): # do something here The keywords used in the let() expression become available for use in any part of the rule that is joined to the ``let()`` by an ``and`` expression, but will not be available in expressions joined by ``or`` or ``not`` branches. Any ``let()`` calls at the top level of the expression will also be available for use in the method body, if you place them in a tuple argument in the *very first* argument position -- even before ``next_method`` and ``self``. Note that variables defined by ``let()`` are **lazy** - their values are not computed until/unless they are actually needed by the relevant part of the rule, so it does not slow things down at runtime to list all your variables up front. Likewise, only the variables actually listed in your first-argument tuple are calculated, and only when the method is actually invoked. (Currently, this feature is mainly to support easy-to-understand rules, and DRY method bodies, as variables used in the rule's criteria may be calculated a second time when the method is invoked.) Note that while variable calculation is lazy, there *is* an evaluation order *between* variables in a let; you can't use a let-variable before it's been defined; you'll instead get whatever argument, local, or global variable would be shadowed by the as-yet-undefined variable. """ raise NotImplementedError("`let` can only be used in rules, not code!") __all__ = [_k for _k in list(globals()) if not _k.startswith('_') and _k!='core'] # TEMPORARY BACKWARDS COMPATIBILITY - PLEASE IMPORT THIS DIRECTLY FROM CORE # (or better still, use the '>>' operator that method types now have) # from peak.rules.core import * # always_overrides, Method, etc. PEAK-Rules-0.5a1.dev-r2713/peak/rules/debug/0000755000076600007660000000000012506203664017241 5ustar telec3telec3PEAK-Rules-0.5a1.dev-r2713/peak/rules/debug/decompile.py0000755000076600007660000001073711633340477021573 0ustar telec3telec3from peak.rules.core import when, value from peak.util.assembler import * from peak.rules.codegen import * __all__ = ['precedence', 'associativity', 'decompile',] def precedence(ob): """Get subexpression precedence - 0 is closest-binding precedence Subexpressions must be parenthesized when their parent expression has a lower return value from this function than they do. If the parent expression has equal precedence, then grouping is dependent upon associativity. """ return 0 def associativity(ob): """Subexpression's operator associativity (0 for left, 1 for right) If a subexpression and parent expression have equal precedence, parentheses are needed unless the subexpression is in this position among the parent expression's operands. """ return 0 when(associativity, (Power,) )(value(1)) when(associativity, (IfElse,))(value(2)) when(associativity, (ListOp,))(value(None)) def needs_parens(parent, child, posn): """Does expression `child` need parens if it's at `posn` in `parent`?""" return False def decompile(ob): return repr(ob) prec_ops = (UnaryOp, BinaryOp, ListOp) for prec, group in enumerate([ (Tuple, List, Dict, Repr, ListComp, BuildSlice), (Getitem, GetSlice, Getattr, Call), (Power,), (Plus, Minus, Invert), (Mul, Div, FloorDiv, Mod), (Add, Sub), (LeftShift, RightShift), (Bitand,), (Bitxor,), (Bitor,), (Compare,), (Not,), (And,), (Or,), (IfElse,), ]): when(precedence, (group,))(value(prec)) for cls in group: if not issubclass(cls, prec_ops): prec_ops += (cls,) when(needs_parens, (prec_ops, prec_ops)) def prec_parens(parent, child, posn): pprec = precedence(parent) pchild = precedence(child) if pprec < pchild: return True passoc = associativity(parent) return pprec==pchild and passoc is not None and posn != passoc when(needs_parens, (Power, (Plus, Minus, Invert))) def prec_power(next_method, parent, child, posn): if posn==1: return False return next_method(parent, child, posn) def decompiled_children(parent, children): for posn, child in enumerate(children): operand = decompile(child) if needs_parens(parent, child, posn): operand = '(%s)' % (operand,) yield operand when(needs_parens, (Getattr, Const)) def int_parens(parent, child, pons): return decompile(child).isdigit() # parenthesize integers when(decompile, ((UnaryOp, BinaryOp, GetSlice, IfElse),)) def decompile_fmt(expr): return expr.fmt % tuple(decompiled_children(expr, expr[1:])) when(decompile, (Local,)) def decompile_local(expr): return expr[1] when(decompile, (Getattr,)) def decompile_getattr(expr): ignore, left, right = expr return '%s.%s' % (list(decompiled_children(expr, [left]))[0], right) when(decompile, (Pass.__class__,)) def decompile_pass(expr): return '' when(decompile, (Const,)) def decompile_const(expr): return decompile(expr.value) when(decompile, (ListOp,)) def decompile_items(expr): return expr.fmt % ', '.join(decompiled_children(expr, expr[1])) And.separator = ' and ' Or.separator = ' or ' when(decompile, ((And, Or),)) def decompile_sep(expr): return expr.separator.join(decompiled_children(expr, expr[1])) when(decompile, (slice,))( lambda e: _decompile_slice(e.start, e.stop, e.step) ) when(decompile, (BuildSlice,))( lambda e: _decompile_slice(e.start, e.stop, e.stride) ) def _decompile_slice(start, stop, step): slice = '' if start is not None and start is not Pass: slice += decompile(start) slice +=':' if stop is not None and stop is not Pass: slice += decompile(stop) if step is not None and step is not Pass: slice += ':' + decompile(step) return slice when(decompile, (Dict,)) def decompile_dict(expr): return '{%s}' % ', '.join( [('%s: %s' % tuple(map(decompile, args))) for args in expr[1]] ) when(decompile, (Call,)) def decompile_dict(expr): func = expr[1] if needs_parens(expr, func, 0): func = '(%s)' % (decompile(func),) else: func = decompile(func) return '%s(%s)' % (func, ', '.join(_decompiled_call_args(expr))) def _decompiled_call_args(call): ignore, ignore, posargs, kwargs, star, dstar, ignore = call for arg in posargs: yield decompile(arg) for kw,arg in kwargs: yield '%s=%s' % (kw.value, decompile(arg)) if star: yield '*' + decompile(star) if dstar: yield '**' + decompile(dstar) PEAK-Rules-0.5a1.dev-r2713/peak/rules/debug/__init__.py0000644000076600007660000000000011633340477021344 0ustar telec3telec3PEAK-Rules-0.5a1.dev-r2713/peak/rules/criteria.py0000755000076600007660000003303311433706265020337 0ustar telec3telec3from peak.rules.core import * from peak.rules.core import sorted, frozenset, set from peak.util.decorators import struct from weakref import ref from sys import maxint from peak.util.extremes import Min, Max __all__ = [ 'Range', 'Value', 'IsObject', 'Class', 'tests_for', 'Conjunction', 'Disjunction', 'Test', 'Signature', 'Inequality', 'DisjunctionSet', 'OrElse', 'Intersection', ] class Intersection(object): """Abstract base for conjunctions and signatures""" __slots__ = () class Disjunction(object): """Abstract base for DisjunctionSet and OrElse Note that a Disjunction can never have less than 2 members, as creating a Disjunction with only 1 item returns that item, and creating one with no items returns ``False`` (as no acceptable conditions means "never true"). """ __slots__ = () def __new__(cls, input): if cls is Disjunction: return DisjunctionSet(input) return super(Disjunction, cls).__new__(cls) when(negate, (Disjunction,))(lambda c: reduce(intersect, map(negate, c))) struct() def Range(lo=(Min,-1), hi=(Max,1)): if hi<=lo: return False return lo, hi struct() def Value(value, match=True): return value, match when(implies, (Value, Range))( # ==Value implies Range if Value is within range lambda c1, c2: c1.match and c2.lo <= (c1.value, 0) <= (c2.hi) ) when(implies, (Range, Value))( # Range implies !=Value if Value is out of range lambda c1, c2: not c2.match and not (c1.lo <= (c2.value, 0) <= (c1.hi)) or c2.match and c1==Range((c2.value,-1), (c2.value,1)) ) when(implies, (Range, Range))( # Range1 implies Range2 if both points are within Range2's bounds lambda c1, c2: c1.lo >= c2.lo and c1.hi <= c2.hi ) when(implies, (Value, Value))( lambda c1, c2: c1==c2 or (c1.match and not c2.match and c1.value!=c2.value) ) when(intersect, (Range, Range))( lambda c1, c2: Range(max(c1.lo, c2.lo), min(c1.hi, c2.hi)) ) def to_range(v): lo, hi = (v.value, -1), (v.value, 1) if v.match: return Range(lo, hi) return Disjunction([Range(hi=lo), Range(lo=hi)]) when(intersect, (Value, Value)) def intersect_values(c1, c2): if c1==c2: return c1 return intersect(to_range(c1), to_range(c2)) # if a value is a point inside a range, the implication rules would handle it; # therefore, they are either !=values, or outside the range (and thus empty) when(intersect, (Range, Value))( lambda c1, c2: not c2.match and intersect(c1, to_range(c2)) or False ) when(intersect, (Value, Range))( lambda c1, c2: not c1.match and intersect(to_range(c1), c2) or False ) when(negate, (Value,))(lambda c: Value(c.value, not c.match)) when(negate, (Range,))(lambda c: Disjunction([Range(hi=c.lo), Range(lo=c.hi)])) struct() def Class(cls, match=True): return cls, match when(negate, (Class,))(lambda c: Class(c.cls, not c.match)) when(implies, (Class, Class)) def class_implies(c1, c2): if c1==c2: # not isinstance(x) implies not isinstance(x) always # isinstance(x) implies isintance(x) always return True elif c1.match and c2.match: # isinstance(x) implies isinstance(y) if issubclass(x,y) return implies(c1.cls, c2.cls) elif c1.match or c2.match: # not isinstance(x) implies isinstance(x) never # isinstance(x) implies not isinstance(y) never return False else: # not isinstance(x) implies not isinstance(y) if issubclass(y, x) return implies(c2.cls, c1.cls) when(intersect, (Class, Class))(lambda c1,c2: Conjunction([c1, c2])) when(implies, (istype, Class))(lambda c1,c2: c1.match and (c2.match == implies(c1.type, c2.cls)) # use ob/inst rules ) when(implies, (Class, istype))(lambda c1,c2: c1.match and not c2.match and c1.cls is not c2.type and issubclass(c1.cls, c2.type) ) struct() def Test(expr, criterion): d = disjuncts(criterion) if len(d)!=1: return Disjunction([Test(expr, c) for c in d]) return expr, criterion when(negate, (Test,))(lambda c: Test(c.expr, negate(c.criterion))) when(implies, (Test, Test))( lambda c1, c2: c1.expr==c2.expr and implies(c1.criterion, c2.criterion) ) when(intersect, (Test, Test)) def intersect_tests(c1, c2): if c1.expr==c2.expr: return Test(c1.expr, intersect(c1.criterion, c2.criterion)) else: return Signature([c1, c2]) inequalities = { '>': lambda v: Range(lo=(v, 1)), '>=': lambda v: Range(lo=(v,-1)), '<': lambda v: Range(hi=(v,-1)), '<=': lambda v: Range(hi=(v, 1)), '!=': lambda v: Value(v, False), '==': lambda v: Value(v, True), } inequalities['=<'] = inequalities['<='] inequalities['=>'] = inequalities['>='] inequalities['<>'] = inequalities['!='] def Inequality(op, value): return inequalities[op](value) class Signature(Intersection, tuple): """Represent and-ed Tests, in sequence""" __slots__ = () def __new__(cls, input): output = [] index = {} input = iter(input) for new in input: if new is True: continue elif new is False: return False assert isinstance(new, Test), \ "Signatures can only contain ``criteria.Test`` instances" expr = new.expr if expr in index: posn = index[expr] old = output[posn] if old==new: continue # 'new' is irrelevant, skip it new = output[posn] = intersect(old, new) if new is False: return False else: index[expr] = len(output) output.append(new) if not output: return True elif len(output)==1: return output[0] return tuple.__new__(cls, output) def __repr__(self): return "Signature("+repr(list(self))+")" when(negate, (Signature,))(lambda c: OrElse(map(negate, c))) class IsObject(int): """Criterion for 'is' comparisons""" __slots__ = 'ref', 'match' def __new__(cls, ob, match=True): self = IsObject.__base__.__new__(cls, id(ob)&maxint) self.match = match self.ref = ob return self def __eq__(self,other): return self is other or ( isinstance(other, IsObject) and self.match==other.match and self.ref is other.ref ) or (isinstance(other,(int,long)) and int(self)==other and self.match) def __ne__(self,other): return not (self==other) def __repr__(self): return "IsObject(%r, %r)" % (self.ref, self.match) when(negate, (IsObject,))(lambda c: IsObject(c.ref, not c.match)) when(implies, (IsObject, IsObject)) def implies_objects(c1, c2): # c1 implies c2 if it's identical, or if c1=="is x" and c2=="is not y" return c1==c2 or (c1.match and not c2.match and c1.ref is not c2.ref) when(intersect, (IsObject, IsObject)) def intersect_objects(c1, c2): # 'is x and is y' 'is not x and is x' if (c1.match and c2.match) or (c1.ref is c2.ref): return False else: # "is not x and is not y" return Conjunction([c1,c2]) class DisjunctionSet(Disjunction, frozenset): """Set of minimal or-ed conditions (i.e. no redundant/implying items) Note that a Disjunction can never have less than 2 members, as creating a Disjunction with only 1 item returns that item, and creating one with no items returns ``False`` (as no acceptable conditions means "never true"). """ def __new__(cls, input): output = [] for item in input: for new in disjuncts(item): for old in output[:]: if implies(new, old): break elif implies(old, new): output.remove(old) else: output.append(new) if not output: return False elif len(output)==1: return output[0] return frozenset.__new__(cls, output) when(implies, (Disjunction, (object, Disjunction))) def union_implies(c1, c2): # Or(...) implies x if all its disjuncts imply x for c in c1: if not implies(c, c2): return False else: return True when(implies, (object, Disjunction)) def ob_implies_union(c1, c2): # x implies Or(...) if it implies any disjunct for c in c2: if implies(c1, c): return True else: return False # We use @around for these conditions in order to avoid redundant implication # testing during intersection, as well as to avoid ambiguity with the # (object, bool) and (bool, object) rules for intersect(). # around(intersect, (Disjunction, object))( lambda c1, c2: type(c1)([intersect(x,c2) for x in c1]) ) around(intersect, (object, Disjunction))( lambda c1, c2: type(c2)([intersect(c1,x) for x in c2]) ) around(intersect, (Disjunction, Disjunction))( lambda c1, c2: type(c1)([intersect(x,y) for x in c1 for y in c2]) ) around(intersect, (Disjunction, Intersection))( lambda c1, c2: type(c1)([intersect(x,c2) for x in c1]) ) around(intersect, (Intersection, Disjunction))( lambda c1, c2: type(c2)([intersect(c1,x) for x in c2]) ) # XXX These rules prevent ambiguity with implies(object, bool) and # (bool, object), at the expense of redundancy. This can be cleaned up later # when we allow cloning of actions for an existing rule. (That is, when we can # say "treat (bool, Disjunction) like (bool, object)" without duplicating the # action.) # when(implies, (bool, Disjunction))(lambda c1, c2: not c1) when(implies, (Disjunction, bool))(lambda c1, c2: c2) # The disjuncts of a Disjunction are a list of its contents: when(disjuncts, (DisjunctionSet,))(list) abstract() def tests_for(ob, engine=None): """Yield the tests composing ob, if any""" when(tests_for, (Test, ))(lambda ob, e: iter([ob])) when(tests_for, (Signature,))(lambda ob, e: iter(ob)) when(tests_for, (bool, ))(lambda ob, e: iter([])) class Conjunction(Intersection, frozenset): """Set of minimal and-ed conditions (i.e. no redundant/implied items) Note that a Conjunction can never have less than 2 members, as creating a Conjunction with only 1 item returns that item, and creating one with no items returns ``True`` (since no required conditions means "always true"). """ def __new__(cls, input): output = [] for new in input: for old in output[:]: if implies(old, new): break elif implies(new, old): output.remove(old) elif mutually_exclusive(new, old): return False else: output.append(new) if not output: return True elif len(output)==1: return output[0] return frozenset.__new__(cls, output) around(implies, (Intersection, object)) def set_implies(c1, c2): for c in c1: if implies(c, c2): return True else: return False around(implies, ((Intersection, object), Intersection)) def ob_implies_set(c1, c2): for c in c2: if not implies(c1, c): return False else: return True when(negate, (Conjunction,))(lambda c: Disjunction(map(negate, c))) when(intersect, (istype,Class)) def intersect_type_class(c1, c2): if not c1.match: return Conjunction([c1,c2]) return False when(intersect, (Class, istype)) def intersect_type_class(c1, c2): if not c2.match: return Conjunction([c1,c2]) return False when(intersect, (istype,istype)) def intersect_type_type(c1, c2): # This is only called if there's no implication if c1.match or c2.match: return False # so unless both are False matches, it's an empty set return Conjunction([c1, c2]) def mutually_exclusive(c1, c2): """Is the intersection of c1 and c2 known to be always false?""" return False when(mutually_exclusive, (istype, istype))( lambda c1, c2: c1.match != c2.match and c1.type==c2.type ) # Intersecting an intersection with something else should return a set of the # same type as the leftmost intersection. These methods are declared @around # to avoid redundant implication testing during intersection, as well as to # avoid ambiguity with the (object, bool) and (bool, object) rules for # intersect(). # around(intersect, (Intersection, Intersection))( lambda c1, c2: type(c1)(list(c1)+list(c2)) ) around(intersect, (Intersection, object))( lambda c1, c2: type(c1)(list(c1)+[c2]) ) around(intersect, (object, Intersection))( lambda c1, c2: type(c2)([c1]+list(c2)) ) # Default intersection is a Conjunction when(intersect, (object, object))(lambda c1, c2: Conjunction([c1,c2])) class OrElse(Disjunction, tuple): """SEQUENCE of or-ed conditions (excluding redundant/implying items)""" def __new__(cls, input): output = [] for item in input: for old in output[:]: if implies(item, old): break elif implies(old, item): output.remove(old) else: output.append(item) if not output: return False elif len(output)==1: return output[0] return tuple.__new__(cls, output) def __repr__(self): return "%s(%s)" % (self.__class__.__name__, list(self)) when(disjuncts, (OrElse,)) def sequential_disjuncts(c): pre = True out = set() for cond in c: out.update(disjuncts(intersect(pre, cond))) pre = intersect(pre, negate(cond)) return list(out) PEAK-Rules-0.5a1.dev-r2713/peak/rules/dispatch.py0000755000076600007660000000616111432714070020326 0ustar telec3telec3"""RuleDispatch Emulation API""" from peak.util.decorators import decorate_assignment from peak.rules import core from warnings import warn import new, sys __all__ = [ 'on', 'as', 'generic', 'as_' ] def generic(combiner=None): """Please switch to using @peak.rules.abstract()""" if combiner is not None: raise AssertionError( "Custom combiners are not supported by the compatibility API;" " please use a custom method type instead" ) def callback(frm,name,value,old_locals): value.when = core.when.__get__(value) value.around = core.around.__get__(value) value.before = core.before.__get__(value) value.after = core.after.__get__(value) def clear(): core.rules_for(value).clear() value.clear = clear return core.abstract(value) return decorate_assignment(callback) def as_(*decorators): """Please switch to using @peak.util.decorators.decorate""" warn("Please use @peak.util.decorators.decorate instead of dispatch.as()", DeprecationWarning, 2 ) def callback(frame,k,v,old_locals): for d in decorators[::-1]: v = d(v) return v return decorate_assignment(callback) globals()['as'] = as_ def make_module(): def callback(frm, name, value, old_locals): m = new.module('dispatch.'+name) v = value() m.__dict__.update(v) m.__all__ = list(v) sys.modules.setdefault(m.__name__, m) sys.modules.setdefault('peak.rules.'+m.__name__, m) return m return decorate_assignment(callback) make_module() def interfaces(): from peak.rules.core import DispatchError, NoApplicableMethods from peak.rules.core import AmbiguousMethods as AmbiguousMethod __all__.extend(locals()) # add these to the main namespace, too globals().update(locals()) return locals() make_module() def strategy(): default = () from peak.util.extremes import Min, Max return locals() def on(argument_name): """Please switch to using @peak.rules.abstract()""" def callback(frm,name,value,old_locals): import inspect funcname = value.__name__ args, varargs, kwargs, defaults = inspect.getargspec(value) if argument_name not in args: raise NameError("%r does not have an argument named %r" % (value, argument_name)) argpos = args.index(argument_name) func = value def when(cond): """Add following function to this GF, using 'cond' as a guard""" def callback(frm,name,value,old_locals): core.when(func, (object,)*argpos + (cond,))(value) if old_locals.get(name) is func: return func return value return decorate_assignment(callback) def clone(): raise AssertionError("PEAK-Rules can't clone() generic functions") func.when = when func.clone = clone return core.abstract(func) return decorate_assignment(callback) # Install sys.modules.setdefault('dispatch', sys.modules[__name__]) PEAK-Rules-0.5a1.dev-r2713/setup.cfg0000775000076600007660000000010512506203664015723 0ustar telec3telec3[egg_info] tag_build = .dev-r2713 tag_date = 0 tag_svn_revision = 0 PEAK-Rules-0.5a1.dev-r2713/Predicates.txt0000755000076600007660000011112011633340500016713 0ustar telec3telec3================== Predicate Dispatch ================== Here, we document and test the internals of PEAK-Rules' predicate dispatch implementation. Specifically, how it assembles the things we've already seen in the `AST Building`_, `Code Generation`_, `Criteria`_ and `Indexing`_ documents, into a complete implementation. .. _AST Building: http://peak.telecommunity.com/DevCenter/PEAK-Rules/AST-Builder .. _Code Generation: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Code-Generation .. _Criteria: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Criteria .. _Indexing: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Indexing .. contents:: **Table of Contents** Predicate Expression Types ========================== Predicate expression types wrap expressions to specify what kind of dispatching should be done on the base expression. For example, ``predicates.IsInstance`` indicates that an expression is to be looked up by what it's an instance of. There are five built-in expression types:: >>> from peak.rules.predicates import \ ... Truth, Identity, Comparison, IsSubclass, IsInstance And we will test them using code objects: >>> from peak.util.assembler import Code, Const, dump >>> from dis import dis Truth ----- The ``Truth`` predicate tests whether its subject expression is true or false, and selects the appropriate sub-node from a ``(true_node, false_node)`` tuple:: >>> c = Code() >>> c(Truth(42)) >>> dump(c.code()) LOAD_FAST 0 ($Arg) UNPACK_SEQUENCE 2 LOAD_CONST 1 (42) JUMP_IF_TRUE L1 ROT_THREE L1: POP_TOP ROT_TWO POP_TOP The generated code unpacks the 2-tuple, and then does a bit of stack manipulation to select the correct subnode. The ``disjuncts()`` of a Truth Test is the Test itself:: >>> from peak.rules.core import disjuncts >>> from peak.rules.criteria import Test, Signature, Value >>> disjuncts(Test(Truth(88), Value(True))) [Test(Truth(88), Value(True, True))] >>> disjuncts(Test(Truth(88), Value(True, False))) [Test(Truth(88), Value(True, False))] Identity -------- The ``Identity`` predicate looks up the ``id()`` of its subject expression in a dictionary of sub-nodes. If the id isn't found, the ``None`` entry is used:: >>> c = Code() >>> c(Identity(99)) >>> dump(c.code()) LOAD_CONST 1 () LOAD_CONST 2 (99) CALL_FUNCTION 1 DUP_TOP LOAD_FAST 0 ($Arg) COMPARE_OP 6 (in) JUMP_IF_FALSE L1 POP_TOP LOAD_FAST 0 ($Arg) ROT_TWO BINARY_SUBSCR JUMP_FORWARD L2 L1: POP_TOP POP_TOP LOAD_FAST 0 ($Arg) LOAD_CONST 0 (None) BINARY_SUBSCR Comparison ---------- The ``Comparison`` predicate expects its "arg" to be an ``(exact, ranges)`` pair, such as might be generated by the ``peak.rules.indexing.split_ranges`` function:: >>> c = Code() >>> c(Comparison(555)) >>> dis(c.code()) 0 0 LOAD_CONST 1 () 3 LOAD_CONST 2 (555) 6 LOAD_FAST 0 ($Arg) 9 CALL_FUNCTION 2 The generated code simply calls a helper function, ``value_check``, with its expression and argument. The helper function looks up and returns the appropriate subnode, first by trying for an exact match, and then looking for a range match if no exact match is found:: >>> from peak.rules.predicates import value_check >>> from peak.util.extremes import Min, Max >>> exact = {'x':1, 'y':2} >>> ranges = [((Min,'x'),42), (('x','y'),99), (('y',Max),88)] >>> for letter in 'wxyz': ... print value_check(letter, (exact, ranges)) 42 1 2 88 >>> value_check('xx', (exact, ranges)) 99 IsSubclass ---------- The ``IsSubclass`` predicate uses a ``(cache, lookup)`` node pair, where `cache` is a dictionary from classes to nodes, and `lookup` is a function to call with the class, in the event that the target class isn't found in the cache:: >>> c = Code() >>> c(IsSubclass(Const(int))) >>> dump(c.code()) LOAD_CONST 1 () SETUP_EXCEPT L1 DUP_TOP LOAD_FAST 0 ($Arg) UNPACK_SEQUENCE 2 ROT_THREE POP_TOP BINARY_SUBSCR ROT_TWO POP_TOP POP_BLOCK JUMP_FORWARD L3 L1: DUP_TOP LOAD_CONST 2 (<...KeyError...>) COMPARE_OP 10 (exception match) JUMP_IF_FALSE L2 POP_TOP POP_TOP POP_TOP POP_TOP LOAD_FAST 0 ($Arg) UNPACK_SEQUENCE 2 POP_TOP ROT_TWO CALL_FUNCTION 1 JUMP_FORWARD L3 L2: POP_TOP END_FINALLY IsInstance ---------- The ``IsInstance`` predicate is virtually identical to ``IsSubclass``, except that it first obtains the ``__class__`` or ``type()`` of its target:: >>> c = Code() >>> c(IsInstance(Const(999))) >>> dump(c.code()) LOAD_CONST 1 (999) SETUP_EXCEPT L1 DUP_TOP LOAD_ATTR 0 (__class__) ROT_TWO POP_TOP POP_BLOCK JUMP_FORWARD L3 L1: DUP_TOP LOAD_CONST 2 (<...AttributeError...>) COMPARE_OP 10 (exception match) JUMP_IF_FALSE L2 POP_TOP POP_TOP POP_TOP POP_TOP LOAD_CONST 3 () ROT_TWO CALL_FUNCTION 1 JUMP_FORWARD L3 L2: POP_TOP END_FINALLY L3: SETUP_EXCEPT L4 DUP_TOP LOAD_FAST 0 ($Arg) UNPACK_SEQUENCE 2 ROT_THREE POP_TOP BINARY_SUBSCR ROT_TWO POP_TOP POP_BLOCK JUMP_FORWARD L6 L4: DUP_TOP LOAD_CONST 4 (<...KeyError...>) COMPARE_OP 10 (exception match) JUMP_IF_FALSE L5 POP_TOP POP_TOP POP_TOP POP_TOP LOAD_FAST 0 ($Arg) UNPACK_SEQUENCE 2 POP_TOP ROT_TWO CALL_FUNCTION 1 JUMP_FORWARD L6 L5: POP_TOP END_FINALLY Defining New Predicate Types ----------------------------- A predicate type must be a ``peak.util.assembler.nodetype``, capable of generating its own lookup code. The code will be used in a ``SMIGenerator`` context (see the `Code Generation`_ manual), so ``SMIGenerator.ARG`` will contain a lookup node. Each predicate type must be usable with the ``predicates.predicate_node_for`` function, and the ``predicates.always_testable`` function: predicate_node_for(builder, expr, cases, remaining_exprs, memo) Return a dispatch tree node argument appropriate for the expr. The return value(s) of this function will be in the ``SMIGenerator.ARG`` local variable when the predicate type's bytecode is executed. always_testable(expr) Return true if the expression can always be tested, regardless of its position among the signature condition(s). Most predicate types should just implement this by calling ``always_testable()`` recursively on their target expression, and in fact all of the built-in predicate types do this. For more details, see the section on `Order Independence`_ below. Predicate Parsing ================= The ``CriteriaBuilder`` class can be used to parse Python expressions into tests and signatures. It's initialized using the same arguments as the ``codegen.ExprBuilder`` class:: >>> from peak.rules.predicates import CriteriaBuilder, Comparison, istype >>> from peak.rules.criteria import Disjunction, Value, Test, Range, Class, OrElse >>> from peak.util.assembler import Local >>> builder = CriteriaBuilder( ... dict(x=Local('x'), y=Local('y')), locals(), globals(), __builtins__ ... ) >>> pe = builder.parse >>> pe('x+42 > 23*2') Test(Comparison(Add(Local('x'), Const(42))), Range((46, 1), (Max, 1))) Iterable constants into or-ed equality tests:: >>> pe('x in (1,2,3)') == Disjunction([ ... Test(Comparison(Local('x')), Value(2, True)), ... Test(Comparison(Local('x')), Value(3, True)), ... Test(Comparison(Local('x')), Value(1, True)) ... ]) True >>> pe('x not in (1,2,3)') == Test( ... Comparison(Local('x')), ... Disjunction([ ... Range((Min, -1), (1, -1)), Range((1, 1), (2, -1)), ... Range((2, 1), (3, -1)), Range((3, 1), (Max, 1)) ... ]) ... ) True And non-iterable constants into plain expressions:: >>> pe('x in 27') Test(Truth(Compare(Local('x'), (('in', Const(27)),))), Value(True, True)) >>> pe('x not in 27') Test(Truth(Compare(Local('x'), (('not in', Const(27)),))), Value(True, True)) The ``is`` operator produces identity tests, if either side is a constant:: >>> pe('x is 42') Test(Identity(Local('x')), IsObject(42, True)) >>> pe('42 is not x') Test(Identity(Local('x')), IsObject(42, False)) And plain expressions when neither side is constant:: >>> pe('x is y') Test(Truth(Compare(Local('x'), (('is', Local('y')),))), Value(True, True)) >>> pe('x is not y') Test(Truth(Compare(Local('x'), (('is not', Local('y')),))), Value(True, True)) >>> pe('not (x is y)') Test(Truth(Compare(Local('x'), (('is', Local('y')),))), Value(True, False)) >>> pe('not (x is not y)') Test(Truth(Compare(Local('x'), (('is not', Local('y')),))), Value(True, False)) Complex logical expressions are always rendered in disjunctive normal form, with negations simplified away or reduced to match flags on criteria objects:: >>> pe('isinstance(x, int) and isinstance(y, str)') Signature([Test(IsInstance(Local('x')), Class(, True)), Test(IsInstance(Local('y')), Class(, True))]) >>> pe('not(not isinstance(x, int) or not isinstance(y, str))') Signature([Test(IsInstance(Local('x')), Class(, True)), Test(IsInstance(Local('y')), Class(, True))]) >>> pe('isinstance(x,int) and (isinstance(y, str) or isinstance(y, unicode))') OrElse([Signature([Test(IsInstance(Local('x')), Class(, True)), Test(IsInstance(Local('y')), Class(, True))]), Signature([Test(IsInstance(Local('x')), Class(, True)), Test(IsInstance(Local('y')), Class(, True))])]) >>> pe('not (isinstance(x, int) or isinstance(y, str))') Signature([Test(IsInstance(Local('x')), Class(, False)), Test(IsInstance(Local('y')), Class(, False))]) >>> pe('not(not isinstance(x, int) and not isinstance(y, str))') == OrElse([ ... Test(IsInstance(Local('x')), Class(int)), ... Test(IsInstance(Local('y')), Class(str)) ... ]) True >>> pe('not( isinstance(x, int) and isinstance(y, str))') OrElse([Test(IsInstance(Local('x')), Class(, False)), Test(IsInstance(Local('y')), Class(, False))]) And arbitrary expressions are handled as truth tests:: >>> pe('x') Test(Truth(Local('x')), Value(True, True)) >>> pe('not x') Test(Truth(Local('x')), Value(True, False)) Note, by the way, that backquotes are not allowed in predicate expressions, as they are reserved for use by macros or "meta functions" to create specialized syntax:: >>> pe('`x`') Traceback (most recent call last): ... SyntaxError: backquotes are not allowed in predicates Pattern Matching ---------------- Arbitrary expressions can be pattern matched for conversion into signatures. At the moment, the only patterns matched are ``isinstance`` and ``issubclass`` calls where the second argument is a constant, and ``type(x) is y`` expressions where `y` is a constant:: >>> from peak.rules.criteria import Test, Signature, Conjunction >>> pe('isinstance(x,int)') Test(IsInstance(Local('x')), Class(, True)) >>> pe('isinstance(x,(str,unicode))') == Disjunction([ ... Test(IsInstance(Local('x')), Class(str)), ... Test(IsInstance(Local('x')), Class(unicode)) ... ]) True >>> pe('type(x) is int') Test(IsInstance(Local('x')), istype(, True)) >>> pe('str is not type(x)') Test(IsInstance(Local('x')), istype(, False)) >>> pe('not isinstance(x,(int,(str,unicode)))') == Test( ... IsInstance(Local('x')), Conjunction([ ... Class(unicode, False), Class(int, False), Class(str, False)]) ... ) True >>> pe('isinstance(x,(int,(str,unicode)))') == Disjunction([ ... Test(IsInstance(Local('x')), Class(str)), ... Test(IsInstance(Local('x')), Class(int)), ... Test(IsInstance(Local('x')), Class(unicode)) ... ]) True >>> pe('issubclass(x,int)') Test(IsSubclass(Local('x')), Class(, True)) >>> pe('issubclass(x,(str,unicode))') == Disjunction([ ... Test(IsSubclass(Local('x')), Class(str)), ... Test(IsSubclass(Local('x')), Class(unicode)) ... ]) True >>> pe('issubclass(x,(int,(str,unicode)))') == Disjunction([ ... Test(IsSubclass(Local('x')), Class(str)), ... Test(IsSubclass(Local('x')), Class(int)), ... Test(IsSubclass(Local('x')), Class(unicode)) ... ]) True >>> pe('not issubclass(x,(int,(str,unicode)))') == Test( ... IsSubclass(Local('x')), Conjunction([ ... Class(unicode, False), Class(int, False), Class(str, False)]) ... ) True >>> pe('issubclass(int, object)') True "Meta Function" Expansion ------------------------- To create special functions with the ability to manipulate the compile-time representation of a rule, you can register "meta functions" with the ``meta_function`` decorator. You begin by defining a stub function which will be imported and used by the caller in their rules:: >>> def let(**kw): ... """This is a function that will have special behavior in rules""" ... raise NotImplementedError("`let` can only be used in rules") Then, you define a "meta function" for this function, that will be called at compile time. The signature of this function must match the signature with which it will be called, except that it can have zero or more extra parameters at the beginning named ``__builder__``, ``__star__`` and/or ``__dstar__``. ``__builder__``, for example, will be the active ``ExpressionBuilder``:: >>> def compile_let(__builder__, **kw): ... __builder__.bind(kw) ... return True (Note: the above is not the actual implementation of the ``peak.rules.let()`` pseudo-function; the actual implementation uses a lower-level interface that allows the keywords to be seen in definition order, so this is just a demo to illustrate the operation of the ``meta_function`` decorator.) To register your "meta function", you use ``@meta_function(stub_function)``:: >>> from peak.rules.predicates import meta_function >>> compile_let = meta_function(let)(compile_let) Then, when the stub function is used in a rule, the meta function is called with the PEAK-Rules AST objects resulting from compiling the invocation of the stub function in the rule:: >>> builder = CriteriaBuilder( ... dict(x=Local('x'), y=Local('y')), locals(), globals(), __builtins__ ... ) >>> pe = builder.parse >>> pe('let(q=x*y) and q>42') Test(Comparison(Mul(Local('x'), Local('y'))), Range((42, 1), (Max, 1))) As you can see, our ``compile_let`` meta-function bound ``q`` to ``Mul(Local('x'), Local('y'))``, which was the compiled form of the keyword argument it received. Notice, by the way, that our meta-function does NOT accept ``**`` arguments:: >>> pe('let(**{"z":x*y}) and z>42') Traceback (most recent call last): ... TypeError: does not support parsing **kw Or ``*`` arguments::: >>> pe('let(*[1,2]) and z>42') Traceback (most recent call last): ... TypeError: does not support parsing *args This is because we didn't include ``__star__`` or ``__dstar__`` parameters at the beginning of the ``compile_let()`` parameter list; if we had, the function would have received either the compiled AST for the corresponding part of the call, or ``None`` if no star or double-star arguments were provided. Dynamic Arguments ~~~~~~~~~~~~~~~~~ Notice, by the way, that ``__star__`` and ``__dstar__`` refer to the *caller's* use of ``*`` and ``**`` to make dynamic calls. The meta function can have ``*`` and ``**`` parameters, but these are passed any *static* positional or keyword arguments used by the caller. For example:: >>> def dummy(*args, **kw): ... """Just a dummy""" >>> def compile_dummy(__star__, __dstar__, p1, p2=None, *args, **kw): ... print "p1 =", p1 ... print "p2 =", p2 ... print "args =", args ... print "kw =", kw ... print "__star__ =", __star__ ... print "__dstar__ =", __dstar__ ... return True >>> compile_dummy = meta_function(dummy)(compile_dummy) >>> builder = CriteriaBuilder( ... dict(x=Local('x'), y=Local('y')), locals(), globals(), __builtins__ ... ) >>> pe = builder.parse >>> pe('dummy(x, y, x*x, y*y, k1=x, k2=y, *x+1, **y*2)') p1 = Local('x') p2 = Local('y') args = (Mul(Local('x'), Local('x')), Mul(Local('y'), Local('y'))) kw = {'k2': Local('y'), 'k1': Local('x')} __star__ = Add(Local('x'), Const(1)) __dstar__ = Mul(Local('y'), Const(2)) True >>> pe('dummy(x)') p1 = Local('x') p2 = None args = () kw = {} __star__ = None __dstar__ = None True Argument Errors ~~~~~~~~~~~~~~~ Static argument errors, such as failure to pass the right number of positional arguments, and duplicate keyword arguments that occur in the source (as opposed to runtime ``*`` or ``**`` problems), are detected at compile time:: >>> pe('dummy(x, p1=y)') Traceback (most recent call last): ... TypeError: Duplicate keyword p1 for <... compile_dummy at ...> >>> pe('dummy(p2=x, p2=y)') Traceback (most recent call last): ... TypeError: Duplicate keyword p2 for <... compile_dummy at ...> >>> pe('dummy()') Traceback (most recent call last): ... TypeError: Missing positional argument p1 for <... compile_dummy at ...> >>> pe('let(x)') Traceback (most recent call last): ... TypeError: Too many arguments for <... compile_let at ...> Also, note that meta functions cannot have packed-tuple arguments:: >>> meta_function(lambda x,y:None)(lambda x,(y,z): True) Traceback (most recent call last): ... TypeError: Meta-functions cannot have packed-tuple arguments Custom Argument Compiling ~~~~~~~~~~~~~~~~~~~~~~~~~ On occasion, a meta function may wish to interpret one or more of its arguments using a custom expression builder in place of the standard one, so that instead of a PEAK-Rules AST, it gets some other data structure. You can do this by passing keyword arguments to ``@meta_function()`` that supply a builder function for each argument that needs custom building. A builder function is a 2-argument callable that will be passed the active ``ExpressionBuilder`` instance and the raw Python AST tuples of the argument it is supposed to parse. The function must then return whatever value should be used as the parsed form of the argument supplied to the meta function. For example:: >>> def make_builder(text): ... def builder_function(old_builder, arg_node): ... return text ... return builder_function >>> def dummy2(*args, **kw): ... """Just another dummy""" >>> compile_dummy = meta_function(dummy2, ... p1=make_builder('p1'), p2=make_builder('p2'), ... args=make_builder('args'), kw=make_builder('kw'), ... k2=make_builder('k2'), ... __star__ = make_builder('*'), __dstar__=make_builder('**') ... )(compile_dummy) >>> builder = CriteriaBuilder( ... dict(x=Local('x'), y=Local('y')), locals(), globals(), __builtins__ ... ) >>> pe = builder.parse >>> pe('dummy2(x, y, x*x, y*y, k1=x, k2=y, *x+1, **y*2)') p1 = p1 p2 = p2 args = ('args', 'args') kw = {'k2': 'k2', 'k1': 'kw'} __star__ = * __dstar__ = ** True As you can see, build functions are selected on the basis of the argument name they target. If the meta function has a ``*`` parameter, each of the overflow positional arguments is parsed with the builder function of the corresponding name. If a named keyword argument has a build function, that one is used, otherwise any build function for the ``**`` parameter is used. Binding Scope ~~~~~~~~~~~~~ Note that bindings defined by meta-functions (e.g. our ``let`` example) cannot escape "or" or "not" clauses in an expression:: >>> pe('let(q=1) or x>q') Traceback (most recent call last): ... NameError: q >>> pe('not let(q=1) and x>> pe('not (let(q=1) and x>> pe('let(q=2) and (not let(q=3) or x>> from peak.rules import abstract, when, around, let >>> def f(x): pass >>> f = abstract(f) >>> def dummy((q,z), x): ... print "Got q =", q, "and z =", z ... print "x was", x >>> when(f, "let(q=x*2, z='whatever') and True")(dummy) >>> f(42) Got q = 84 and z = whatever x was 42 This even works when you have a ``next_method`` argument after the tuple, and even if your method is defined inside a closure:: >>> def closure(y): ... def dummy2((a,b,c), next_method, x): ... print "a, b, c =", (a,b,c) ... print "y was", y ... return next_method(x) ... return dummy2 >>> around(f, "let(a='a', b=x*3, c=b+23)")(closure(99)) >>> f(42) a, b, c = ('a', 126, 149) y was 99 Got q = 84 and z = whatever x was 42 At the moment, this actually works by recalculating the expressions in a wrapper function that then invokes your original method, so it's more of a DRY thing than an efficiency thing. That is, it keeps you from accidentally getting your rule and your function out of sync, and saves on retyping or copy-pasting. (Future versions of PEAK-Rules, however, may improve this so that the bindings aren't recalculated at every method level, or perhaps aren't recalculated at all. It's tricky, though, because depending on the calculations involved, it might be more efficient to redo them than to do the dynamic stack inspection that would be needed to locate the active expression cache! So, in that event, the main value would be supporting at-most-once execution of expressions with side-effects.) Custom Predicate Functions ========================== The ``@expand_as`` decorator lets you specify a string that will be used in place of a function, when the function is referenced in a condition. Here's a trivial example:: >>> from peak.rules import expand_as, value >>> def just(arg): pass >>> expand_as("arg")(just) >>> def f(x): pass >>> when(f, "x==just(42)")(value(23)) value(23) >>> f(42) 23 In the above, the ``just(arg)`` function is defined as being the same as its argument. So, the ``"x==just(42)`` is treated as though you'd just said ``"x==42"``. And, although we never defined an actual *implementation* of the ``just()`` function, it actually still works:: >>> just(42) 42 This is because if you decorate an empty function with ``@expand_as``, the supplied condition will be compiled and attached to the existing function object for you. (This saves you having to actually write the body.) Of course, if you decorate, say, an already-existing function that you want to replace, then nothing happens to that function:: >>> def isint(ob): ... print "called!" ... return isinstance(ob, int) >>> expand_as("isinstance(x, int)")(isint) >>> isint(42) called! True But, the correct expansion still happens when you use that function in a rule:: >>> around(f, "isint(x)")(value(99)) value(99) >>> f(42) 99 Note that it's ok to use ``let()`` or other binding-creating expressions inside an expansion string, and they won't interfere with the surrounding conditions:: >>> def oddment(a, b): pass >>> expand_as("let(x=a*2, y=x+b+1) and y")(oddment) >>> oddment(27, 51) # prove bindings work even in a function 106 >>> around(f, "x==oddment(27, 51) and x==106 and isint(x)")(value('yeah!')) value('yeah!') >>> f(106) # prove that x doesn't get redefined after oddment 'yeah!' In the above, temporary variables ``x`` and ``y`` are created in the expansion, but they don't affect the original value of x in the rule where the function is expanded. Of course, this also means that you can't implement something like a pattern-matching feature or the ``let()`` function using ``@expand_as``. It's just an easier way to handle the sort of common cases where meta-functions would be overkill. Expression to Predicate Conversion ================================== Meta-functions are only one way to transform a Python expression into a predicate. It's also possible to register somewhat-arbitrary transformations by registering methods with the ``expressionSignature()`` generic function. In this section, we'll create a simple "priority" predicate that doesn't influence method selection, but affects implication order between predicates. The basic idea is that we'll create a ``priority`` type that's an integer subclass, and use it in expressions of the form ``isisntance(foo, Bar) and priority(3)``, that will then have precedence over an identical expression with a lower priority:: >>> from peak.rules import when >>> from peak.rules.core import implies >>> class priority(int): ... """A simple priority""" >>> when(implies, (priority, priority))(lambda p1,p2: p1>p2) ...> >>> implies(priority(3), priority(2)) True >>> implies(priority(2), priority(3)) False To use our new type, we'll need to implement a conversion from a ``Const(somepriority)`` expression, to a ``Test(priority, somepriority)`` condition. Normally, these conversions are handled by the ``expressionSignature()`` generic function in the ``peak.rules.predicates`` module. By default, ``expressionSignature()`` simply takes the expression object it's given, and returns ``Test(Truth(expr), Value(True))`` -- that is, a truth test on the boolean value of the expression. Or, if the value given is a constant, it simply returns an immediate boolean value:: >>> from peak.rules.predicates import expressionSignature >>> expressionSignature(Const(priority(3))) True So, we need to register a method that handles priorities appropriately:: >>> when(expressionSignature, "isinstance(expr, Const) and isinstance(expr.value, priority)")( ... lambda expr: Test(None, expr.value) ... ) ...> Okay, let's try out our new condition:: >>> from peak.rules import value >>> def dummy(arg): return "default" >>> when(dummy, "arg==1 and priority(1)")(value("1 @ 1")) value('1 @ 1') >>> dummy(1) '1 @ 1' >>> dummy(2) 'default' >>> when(dummy, "arg==1 and priority(2)")(value("1 @ 2")) value('1 @ 2') >>> dummy(1) '1 @ 2' >>> dummy(2) 'default' >>> when(dummy, "arg==2 and priority(2)")(value("2 @ 2")) value('2 @ 2') >>> dummy(1) '1 @ 2' >>> dummy(2) '2 @ 2' >>> when(dummy, "arg==2 and priority(1)")(value("2 @ 1")) value('2 @ 1') >>> dummy(1) '1 @ 2' >>> dummy(2) '2 @ 2' Upgrading from ``TypeEngine`` ============================= In order to allow a function to safely upgrade from type-only dispatch to full predicate dispatch, it's necessary for predicate engines to support using type tuples as signatures (since such tuples may already be registered with the function's ``RuleSet``). To support this, the ``tests_for()`` function takes an optional second parameter, representing the engine that "wants" the tests, and whose argument names will be used to accomplish the conversion:: >>> from peak.rules.core import Dispatching, implies >>> from peak.rules.criteria import tests_for >>> engine = Dispatching(implies).engine >>> list(tests_for((int,str), engine)) [Test(IsInstance(Local('s1')), Class(, True)), Test(IsInstance(Local('s2')), Class(, True))] >>> list(tests_for((istype(tuple),), engine)) [Test(IsInstance(Local('s1')), istype(, True))] Each element of the type tuple is converted using a second generic function, ``type_to_test``:: >>> from peak.rules.predicates import type_to_test >>> type_to_test(int, Local('x'), engine) Test(IsInstance(Local('x')), Class(, True)) >>> type_to_test(istype(str), Local('x'), engine) Test(IsInstance(Local('x')), istype(, True)) >>> class x: pass >>> type_to_test(x, Local('x'), engine) Test(IsInstance(Local('x')), Class(, True)) If you implement a new kind of class test for use in type tuples, you'll need to add the appropriate method(s) to ``type_to_test`` if you want it to also work with the predicate engine. So, let's test the actual upgrade process, and also confirm that you can still pass in type tuples (or precomputed tests, signatures, etc.) after upgrading:: >>> def demo(ob): pass >>> tmp = when(demo, (int,))(value('int')) >>> tmp = when(demo, (str,))(value('str')) >>> demo(42) 'int' >>> demo('test') 'str' >>> tmp = when(demo, "isinstance(ob, int) and ob==42")(value('Ultimate answer')) >>> tmp = when(demo, (list,))(value('list')) >>> tmp = when(demo, Test(IsInstance(Local('ob')), Class(tuple)))( ... value('tuple') ... ) >>> demo(42) 'Ultimate answer' >>> demo([]) 'list' >>> demo(()) 'tuple' >>> demo('test'), demo(23) ('str', 'int') And, just for the heck of it, let's make sure that you can upgrade to an IndexedEngine by using any other values on a TypeEngine function:: >>> def demo(ob): pass >>> tmp = when(demo, Test(IsInstance(Local('ob')), Class(tuple)))( ... value('tuple') ... ) >>> demo(()) 'tuple' Criterion Ordering ================== Criterion ordering for a predicate dispatch engine is defined by the ordering of the tests in its signatures. Any test expression that is not defined as ``always_testable``, must not be computed until after any test expressions to its left have been tested. But tests whose expression is just a local variable (i.e., a plain function argument), do not have such restrictions:: >>> from peak.rules.predicates import IndexedEngine >>> from peak.rules import abstract, when >>> from peak.rules.indexing import Ordering >>> from peak.rules.codegen import Add >>> def f(a,b): pass >>> f = abstract(f) >>> m = when(f, "isinstance(a, int) and a+b==42")(value(None)) >>> engine = Dispatching(f).engine >>> list(Ordering(engine, IsInstance(Local('a'))).constraints) [frozenset([])] >>> list(Ordering(engine, Comparison(Add(Local('a'),Local('b')))).constraints) [frozenset([IsInstance(Local('a'))])] >>> def f(a,b): pass >>> f = abstract(f) >>> m = when(f, "isinstance(b, str) and a+b==42 and isinstance(a, int)")( ... value(None) ... ) >>> engine = Dispatching(f).engine >>> list(Ordering(engine, IsInstance(Local('a'))).constraints) [frozenset([])] >>> list(Ordering(engine, IsInstance(Local('b'))).constraints) [frozenset([])] >>> list(Ordering(engine, Comparison(Add(Local('a'),Local('b')))).constraints) [frozenset([IsInstance(Local('b'))])] >>> def f(a,b): pass >>> f = abstract(f) >>> m = when(f, "isinstance(a, int) and isinstance(b, str) and a+b==42")( ... value(None) ... ) >>> engine = Dispatching(f).engine >>> list(Ordering(engine, IsInstance(Local('a'))).constraints) [frozenset([])] >>> list(Ordering(engine, IsInstance(Local('b'))).constraints) [frozenset([])] >>> try: frozenset and None ... except NameError: from peak.rules.core import frozenset >>> list( ... Ordering(engine, Comparison(Add(Local('a'),Local('b')))).constraints ... ) == [frozenset([IsInstance(Local('a')), IsInstance(Local('b'))])] True Order Independence ------------------ The determination of whether a test expression can be used in an order- independent way, is via the ``always_testable()`` function:: >>> from peak.rules.predicates import always_testable In general, only locals and constants can have their tests applied independent of signature ordering:: >>> always_testable(Local('x')) True >>> always_testable(Const(99)) True >>> always_testable(Add(Local('a'),Local('b'))) False And predicate test expressions are evaluated according to their tested expression:: >>> always_testable(IsInstance(Local('x'))) True >>> always_testable(IsInstance(Add(Local('a'),Local('b')))) False >>> always_testable(Comparison(Local('x'))) True >>> always_testable(Comparison(Add(Local('a'),Local('b')))) False >>> always_testable(Identity(Local('x'))) True >>> always_testable(Identity(Add(Local('a'),Local('b')))) False >>> always_testable(Truth(Local('x'))) True >>> always_testable(Truth(Add(Local('a'),Local('b')))) False Except for ``IsSubclass()``, which may need to have other tests applied before it:: >>> always_testable(IsSubclass(Local('x'))) False If you create a new predicate type, be sure to define a method for ``always_testable`` that will recursively invoke ``always_testable`` on the predicate's target expression. If you don't do this, then your predicate type will always be treated as order-dependent, even if its target expression is a local or constant. PEAK-Rules-0.5a1.dev-r2713/PEAK_Rules.egg-info/0000755000076600007660000000000012506203664017465 5ustar telec3telec3PEAK-Rules-0.5a1.dev-r2713/PEAK_Rules.egg-info/top_level.txt0000664000076600007660000000000512506203664022214 0ustar telec3telec3peak PEAK-Rules-0.5a1.dev-r2713/PEAK_Rules.egg-info/SOURCES.txt0000664000076600007660000000127012506203664021353 0ustar telec3telec3AST-Builder.txt Code-Generation.txt Criteria.txt DESIGN.txt Indexing.txt Predicates.txt README.txt Syntax-Matching.txt setup.cfg setup.py test_rules.py wikiup.cfg PEAK_Rules.egg-info/PKG-INFO PEAK_Rules.egg-info/SOURCES.txt PEAK_Rules.egg-info/dependency_links.txt PEAK_Rules.egg-info/namespace_packages.txt PEAK_Rules.egg-info/requires.txt PEAK_Rules.egg-info/top_level.txt ez_setup/README.txt ez_setup/__init__.py peak/__init__.py peak/rules/__init__.py peak/rules/ast_builder.py peak/rules/codegen.py peak/rules/core.py peak/rules/criteria.py peak/rules/dispatch.py peak/rules/indexing.py peak/rules/predicates.py peak/rules/syntax.py peak/rules/debug/__init__.py peak/rules/debug/decompile.pyPEAK-Rules-0.5a1.dev-r2713/PEAK_Rules.egg-info/requires.txt0000664000076600007660000000011512506203664022064 0ustar telec3telec3BytecodeAssembler>=0.6 DecoratorTools>=1.7dev-r2450 AddOns>=0.6 Extremes>=1.1PEAK-Rules-0.5a1.dev-r2713/PEAK_Rules.egg-info/PKG-INFO0000664000076600007660000000770312506203664020573 0ustar telec3telec3Metadata-Version: 1.0 Name: PEAK-Rules Version: 0.5a1.dev-r2713 Summary: Generic functions and business rules support systems Home-page: http://pypi.python.org/pypi/PEAK-Rules Author: Phillip J. Eby Author-email: peak@eby-sarna.com License: ZPL 2.1 Download-URL: http://peak.telecommunity.com/snapshots/ Description: PEAK-Rules is a highly-extensible framework for creating and using generic functions, from the very simple to the very complex. Out of the box, it supports multiple-dispatch on positional arguments using tuples of types, full predicate dispatch using strings containing Python expressions, and CLOS-like method combining. (But the framework allows you to mix and match dispatch engines and custom method combinations, if you need or want to.) Basic usage:: >>> from peak.rules import abstract, when, around, before, after >>> @abstract() ... def pprint(ob): ... """A pretty-printing generic function""" >>> @when(pprint, (list,)) ... def pprint_list(ob): ... print "pretty-printing a list" >>> @when(pprint, "isinstance(ob,list) and len(ob)>50") ... def pprint_long_list(ob): ... print "pretty-printing a long list" >>> pprint([1,2,3]) pretty-printing a list >>> pprint([42]*1000) pretty-printing a long list >>> pprint(42) Traceback (most recent call last): ... NoApplicableMethods: ... PEAK-Rules works with Python 2.3 and up -- just omit the ``@`` signs if your code needs to run under 2.3. Also, note that with PEAK-Rules, *any* function can be generic: you don't have to predeclare a function as generic. (The ``abstract`` decorator is used to declare a function with no *default* method; i.e., one that will raise ``NoApplicableMethods`` instead of executing a default implementation, if no rules match the arguments it's invoked with.) PEAK-Rules is still under development; it lacks much in the way of error checking, so if you mess up your rules, it may not be obvious where or how you did. User documentation is also lacking, although there are extensive doctests describing and testing most of its internals, including: * `Introduction`_ (Method combination, porting from RuleDispatch) * `Core Design Overview`_ (Terminology, method precedence, etc.) * The `Basic AST Builder`_ and advanced `Code Generation`_ * `Criteria`_, `Indexing`_, and `Predicates`_ * `Syntax pattern matching`_ (Please note that these documents are still in a state of flux and some may still be incomplete or disorganized, prior to the first official release.) Source distribution snapshots are generated daily, but you can also update directly from the `development version`_ in SVN. .. _development version: svn://svn.eby-sarna.com/svnroot/PEAK-Rules#egg=PEAK_Rules-dev .. _Introduction: http://peak.telecommunity.com/DevCenter/PEAK-Rules#toc .. _Core Design Overview: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Design .. _Predicates: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Predicates .. _Basic AST Builder: http://peak.telecommunity.com/DevCenter/PEAK-Rules/AST-Builder .. _Code Generation: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Code-Generation .. _Criteria: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Criteria .. _Indexing: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Indexing .. _Predicates: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Predicates .. _Syntax pattern matching: http://peak.telecommunity.com/DevCenter/PEAK-Rules/Syntax-Matching .. _toc: Platform: UNKNOWN PEAK-Rules-0.5a1.dev-r2713/PEAK_Rules.egg-info/namespace_packages.txt0000664000076600007660000000000512506203664024015 0ustar telec3telec3peak PEAK-Rules-0.5a1.dev-r2713/PEAK_Rules.egg-info/dependency_links.txt0000664000076600007660000000000112506203664023535 0ustar telec3telec3 PEAK-Rules-0.5a1.dev-r2713/test_rules.py0000755000076600007660000006037212041201063016640 0ustar telec3telec3import unittest, sys from peak.rules.core import * x2 = lambda a: a*2 x3 = lambda next_method, a: next_method(a)*3 class TypeEngineTests(unittest.TestCase): def testIntraSignatureCombinationAndRemoval(self): abstract() def f(a): """blah""" rx2 = Rule(x2,(int,), Method) rx3 = Rule(x3,(int,), Around) rules_for(f).add(rx2) self.assertEqual(f(1), 2) rules_for(f).add(rx3) self.assertEqual(f(1), 6) rules_for(f).remove(rx3) self.assertEqual(f(1), 2) def testAroundDecoratorAndRetroactiveCombining(self): def f(a): return a self.assertEqual(f(1), 1) self.assertEqual(f('x'), 'x') when(f, (int,))(x2) self.assertEqual(f(1), 2) self.assertEqual(f('x'), 'x') around(f, (int,))(lambda a:42) self.assertEqual(f(1), 42) self.assertEqual(f('x'), 'x') class MiscTests(unittest.TestCase): def testPointers(self): from peak.rules.indexing import IsObject from sys import maxint anOb = object() ptr = IsObject(anOb) self.assertEqual(id(anOb)&maxint,ptr) self.assertEqual(hash(id(anOb)&maxint),hash(ptr)) self.assertEqual(ptr.match, True) self.assertEqual(IsObject(anOb, False).match, False) self.assertNotEqual(IsObject(anOb, False), ptr) class X: pass anOb = X() ptr = IsObject(anOb) oid = id(anOb)&maxint self.assertEqual(oid,ptr) self.assertEqual(hash(oid),hash(ptr)) del anOb self.assertNotEqual(ptr,"foo") self.assertEqual(ptr,ptr) self.assertEqual(hash(oid),hash(ptr)) def testRuleSetReentrance(self): from peak.rules.core import Rule, RuleSet rs = RuleSet() log = [] class MyListener: def actions_changed(self, added, removed): log.append(1) if self is ml1: rs.unsubscribe(ml2) ml1, ml2 = MyListener(), MyListener() rs.subscribe(ml1) rs.subscribe(ml2) self.assertEqual(log, []) rs.add(Rule(lambda:None)) self.assertEqual(log, [1, 1]) def testAbstract(self): def f1(x,y=None): raise AssertionError("Should never get here") d = Dispatching(f1) log = [] d.rules.default_action = lambda *args: log.append(args) f1 = abstract(f1) f1(27,42) self.assertEqual(log, [(27,42)]) when(f1, ())(lambda *args: 99) self.assertRaises(AssertionError, abstract, f1) def testAbstractRegeneration(self): def f1(x,y=None): raise AssertionError("Should never get here") d = Dispatching(f1) log = [] d.rules.default_action = lambda *args: log.append(args) d.request_regeneration() f1 = abstract(f1) self.assertNotEqual(d.backup, f1.func_code) self.assertEqual(f1.func_code, d._regen) f1.func_code = d.backup f1(27,42) self.assertEqual(log, [(27,42)]) def testCreateEngine(self): def f1(x,y=None): raise AssertionError("Should never get here") d = Dispatching(f1) old_engine = d.engine self.assertEqual(d.rules.listeners, [old_engine]) from peak.rules.core import TypeEngine class MyEngine(TypeEngine): pass d.create_engine(MyEngine) new_engine = d.engine self.assertNotEqual(new_engine, old_engine) self.failUnless(isinstance(new_engine, MyEngine)) self.assertEqual(d.rules.listeners, [new_engine]) def testIndexClassicMRO(self): class MyEngine: pass eng = MyEngine() from peak.rules.indexing import TypeIndex from peak.rules.criteria import Class from types import InstanceType ind = TypeIndex(eng, 'classes') ind.add_case(0, Class(MyEngine)) ind.add_case(1, Class(object)) ind.reseed(Class(InstanceType)) self.assertEqual( dict(ind.expanded_sets()), {MyEngine: [[0,1],[]], InstanceType: [[1],[]], object: [[1],[]]} ) def testEngineArgnames(self): argnames = lambda func: Dispatching(func).engine.argnames self.assertEqual( argnames(lambda a,b,c=None,*d,**e: None), list('abcde') ) self.assertEqual( argnames(lambda a,b,c=None,*d: None), list('abcd') ) self.assertEqual( argnames(lambda a,b,c=None,**e: None), list('abce') ) self.assertEqual( argnames(lambda a,b,c=None: None), list('abc') ) self.assertEqual( argnames(lambda a,(b,(c,d)), e: None), list('abcde') ) def test_istype_implies(self): self.failUnless(implies(istype(object), object)) self.failUnless(implies(istype(int), object)) self.failIf(implies(istype(object, False), object)) self.failIf(implies(istype(int, False), object)) def testIndexedEngine(self): from peak.rules.predicates import IndexedEngine, Comparison from peak.rules.criteria import Range, Value, Test, Signature from peak.util.assembler import Local from peak.util.extremes import Min, Max abstract() def classify(age): pass Dispatching(classify).create_engine(IndexedEngine) def setup(r, f): when(classify, Signature([Test(Comparison(Local('age')), r)]))(f) setup(Range(hi=( 2,-1)), lambda age:"infant") setup(Range(hi=(13,-1)), lambda age:"preteen") setup(Range(hi=( 5,-1)), lambda age:"preschooler") setup(Range(hi=(20,-1)), lambda age:"teenager") setup(Range(lo=(20,-1)), lambda age:"adult") setup(Range(lo=(55,-1)), lambda age:"senior") setup(Value(16), lambda age:"sweet sixteen") self.assertEqual(classify(0),"infant") self.assertEqual(classify(25),"adult") self.assertEqual(classify(17),"teenager") self.assertEqual(classify(13),"teenager") self.assertEqual(classify(12.99),"preteen") self.assertEqual(classify(4),"preschooler") self.assertEqual(classify(55),"senior") self.assertEqual(classify(54.9),"adult") self.assertEqual(classify(14.5),"teenager") self.assertEqual(classify(16),"sweet sixteen") self.assertEqual(classify(16.5),"teenager") self.assertEqual(classify(99),"senior") self.assertEqual(classify(Min),"infant") self.assertEqual(classify(Max),"senior") def testSignatureOfTruthTests(self): from peak.rules.predicates import Truth from peak.rules.criteria import Test, Signature # this line used to fail with a recursion error: Signature([Test(Truth(99), True), Test(Truth(88), False)]) def testClassBodyRules(self): from peak.rules.core import Local, Rule from peak.rules.criteria import Signature, Test, Class, Value from peak.rules.predicates import IsInstance, Truth abstract() def f1(a): pass abstract() def f2(b): pass # This is to verify that the rules have sequence numbers by definition # order, not reverse definition order, inside a class. num = Rule(None).sequence class T: f1=sys._getframe(1).f_locals['f1'] # ugh when(f1) def f1_(a): pass f2=sys._getframe(1).f_locals['f2'] when(f2, 'b') def f2_(b): pass self.assertEqual( list(rules_for(f1)), [Rule(T.f1_.im_func, (T,), Method, num+1)] ) self.assertEqual( list(rules_for(f2)), [Rule( T.f2_.im_func, Signature([ Test(IsInstance(Local('b')), Class(T)), Test(Truth(Local('b')), Value(True)) ]), Method, num+2) ] ) def testParseInequalities(self): from peak.rules.predicates import CriteriaBuilder, Comparison, Truth from peak.util.assembler import Compare, Local from peak.rules.criteria import Inequality, Test, Value from peak.rules.ast_builder import parse_expr builder = CriteriaBuilder( dict(x=Local('x'), y=Local('y')), locals(), globals(), __builtins__ ) pe = builder.parse x_cmp_y = lambda op, t=True: Test( Truth(Compare(Local('x'), ((op, Local('y')),))), Value(True, t) ) x,y = Comparison(Local('x')), Comparison(Local('y')) for op, mirror_op, not_op, stdop, not_stdop in [ ('>', '<', '<=','>','<='), ('<', '>', '>=','<','>='), ('==','==','!=','==','!='), ('<>','<>','==','!=','=='), ]: fwd_sig = Test(x, Inequality(op, 1)) self.assertEqual(pe('x %s 1' % op), fwd_sig) self.assertEqual(pe('1 %s x' % mirror_op), fwd_sig) rev_sig = Test(x, Inequality(mirror_op, 1)) self.assertEqual(pe('x %s 1' % mirror_op), rev_sig) self.assertEqual(pe('1 %s x' % op), rev_sig) not_sig = Test(x, Inequality(not_op, 1)) self.assertEqual(pe('not x %s 1' % op), not_sig) self.assertEqual(pe('not x %s 1' % not_op), fwd_sig) self.assertEqual(pe('x %s y' % op), x_cmp_y(stdop)) self.assertEqual(pe('x %s y' % not_op), x_cmp_y(not_stdop)) self.assertEqual(pe('not x %s y' % op),x_cmp_y(stdop,False)) self.assertEqual(pe('not x %s y' % not_op),x_cmp_y(not_stdop,False)) def testInheritance(self): class X: pass class Y(X): pass x = Y() f = lambda x: "f" when(f, "isinstance(x, X)")(lambda x: "g") when(f, "type(x) is X")(lambda x: "h") self.assertEqual(f(x), 'g') def testNotInherited(self): f = abstract(lambda x: "f") when(f, "not isinstance(x, int)")(lambda x: "g") when(f, "type(x) is object")(lambda x: "h") self.assertEqual(f(None), 'g') def testTypeImplicationAndIsSubclassOrdering(self): from inspect import isclass class A(object): pass class B(A): pass whats_this = abstract(lambda obj: obj) when(whats_this, "isclass(obj) and issubclass(obj, A)")(lambda o:"A") when(whats_this, "isclass(obj) and issubclass(obj, B)")(lambda o:"B") when(whats_this, (B,))(lambda o:"B()") when(whats_this, (A,))(lambda o:"A()") self.failUnlessEqual(whats_this(B), "B") self.failUnlessEqual(whats_this(A), "A") self.failUnlessEqual(whats_this(B()), "B()") self.failUnlessEqual(whats_this(A()), "A()") def testRuleSetClear(self): from peak.rules.core import Rule, RuleSet rs = RuleSet(); r = Rule(lambda:None,actiontype=Method) rs.add(r) self.assertEqual(list(rs), [r]) rs.clear() self.assertEqual(list(rs), []) def testTypeEngineKeywords(self): def func(x, **k): pass def fstr(x,**k): return x,k abstract(func) when(func,(str,))(fstr) self.assertEqual(func('x',s='7'), ('x',{'s':'7'})) def testFlatPriorities(self): from peak.rules import value, AmbiguousMethods from peak.rules.predicates import priority f = lambda n, m: 0 f1, f2, f3 = value(1), value(2), value(3) when(f, "n==5 and priority(1)")(f1) when(f, "m==5 and priority(1)")(f2) when(f, "n==5 and m==5 and priority(1)")(f3) self.assertEqual(f(5,5), 3) def testIsNot(self): def func(x): pass p,q,r = object(),object(),object() when(func,"x is not p")(value('~p')) when(func,"x is not p and x is not q and x is not r")(value('nada')) self.assertEqual(func(23),'nada') self.assertEqual(func(q),'~p') def testTypeVsIsTypePrecedence(self): def func(x): pass when(func, (int, ))(value(1)) when(func, (istype(int),))(value(2)) self.assertEqual(func(42), 2) def testNamedGFExtension(self): p,q,r = object(),object(),object() when("%s:%s.named_func" % (__name__, self.__class__.__name__), "x is not p")(value('~p')) self.assertEqual(self.named_func(q),'~p') def named_func(x): pass named_func = staticmethod(named_func) class RuleDispatchTests(unittest.TestCase): def testSimplePreds(self): from peak.rules import dispatch [dispatch.generic()] def classify(age): """Stereotype for age""" def defmethod(gf,s,func): gf.when(s)(func) defmethod(classify,'not not age<2', lambda age:"infant") defmethod(classify,'age<13', lambda age:"preteen") defmethod(classify,'age<5', lambda age:"preschooler") defmethod(classify,'20>age', lambda age:"teenager") defmethod(classify,'not age<20',lambda age:"adult") defmethod(classify,'age>=55',lambda age:"senior") defmethod(classify,'age==16',lambda age:"sweet sixteen") self.assertEqual(classify(25),"adult") self.assertEqual(classify(17),"teenager") self.assertEqual(classify(13),"teenager") self.assertEqual(classify(12.99),"preteen") self.assertEqual(classify(0),"infant") self.assertEqual(classify(4),"preschooler") self.assertEqual(classify(55),"senior") self.assertEqual(classify(54.9),"adult") self.assertEqual(classify(14.5),"teenager") self.assertEqual(classify(16),"sweet sixteen") self.assertEqual(classify(16.5),"teenager") self.assertEqual(classify(99),"senior") self.assertEqual(classify(dispatch.strategy.Min),"infant") self.assertEqual(classify(dispatch.strategy.Max),"senior") def testKwArgHandling(self): from peak.rules import dispatch [dispatch.generic()] def f(**fiz): """Test of kw handling""" [f.when("'x' in fiz")] def f(**fiz): return "x" [f.when("'y' in fiz")] def f(**fiz): return "y" self.assertEqual(f(x=1),"x") self.assertEqual(f(y=1),"y") self.assertRaises(dispatch.AmbiguousMethod, f, x=1, y=1) def testVarArgHandling(self): from peak.rules import dispatch [dispatch.generic()] def f(*fiz): """Test of vararg handling""" [f.when("'x' in fiz")] def f(*fiz): return "x" [f.when("'y' in fiz")] def f(*fiz): return "y" self.assertEqual(f("foo","x"),"x") self.assertEqual(f("bar","q","y"),"y") self.assertEqual(f("bar","q","y"),"y") self.assertEqual(f("y","q",),"y") self.assertRaises(dispatch.AmbiguousMethod, f, "x","y") def test_NoApplicableMethods_is_raised(self): from peak.rules import dispatch [dispatch.generic()] def demo_func(number): pass demo_func.when("number < 10")(lambda x: 0) self.assertEqual(demo_func(3),0) self.assertRaises(dispatch.NoApplicableMethods, demo_func, 33) def testSingles(self): from peak.rules import dispatch [dispatch.on('t')] def gm (t) : pass [gm.when(object)] def gm (t) : return 'default' [gm.when(int)] def gm2 (t) : return 'int' self.assertEqual(gm(42),"int") self.assertEqual(gm("x"),"default") self.assertEqual(gm(42.0),"default") def testSubclassDispatch(self): from peak.rules import dispatch [dispatch.generic()] def gm (t) : pass [gm.when(dispatch.strategy.default)] def gm (t) : return 'default' [gm.when('issubclass(t,int)')] def gm2 (t) : return 'int' self.assertEqual(gm(int),"int") self.assertEqual(gm(object),"default") self.assertEqual(gm(float),"default") def testTrivialities(self): from peak.rules import dispatch [dispatch.on('x')] def f1(x,*y,**z): "foo bar" [dispatch.on('x')] def f2(x,*y,**z): "baz spam" for f,doc in (f1,"foo bar"),(f2,"baz spam"): self.assertEqual(f.__doc__, doc) # Empty generic should raise NoApplicableMethods self.assertRaises(dispatch.NoApplicableMethods, f, 1, 2, 3) self.assertRaises(dispatch.NoApplicableMethods, f, "x", y="z") # Must have at least one argument to do dispatching self.assertRaises(TypeError, f) self.assertRaises(TypeError, f, foo="bar") class MockBuilder: def __init__(self, test, expr, cases, remaining, seeds, index=None): self.test = test self.args = expr, cases, remaining, {} self.seeds = seeds self.index = index def test_func(self, func): return func(self, *self.args) def build(self, cases, remaining_exprs, memo): self.test.failUnless(memo is self.args[-1]) self.test.assertEqual(self.args[-2], remaining_exprs) return cases def seed_bits(self, expr, cases): self.test.assertEqual(self.args[1], cases) if self.index is not None: return self.index.seed_bits(cases) return self.seeds def reseed(self, expr, criterion): self.test.assertEqual(self.args[0], expr) self.index.reseed(criterion) class NodeBuildingTests(unittest.TestCase): def build(self, func, dontcare, seeds, index=None): seedbits = dontcare, seeds builder = MockBuilder( self, 'expr', 'cases', 'remaining', seedbits, index ) return builder.test_func(func) def testTruthNode(self): from peak.rules.predicates import truth_node node = self.build(truth_node, 27, {True: (128,0), False: (64,0)}) self.assertEqual(node, (27|128, 27|64)) def testIdentityNode(self): from peak.rules.predicates import identity_node node = self.build(identity_node, 27, {9127: (128,0), 6499: (64,0), None: (0,0)}) self.assertEqual(node, {None:27, 9127:27|128, 6499: 27|64}) def testRangeNode(self): from peak.rules.indexing import RangeIndex, to_bits from peak.rules.predicates import range_node from peak.rules.criteria import Range, Value, Min, Max ind = RangeIndex(self, 'expr') ind.add_case(0, Value(19)) ind.add_case(1, Value(23)) ind.add_case(2, Value(23, False)) ind.add_case(3, Range(lo=(57,1))) ind.add_case(4, Range(lo=(57,-1))) dontcare, seeds = ind.seed_bits(to_bits(range(6))) exact, ranges = self.build(range_node, dontcare, seeds) self.assertEqual(exact, {19:to_bits([0, 2, 5]), 23:to_bits([1,5]), 57:to_bits([2,4,5])}) self.assertEqual(ranges, [((Min,57), to_bits([2,5])), ((57,Max), to_bits([2,3,4,5]))]) def testClassNode(self): from peak.rules.indexing import TypeIndex, to_bits from peak.rules.predicates import class_node from peak.rules.criteria import Class, Conjunction from types import InstanceType ind = TypeIndex(self, 'expr') class a: pass class b: pass class c(a,b): pass class x(a,b,object): pass ind.add_case(0, Class(InstanceType)) ind.add_case(1, Conjunction([Class(a), Class(b), Class(c,False)])) ind.add_case(2, Class(object)) ind.add_case(3, Conjunction([Class(a), Class(b)])) ind.add_case(4, Class(a)) ind.selectivity(range(6)) cases = to_bits(range(6)) builder = MockBuilder( self, 'expr', cases, 'remaining', ind.seed_bits(cases), ind ) cache, lookup = builder.test_func(class_node,) self.assertEqual(cache, {}) data = [ (object, to_bits([2,5])), (InstanceType, to_bits([0,2,5])), (a, to_bits([0,2,4,5])), (b, to_bits([0,2,5])), (c, to_bits([0,2,3,4,5])), (x, to_bits([1,2,3,4,5])) ] for k, v in data: self.assertEqual(lookup(k), v) self.assertEqual(cache, dict(data)) class DecompilationTests(unittest.TestCase): def setUp(self): from peak.rules.ast_builder import parse_expr from peak.rules.codegen import ExprBuilder from peak.util.assembler import Local, Const from peak.rules.debug.decompile import decompile class Const(Const): pass # non-folding class ExprBuilder(ExprBuilder): def Const(self,value): return Const(value) argmap = dict([(name,Local(name)) for name in 'abcdefgh']) builder = ExprBuilder(argmap, locals(), globals(), __builtins__) self.parse = builder.parse self.decompile = decompile def roundtrip(self, s1, s2=None): if s2 is None: s2 = s1 self.check(self.parse(s1), s2) def check(self, expr, s): self.assertEqual(self.decompile(expr), s) def test_basics(self): from peak.util.assembler import Pass, Getattr, Local self.check(Pass, '') self.check(Getattr(Local('a'), 'b'), 'a.b') self.roundtrip('not -+~`a`') self.roundtrip('a+b-c*d/e%f**g//h') self.roundtrip('a and b or not c') self.roundtrip('~a|b^c&d<>f') self.roundtrip('(a, b)') self.roundtrip('[a, b]') self.roundtrip('[a]') self.roundtrip('[]') self.roundtrip('()') self.roundtrip('1') self.roundtrip('a.b.c') # TODO: Compare def test_slices(self): self.roundtrip('a[:]') self.roundtrip('a[1:]') self.roundtrip('a[:2]') self.roundtrip('a[1:2]') self.roundtrip('a[1:2:3]') self.roundtrip('a[::]', 'a[:]') self.roundtrip('a[b:c:d]') self.roundtrip('a[b::d]') self.roundtrip('a[-b:c+d:e*f]') self.roundtrip('a[::d]') def test_paren_precedence(self): self.roundtrip('(1).__class__') self.roundtrip('1.2.__class__') self.roundtrip('`(a, b)`') self.roundtrip('`a,b`', '`(a, b)`') self.roundtrip('(a, [b, c])') self.roundtrip('a,b', '(a, b)') self.roundtrip('a+b*c') self.roundtrip('c*(a+b)') self.roundtrip('a*b*c') self.roundtrip('(a*b)*c', 'a*b*c') self.roundtrip('a*(b*c)') self.roundtrip('(a*b).c') self.roundtrip('(a, b, c)') self.roundtrip('a**b**c') self.roundtrip('a**(b**c)', 'a**b**c') self.roundtrip('a and b or c and not d') self.roundtrip('a and (b or c) and not d') if sys.version>='2.5': # if-else is right-associative self.roundtrip('(a, b, c) if d else e') self.roundtrip('(a if b else c) if d else e') self.roundtrip('a if (b if c else d) else e') self.roundtrip('a if b else (c if d else e)', 'a if b else c if d else e') def test_powers(self): # Exponentiation operator binds less tightly than unary numeric/bitwise # on the right: self.roundtrip('2**-1') self.roundtrip('2**+1') self.roundtrip('2**~1') self.roundtrip('2**1*2') self.roundtrip('2**(1*2)') def test_dicts(self): self.roundtrip('{}') self.roundtrip('{a: b}') self.roundtrip('{a: b, c: d}') def test_calls(self): self.roundtrip('1()') self.roundtrip('a()') self.roundtrip('a(b)') self.roundtrip('a(c=d)') self.roundtrip('a(*e)') self.roundtrip('a(**f)') self.roundtrip('a(b, c=d, *e, **f)') self.roundtrip('a.b(c)') self.roundtrip('a+b(c)') self.roundtrip('(a+b)(c)') if sys.version>='2.5': self.roundtrip('a if b(c) else d') self.roundtrip('a(b if c else d, *e if f else g)') def additional_tests(): import doctest files = [ 'README.txt', 'DESIGN.txt', 'Indexing.txt', 'AST-Builder.txt', 'Code-Generation.txt', 'Syntax-Matching.txt', 'Criteria.txt', 'Predicates.txt', ][sys.version<'2.4':] # skip README.txt on 2.3 due to @ syntax return doctest.DocFileSuite( optionflags=doctest.ELLIPSIS|doctest.NORMALIZE_WHITESPACE, globs=dict(__name__=None), # workaround PyPy type repr() issue 1292 *files ) PEAK-Rules-0.5a1.dev-r2713/Indexing.txt0000755000076600007660000014237212041201063016404 0ustar telec3telec3================================== Decision Trees and Index Selection ================================== One of the most efficient representations for executing a collection of rules is a decision tree expressed as code. This document describes the design (and tests the implementation) of a decision tree building subsystem and its indexing machinery. You do not need to read this unless you are extending these subsystems, e.g. to support outputting something other than bytecode, or to add specialized indexes, an alternate expression language, etc. This document does not describe in detail how indexes are used to build executable decision trees (e.g. in bytecode or source code form), but focuses instead on the overall process of decision tree building, how indexes are selected to build individual nodes, and how these processes can be customized. The algorithms presented here are based in part on the ones described by Craig Chambers and Weimin Chen in their 1999 paper on `Efficient Multiple and Predicate Dispatching `_. We do, however, introduce an improved appraoch to managing inter-expression constraints. The new approach is simpler to explain and implement, while being more precise in what it constrains, as it more directly represents the actual constraints supplied by the input rules. .. contents:: **Table of Contents** -------------------- Expression Selection -------------------- An "expression" is an object that is used to build individual decision tree nodes, using statistics about which rules accept what values for a particular argument expression. Different expression types are used to apply different kinds of criteria, such as ``isinstance()`` or ``issubclass()`` tests versus equality or other comparison tests. You can create new expression types to support new kinds of criteria. For example, given rules with these criteria:: isinstance(x,Foo) and y==BAR isinstance(x,Baz) and y>0 and z/y<27 Three expressions would be required: one to handle ``isinstance()`` tests on ``x``, one to handle comparison tests on ``y``, and a third to handle comparison tests on ``z/y``. We would want the resulting decision tree to look something like this (ignoring inheritance and various other issues, of course):: switch type(x): case Foo: if y==BAR: ... case Baz: if y>0: if z/y<27: ... The decision tree must meet two requirements: it must be correct, and it must be as efficient as possible. To be efficient, it must test highly selective expressions first. For example, it would be unwise to first test conditions that apply to only a small number of rules. In the above example, testing ``z/y<27`` first would have been wasteful because only one of the two rules cares about the value of ``z/y``. To be correct, however, the tree must avoid reordering tests that are guarded by preconditions -- like ``y>0 and z/y<27``, where the ``y>0`` guards against division by zero. Even if it was highly selective, we can't use the ``z/y`` equality index for the root of the decision tree. These two requirements of correctness and efficiency are met by managing inter-expression `ordering constraints`_ and `selectivity statistics`_, as described in the sections below. Ordering Constraints ==================== Inter-expression ordering constraints ensure that evaluation is not reordered in such a way as to test expressions before their guards. Support for tracking these guards is provided by the ``Ordering`` add-on of an engine:: >>> from peak.rules.indexing import Ordering >>> def f(): """An object to attach some orderings to""" ``Ordering`` add-ons have two methods for managing inter-expression constraints: ``requires()`` and ``can_precede()``. The ``requires()`` method adds a "constraint": a set of expressions that must all have been computed *before* the constrained expression can be used. Let's define a constraint for function ``f`` that says ``z/y`` can only be computed after ``x`` and ``y`` have been examined:: >>> Ordering(f, "z/y").requires(["x", "y"]) We won't add constraints for tests on plain arguments like ``x`` and ``y`` themselves, because argument-only tests can be evaluated in any order. (Note, by the way, that although the examples here use strings, the actual keys or expression objects used can be any hashable object.) An expression can't be used in a decision tree node unless at least one of its constraints is met. As a decision tree is built, the tree builder tracks what expressions haven't yet been used -- and if a required expression hasn't been used yet, any constraints that contain it are not met. Initially, all of our indexes will be unused, but only ``x`` and ``y`` will be usable:: >>> unused = ["x", "y", "z/y"] >>> Ordering(f, "x").can_precede(unused) True >>> Ordering(f, "y").can_precede(unused) True >>> Ordering(f, "z/y").can_precede(unused) False Let's say that the tree builder decides to evaluate ``y``, and then removes it from the unused set:: >>> unused.remove("y") >>> Ordering(f, "x").can_precede(unused) True >>> Ordering(f, "z/y").can_precede(unused) False Since ``x`` is still in the unused set, ``z/y`` still can't be used. We have to remove it also:: >>> unused.remove("x") >>> Ordering(f, "z/y").can_precede(unused) True Multiple Constraints -------------------- You can add more than one ordering constraint for an expression, since the same expression may be used in different positions in different rules. For example, suppose we have the following criteria:: E1 and E2 and E3 and E4 E5 and E3 and E2 and E6 E2 and E3 appear in two different places, resulting in the following ordering constraints for the first expression:: >>> Ordering(f, "E2").requires(["E1"]) # E1 and E2 and E3 and E4 >>> Ordering(f, "E3").requires(["E1", "E2"]) >>> Ordering(f, "E4").requires(["E1", "E2", "E3"]) However, we can shortcut this repetitious process by using the ``define_ordering`` function, which automatically adds all the constraints implied by a given expression sequence:: >>> from peak.rules.indexing import define_ordering >>> define_ordering(f, ["E5","E3","E2","E6"]) # E5 and E3 and E2 and E6 All in all, E3 can be computed as long as either E1 and E2 *or* E5 have been computed. E2 can be computed as long as either E1 *or* E5 and E3 have been computed:: >>> E2 = Ordering(f, "E2") >>> E2.can_precede(["E1", "E2", "E3", "E4", "E5", "E6"]) False >>> E2.can_precede(["E1", "E5"]) False >>> E2.can_precede(["E5"]) True >>> E2.can_precede(["E1"]) True This is somewhat different from the Chambers & Chen approach, in that their approach reduces the above example to a constraint graph of the form:: (E1 or E5) -> (E2 or E3) -> (E4 or E6) Their approach is a little over-optimistic here, in that it assumes that "E5, E2" is a valid calculation sequence, even though this sequence appears nowhere in the input rules! The approach we use (which tracks the *actual* constraints provided by the rules) will allow any of these sequences to compute E2:: E5, E3, E2 E1, E2 E5, E1, E2 E1, E5, E2 It does not overgeneralize from this to assume that "E5, E2" is valid, the way the Chambers & Chen approach does. Constraint Simplification ------------------------- The active constraints of an index are maintained as a set of frozen sets:: >>> from peak.rules.indexing import set, frozenset # 2.3 compatibility >>> E2.constraints == set([frozenset(["E1"]), frozenset(["E5", "E3"])]) True Because constraints are effectively "or"ed together, adding a more restrictive constraint than an existing constraint is ignored:: >>> E2.requires(["E5", "E3", "E4"]) >>> E2.constraints == set([frozenset(["E1"]), frozenset(["E5", "E3"])]) True And if a less-restrictive constraint is added, it replaces any constraints that it's a subset of:: >>> E2.requires(["E5"]) >>> E2.constraints == set([frozenset(["E1"]), frozenset(["E5"])]) True And of course adding the same constraint more than once has no effect:: >>> E4 = Ordering(f, "E4") >>> E4.requires(["E1", "E2", "E3"]) >>> E4.requires(["E1", "E2", "E3"]) >>> E4.constraints == set([frozenset(["E1", "E2", "E3"])]) True Selectivity Statistics ====================== To maximize efficiency, decision tree nodes should always be built using the "most selective" expression that is currently legal to evaluate. Selectivity isn't an absolute measurement, however. It's based on the cases remaining to be distinguished at the current node. If none of the cases at a node care about a particular expression, that expression shouldn't be used. Likewise, if *all* of the rules care about an expression, but they are all expecting the same class or value, there may be better choices that would narrow down the applicable rules faster. All of these conditions can be determined using two statistics: the number of branches that the expression would produce for the current node, and the average number of cases remaining on each of the branches. If none of the cases care about the expression, then each branch will still have the same number of rules as the current node does. Thus the average is N (the number of cases at the current node.) If all of the cases care about the expression, but they all expect the same class or value, then there would be two branches: one empty, and one with N rules. Thus, its average is N/2: better than the case where none of the rules care, but it could be better still. If each rule expects a different value for the expression, then there will be N branches, each of length 1, resulting in an average of about 1 -- an optimal choice. Decision tree builders must therefore be able to compute an expression's selectivity, as we will see in the next section. ---------------------- Decision Tree Building ---------------------- The ``TreeBuilder`` class implements the basic algorithm for transforming rules into a decision tree:: >>> from peak.rules.indexing import TreeBuilder A decision tree is built by taking a set of cases (action definitions), and a set of indexes that cover all expressions tested by those cases, using the ``build()`` method: build(`cases`, `exprs`, `memo`) Builds a decision tree to distinguish `cases`, using `exprs`. The "best" expression (based on legality of use and selectivity) is chosen and used to build a dispach node, with subnodes recursively constructed by calls to ``build()`` with the remaining `cases` and `exprs`. Leaf nodes are constructed whenever either the cases or expressions are exhausted along a particular path. `memo` must be a dictionary. This method is memoized via `memo`, meaning that if it is called more than once with the same cases and indexes, it will return the same result each time. That is, the first invocation's return value is cached in `memo`, and returned by future calls using the same `memo`. This helps cut down on the total decision tree size by eliminating the redundancy that would otherwise occur when more than one path leads to the same remaining options. Because of this, however, the input `cases` and `exprs` must be hashable, because the ``TreeBuilder`` will be using them as dictionary keys. Our examples here will use tuples of cases and indexes (so as to maintain their order when we display them), but immutable sets could also be used, as could integer "bit sets" or strings or anything else that's hashable. The ``build()`` method doesn't care about what the cases or indexes "mean"; it just passes them off to other methods for handling. Subclassing TreeBuilder ======================= The ``TreeBuilder`` base class requires the following methods to be defined in a subclass, in order for the ``build()`` template method to work: build_node(`expr`, `cases`, `remaining_exprs`, `memo`) Build a decision node, using `expr` as the expression to dispatch on. `cases` are the actions to be distinguished, and `remaining_exprs` are the expressions that still need dispatch nodes. This method should call back to the ``build()`` method to obtain subnodes as needed, via e.g. ``self.build(subnode_cases, remaining_indexes, memo)``. This method should then return the switch node it has constructed. build_leaf(`cases`, `memo`) Build a leaf node for `cases`. Usually this means something like picking the most specific case, or producing a method combination for the cases. The return value should be the leaf node it has constructed. For improved efficiency, most ``TreeBuilder`` subclasses may wish to cache these leaf nodes by their value in `memo`, to allow equivalent leaves to be shared even if they were produced by different sets of `cases`. Such caching should be done by this method, if applicable. selectivity(`expr`, `cases`) Estimate the selectivity of a decision tree node that subdivides `cases` using `expr`. The return value must be a tuple of the form (`branches`, `total`), where `branches` is the number of branches that the node would have, and `total` is the total number of rules applying on each branch. (In other words, the average number of rules on each branch is ``total/branches``.) If `expr` is not used by any of the `cases`, the returned ``total`` *must* be equal to ``branches * len(cases)``. If `expr` *is* used, however, the return statistics can be estimated rather than precisely computed, if it improves performance. (Selectivity estimation is done a *lot* during tree building, since each node must choose the "best" expression from all the currently-applicable expressions.) cost(`expr`, `remaining_exprs`) Estimate the cost of computing `expr`, given that `remaining_exprs` have not been computed yet. Lower costs are better than higher ones. The default implementation of this method just returns 1. Expression cost is only used as a tiebreaker when the selectivity of two candidate expressions is the same, however. For our examples, we'll define some of these methods to build interior nodes as dictionaries, and leaf nodes as lists. We'll also print out what we're doing, to show that redundant nodes are cached. And, we'll compute selectivity in a slow but easy way: by building a branch table for the expressions. But first, we'll need an expression class. For simplicity's sake, we'll use a string subclass that uses sequences of name-value pairs as rules, and creates branches for each value whose name equals the expression string:: >>> class DemoExpr(str): ... def branch_table(self, cases): ... branches = {None: []} # "none of the above" branch ... for case in cases: ... care = False ... for name, value in case: ... if name==self: ... branches.setdefault(value, []).append(case) ... care = True ... if not care: ... # "don't care" rules must be added to *every* branch ... for b in branches: ... branches[b].append(case) ... return branches >>> def r(**kw): ... return tuple(kw.items()) >>> x = DemoExpr('x') >>> x.branch_table([r(x=1), r(x=2)]) == { ... None: [], 1: [(('x', 1),)], 2: [(('x', 2),)] ... } True Notice, by the way, that branch tables produced by this expression type contain a ``None`` key, to handle the case where the expression value doesn't match any of the rules. It isn't necessary that this key actually be ``None``, and for many types of expression in PEAK-Rules, it *can't* be ``None``. But the general idea of a "none-of-the-above" branch in tree nodes is nonetheless important. (Note also that expression objects aren't required to have a ``branch_table()`` method; that's just an implementation detail of the demos in this document.) Computing Selectivity and Building Nodes ======================================== Now that we have our expression type and the ability to build simple branch tables, we can define our tree builder, which assumes that each "case" is a tuple of name-value pairs, and that each "expr" is a ``DemoExpr`` instance:: >>> from peak.rules.core import sorted # for older Pythons >>> class DemoBuilder(TreeBuilder): ... ... def build_node(self, expr, cases, remaining_exprs, memo): ... enames = list(remaining_exprs) ... enames.sort() ... enames = ", ".join(enames) ... ... print "building switch for", expr, ... print "with", map(sorted,cases), "and (", enames, ")" ... ... branches = expr.branch_table(cases).items() ... branches.sort() ... return dict( ... [(key, self.build(tuple(values), remaining_exprs, memo)) ... for key,values in branches] ... ) ... ... def build_leaf(self, cases, memo): ... print "building leaf node for", map(sorted,cases) ... return map(sorted,cases) ... ... def selectivity(engine, expr, cases): ... branches = expr.branch_table(cases) ... total = 0 ... for value, rules in branches.items(): ... total += len(rules) ... return len(branches), total Note, by the way, that this demo builder is *very* inefficient. A real builder would have other methods to allow cases to be added to it in advance for indexing purposes, and the selectivity and branch table calculations would be done by extracting a subset of precomputed index information. However, for this demo index, clarity and simplicity are more important than performance. In later sections, we'll look at more efficient ways to compute selectivity and build dispatch tables. Here's a quick example to show how selectivity should be calculated for a few simple cases:: >>> db = DemoBuilder() >>> db.selectivity(x, [r(x=1), r(x=2)]) # 3 branches: 1, 2, None (3, 2) >>> db.selectivity(x, [r(x=1), r(x=1)]) # 2 branches: 1, None (2, 2) >>> db.selectivity(x, []) # 1 branch: "None of the above" (1, 0) >>> db.selectivity(x, [r(y=42)]) # 1 branch: "None of the above" (1, 1) >>> db.selectivity(x, [r(x=1), r(y=1)]) # 2 branches: 1, None (2, 3) >>> db.selectivity(x, [r(y=2), r(z=1)]) # 1 branch: "None of the above" (1, 2) Basic Tree-Building =================== Building nothing produces an empty leaf node:: >>> DemoBuilder().build((), (), {}) building leaf node for [] [] In fact, building anything with no remaining indexes produces a leaf node:: >>> DemoBuilder().build((r(x=1),), (), {}) building leaf node for [[('x', 1)]] [[('x', 1)]] Any inapplicable indexes are ignored:: >>> DemoBuilder().build((r(x=1),), (DemoExpr('q'),), {}) building leaf node for [[('x', 1)]] [[('x', 1)]] But applicable indexes are used:: >>> DemoBuilder().build((r(x=1),), (DemoExpr('x'),), {}) == { ... None: [], 1: [[('x', 1)]] ... } building switch for x with [[('x', 1)]] and ( ) building leaf node for [] building leaf node for [[('x', 1)]] True >>> DemoBuilder().build((r(x=1),), (DemoExpr('x'), DemoExpr('q')), {}) == { ... None: [], 1: [[('x', 1)]] ... } building switch for x with [[('x', 1)]] and ( ) building leaf node for [] building leaf node for [[('x', 1)]] True >>> DemoBuilder().build((r(x=1),), (DemoExpr('q'), DemoExpr('x')), {}) == { ... None: [], 1: [[('x', 1)]] ... } building switch for x with [[('x', 1)]] and ( ) building leaf node for [] building leaf node for [[('x', 1)]] True As long as they have no constraints preventing them from being used:: >>> def f(): pass >>> x = DemoExpr('x') >>> y = DemoExpr('y') >>> db = DemoBuilder() >>> Ordering(db, x).requires([y]) >>> db.build((r(x=1, y=2),), (x, y), {}) == { ... None: [], 2: {None: [], 1: [[('x', 1), ('y', 2)]]} ... } building switch for y with [[('x', 1), ('y', 2)]] and ( x ) building leaf node for [] building switch for x with [[('x', 1), ('y', 2)]] and ( ) building leaf node for [[('x', 1), ('y', 2)]] True >>> db.build((r(x=1, y=2),), (y, x), {}) == { ... None: [], 2: {None: [], 1: [[('x', 1), ('y', 2)]]} ... } building switch for y with [[('x', 1), ('y', 2)]] and ( x ) building leaf node for [] building switch for x with [[('x', 1), ('y', 2)]] and ( ) building leaf node for [[('x', 1), ('y', 2)]] True Optimizations ============= If more than one index is applicable, the one with the best selectivity is chosen:: >>> z = DemoExpr('z') >>> rules = r(x=1,y=2,z=3), r(z=4) >>> db.build(rules, (x,y,z), {}) == { # doctest: +NORMALIZE_WHITESPACE ... None: [], ... 3: {None: [], ... 2: {None: [], ... 1: [[('x', 1), ('y', 2), ('z', 3)]]}}, ... 4: [[('z', 4)]] ... } building switch for z with [[('x', 1), ('y', 2), ('z', 3)], [('z', 4)]] and ( x, y ) building leaf node for [] building switch for y with [[('x', 1), ('y', 2), ('z', 3)]] and ( x ) building switch for x with [[('x', 1), ('y', 2), ('z', 3)]] and ( ) building leaf node for [[('x', 1), ('y', 2), ('z', 3)]] building leaf node for [[('z', 4)]] True >>> db.build(rules, (z,y,x), {}) == { # doctest: +NORMALIZE_WHITESPACE ... None: [], ... 3: {None: [], ... 2: {None: [], ... 1: [[('x', 1), ('y', 2), ('z', 3)]]}}, ... 4: [[('z', 4)]] ... } building switch for z with [[('x', 1), ('y', 2), ('z', 3)], [('z', 4)]] and ( x, y ) building leaf node for [] building switch for y with [[('x', 1), ('y', 2), ('z', 3)]] and ( x ) building switch for x with [[('x', 1), ('y', 2), ('z', 3)]] and ( ) building leaf node for [[('x', 1), ('y', 2), ('z', 3)]] building leaf node for [[('z', 4)]] True If an index is skipped due to a constraint, its selectivity should still be checked if its constraints go away (due to the blocking index being found inapplicable):: >>> Ordering(db, z).requires([y]) # z now must have y checked first >>> rules = r(x=1,z=3), r(z=4) # but y isn't used by the rules >>> db.build(rules, (x,y,z), {}) == { # so z is the most-selective index ... None: [], 3: {None: [], 1: [[('x', 1), ('z', 3)]]}, 4: [[('z', 4)]] ... } building switch for z with [[('x', 1), ('z', 3)], [('z', 4)]] and ( x ) building leaf node for [] building switch for x with [[('x', 1), ('z', 3)]] and ( ) building leaf node for [[('x', 1), ('z', 3)]] building leaf node for [[('z', 4)]] True >>> db.build(rules, (z,y,x), {}) == { ... None: [], 3: {None: [], 1: [[('x', 1), ('z', 3)]]}, 4: [[('z', 4)]] ... } # try them in another order building switch for z with [[('x', 1), ('z', 3)], [('z', 4)]] and ( x ) building leaf node for [] building switch for x with [[('x', 1), ('z', 3)]] and ( ) building leaf node for [[('x', 1), ('z', 3)]] building leaf node for [[('z', 4)]] True The above examples whose results contain more than one ``[]`` leaf node show that leaf nodes for the same cases are being cached, but let's also show that non-leaf nodes are similarly shared and cached:: >>> rules = (('x',1),('x',2),('y',1),('y',2)), >>> tree = db.build(rules, (x,y,), {}) building switch for y with [[('x', 1), ('x', 2), ('y', 1), ('y', 2)]] and ( x ) building leaf node for [] building switch for x with [[('x', 1), ('x', 2), ('y', 1), ('y', 2)]] and ( ) building leaf node for [[('x', 1), ('x', 2), ('y', 1), ('y', 2)]] >>> tree == { ... None: [], ... 1: {None: [], ... 1: [[('x', 1), ('x', 2), ('y', 1), ('y', 2)]], ... 2: [[('x', 1), ('x', 2), ('y', 1), ('y', 2)]]}, ... 2: {None: [], ... 1: [[('x', 1), ('x', 2), ('y', 1), ('y', 2)]], ... 2: [[('x', 1), ('x', 2), ('y', 1), ('y', 2)]]} ... } True >>> tree[1] is tree[2] True >>> tree[1][1] is tree[1][2] True >>> tree[None] is tree[1][None] True As you can see, the redundant leaf nodes and intermediate nodes are shared due to the caching, and the log shows that only two intermediate nodes and two leaf nodes were even created in the first place. ------------------ Bitmap/Set Indexes ------------------ The basic indexes used by PEAK-Rules use a combination of bitmaps and sets to represent criteria as ranges or regions in a (1-dimensional) logical space. Points that represent region edges are known as "seeds". In the simplest possible indexes, a bitmap of applicable rules (cases, actually) is kept for each seed. (For equality or identity testing, this is all that's required, but more complex criteria things get a bit more involved.) For simple generic functions with 31 or fewer cases, a single integer is sufficient to represent any set of cases. For more complex generic functions, Python's long integer arithmetic performance scales reasonably well, even up to many hundreds of bits (cases). Plus, both integers and longs can be used as dictionary keys, and thus work well with the ``TreeBuilder`` class (which needs to cache dispatch nodes for sets of applicable cases). Bitmap Operations ================= To work with bitsets, we need to be able to convert an integer sequence to a bitmap (integer or long integer), and vice versa:: >>> from peak.rules.indexing import to_bits, from_bits >>> odds = to_bits([9,11,13,1,3,5,7,15]) >>> print hex(odds) 0xaaaa >>> list(from_bits(odds)) [1, 3, 5, 7, 9, 11, 13, 15] And to handle sets with more than 31 bits, we need these operations to handle long integers:: >>> seven_long = to_bits([32,33,34]) >>> print hex(seven_long) 0x700000000L >>> list(from_bits(seven_long)) [32, 33, 34] The ``BitmapIndex`` Add-on ========================== The ``BitmapIndex`` add-on base class provides basic support for indexing cases identified by sequential integers. Indexes are keyed by expression and attach to an "engine", which can be any object that supports add-ons. The ``BitmapIndex`` add-on provides these methods and attributes: add_case(case_id, criterion) Given an integer `case_id`, update the index by calling the ``.add_criterion()`` method on `criterion` and caching the result (so that multiple cases using the same criterion share the same seeds). add_criterion(criterion) (**abstract method: must be defined in a subclass**) The ``.add_criterion()`` method must return a set or sequence of the "applicable seeds" for which the criterion holds true. It must also ensure that the ``.all_seeds`` index is up to date, usually by calling the ``.include()`` and ``.exclude()`` or ``.add_seed()`` methods. The set or sequence returned by this method is only used for its ``len()`` as a basis for computing selectivity; its contents aren't actually used by the ``BitmapIndex`` base class. Thus, this method can return a specialized dynamic object that simply computes an appropriate length, and optionally updates the ``all_seeds`` sets when it notice new seeds there (e.g. due to ``reseed()`` or the addition of new cases and criteria). include(seed, criterion) Add `criterion` to the "inclusions" for `seed`. You should call this method in ``.add_criterion()`` for each seed that the criterion applies to. Note that there is no requirement that the `criterion` used here must be limited to criteria that were passed to ``.add_criterion()`` -- that method is allowed to pre-calculate inclusions for related criteria that haven't been seen yet. For example, ``TypeIndex`` precalculates many inclusions and exclusions for the base classes of criteria it encounters. exclude(seed, criterion) Add `criterion` to the "exclusions" for `seed`. Like ``.include()``, this is usually called from within ``.add_criterion()``, and may be called on any criterion, whether or not it was passed to ``.add_criterion()``. Simple indexes will probably have no use for this method, as "exclusions" are typically only useful for negated criteria. For example, an ``is not`` condition is false for all seeds but one, so it's not efficient to try to update the inclusions for every seed to list every ``is not`` criterion. Instead, it's better to record an exclusion for the ``is not`` criterion's seed, and an inclusion for the criterion to a "default" seed, so that by default, the criterion will apply. selectivity(case_id_list) Given a **sorted** sequence of integer case ids, return the `selectivity statistics`_ of the index for those cases. The default implementation simply sums the ``len()`` of the values returned by the appropriate ``.add_criterion()`` calls. It can be overridden in a subclass to implement more sophisticated calculation strategies. seed_bits(cases) Given a bitmap of cases, return a tuple ``(dontcares, seedmap)``, where ``dontcares`` is a bitset of "don't care" cases (i.e., cases that must be included in *every* child node), and ``seedmap`` is a dictionary mapping seeds to ``(include, exclude)`` bitset pairs. reseed(criterion) Handle a reseed request. By default, this is a synonym for ``.add_criterion(criterion)``, but if necessary it can be overridden in a subclass to handle reseeds differently. expanded_sets() Return a list of ``(seed, (inc, exc))`` tuples, where ``inc`` and ``exc`` are integer case lists. This method is for debugging and testing only. all_seeds A dictionary containing information about all seeds seen by the index. The keys are the seeds, and the values are ``(include,exclude)`` tuples of sets of criteria, indicating which criteria include or exclude that key, respectively. Dynamic criteria should update this dictionary when new seeds are discovered (e.g. via ``reseed()`` or new cases being added). criteria_bits A dictionary mapping each known criterion to a bitset representing the cases that use that criterion. criteria_seeds A dictionary that caches the result of the index's calls to the ``.add_criterion()`` method. case_seeds A list mapping case ids to the values returned by ``.add_criterion()`` for the corresponding criterion. Entries for case ids that were not assigned criteria via ``.add_case()`` are filled with ``self.null``. null The value used to pad skipped entries in the ``.case_seeds`` list. By default, this is the same object as ``self.all_seeds``, so that for selectivity-calculation purposes, the case is treated as applying for all seeds. (You may change this in a subclass if you are also overriding the ``selectivity()`` method.) known_cases A bitset representing the case ids of all cases added to the index using the ``add_case()`` method. extra An empty dictionary, made available for the use of ``.add_criterion()`` algorithms that may require additional storage for dependencies or intermediate indexes. match A single-element sequence that can be used as a default return value from ``.add_criterion()``. If your ``.add_criterion()`` method is going to return a set or sequence containing only one element (and it won't be changed later), you should return ``self.match`` instead of making a new sequence. This can save a lot of memory. For our first demo, we'll assume we're indexing a simple equality condition, so every criterion will seed only to itself. We'll create a simple criterion type, and a suitable ``BitmapIndex`` subclass:: >>> from peak.rules.indexing import BitmapIndex >>> from peak.util.decorators import struct >>> from peak.rules import when >>> class DemoEngine: pass >>> def find(val): ... """A structure type used as a criterion""" ... return val, >>> find = struct()(find) >>> class DemoIndex(BitmapIndex): ... def add_criterion(self, criterion): ... print "computing seeds for", criterion ... self.include(criterion.val, criterion) ... return [criterion.val] By the way, the ``BitmapIndex`` constructor is an abstract factory: it can automatically create a subclass of the appropriate type, for the given engine and expression:: >>> eng = DemoEngine() >>> ind = BitmapIndex(eng, "some expression") Traceback (most recent call last): ... NoApplicableMethods: ((<...DemoEngine ...>, 'some expression'), {}) Well, it will as long as you register an appropriate method for the ``bitmap_index_type`` generic function:: >>> from peak.rules.indexing import bitmap_index_type >>> f = when(bitmap_index_type, (DemoEngine, str))( ... lambda engine, expr: DemoIndex ... ) Now we should be able to get the right kind of index, just by calling ``BitmapIndex()``:: >>> BitmapIndex(eng, "some expression") We also could have explicitly used ``DemoIndex``, of course. Once created, the same instance will be reused for each call to either the specific constructor or ``BitmapIndex``. To begin with, our index's selectivity is 0/0 because there are no seeds (and therefore no branches):: >>> ind = DemoIndex(eng, "some expression") >>> ind.selectivity([]) (0, 0) >>> ind.all_seeds {} >>> ind.criteria_seeds {} >>> list(from_bits(ind.known_cases)) [] By adding a case, we'll add a criterion (and therefore a seed), which gets cached:: >>> ind.add_case(0, find("x")) computing seeds for find('x',) >>> ind.criteria_seeds {find('x',): ['x']} >>> ind.criteria_bits {find('x',): 1} Now, selectivity will always be at least 1/0, because there's one possible branch:: >>> ind.selectivity([]) (1, 0) >>> ind.selectivity([1]) (1, 1) >>> ind.selectivity([2]) (1, 1) >>> ind.selectivity([1,2]) (1, 2) And the seed bitmaps reflect this:: >>> list(from_bits(ind.known_cases)) [0] >>> ind.seed_bits(ind.known_cases) (0, {'x': (1, 0)}) >>> dict(ind.expanded_sets()) # expanded form of seed_bits(known_cases) {'x': [[0], []]} If we add another case with the same criterion, the number of branches will stay the same. Notice also, that the seeds for a previously-seen criterion are not recalculated:: >>> ind.add_case(1, find("x")) >>> ind.criteria_bits # criterion 'x' was used for cases 0 and 1 {find('x',): 3} >>> dict(ind.expanded_sets()) {'x': [[0, 1], []]} >>> list(from_bits(ind.known_cases)) [0, 1] >>> ind.selectivity([]) (1, 0) >>> ind.selectivity([1]) (1, 1) >>> ind.selectivity([2]) (1, 1) >>> ind.selectivity([1,2]) (1, 2) However, if we add a new case with a *new* criterion, its seeds are computed, and the number of branches increases:: >>> ind.add_case(2, find("y")) computing seeds for find('y',) >>> dict(ind.expanded_sets()) {'y': [[2], []], 'x': [[0, 1], []]} >>> list(from_bits(ind.known_cases)) [0, 1, 2] >>> ind.selectivity([]) (2, 0) >>> ind.selectivity([1]) (2, 1) >>> ind.selectivity([2]) (2, 1) >>> ind.selectivity([1,2]) (2, 2) ----------------- Indexing Criteria ----------------- All of the criterion objects provided by ``peak.rules.criteria`` have ``BitmapIndex`` subclasses suitable for indexing them:: >>> from peak.rules.criteria import Value, Range, IsObject, Class, Conjunction To demonstrate them, we'll use a dummy engine object:: >>> class Engine: pass >>> eng = Engine() Object Identity =============== The ``IsObject`` criterion type implements indexing for the ``is`` and ``is not`` operators. ``IsObject(x)`` represents ``is x``, and ``IsObject(x, False)`` represents ``is not x``. The bitmap index seeds for ``IsObject`` objects are the ``id()`` values of the target objects, or ``None`` to represent the "none of the above" cases:: >>> from peak.rules.indexing import PointerIndex >>> p = object() >>> ppeq = IsObject(p) >>> ppne = IsObject(p, False) >>> ind = PointerIndex(eng, "x") >>> ind.add_case(0, ppeq) >>> dict(ind.expanded_sets()) {...: [[0], []], None: [[], [0]]} The selectivity of an ``is`` criterion is 1:: >>> ind.selectivity([0]) (2, 1) And for an ``is not`` criterion, it's always one less than the total number of seeds currently in the index (because an ``is not`` criterion is true for every possible branch *except* its target):: >>> ind.add_case(1, ppne) >>> dict(ind.expanded_sets()) == {id(p): [[0], [1]], None: [[1], [0]]} True >>> ind.selectivity([1]) (2, 1) >>> ind.selectivity([0,1]) (2, 2) >>> q = object() >>> ind.add_case(2, IsObject(q)) >>> ind.selectivity([1]) # now it's (3,2) instead of (2,1) or (3,1) (3, 2) >>> ind.selectivity([0]) # 'is' pointers are always 1 (3, 1) >>> dict(ind.expanded_sets()) == { ... None: [[1], [0, 2]], id(p): [[0], [1]], id(q): [[1, 2], []], ... } True >>> from peak.rules.core import intersect >>> ind.add_case(3, intersect(ppne, IsObject(q,False))) >>> dict(ind.expanded_sets()) == { ... None: [[1, 3], [0, 2]], id(p): [[0], [1, 3]], id(q): [[1, 2], [3]], ... } True >>> r = object() >>> ind.add_case(4, IsObject(r)) >>> dict(ind.expanded_sets()) == { ... None: [[1, 3], [0, 2, 4]], id(p): [[0], [1, 3]], ... id(q): [[1, 2], [3]], id(r): [[1, 3, 4], []] ... } True Ranges and Value Comparisons ============================ ``Range`` Objects ----------------- The ``Range()`` criterion type represents an inequality such as ``lo < x < hi`` or ``x >= lo``. (For more details on their semantics, see Criteria.txt). The bitmap index seeds for ``Range`` objects are edge tuples, and the selectivity of a ``Range`` is the distance between the low and high edges in a sorted list of all the index's seeds:: >>> from peak.rules.indexing import RangeIndex >>> ind = RangeIndex(eng, "y") >>> r = Range((1,-1), (23,1)) >>> ind.add_case(0, r) >>> ind.selectivity([0]) (2, 1) >>> from peak.rules.criteria import sorted >>> sorted(ind.expanded_sets()) [((1, -1), [[0], []]), ((23, 1), [[], [0]])] >>> ind.add_case(1, Range((5,-1), (20,1))) >>> ind.selectivity([1]) (4, 1) >>> sorted(ind.expanded_sets()) [((1, -1), [[0], []]), ((5, -1), [[1], []]), ((20, 1), [[], [1]]), ((23, 1), [[], [0]])] >>> ind.add_case(2, Range((7,-1), (24,1))) >>> ind.selectivity([2]) (6, 3) >>> sorted(ind.expanded_sets()) [((1, -1), [[0], []]), ((5, -1), [[1], []]), ((7, -1), [[2], []]), ((20, 1), [[], [1]]), ((23, 1), [[], [0]]), ((24, 1), [[], [2]])] >>> ind.add_case(3, Range((7,-1), (7,1))) >>> sorted(ind.expanded_sets()) [((1, -1), [[0], []]), ((5, -1), [[1], []]), ((7, -1), [[2, 3], []]), ((7, 1), [[], [3]]), ((20, 1), [[], [1]]), ((23, 1), [[], [0]]), ((24, 1), [[], [2]])] >>> ind.selectivity([0]) (7, 5) >>> ind.selectivity([1]) (7, 3) >>> ind.selectivity([2]) (7, 4) >>> ind.selectivity([3]) (7, 1) ``Value`` Objects ----------------- ``Value`` objects are used to represent ``==`` and ``!=`` comparisons. ``Value(x)`` represents ``==x`` and ``Value(x, False)`` represents ``!=x``. The bitmap index seeds for ``Value`` objects are ``(value, 0)`` tuples, which fall between the "below" and "above" tuples of any ``Range`` objects in the same index. And the selectivity of a ``Value`` is either 1 or the number of seeds in the index, minus one:: >>> ind.add_case(4, Value(7, False)) >>> ind.selectivity([4]) (9, 8) >>> sorted(ind.expanded_sets()) [((Min, -1), [[4], []]), ((1, -1), [[0], []]), ((5, -1), [[1], []]), ((7, -1), [[2, 3], []]), ((7, 0), [[], [4]]), ((7, 1), [[], [3]]), ((20, 1), [[], [1]]), ((23, 1), [[], [0]]), ((24, 1), [[], [2]])] >>> ind.add_case(5, Value(7)) >>> ind.selectivity([5]) (9, 1) >>> sorted(ind.expanded_sets()) [((Min, -1), [[4], [5]]), ((1, -1), [[0], []]), ((5, -1), [[1], []]), ((7, -1), [[2, 3], []]), ((7, 0), [[5], [4]]), ((7, 1), [[], [3]]), ((20, 1), [[], [1]]), ((23, 1), [[], [0]]), ((24, 1), [[], [2]])] Notice that the seeds for a ``Value`` always include either an inclusion or exclusion for ``(Min, -1)``, as this Value Map Generation -------------------- >>> from peak.rules.indexing import split_ranges >>> from peak.util.extremes import Min, Max >>> def dump_ranges(ind, cases): ... exact, ranges = split_ranges(*ind.seed_bits(cases)) ... for k in exact.keys(): ... exact[k] = list(from_bits(exact[k])) ... for n, (k, v) in enumerate(ranges): ... ranges[n] = k, list(from_bits(v)) ... return exact, ranges >>> dump_ranges(ind, ind.known_cases) == ( ... {1: [0, 4], 5: [0, 1, 4], 7: [0, 1, 2, 3, 5], 20: [0, 1, 2, 4], ... 23: [0, 2, 4], 24: [2, 4]}, ... [((Min, 1), [4]), ((1, 5), [0, 4]), ((5, 7), [0, 1, 4]), ... ((7, 20), [0, 1, 2, 4]), ((20, 23), [0, 2, 4]), ((23, 24), [2, 4]), ... ((24, Max), [4])] ... ) True >>> ind = RangeIndex(eng, 'q') >>> dump_ranges(ind, ind.known_cases) == ({}, [((Min, Max), [])]) True >>> ind.add_case(0, Value(19)) >>> dump_ranges(ind, ind.known_cases) == ( ... {Min: [], 19: [0]}, [((Min, Max), [])] ... ) True >>> ind.add_case(1, Value(23)) >>> dump_ranges(ind, ind.known_cases) == ( ... {Min: [], 19: [0], 23: [1]}, [((Min, Max), [])] ... ) True >>> ind.add_case(2, Value(23, False)) >>> dump_ranges(ind, ind.known_cases) == ( ... {19: [0, 2], 23: [1]}, [((Min, Max), [2])] ... ) True >>> ind.add_case(3, Range(lo=(57,1))) >>> dump_ranges(ind, ind.known_cases) == ( ... {57: [2], 19: [0, 2], 23: [1]}, ... [((Min, 57), [2]), ((57, Max), [2, 3])] ... ) True >>> ind.add_case(4, Range(lo=(57,-1))) >>> dump_ranges(ind, ind.known_cases) == ( ... {57: [2, 4], 19: [0, 2], 23: [1]}, ... [((Min, 57), [2]), ((57, Max), [2, 3, 4])] ... ) True Single Classes ============== ``Class`` objects represent ``issubclass()`` or ``isinstance()`` tests. ``Class(x)`` is a instance/subclass match, while ``Class(x, False)`` is a non-match:: >>> from peak.rules.indexing import TypeIndex >>> ind = TypeIndex(eng, 'by class') >>> ind.add_case(0, Class(int)) >>> ind.selectivity([0]) (2, 1) >>> dict(ind.expanded_sets()) == { ... int: [[0], []], object: [[], []] ... } True >>> class myint(int): pass >>> ind.add_case(1, Class(myint)) >>> ind.selectivity([1]) (3, 1) >>> dict(ind.expanded_sets()) == { ... int: [[0], []], object: [[], []], myint: [[0, 1], []] ... } True >>> ind.selectivity([0]) (3, 2) >>> ind.add_case(2, Class(int, False)) >>> ind.selectivity([2]) (3, 1) >>> dict(ind.expanded_sets()) == { ... int: [[0], []], object: [[2], []], myint: [[0, 1], []] ... } True >>> class other(object): pass >>> ind.add_case(3, Class(other)) >>> ind.selectivity([3]) (4, 1) >>> ind.selectivity([2]) (4, 2) >>> dict(ind.expanded_sets()) == { ... int: [[0], []], object: [[2], []], other: [[2, 3], []], ... myint: [[0, 1], []] ... } True Multiple Classes ================ ``Conjunction`` objects can hold sets of 2 or more ``Class`` criteria, and represent the "and" of those criteria. This is the most complex type of criteria to index, because there's no easy way to incrementally update a set intersection:: >>> class a(object): pass >>> class b(object): pass >>> class c(a,b): pass >>> class d(b): pass >>> class e(a,d): pass >>> class f(e, c): pass >>> class g(c): pass >>> ind = TypeIndex(eng, 'classes') >>> ind.add_case(0, Conjunction([Class(a), Class(b)])) >>> ind.selectivity([0]) (3, 0) >>> dict(ind.expanded_sets()) == { ... a: [[], []], b: [[], []], object: [[], []] ... } True >>> ind.add_case(1, Class(c)) >>> ind.selectivity([0]) # c (4, 1) >>> ind.selectivity([1]) (4, 1) >>> dict(ind.expanded_sets()) == { ... a: [[], []], b: [[], []], c: [[0, 1], []], object: [[], []] ... } True >>> ind.add_case(2, Conjunction([Class(a), Class(b), Class(c, False)])) >>> ind.selectivity([2]) (4, 0) >>> dict(ind.expanded_sets()) == { ... a: [[], []], b: [[], []], c: [[0, 1], []], object: [[], []] ... } True >>> ind.add_case(3, Class(e)) >>> ind.selectivity([0]) # c, e (5, 2) >>> ind.selectivity([2]) (5, 1) >>> ind.selectivity([3]) (5, 1) >>> dict(ind.expanded_sets()) == { ... a: [[], []], b: [[], []], c: [[0, 1], []], object: [[], []], ... e: [[0, 2, 3], []] ... } True >>> ind.add_case(4, Class(f)) >>> ind.selectivity([0]) # c, e, f (6, 3) >>> ind.selectivity([2]) (6, 1) >>> dict(ind.expanded_sets()) == { ... a: [[], []], b: [[], []], c: [[0, 1], []], object: [[], []], ... e: [[0, 2, 3], []], f: [[0, 1, 3, 4], []] ... } True >>> ind.add_case(5, Class(g)) >>> ind.selectivity([0]) # c, e, f, g (7, 4) >>> ind.selectivity([2]) # still just 'e' (7, 1) >>> Conjunction([Class(d, False), Class(e, False)]) Class(, False) >>> ind.add_case(6, Conjunction([Class(d, False), Class(e, False)])) >>> ind.selectivity([6]) # all but d, e, f (8, 5) >>> dict(ind.expanded_sets()) == { ... a: [[6], []], b: [[6], []], c: [[0, 1, 6], []], object: [[6], []], ... d: [[], []], e: [[0, 2, 3], []], f: [[0, 1, 3, 4], []], ... g: [[0, 1, 5, 6], []], ... } True Reseeding ========= Due to the nature of multiple inheritance, it's sometimes necessary to re-seed an index by adding a new criterion, recomputing selectivity, and then getting new seed bits. Let's test an example by making a new class that inherits from ``a`` and ``b``:: >>> class x(a,b): pass This class isn't in the index, because we just created it. If it were only inheriting from one class, we could safely assume that a lookup of the base class would work just as well as looking up the actual class. However, since it inherits from more than one class, we don't know if there are multi-class criteria that might apply to it. (And in fact there are: cases 0 and 2 in our index apply to classes that inherit from both ``a`` and ``b``.) So, to update the index, we use the ``reseed()`` method:: >>> ind.reseed(Class(x)) It takes a criterion and an optional bitset representing the cases for which a new ``seed_bits`` map should be generated. It updates the index by checking the ``len()`` of every "applicable seeds" set, after adding the criterion to the appropriate caches. The result is that the index now includes correct information for the new ``x`` class:: >>> dict(ind.expanded_sets()) == { ... a: [[6], []], b: [[6], []], c: [[0, 1, 6], []], object: [[6], []], ... d: [[], []], e: [[0, 2, 3], []], f: [[0, 1, 3, 4], []], ... g: [[0, 1, 5, 6], []], x: [[0, 2, 6], []] ... } True Exact Types =========== Finally, let's try out some ``istype()`` and mixed class/classes/istype criteria, to make sure they interoperate:: >>> from peak.rules import istype >>> ind = TypeIndex(eng, 'types') >>> ind.add_case(0, istype(a)) >>> ind.selectivity([0]) (2, 1) >>> ind.add_case(1, istype(b, False)) >>> ind.selectivity([1]) (3, 2) >>> ind.selectivity([0,1]) (3, 3) >>> ind.add_case(2, Class(a)) >>> ind.selectivity([2]) (3, 1) >>> ind.selectivity([0,1,2]) (3, 4) >>> dict(ind.expanded_sets()) == { ... a:[[0, 1, 2], []], ... b:[[], []], ... object:[[1], []] ... } True >>> ind.add_case(3, Class(a, False)) >>> ind.selectivity([3]) (3, 2) >>> ind.selectivity([0,1,2,3]) (3, 6) >>> dict(ind.expanded_sets()) == { ... a:[[0, 1, 2], []], ... b:[[3], []], ... object:[[1, 3], []] ... } True >>> ind.add_case(4, Conjunction([Class(a), istype(c, False)])) >>> ind.selectivity([4]) (4, 1) >>> dict(ind.expanded_sets()) == { ... a:[[0, 1, 2, 4], []], ... b:[[3], []], ... c:[[1, 2], []], ... object:[[1, 3], []] ... } True >>> ind.reseed(Class(e)) >>> dict(ind.expanded_sets()) == { ... a:[[0, 1, 2, 4], []], ... b:[[3], []], ... c:[[1, 2], []], ... e:[[1, 2, 4], []], ... object:[[1, 3], []] ... } True >>> ind.add_case(5, istype(object)) Truth ===== ``TruthIndex`` treats boolean criteria as logical truth:: >>> from peak.rules.indexing import TruthIndex >>> ind = TruthIndex(eng, "z") >>> ind.add_case(0, Value(True)) >>> ind.selectivity([0]) (2, 1) >>> ind.add_case(1, Value(True, False)) >>> ind.selectivity([0,1]) (2, 2) >>> dict(ind.expanded_sets()) {False: [[1], []], True: [[0], []]} PEAK-Rules-0.5a1.dev-r2713/AST-Builder.txt0000775000076600007660000004405011430374112016655 0ustar telec3telec3============================================= Building Expressions from Python Syntax Trees ============================================= The ``ast_builder`` module allows you to quickly navigate a Python syntax tree and perform operations on it. While Python 2.5 has a new "AST" feature that provides a high-level syntax tree, older Python versions offer a very low-level interface that provides complex tuple trees with lots of redundant information. The ``ast_builder`` module simplifies these trees dramatically, without creating an intermediate AST data structure (the way the stdlib ``compiler`` package does). Instead, it allows you to effectively "visit" a virtual AST structure and generate your desired output directly. In addition, it allows you to skip, delay, or repeat traversals of arbitrary subtrees. This document describes the design (and tests the implementation) of the ``ast_builder`` module. You don't need to read it unless you want to use this module directly in your own programs. If you do want to use it directly, you should keep in mind that it currently only implements a **subset** of Python *expression* syntax: it does not support lambdas, yield expressions, or any kind of statements. .. contents:: **Table of Contents** ------------------------ Parse Trees and Builders ------------------------ ``ast_builder`` operates on parse tuple trees, as created by the standard library ``parser`` module. The two API functions it provides are ``build`` and ``parse_expr``:: >>> from peak.rules.ast_builder import build, parse_expr The ``build()`` function accepts two arguments, a "builder" and a "nodelist". A "builder" is an object that you supply that will perform actions on nodes in the parse tree. The "nodelist" is a parse tuple tree. As a shortcut, you can use ``parse_expr()`` to parse a string into a nodelist and invoke ``build()`` in one step. A simple example:: >>> class Builder: ... def Const(self, const): ... print const >>> parse_expr("1", Builder()) 1 >>> parse_expr("'foo'", Builder()) foo As you can see, the builder's ``Const()`` method is invoked for integer and string constants, and it receives an actual value. Many other method names are used for more complex operations:: >>> class Builder: ... def Const(self, const): ... return repr(const) ... def Add(self, left, right): ... return "Add(%s, %s)" % (build(self,left), build(self,right)) >>> parse_expr("1+2", Builder()) 'Add(1, 2)' >>> parse_expr("'foo'+'bar'", Builder()) "Add('foo', 'bar')" Most builder methods accept nodelists as arguments. These nodelists can be recursively passed to ``build()`` in order to process expression subtrees. This is not done automatically, because it's possible you might want to skip processing of a particular subtree, or need to process a subtree with a different builder than the one currently in use, or even process a subtree with more than one builder (e.g. a builder that sees what names are bound within a function body, and a second builder to generate code). For convenience in the rest of this document, we'll use a shorthand function to create a ``Builder()``, parse an expression, and print the result:: >>> def pe(expr): ... print parse_expr(expr, Builder()) Tokens ====== The ``Const`` and ``Name`` methods receive token values for constants and names respectively:: >>> class Builder: ... def Const(self, const): ... return repr(const) ... ... def Name(self, name): ... return name >>> pe("a") a >>> pe("b") b >>> pe("123") 123 >>> pe("'xyz'") 'xyz' Note that adjacent string constants are automatically merged:: >>> pe("'abc' 'xyz'") 'abcxyz' Unary Operators =============== There are five "unary" operator methods, that take a single argument: an AST tuple representing the expression the operator is applied to:: >>> class Builder(Builder): ... def unaryOp(fmt): ... def method(self,expr): ... return fmt % build(self,expr) ... return method ... ... UnaryPlus = unaryOp('Plus(%s)') ... UnaryMinus = unaryOp('Minus(%s)') ... Invert = unaryOp('Invert(%s)') ... Backquote = unaryOp('repr(%s)') ... Not = unaryOp('Not(%s)') >>> pe("not - + ~`x`") Not(Minus(Plus(Invert(repr(x))))) Attribute Access ================ The ``Getattr()`` method is called with a node and a string; the node is the base expression, and the string is the attribute that was accessed:: >>> class Builder(Builder): ... def Getattr(self, expr, attr): ... return 'Getattr(%s,%r)' % (build(self,expr), attr) >>> pe("a.b") Getattr(a,'b') >>> pe("a.b.c") Getattr(Getattr(a,'b'),'c') Simple Binary Operators ======================= There are 10 "simple" binary operator methods, that take a pair of left and right nodelists as arguments:: >>> class Builder(Builder): ... def mkBinOp(op): ... pat = op + '(%s,%s)' ... def method(self, left, right): ... return pat % (build(self,left), build(self,right)) ... return method ... ... Add = mkBinOp('Add') ... Sub = mkBinOp('Sub') ... Mul = mkBinOp('Mul') ... Div = mkBinOp('Div') ... Mod = mkBinOp('Mod') ... FloorDiv = mkBinOp('FloorDiv') ... Power = mkBinOp('Power') ... LeftShift = mkBinOp('LeftShift') ... RightShift = mkBinOp('RightShift') ... Subscript = mkBinOp('Subscript') Most of these operators correspond to normal Python binary operators:: >>> pe("a+b") Add(a,b) >>> pe("b-a") Sub(b,a) >>> pe("c*d") Mul(c,d) >>> pe("c/d") Div(c,d) >>> pe("c%d") Mod(c,d) >>> pe("c//d") FloorDiv(c,d) >>> pe("a**b") Power(a,b) >>> pe("a<>> pe("a>>b") RightShift(a,b) >>> pe("a[1]") Subscript(a,1) >>> pe("a[1][2]") Subscript(Subscript(a,1),2) By the way, ``Ellipsis`` is also handled by the ``Const`` method, in the case where you have an expression like ``foo[...]``:: >>> pe("a[...]") Subscript(a,Ellipsis) "List" Operators ================ The 7 "list operator" methods take a single argument: a sequence of nodes that represent a list of expressions delimited by the corresponding operator:: >>> class Builder(Builder): ... def multiOp(fmt,sep=','): ... def method(self,items): ... return fmt % sep.join([build(self,item) for item in items]) ... return method ... ... And = multiOp('And(%s)') ... Or = multiOp('Or(%s)') ... Tuple = multiOp('Tuple(%s)') ... List = multiOp('List(%s)') ... Bitor = multiOp('Bitor(%s)') ... Bitxor = multiOp('Bitxor(%s)') ... Bitand = multiOp('Bitand(%s)') >>> pe("a and b") And(a,b) >>> pe("a or b") Or(a,b) >>> pe("a and b and c") And(a,b,c) >>> pe("a or b or c") Or(a,b,c) >>> pe("a and b and c and d") And(a,b,c,d) >>> pe("a or b or c or d") Or(a,b,c,d) >>> pe("a&b&c") Bitand(a,b,c) >>> pe("a|b|c") Bitor(a,b,c) >>> pe("a^b^c") Bitxor(a,b,c) >>> pe("a&b&c&d") Bitand(a,b,c,d) >>> pe("a|b|c|d") Bitor(a,b,c,d) >>> pe("a^b^c^d") Bitxor(a,b,c,d) Tuples ------ No parens:: >>> pe("a,") Tuple(a) >>> pe("a,b") Tuple(a,b) >>> pe("a,b,c") Tuple(a,b,c) >>> pe("a,b,c,") Tuple(a,b,c) With parens:: >>> pe("()") Tuple() >>> pe("(a)") a >>> pe("(a,)") Tuple(a) >>> pe("(a,b)") Tuple(a,b) >>> pe("(a,b,)") Tuple(a,b) >>> pe("(a,b,c)") Tuple(a,b,c) >>> pe("(a,b,c,)") Tuple(a,b,c) Lists ----- :: >>> pe("[]") List() >>> pe("[a]") List(a) >>> pe("[a,]") List(a) >>> pe("[a,b]") List(a,b) >>> pe("[a,b,]") List(a,b) >>> pe("[a,b,c]") List(a,b,c) >>> pe("[a,b,c,]") List(a,b,c) Slicing ======= The ``Slice2`` method takes two arguments: a start and stop value. Each is either a node list or ``None`` (in which case there is no expression for that part of the slice):: >>> class Builder(Builder): ... def Slice2(self,start,stop): ... txt = 'Slice(' ... if start: ... txt += build(self, start) ... txt += ':' ... if stop: ... txt += build(self, stop) ... return txt+')' >>> pe("a[:]") Subscript(a,Slice(:)) >>> pe("a[1:2]") Subscript(a,Slice(1:2)) >>> pe("a[1:]") Subscript(a,Slice(1:)) >>> pe("a[:2]") Subscript(a,Slice(:2)) The ``Slice3`` method is similar, but takes three arguments:: >>> class Builder(Builder): ... def Slice3(self,start,stop,stride): ... txt = 'Slice(' ... if start: ... txt += build(self, start) ... txt += ':' ... if stop: ... txt += build(self, stop) ... txt += ':' ... if stride: ... txt += build(self, stride) ... return txt+')' >>> pe("a[::]") Subscript(a,Slice(::)) >>> pe("a[1::]") Subscript(a,Slice(1::)) >>> pe("a[:2:]") Subscript(a,Slice(:2:)) >>> pe("a[1:2:]") Subscript(a,Slice(1:2:)) >>> pe("a[::3]") Subscript(a,Slice(::3)) >>> pe("a[1::3]") Subscript(a,Slice(1::3)) >>> pe("a[:2:3]") Subscript(a,Slice(:2:3)) >>> pe("a[1:2:3]") Subscript(a,Slice(1:2:3)) Comparisons and Conditional Expressions ======================================= The ``Compare`` method receives two arguments: a node for the first expression to be compared, followed by a list of ``(op, expr)`` tuples for subsequent comparisons. The ``op`` value is a string representing the comparison operator used, and each ``expr`` is a node:: >>> class Builder(Builder): ... def Compare(self,initExpr,comparisons): ... data = [build(self,initExpr)] ... for op,val in comparisons: ... data.append(op) ... data.append(build(self,val)) ... return 'Compare(%s)' % ' '.join(data) >>> pe("a>b") Compare(a > b) >>> pe("a>=b") Compare(a >= b) >>> pe("a>> pe("a<=b") Compare(a <= b) >>> pe("a<>b") Compare(a <> b) >>> pe("a!=b") Compare(a != b) >>> pe("a==b") Compare(a == b) >>> pe("a in b") Compare(a in b) >>> pe("a is b") Compare(a is b) >>> pe("a not in b") Compare(a not in b) >>> pe("a is not b") Compare(a is not b) N-Way Comparisons ----------------- If you don't want to have to process N-way comparisons in your builder, you can set the ``simplify_comparisons`` flag on your class to ``True``, and N-way comparisons will be converted to ``and`` expressions:: >>> Builder.simplify_comparisons = True >>> pe("1<2<3") And(Compare(1 < 2),Compare(2 < 3)) Otherwise, you can set the flag to ``False``, and you will receive N-way comparisons as additional values in the list of ``(op, expr)`` tuples:: >>> Builder.simplify_comparisons = False >>> pe("1<2<3") Compare(1 < 2 < 3) >>> pe("a>=b>c= b > c < d) Note that you *must* explicitly set ``simplify_comparisons`` to either a true or false value; there is no default. Conditional Expressions ----------------------- The ``IfElse`` method receives three arguments: a node for the "true" value, a node for the condition, and a node for the "false" value:: >>> class Builder(Builder): ... def IfElse(self, trueVal, condition, falseVal): ... return 'IfElse(%s, %s, %s)' % ( ... build(self, trueVal), build(self, condition), ... build(self, falseVal) ... ) >>> import sys >>> if sys.version>='2.5': ... pe("a if b else c") ... else: ... print "IfElse(a, b, c)" IfElse(a, b, c) List Comprehensions and Generator Expressions ============================================= The ``ListComp`` and ``GenExpr`` methods receive two arguments: a node for the output expression, and a list of ``(op, node)`` tuples, where `op` is the name of an operator (either "for", "in", or "if"), and `node` is the node corresponding to the operator's argument:: >>> class Builder(Builder): ... def ListComp(self, initExpr, clauses): ... data = [build(self,initExpr)] ... for op,val in clauses: ... data.append(op) ... data.append(build(self,val)) ... return 'ListComp(%s)' % ' '.join(data) ... ... def GenExpr(self, initExpr, clauses): ... data = [build(self,initExpr)] ... for op,val in clauses: ... data.append(op) ... data.append(build(self,val)) ... return 'GenExpr(%s)' % ' '.join(data) >>> pe("[x for x in y if z]") ListComp(x for x in y if z) >>> pe("[(x+1, 42) for x in y for y in z if q>z]") ListComp(Tuple(Add(x,1),42) for x in y for y in z if Compare(q > z)) >>> pe("[x+1 for x in y if x in z for y in q if r if p]") ListComp(Add(x,1) for x in y if Compare(x in z) for y in q if r if p) >>> if sys.version>='2.4': ... pe("(x for x in y if z)") ... else: ... print "GenExpr(x for x in y if z)" GenExpr(x for x in y if z) Note, by the way, that when you are building the "for" clause assignments, you'll need to handle arbitrary assignments (e.g. tuple unpacking):: >>> pe("[x for y, x in z]") ListComp(x for Tuple(y,x) in z) >>> pe("[x.y for x.y in z]") ListComp(Getattr(x,'y') for Getattr(x,'y') in z) >>> pe("[x[y] for x[y] in z]") ListComp(Subscript(x,y) for Subscript(x,y) in z) (Normally, you would handle this by passing the "for" clauses to a different builder instance that's set up to handle calls to ``Name``, ``Getattr``, ``Tuple``, etc. by generating assignments instead of lookups.) Dictionaries ============ The ``Dict`` method takes one argument: a list of ``(key, value)`` tuples, where both the keys and values are expression nodes:: >>> class Builder(Builder): ... def Dict(self, items): ... return '{%s}' % ','.join([ ... '%s:%s' % (build(self,k),build(self,v)) for k,v in items ... ]) >>> pe("{ (a,b):c+d, e:[f,g] }") {Tuple(a,b):Add(c,d),e:List(f,g)} Calls ===== The ``CallFunc`` method takes five arguments: `func` The expression to be called. `args` A list of positional argument expression nodes. `kw` A list of ``(argname, value)`` expression node pairs `star_node` The node for the ``*args`` expression, or ``None`` if there isn't one. `dstar_node` The node for the ``*kw`` expression, or ``None`` if there isn't one. >>> class Builder(Builder): ... def CallFunc(self, func, args, kw, star_node, dstar_node): ... if star_node: ... star_node=build(self, star_node) ... else: ... star_node = 'None' ... if dstar_node: ... dstar_node=build(self, dstar_node) ... else: ... dstar_node = 'None' ... return 'Call(%s,%s,%s,%s,%s)' % ( ... build(self,func), self.Tuple(args), self.Dict(kw), ... star_node, dstar_node ... ) >>> pe("a()") Call(a,Tuple(),{},None,None) >>> pe("a(1,2)") Call(a,Tuple(1,2),{},None,None) >>> pe("a(1,2,)") Call(a,Tuple(1,2),{},None,None) >>> pe("a(b=3)") Call(a,Tuple(),{'b':3},None,None) >>> pe("a(1,2,b=3)") Call(a,Tuple(1,2),{'b':3},None,None) >>> pe("a(*x)") Call(a,Tuple(),{},x,None) >>> pe("a(1,*x)") Call(a,Tuple(1),{},x,None) >>> pe("a(b=3,*x)") Call(a,Tuple(),{'b':3},x,None) >>> pe("a(1,2,b=3,*x)") Call(a,Tuple(1,2),{'b':3},x,None) >>> pe("a(**y)") Call(a,Tuple(),{},None,y) >>> pe("a(1,**y)") Call(a,Tuple(1),{},None,y) >>> pe("a(b=3,**y)") Call(a,Tuple(),{'b':3},None,y) >>> pe("a(1,2,b=3,**y)") Call(a,Tuple(1,2),{'b':3},None,y) >>> pe("a(*x,**y)") Call(a,Tuple(),{},x,y) >>> pe("a(1,*x,**y)") Call(a,Tuple(1),{},x,y) >>> pe("a(b=3,*x,**y)") Call(a,Tuple(),{'b':3},x,y) >>> pe("a(1,2,b=3,*x,**y)") Call(a,Tuple(1,2),{'b':3},x,y) >>> if sys.version>='2.4': ... pe("a(x for x in y if z)") ... pe("a(x for x in y if z, q)") ... else: ... print "Call(a,Tuple(GenExpr(x for x in y if z)),{},None,None)" ... print "Call(a,Tuple(GenExpr(x for x in y if z),q),{},None,None)" Call(a,Tuple(GenExpr(x for x in y if z)),{},None,None) Call(a,Tuple(GenExpr(x for x in y if z),q),{},None,None) Miscellaneous Tests =================== An interesting quirk of the AST module is that it supports parsing some calls that *should* be syntax errors. The ``ast_builder`` module thus has to trap these itself:: >>> pe("a(1=2)") # expr as kw Traceback (most recent call last): ... SyntaxError: keyword can't be an expression (...) >>> pe("a(b=2,c)") Traceback (most recent call last): ... SyntaxError: non-keyword arg after keyword arg Most of Python's operator associativity and precedence is grammar-driven, but certain parts have to be handled by ``ast_builder``. These are just some tests to make sure that associativity is correct:: >>> pe("a+b+c") Add(Add(a,b),c) >>> pe("a*b*c") Mul(Mul(a,b),c) >>> pe("a/b/c") Div(Div(a,b),c) >>> pe("a//b//c") FloorDiv(FloorDiv(a,b),c) >>> pe("a%b%c") Mod(Mod(a,b),c) >>> pe("a<>> pe("a>>b>>c") RightShift(RightShift(a,b),c) >>> pe("a()()") Call(Call(a,Tuple(),{},None,None),Tuple(),{},None,None) >>> pe("a**b**c") # power is right-associative Power(a,Power(b,c)) >>> pe("5*x**2 + 4*x + -1") Add(Add(Mul(5,Power(x,2)),Mul(4,x)),Minus(1))