pax_global_header00006660000000000000000000000064140735305320014514gustar00rootroot0000000000000052 comment=be4bb5be8b5288e0a3379e429933782e70a3f34d numpysane-0.35/000077500000000000000000000000001407353053200134625ustar00rootroot00000000000000numpysane-0.35/.gitignore000066400000000000000000000001531407353053200154510ustar00rootroot00000000000000*.pyc *~ debian/*.log dist/ MANIFEST .pybuild README build/ *.egg-info *.d *.o *.i *.s *.so */*GENERATED.c numpysane-0.35/Changes000066400000000000000000000074541407353053200147670ustar00rootroot00000000000000numpysane (0.35) * added broadcast_extra_dims() -- Dima Kogan Wed, 14 Jul 2021 02:41:25 -0700 numpysane (0.34) * glue(np.array(()), x) and glue(x, np.array(())) both work -- Dima Kogan Thu, 27 May 2021 17:03:33 -0700 numpysane (0.33) * matmult() supports in-place output via the "out" keyword argument -- Dima Kogan Thu, 15 Apr 2021 18:15:46 -0700 numpysane (0.32) * glue(): minor bug fix Prior to this fix this would happen: print( nps.glue(np.array(()),np.arange(5,), axis=-2). shape ) ---> (5,) This is unintuitive because I glued something along dimension -2, but the result doesn't even have such a dimension. This patch calls atleast_dims() right before glue() returns, so that we get a shape (1,5) instead in this case -- Dima Kogan Thu, 25 Mar 2021 18:52:19 -0700 numpysane (0.31) * broadcast_define(): better out_kwarg logic. If the output is written in-place, the inner function doesn't need to return anything. And the broadcast_define()-wrapped function automatically returns the right thing -- Dima Kogan Wed, 03 Feb 2021 12:51:54 -0800 numpysane (0.30) * numpysane_pywrap can find its C templates even if installed via pip * numpysane.mag() can take a dtype keyword argument * mag(), inner(), norm2() now use the given dtype for all their computations, so selecting an appropriate dtype can prevent overflows -- Dima Kogan Mon, 01 Feb 2021 15:31:03 -0800 numpysane (0.29) * numpysane_pywrap: module docstrings can span multiple lines -- Dima Kogan Tue, 17 Nov 2020 12:09:00 -0800 numpysane (0.28) * more precise logic for size-0 concatenation I can once again accumulate arrays from np.array(()) -- Dima Kogan Wed, 23 Sep 2020 13:06:08 -0700 numpysane (0.27) * numpysane_pywrap item__...() macro works with non-trivial arguments -- Dima Kogan Mon, 21 Sep 2020 14:09:17 -0700 numpysane (0.26) * glue() and cat() handles size-0 arrays better * numpysane_pywrap: size-0 arrays are always deemed contiguous -- Dima Kogan Sat, 19 Sep 2020 20:13:05 -0700 numpysane (0.25) * nps.dummy() supports multiple axes given at once. So I can do something like nps.dummy(x, -2, -2) * numpysane_pywrap: generated code can use ctype__NAME and item__NAME to simplify handling of non-contiguous data -- Dima Kogan Sat, 05 Sep 2020 13:52:19 -0700 numpysane (0.24) * C broadcasting: I can pass strings in the extra, non-broadcastable arguments * C broadcasting: added support for precomputed cookies to do as much of the work as possible outside of the slice loop -- Dima Kogan Fri, 19 Jun 2020 10:55:47 -0700 numpysane (0.23) * Bug fix: C broadcasting doesn't write to uninitialized memory when given a size-0 matrix -- Dima Kogan Fri, 12 Jun 2020 19:16:25 -0700 numpysane (0.22) * broadcast_define() and the generated function checks its arguments for validity more thoroughly * outer() doesn't require identically-dimensioned input * mass rewrite of the documentation * Added C-level broadcasting * License change: any version of the LGPL instead of LGPL-3+ -- Dima Kogan Sat, 14 Mar 2020 23:40:29 -0700 numpysane (0.20) * nps.matmult(..., out=out) produces in-place results when one of the arguments is 1D -- Dima Kogan Sat, 30 Nov 2019 18:20:49 -0800 numpysane (0.19) * Added mag() convenience function. mag(x) = sqrt(norm2(x)) * Initial support for C-level broadcasting -- Dima Kogan Thu, 28 Nov 2019 18:50:02 -0800 numpysane-0.35/LICENSE000066400000000000000000000004111407353053200144630ustar00rootroot00000000000000Copyright 2016-2020 Dima Kogan. This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (any version) as published by the Free Software Foundation See https://www.gnu.org/licenses/lgpl.html numpysane-0.35/MANIFEST.in000066400000000000000000000000561407353053200152210ustar00rootroot00000000000000include README include test/test-numpysane.py numpysane-0.35/Makefile000066400000000000000000000053131407353053200151240ustar00rootroot00000000000000########## For the test suite. To build the extension module for testing the ########## broadcasting-in-C # Minimal part of https://github.com/dkogan/mrbuild to provide python Makefile # rules # # I support both python2 and python3, but only one at a time PYTHON_VERSION_FOR_EXTENSIONS := 3 include Makefile.common.header # I build a python extension module called "testlib" from the C library # (testlib) and from the numpysane_pywrap wrapper. The wrapper is generated with # genpywrap.py test/testlib$(PY_EXT_SUFFIX): test/testlib_pywrap_GENERATED.o test/testlib.o $(PY_MRBUILD_LINKER) $(PY_MRBUILD_LDFLAGS) $^ -o $@ test/testlib_pywrap_GENERATED.o: CFLAGS += $(PY_MRBUILD_CFLAGS) CC ?= gcc CFLAGS += -g %.o:%.c $(CC) -Wall -Wextra $(CFLAGS) $(CPPFLAGS) -c -o $@ $< test/testlib_pywrap_GENERATED.c: test/genpywrap.py numpysane_pywrap.py $(wildcard pywrap-templates/*.c) ./$< > $@ # In the python api I have to cast a PyCFunctionWithKeywords to a PyCFunction, # and the compiler complains. But that's how Python does it! So I tell the # compiler to chill test/testlib_pywrap_GENERATED.o: CFLAGS += -Wno-cast-function-type test/testlib_pywrap_GENERATED.o: test/testlib.h CFLAGS += -Wno-missing-field-initializers clean: rm -rf test/*.[do] test/*.o test/*.so test/*.so.* test/testlib_pywrap_GENERATED.c README.org README .PHONY: clean ####### Everything non-extension-module related .DEFAULT_GOAL := all all: README README.org README-pywrap README-pywrap.org # a multiple-target pattern rule means that a single invocation of the command # builds all the targets, which is what I want here %EADME %EADME.org: numpysane.py README.footer.org extract_README.py python3 extract_README.py numpysane README.org README README.footer.org %EADME-pywrap %EADME-pywrap.org: numpysane_pywrap.py README.footer.org extract_README.py python3 extract_README.py numpysane_pywrap README-pywrap.org README-pywrap README.footer.org test: test2 test3 check: check2 check3 check2: test2 check3: test3 test2 test3: test/test-numpysane.py test-c-broadcasting python$(patsubst test%,%,$@) test/test-numpysane.py test-c-broadcasting: test/testlib$(PY_EXT_SUFFIX) python${PYTHON_VERSION_FOR_EXTENSIONS} test/test-c-broadcasting.py .PHONY: check check2 check3 test test2 test3 test-c-broadcasting DIST_VERSION := $(or $(shell < numpysane.py perl -ne "if(/__version__ = '(.*)'/) { print \$$1; exit}"), $(error "Couldn't parse the distribution version")) DIST := dist/numpysane-$(DIST_VERSION).tar.gz $(DIST): README # make distribution tarball $(DIST): python3 setup.py sdist .PHONY: $(DIST) # rebuild it unconditionally dist: $(DIST) .PHONY: dist # make and upload the distribution tarball dist_upload: $(DIST) twine upload --verbose $(DIST) .PHONY: dist_upload numpysane-0.35/Makefile.common.header000066400000000000000000000132321407353053200176410ustar00rootroot00000000000000# -*- Makefile -*- # This is a part of the mrbuild project: https://github.com/dkogan/mrbuild # # Released under an MIT-style license. Modify and distribute as you like: # # Copyright 2016-2019 California Institute of Technology # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell # copies of the Software, and to permit persons to whom the Software is # furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # This stuff defines variables (PY_EXT_SUFFIX) that could be used by the user # Makefile at parsing time. So this must be included BEFORE the rest of the user # Makefile PYTHON_VERSION_FOR_EXTENSIONS ?= 3 # 2 or 3 # Flags for python extension modules. See # http://notes.secretsauce.net/notes/2017/11/14_python-extension-modules-without-setuptools-or-distutils.html # # I build the python extension module without any setuptools or anything. # Instead I ask python about the build flags it likes, and build the DSO # normally using those flags. # # There's some sillyness in Make I need to work around. First, I produce a # python script to query the various build flags, but replacing all whitespace # with __whitespace__. The string I get when running this script will then have # a number of whitespace-separated tokens, each setting ONE variable # # I set up a number of variables: # # These come from Python queries. I ask Python about XXX and store the result # into PY_XXX # # PY_CC # PY_CFLAGS # PY_CCSHARED # PY_INCLUDEPY # PY_BLDSHARED # PY_LDFLAGS # PY_EXT_SUFFIX # PY_MULTIARCH # # These process the above into a single set of CFLAGS: # # PY_MRBUILD_CFLAGS # # These process the above into a single set of LDFLAGS: # # PY_MRBUILD_LDFLAGS # # These process the above into a DSO-building linker command # # PY_MRBUILD_LINKER # # When the user Makefile evaluates ANY of these variables I query python, and # memoize the results. So the python is invoked at MOST one time. Any Makefiles # that don't touch the PY_... variables will not end up invoking the python # thing at all # # Variables to ask Python about _PYVARS_LIST := CC CFLAGS CCSHARED INCLUDEPY BLDSHARED BLDLIBRARY LDFLAGS EXT_SUFFIX MULTIARCH # Python script to query those variables define _PYVARS_SCRIPT from __future__ import print_function import sysconfig import re conf = sysconfig.get_config_vars() for v in ($(foreach v,$(_PYVARS_LIST),"$v",)): if v in conf: print(re.sub("[\t ]+", "__whitespace__", "_PY_{}:={}".format(v, conf[v]))) endef # I eval this to actually invoke the Python and to ingest its results. I only # eval this ONLY when necessary. define query_python_extension_building_flags _PYVARS := $$(shell python$(PYTHON_VERSION_FOR_EXTENSIONS) -c '$$(_PYVARS_SCRIPT)') # I then $(eval) these tokens one at a time, restoring the whitespace $$(foreach setvarcmd,$$(_PYVARS),$$(eval $$(subst __whitespace__, ,$$(setvarcmd)))) # pull out flags from CC, throw out the compiler itself, since I know better _FLAGS_FROM_PYCC := $$(wordlist 2,$$(words $$(_PY_CC)),$$(_PY_CC)) _PY_MRBUILD_CFLAGS := $$(filter-out -O%,$$(_FLAGS_FROM_PYCC) $$(_PY_CFLAGS) $$(_PY_CCSHARED) -I$$(_PY_INCLUDEPY)) # I add an RPATH to the python extension DSO so that it runs in-tree. Will pull # it out at install time _PY_MRBUILD_LDFLAGS := $$(_PY_LDFLAGS) -L$$(abspath .) -Wl,-rpath=$$(abspath .) _PY_MRBUILD_LINKER := $$(_PY_BLDSHARED) $$(_PY_BLDLIBRARY) endef # List of variables a user Makefile could touch _PYVARS_API := $(foreach v,$(_PYVARS_LIST),PY_$v) PY_MRBUILD_CFLAGS PY_MRBUILD_LDFLAGS PY_MRBUILD_LINKER # The first time the user touches these variables, ask Python. Each subsequent # time, use the previously-returned value. So we query Python at most once. If a # project isn't using the Python extension modules, we will not query Python at # all # # I handle all the Python API variables identically, except for PY_EXT_SUFFIX. # If Python gives me a suffix, Iuse it (this is available in python3; it has # ABI, architecture details). Otherwise, I try the multiarch suffix, or if even # THAT isn't available, just do .so. I need to handle it specially to make the # self-referential logic work with the memoization logic define _PY_DEFINE_API_VAR $1 = $$(or $$(_$1),$$(eval $$(query_python_extension_building_flags))$$(_$1)) endef define _PY_DEFINE_API_VAR_EXTSUFFIX $1 = $$(or $$(_$1),$$(eval $$(query_python_extension_building_flags))$$(or $$(_$1),$$(if $$(PY_MULTIARCH),.$$(PY_MULTIARCH)).so)) endef $(foreach v,$(filter-out PY_EXT_SUFFIX,$(_PYVARS_API)),$(eval $(call _PY_DEFINE_API_VAR,$v))) $(eval $(call _PY_DEFINE_API_VAR_EXTSUFFIX,PY_EXT_SUFFIX)) # Useful to pull in a local build of some library. Sets the compiler and linker # (runtime and build-time) flags. Invoke like this: # $(eval $(call add_local_library_path,/home/user/library)) define add_local_library_path CFLAGS += -I$1 CXXFLAGS += -I$1 LDFLAGS += -L$1 -Wl,-rpath=$1 endef numpysane-0.35/README-pywrap.org000066400000000000000000000740041407353053200164550ustar00rootroot00000000000000* TALK I just gave a talk about this at [[https://www.socallinuxexpo.org/scale/18x][SCaLE 18x]]. Presentation lives [[https://github.com/dkogan/talk-numpysane-gnuplotlib/raw/master/numpysane-gnuplotlib.pdf][here]]. * NAME numpysane_pywrap: Python-wrap C code with broadcasting awareness * SYNOPSIS Let's implement a broadcastable and type-checked inner product that is - Written in C (i.e. it is fast) - Callable from python using numpy arrays (i.e. it is convenient) We write a bit of python to generate the wrapping code. "genpywrap.py": #+BEGIN_EXAMPLE import numpy as np import numpysane as nps import numpysane_pywrap as npsp m = npsp.module( name = "innerlib", docstring = "An inner product module in C") m.function( "inner", "Inner product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (), Ccode_slice_eval = \ {np.float64: r""" double* out = (double*)data_slice__output; const int N = dims_slice__a[0]; *out = 0.0; for(int i=0; i inner_pywrap.c #+END_EXAMPLE We build this into a python module: #+BEGIN_EXAMPLE COMPILE=(`python3 -c " import sysconfig conf = sysconfig.get_config_vars() print('{} {} {} -I{}'.format(*[conf[x] for x in ('CC', 'CFLAGS', 'CCSHARED', 'INCLUDEPY')]))"`) LINK=(`python3 -c " import sysconfig conf = sysconfig.get_config_vars() print('{} {} {}'.format(*[conf[x] for x in ('BLDSHARED', 'BLDLIBRARY', 'LDFLAGS')]))"`) EXT_SUFFIX=`python3 -c " import sysconfig print(sysconfig.get_config_vars('EXT_SUFFIX')[0])"` ${COMPILE[@]} -c -o inner_pywrap.o inner_pywrap.c ${LINK[@]} -o innerlib$EXT_SUFFIX inner_pywrap.o #+END_EXAMPLE Here we used the build commands directly. This could be done with setuptools/distutils instead; it's a normal extension module. And now we can compute broadcasted inner products from a python script "tst.py": #+BEGIN_EXAMPLE import numpy as np import innerlib print(innerlib.inner( np.arange(4, dtype=float), np.arange(8, dtype=float).reshape( 2,4))) #+END_EXAMPLE Running it to compute inner([0,1,2,3],[0,1,2,3]) and inner([0,1,2,3],[4,5,6,7]): #+BEGIN_EXAMPLE $ python3 tst.py [14. 38.] #+END_EXAMPLE * DESCRIPTION This module provides routines to python-wrap existing C code by generating C sources that define the wrapper python extension module. To create the wrappers we 1. Instantiate a new numpysane_pywrap.module class 2. Call module.function() for each wrapper function we want to add to this module 3. Call module.write() to write the C sources defining this module to standard output The sources can then be built and executed normally, as any other python extension module. The resulting functions are called as one would expect: #+BEGIN_EXAMPLE output = f_one_output (input0, input1, ...) (output0, output1, ...) = f_multiple_outputs(input0, input1, ...) #+END_EXAMPLE depending on whether we declared a single output, or multiple outputs (see below). It is also possible to pre-allocate the output array(s), and call the functions like this (see below): #+BEGIN_EXAMPLE output = np.zeros(...) f_one_output (input0, input1, ..., out = output) output0 = np.zeros(...) output1 = np.zeros(...) f_multiple_outputs(input0, input1, ..., out = (output0, output1)) #+END_EXAMPLE Each wrapped function is broadcasting-aware. The normal numpy broadcasting rules (as described in 'broadcast_define' and on the numpy website: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) apply. In summary: - Dimensions are aligned at the end of the shape list, and must match the prototype - Extra dimensions left over at the front must be consistent for all the input arguments, meaning: - All dimensions of length != 1 must match - Dimensions of length 1 match corresponding dimensions of any length in other arrays - Missing leading dimensions are implicitly set to length 1 - The output(s) have a shape where - The trailing dimensions match the prototype - The leading dimensions come from the extra dimensions in the inputs When we create a wrapper function, we only define how to compute a single broadcasted slice. If the generated function is called with higher-dimensional inputs, this slice code will be called multiple times. This broadcast loop is produced by the numpysane_pywrap generator automatically. The generated code also - parses the python arguments - generates python return values - validates the inputs (and any pre-allocated outputs) to make sure the given shapes and types all match the declared shapes and types. For instance, computing an inner product of a 5-vector and a 3-vector is illegal - creates the output arrays as necessary This code-generator module does NOT produce any code to implicitly make copies of the input. If the inputs fail validation (unknown types given, contiguity checks failed, etc) then an exception is raised. Copying the input is potentially slow, so we require the user to do that, if necessary. ** Explicated example In the synopsis we declared the wrapper module like this: #+BEGIN_EXAMPLE m = npsp.module( name = "innerlib", docstring = "An inner product module in C") #+END_EXAMPLE This produces a module named "innerlib". Note that the python importer will look for this module in a file called "innerlib$EXT_SUFFIX" where EXT_SUFFIX comes from the python configuration. This is normal behavior for python extension modules. A module can contain many wrapper functions. Each one is added by calling 'm.function()'. We did this: #+BEGIN_EXAMPLE m.function( "inner", "Inner product pywrapped with numpysane_pywrap", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (), Ccode_slice_eval = \ {np.float64: r""" double* out = (double*)data_slice__output; const int N = dims_slice__a[0]; *out = 0.0; for(int i=0; i>> print(innerlib.inner( np.arange(4, dtype=float), np.arange(8, dtype=float).reshape( 2,4)), scale_string = "1.0") [14. 38.] >>> print(innerlib.inner( np.arange(4, dtype=float), np.arange(8, dtype=float).reshape( 2,4), scale = 2.0, scale_string = "10.0")) [280. 760.] #+END_EXAMPLE ** Precomputing a cookie outside the slice computation Sometimes it is useful to generate some resource once, before any of the broadcasted slices were evaluated. The slice evaluation code could then make use of this resource. Example: allocating memory, opening files. This is supported using a 'cookie'. We define a structure that contains data that will be available to all the generated functions. This structure is initialized at the beginning, used by the slice computation functions, and then cleaned up at the end. This is most easily described with an example. The scaled inner product demonstrated immediately above has an inefficiency: we compute 'atof(scale_string)' once for every slice, even though the string does not change. We should compute the atof() ONCE, and use the resulting value each time. And we can: #+BEGIN_EXAMPLE m.function( "inner", "Inner product pywrapped with numpysane_pywrap", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (), extra_args = (("double", "scale", "1", "d"), ("const char*", "scale_string", "NULL", "s")), Ccode_cookie_struct = r""" double scale; /* from BOTH scale arguments: "scale", "scale_string" */ """, Ccode_validate = r""" if(scale_string == NULL) { PyErr_Format(PyExc_RuntimeError, "The 'scale_string' argument is required" ); return false; } cookie->scale = *scale * (scale_string ? atof(scale_string) : 1.0); return true; """, Ccode_slice_eval = \ {np.float64: r""" double* out = (double*)data_slice__output; const int N = dims_slice__a[0]; *out = 0.0; for(int i=0; iscale; return true;""" }, // Cleanup, such as free() or close() goes here Ccode_cookie_cleanup = '' ) #+END_EXAMPLE We defined a cookie structure that contains one element: 'double scale'. We compute the scale factor (from BOTH of the extra arguments) before any of the slices are evaluated: in the validation function. Then we apply the already-computed scale with each slice. Both the validation and slice computation functions have the whole cookie structure available in '*cookie'. It is expected that the validation function will write something to the cookie, and the slice functions will read it, but this is not enforced: this structure is not const, and both functions can do whatever they like. If the cookie initialization did something that must be cleaned up (like a malloc() for instance), the cleanup code can be specified in the 'Ccode_cookie_cleanup' argument to function(). Note: this cleanup code is ALWAYS executed, even if there were errors that raise an exception, EVEN if we haven't initialized the cookie yet. When the cookie object is first initialized, it is filled with 0, so the cleanup code can detect whether the cookie has been initialized or not: #+BEGIN_EXAMPLE m.function( ... Ccode_cookie_struct = r""" ... bool initialized; """, Ccode_validate = r""" ... cookie->initialized = true; return true; """, Ccode_cookie_cleanup = r""" if(cookie->initialized) cleanup(); """ ) #+END_EXAMPLE ** Examples For some sample usage, see the wrapper-generator used in the test suite: https://github.com/dkogan/numpysane/blob/master/test/genpywrap.py ** Planned functionality Currently, each broadcasted slice is computed sequentially. But since the slices are inherently independent, this is a natural place to add parallelism. And implemention this with something like OpenMP should be straightforward. I'll get around to doing this eventually, but in the meantime, patches are welcome. * COMPATIBILITY Python 2 and Python 3 should both be supported. Please report a bug if either one doesn't work. * REPOSITORY https://github.com/dkogan/numpysane * AUTHOR Dima Kogan * LICENSE AND COPYRIGHT Copyright 2016-2020 Dima Kogan. This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (any version) as published by the Free Software Foundation See https://www.gnu.org/licenses/lgpl.html numpysane-0.35/README.footer.org000066400000000000000000000007641407353053200164340ustar00rootroot00000000000000* COMPATIBILITY Python 2 and Python 3 should both be supported. Please report a bug if either one doesn't work. * REPOSITORY https://github.com/dkogan/numpysane * AUTHOR Dima Kogan * LICENSE AND COPYRIGHT Copyright 2016-2020 Dima Kogan. This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (any version) as published by the Free Software Foundation See https://www.gnu.org/licenses/lgpl.html numpysane-0.35/README.org000066400000000000000000001473251407353053200151440ustar00rootroot00000000000000* TALK I just gave a talk about this at [[https://www.socallinuxexpo.org/scale/18x][SCaLE 18x]]. Here are the [[https://www.youtube.com/watch?v=YOOapXNtUWw][video of the talk]] and the [[https://github.com/dkogan/talk-numpysane-gnuplotlib/raw/master/numpysane-gnuplotlib.pdf]["slides"]]. * NAME numpysane: more-reasonable core functionality for numpy * SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> row = a[0,:] + 1000 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> row array([1000, 1001, 1002]) >>> nps.glue(a,b, axis=-1) array([[ 0, 1, 2, 100, 101, 102], [ 3, 4, 5, 103, 104, 105]]) >>> nps.glue(a,b,row, axis=-2) array([[ 0, 1, 2], [ 3, 4, 5], [ 100, 101, 102], [ 103, 104, 105], [1000, 1001, 1002]]) >>> nps.cat(a,b) array([[[ 0, 1, 2], [ 3, 4, 5]], [[100, 101, 102], [103, 104, 105]]]) >>> @nps.broadcast_define( (('n',), ('n',)) ) ... def inner_product(a, b): ... return a.dot(b) >>> inner_product(a,b) array([ 305, 1250]) #+END_EXAMPLE * DESCRIPTION Numpy is a very widely used toolkit for numerical computation in Python. Despite its popularity, some of its core functionality is mysterious and/or incomplete. The numpysane library seeks to fill those gaps by providing its own replacement routines. Many of the replacement functions are direct translations from PDL (http://pdl.perl.org), a numerical computation library for perl. The functions provided by this module fall into three broad categories: - Broadcasting support - Nicer array manipulation - Basic linear algebra ** Broadcasting Numpy has a limited support for broadcasting (http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html), a generic way to vectorize functions. A broadcasting-aware function knows the dimensionality of its inputs, and any extra dimensions in the input are automatically used for vectorization. *** Broadcasting rules A basic example is an inner product: a function that takes in two identically-sized 1-dimensional arrays (input prototype (('n',), ('n',)) ) and returns a scalar (output prototype () ). If one calls a broadcasting-aware inner product with two arrays of shape (2,3,4) as input, it would compute 6 inner products of length-4 each, and report the output in an array of shape (2,3). In short: - The most significant dimension in a numpy array is the LAST one, so the prototype of an input argument must exactly match a given input's trailing shape. So a prototype shape of (a,b,c) accepts an argument shape of (......, a,b,c), with as many or as few leading dimensions as desired. - The extra leading dimensions must be compatible across all the inputs. This means that each leading dimension must either - equal 1 - be missing (thus assumed to equal 1) - equal to some positive integer >1, consistent across all arguments - The output is collected into an array that's sized as a superset of the above-prototype shape of each argument More involved example: A function with input prototype ( (3,), ('n',3), ('n',), ('m',) ) given inputs of shape #+BEGIN_EXAMPLE (1,5, 3) (2,1, 8,3) ( 8) ( 5, 9) #+END_EXAMPLE will return an output array of shape (2,5, ...), where ... is the shape of each output slice. Note again that the prototype dictates the TRAILING shape of the inputs. *** What about the stock broadcasting support? The numpy documentation dedicates a whole page explaining the broadcasting rules, but only a small number of numpy functions provide any broadcasting support. It's fairly inconsistent, and most functions have no broadcasting support and no mention of it in the documentation. And as a result, this is not a prominent part of the numpy ecosystem and there's little user awareness that it exists. *** What this module provides This module contains functionality to make any arbitrary function broadcastable, in either C or Python. In both cases, the input and output prototypes are declared, and these are used for shape-checking and vectorization each time the function is called. The functions can have either - A single output, returned as a numpy array. The output specification in the prototype is a single shape tuple - Multiple outputs, returned as a tuple of numpy arrays. The output specification in the prototype is a tuple of shape tuples *** Broadcasting in python This is invoked as a decorator, applied to any function. An example: #+BEGIN_EXAMPLE >>> import numpysane as nps >>> @nps.broadcast_define( (('n',), ('n',)) ) ... def inner_product(a, b): ... return a.dot(b) #+END_EXAMPLE Here we have a simple inner product function to compute ONE inner product. The 'broadcast_define' decorator adds broadcasting-awareness: 'inner_product()' expects two 1D vectors of length 'n' each (same 'n' for the two inputs), vectorizing extra dimensions, as needed. The inputs are shape-checked, and incompatible dimensions will trigger an exception. Example: #+BEGIN_EXAMPLE >>> import numpy as np >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> inner_product(a,b) array([ 305, 1250]) #+END_EXAMPLE Another related function in this module broadcast_generate(). It's similar to broadcast_define(), but instead of adding broadcasting-awareness to an existing function, it returns a generator that produces tuples from a set of arguments according to a given prototype. Similarly, broadcast_extra_dims() is available to report the outer shape of a potential broadcasting operation. Stock numpy has some rudimentary support for all this with its vectorize() function, but it assumes only scalar inputs and outputs, which severely limits its usefulness. See the docstrings for 'broadcast_define' and 'broadcast_generate' in the INTERFACE section below for usage details. *** Broadcasting in C The python broadcasting is useful, but it is a python loop, so the loop itself is computationally expensive if we have many iterations. If the function being wrapped is available in C, we can apply broadcasting awareness in C, which makes a much faster loop. The "numpysane_pywrap" module generates code to wrap arbitrary C code in a broadcasting-aware wrapper callable from python. This is an analogue of PDL::PP (http://pdl.perl.org/PDLdocs/PP.html). This generated code is compiled and linked into a python extension module, as usual. This functionality documented separately: https://github.com/dkogan/numpysane/blob/master/README-pywrap.org After I wrote this, I realized there is some support for this in stock numpy: https://docs.scipy.org/doc/numpy-1.13.0/reference/c-api.ufunc.html Note: I have not tried using these APIs. ** Nicer array manipulation Numpy functions that move dimensions around and concatenate matrices are unintuitive. For instance, a simple concatenation of a row-vector or a column-vector to a matrix requires arcane knowledge to accomplish reliably. This module provides new functions that can be used for these basic operations. These new functions do have well-defined and sensible behavior, and they largely come from the interfaces in PDL (http://pdl.perl.org). These all respect the core rules of numpy broadcasting: - LEADING length-1 dimensions don't affect the meaning of an array, so the routines handle missing or extra length-1 dimensions at the front - The inner-most dimensions of an array are the TRAILING ones, so whenever an axis specification is used, it is strongly recommended (sometimes required) to count the axes from the back by passing in axis<0 A high level description of the functionality is given here, and each function is described in detail in the INTERFACE section below. In the following examples, I use a function "arr" that returns a numpy array with given dimensions: #+BEGIN_EXAMPLE >>> def arr(*shape): ... product = reduce( lambda x,y: x*y, shape) ... return numpy.arange(product).reshape(*shape) >>> arr(1,2,3) array([[[0, 1, 2], [3, 4, 5]]]) >>> arr(1,2,3).shape (1, 2, 3) #+END_EXAMPLE *** Concatenation This module provides two functions to do this **** glue Concatenates some number of arrays along a given axis ('axis' must be given in a kwarg). Implicit length-1 dimensions are added at the start as needed. Dimensions other than the glueing axis must match exactly. Basic usage: #+BEGIN_EXAMPLE >>> row_vector = arr( 3,) >>> col_vector = arr(5,1,) >>> matrix = arr(5,3,) >>> numpysane.glue(matrix, row_vector, axis = -2).shape (6,3) >>> numpysane.glue(matrix, col_vector, axis = -1).shape (5,4) #+END_EXAMPLE **** cat Concatenate some number of arrays along a new leading axis. Implicit length-1 dimensions are added, and the logical shapes of the inputs must match. This function is a logical inverse of numpy array iteration: iteration splits an array over its leading dimension, while cat joins a number of arrays via a new leading dimension. Basic usage: #+BEGIN_EXAMPLE >>> numpysane.cat(arr(5,), arr(5,)).shape (2,5) >>> numpysane.cat(arr(5,), arr(1,1,5,)).shape (2,1,1,5) #+END_EXAMPLE *** Manipulation of dimensions Several functions are available, all being fairly direct ports of their PDL (http://pdl.perl.org) equivalents **** clump Reshapes the array by grouping together 'n' dimensions, where 'n' is given in a kwarg. If 'n' > 0, then n leading dimensions are clumped; if 'n' < 0, then -n trailing dimensions are clumped. Basic usage: #+BEGIN_EXAMPLE >>> numpysane.clump( arr(2,3,4), n = -2).shape (2, 12) >>> numpysane.clump( arr(2,3,4), n = 2).shape (6, 4) #+END_EXAMPLE **** atleast_dims Adds length-1 dimensions at the front of an array so that all the given dimensions are in-bounds. Any axis<0 may expand the shape. Adding new leading dimensions (axis>=0) is never useful, since numpy broadcasts from the end, so clump() treats axis>0 as a check only: the requested axis MUST already be in-bounds, or an exception is thrown. This function always preserves the meaning of all the axes in the array: axis=-1 is the same axis before and after the call. Basic usage: #+BEGIN_EXAMPLE >>> numpysane.atleast_dims(arr(2,3), -1).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), -2).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), -3).shape (1, 2, 3) >>> numpysane.atleast_dims(arr(2,3), 0).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), 1).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), 2).shape [exception] #+END_EXAMPLE **** mv Moves a dimension from one position to another. Basic usage to move the last dimension (-1) to the front (0) #+BEGIN_EXAMPLE >>> numpysane.mv(arr(2,3,4), -1, 0).shape (4, 2, 3) #+END_EXAMPLE Or to move a dimension -5 (added implicitly) to the end #+BEGIN_EXAMPLE >>> numpysane.mv(arr(2,3,4), -5, -1).shape (1, 2, 3, 4, 1) #+END_EXAMPLE **** xchg Exchanges the positions of two dimensions. Basic usage to move the last dimension (-1) to the front (0), and the front to the back. #+BEGIN_EXAMPLE >>> numpysane.xchg(arr(2,3,4), -1, 0).shape (4, 3, 2) #+END_EXAMPLE Or to swap a dimension -5 (added implicitly) with dimension -2 #+BEGIN_EXAMPLE >>> numpysane.xchg(arr(2,3,4), -5, -2).shape (3, 1, 2, 1, 4) #+END_EXAMPLE **** transpose Reverses the order of the two trailing dimensions in an array. The whole array is seen as being an array of 2D matrices, each matrix living in the 2 most significant dimensions, which implies this definition. Basic usage: #+BEGIN_EXAMPLE >>> numpysane.transpose( arr(2,3) ).shape (3,2) >>> numpysane.transpose( arr(5,2,3) ).shape (5,3,2) >>> numpysane.transpose( arr(3,) ).shape (3,1) #+END_EXAMPLE Note that in the second example we had 5 matrices, and we transposed each one. And in the last example we turned a row vector into a column vector by adding an implicit leading length-1 dimension before transposing. **** dummy Adds a single length-1 dimension at the given position. Basic usage: #+BEGIN_EXAMPLE >>> numpysane.dummy(arr(2,3,4), -1).shape (2, 3, 4, 1) #+END_EXAMPLE **** reorder Reorders the dimensions in an array using the given order. Basic usage: #+BEGIN_EXAMPLE >>> numpysane.reorder( arr(2,3,4), -1, -2, -3 ).shape (4, 3, 2) >>> numpysane.reorder( arr(2,3,4), 0, -1, 1 ).shape (2, 4, 3) >>> numpysane.reorder( arr(2,3,4), -2 , -1, 0 ).shape (3, 4, 2) >>> numpysane.reorder( arr(2,3,4), -4 , -2, -5, -1, 0 ).shape (1, 3, 1, 4, 2) #+END_EXAMPLE ** Basic linear algebra *** inner Broadcast-aware inner product. Identical to numpysane.dot(). Basic usage to compute 4 inner products of length 3 each: #+BEGIN_EXAMPLE >>> numpysane.inner(arr( 3,), arr(4,3,)).shape (4,) >>> numpysane.inner(arr( 3,), arr(4,3,)) array([5, 14, 23, 32]) #+END_EXAMPLE *** dot Broadcast-aware non-conjugating dot product. Identical to numpysane.inner(). *** vdot Broadcast-aware conjugating dot product. Same as numpysane.dot(), except this one conjugates complex input, which numpysane.dot() does not *** outer Broadcast-aware outer product. Basic usage to compute 4 outer products of length 3 each: #+BEGIN_EXAMPLE >>> numpysane.outer(arr( 3,), arr(4,3,)).shape array(4, 3, 3) #+END_EXAMPLE *** norm2 Broadcast-aware 2-norm. numpysane.norm2(x) is identical to numpysane.inner(x,x): #+BEGIN_EXAMPLE >>> numpysane.norm2(arr(4,3)) array([5, 50, 149, 302]) #+END_EXAMPLE *** mag Broadcast-aware vector magnitude. mag(x) is functionally identical to sqrt(numpysane.norm2(x)) and sqrt(numpysane.inner(x,x)) #+BEGIN_EXAMPLE >>> numpysane.mag(arr(4,3)) array([ 2.23606798, 7.07106781, 12.20655562, 17.3781472 ]) #+END_EXAMPLE *** trace Broadcast-aware matrix trace. #+BEGIN_EXAMPLE >>> numpysane.trace(arr(4,3,3)) array([12., 39., 66., 93.]) #+END_EXAMPLE *** matmult Broadcast-aware matrix multiplication. This accepts an arbitrary number of inputs, and adds leading length-1 dimensions as needed. Multiplying a row-vector by a matrix #+BEGIN_EXAMPLE >>> numpysane.matmult( arr(3,), arr(3,2) ).shape (2,) #+END_EXAMPLE Multiplying a row-vector by 5 different matrices: #+BEGIN_EXAMPLE >>> numpysane.matmult( arr(3,), arr(5,3,2) ).shape (5, 2) #+END_EXAMPLE Multiplying a matrix by a col-vector: #+BEGIN_EXAMPLE >>> numpysane.matmult( arr(3,2), arr(2,1) ).shape (3, 1) #+END_EXAMPLE Multiplying a row-vector by a matrix by a col-vector: #+BEGIN_EXAMPLE >>> numpysane.matmult( arr(3,), arr(3,2), arr(2,1) ).shape (1,) #+END_EXAMPLE ** What's wrong with existing numpy functions? Why did I go through all the trouble to reimplement all this? Doesn't numpy already do all these things? Yes, it does. But in a very nonintuitive way. *** Concatenation **** hstack() hstack() performs a "horizontal" concatenation. When numpy prints an array, this is the last dimension (the most significant dimensions in numpy are at the end). So one would expect that this function concatenates arrays along this last dimension. In the special case of 1D and 2D arrays, one would be right: #+BEGIN_EXAMPLE >>> numpy.hstack( (arr(3), arr(3))).shape (6,) >>> numpy.hstack( (arr(2,3), arr(2,3))).shape (2, 6) #+END_EXAMPLE but in any other case, one would be wrong: #+BEGIN_EXAMPLE >>> numpy.hstack( (arr(1,2,3), arr(1,2,3))).shape (1, 4, 3) <------ I expect (1, 2, 6) >>> numpy.hstack( (arr(1,2,3), arr(1,2,4))).shape [exception] <------ I expect (1, 2, 7) >>> numpy.hstack( (arr(3), arr(1,3))).shape [exception] <------ I expect (1, 6) >>> numpy.hstack( (arr(1,3), arr(3))).shape [exception] <------ I expect (1, 6) #+END_EXAMPLE The above should all succeed, and should produce the shapes as indicated. Cases such as "numpy.hstack( (arr(3), arr(1,3)))" are maybe up for debate, but broadcasting rules allow adding as many extra length-1 dimensions as we want without changing the meaning of the object, so I claim this should work. Either way, if you print out the operands for any of the above, you too would expect a "horizontal" stack() to work as stated above. It turns out that normally hstack() concatenates along axis=1, unless the first argument only has one dimension, in which case axis=0 is used. In a system where the most significant dimension is the last one, this is only correct if everyone has only 2D arrays. The correct way to do this is to concatenate along axis=-1. It works for n-dimensionsal objects, and doesn't require the special case logic for 1-dimensional objects. **** vstack() Similarly, vstack() performs a "vertical" concatenation. When numpy prints an array, this is the second-to-last dimension (remember, the most significant dimensions in numpy are at the end). So one would expect that this function concatenates arrays along this second-to-last dimension. Again, in the special case of 1D and 2D arrays, one would be right: #+BEGIN_EXAMPLE >>> numpy.vstack( (arr(2,3), arr(2,3))).shape (4, 3) >>> numpy.vstack( (arr(3), arr(3))).shape (2, 3) >>> numpy.vstack( (arr(1,3), arr(3))).shape (2, 3) >>> numpy.vstack( (arr(3), arr(1,3))).shape (2, 3) >>> numpy.vstack( (arr(2,3), arr(3))).shape (3, 3) #+END_EXAMPLE Note that this function appears to tolerate some amount of shape mismatches. It does it in a form one would expect, but given the state of the rest of this system, I found it surprising. For instance "numpy.hstack( (arr(1,3), arr(3)))" fails, so one would think that "numpy.vstack( (arr(1,3), arr(3)))" would fail too. And once again, adding more dimensions make it confused, for the same reason: #+BEGIN_EXAMPLE >>> numpy.vstack( (arr(1,2,3), arr(2,3))).shape [exception] <------ I expect (1, 4, 3) >>> numpy.vstack( (arr(1,2,3), arr(1,2,3))).shape (2, 2, 3) <------ I expect (1, 4, 3) #+END_EXAMPLE Similarly to hstack(), vstack() concatenates along axis=0, which is "vertical" only for 2D arrays, but not for any others. And similarly to hstack(), the 1D case has special-cased logic to make it work properly. The correct way to do this is to concatenate along axis=-2. It works for n-dimensionsal objects, and doesn't require the special case for 1-dimensional objects. **** dstack() I'll skip the detailed description, since this is similar to hstack() and vstack(). The intent was to concatenate across axis=-3, but the implementation takes axis=2 instead. This is wrong, as before. And I find it strange that these 3 functions even exist, since they are all special-cases: the concatenation axis should be an argument, and at most, the edge special case (hstack()) should exist. This brings us to the next function **** concatenate() This is a more general function, and unlike hstack(), vstack() and dstack(), it takes as input a list of arrays AND the concatenation dimension. It accepts negative concatenation dimensions to allow us to count from the end, so things should work better. And in many cases that failed previously, they do: #+BEGIN_EXAMPLE >>> numpy.concatenate( (arr(1,2,3), arr(1,2,3)), axis=-1).shape (1, 2, 6) >>> numpy.concatenate( (arr(1,2,3), arr(1,2,4)), axis=-1).shape (1, 2, 7) >>> numpy.concatenate( (arr(1,2,3), arr(1,2,3)), axis=-2).shape (1, 4, 3) #+END_EXAMPLE But many things still don't work as I would expect: #+BEGIN_EXAMPLE >>> numpy.concatenate( (arr(1,3), arr(3)), axis=-1).shape [exception] <------ I expect (1, 6) >>> numpy.concatenate( (arr(3), arr(1,3)), axis=-1).shape [exception] <------ I expect (1, 6) >>> numpy.concatenate( (arr(1,3), arr(3)), axis=-2).shape [exception] <------ I expect (3, 3) >>> numpy.concatenate( (arr(3), arr(1,3)), axis=-2).shape [exception] <------ I expect (2, 3) >>> numpy.concatenate( (arr(2,3), arr(2,3)), axis=-3).shape [exception] <------ I expect (2, 2, 3) #+END_EXAMPLE This function works as expected only if - All inputs have the same number of dimensions - All inputs have a matching shape, except for the dimension along which we're concatenating - All inputs HAVE the dimension along which we're concatenating A common use case that violates these conditions: I have an object that contains N 3D vectors, and I want to add another 3D vector to it. This is essentially the first failing example above. **** stack() The name makes it sound exactly like concatenate(), and it takes the same arguments, but it is very different. stack() requires that all inputs have EXACTLY the same shape. It then concatenates all the inputs along a new dimension, and places that dimension in the location given by the 'axis' input. If this is the exact type of concatenation you want, this function works fine. But it's one of many things a user may want to do. **** Thoughts on concatenation This module introduces numpysane.glue() and numpysane.cat() to replace all the above functions. These do not refer to anything being "horizontal" or "vertical", nor do they talk about "rows" or "columns": these concepts simply don't apply in a generic N-dimensional system. These functions are very explicit about the dimensionality of the inputs/outputs, and fit well into a broadcasting-aware system. Since these functions assume that broadcasting is an important concept in the system, the given axis indices should be counted from the most significant dimension: the last dimension in numpy. This means that where an axis index is specified, negative indices are encouraged. glue() forbids axis>=0 outright. ***** Example for further justification An array containing N 3D vectors would have shape (N,3). Another array containing a single 3D vector would have shape (3,). Counting the dimensions from the end, each vector is indexed in dimension -1. However, counting from the front, the vector is indexed in dimension 0 or 1, depending on which of the two arrays we're looking at. If we want to add the single vector to the array containing the N vectors, and we mistakenly try to concatenate along the first dimension, it would fail if N != 3. But if we're unlucky, and N=3, then we'd get a nonsensical output array of shape (3,4). Why would an array of N 3D vectors have shape (N,3) and not (3,N)? Because if we apply python iteration to it, we'd expect to get N iterates of arrays with shape (3,) each, and numpy iterates from the first dimension: #+BEGIN_EXAMPLE >>> a = numpy.arange(2*3).reshape(2,3) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> [x for x in a] [array([0, 1, 2]), array([3, 4, 5])] #+END_EXAMPLE *** Manipulation of dimensions **** atleast_xd() Numpy has 3 special-case functions atleast_1d(), atleast_2d() and atleast_3d(). For 4d and higher, you need to do something else. These do surprising things: #+BEGIN_EXAMPLE >>> numpy.atleast_3d(arr(3)).shape (1, 3, 1) #+END_EXAMPLE **** transpose() Given a matrix (a 2D array), numpy.transpose() swaps the two dimensions, as expected. Given anything else, it does not do what is expected: #+BEGIN_EXAMPLE >>> numpy.transpose(arr(3, )).shape (3,) >>> numpy.transpose(arr(3,4, )).shape (4, 3) >>> numpy.transpose(arr(3,4,5,6,)).shape (6, 5, 4, 3) #+END_EXAMPLE I.e. numpy.transpose() reverses the order of ALL dimensions in the array. So if we have N 2D matrices in a single array, numpy.transpose() doesn't transpose each matrix. *** Basic linear algebra **** inner() and dot() numpy.inner() and numpy.dot() are strange. In a real-valued n-dimensional Euclidean space, a "dot product" is just another name for an "inner product". Numpy disagrees. It looks like numpy.dot() is matrix multiplication, with some wonky behaviors when given higher-dimension objects, and with some special-case behaviors for 1-dimensional and 0-dimensional objects: #+BEGIN_EXAMPLE >>> numpy.dot( arr(4,5,2,3), arr(3,5)).shape (4, 5, 2, 5) <--- expected result for a broadcasted matrix multiplication >>> numpy.dot( arr(3,5), arr(4,5,2,3)).shape [exception] <--- numpy.dot() is not commutative. Expected for matrix multiplication, but not for a dot product >>> numpy.dot( arr(4,5,2,3), arr(1,3,5)).shape (4, 5, 2, 1, 5) <--- don't know where this came from at all >>> numpy.dot( arr(4,5,2,3), arr(3)).shape (4, 5, 2) <--- 1D special case. This is a dot product. >>> numpy.dot( arr(4,5,2,3), 3).shape (4, 5, 2, 3) <--- 0D special case. This is a scaling. #+END_EXAMPLE It looks like numpy.inner() is some sort of quasi-broadcastable inner product, also with some funny dimensioning rules. In many cases it looks like numpy.dot(a,b) is the same as numpy.inner(a, transpose(b)) where transpose() swaps the last two dimensions: #+BEGIN_EXAMPLE >>> numpy.inner( arr(4,5,2,3), arr(5,3)).shape (4, 5, 2, 5) <--- All the length-3 inner products collected into a shape with not-quite-broadcasting rules >>> numpy.inner( arr(5,3), arr(4,5,2,3)).shape (5, 4, 5, 2) <--- numpy.inner() is not commutative. Unexpected for an inner product >>> numpy.inner( arr(4,5,2,3), arr(1,5,3)).shape (4, 5, 2, 1, 5) <--- No idea >>> numpy.inner( arr(4,5,2,3), arr(3)).shape (4, 5, 2) <--- 1D special case. This is a dot product. >>> numpy.inner( arr(4,5,2,3), 3).shape (4, 5, 2, 3) <--- 0D special case. This is a scaling. #+END_EXAMPLE * INTERFACE ** broadcast_define() Vectorizes an arbitrary function, expecting input as in the given prototype. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> @nps.broadcast_define( (('n',), ('n',)) ) ... def inner_product(a, b): ... return a.dot(b) >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> inner_product(a,b) array([ 305, 1250]) #+END_EXAMPLE The prototype defines the dimensionality of the inputs. In the inner product example above, the input is two 1D n-dimensional vectors. In particular, the 'n' is the same for the two inputs. This function is intended to be used as a decorator, applied to a function defining the operation to be vectorized. Each element in the prototype list refers to each input, in order. In turn, each such element is a list that describes the shape of that input. Each of these shape descriptors can be any of - a positive integer, indicating an input dimension of exactly that length - a string, indicating an arbitrary, but internally consistent dimension The normal numpy broadcasting rules (as described elsewhere) apply. In summary: - Dimensions are aligned at the end of the shape list, and must match the prototype - Extra dimensions left over at the front must be consistent for all the input arguments, meaning: - All dimensions of length != 1 must match - Dimensions of length 1 match corresponding dimensions of any length in other arrays - Missing leading dimensions are implicitly set to length 1 - The output(s) have a shape where - The trailing dimensions are whatever the function being broadcasted returns - The leading dimensions come from the extra dimensions in the inputs Calling a function wrapped with broadcast_define() with extra arguments (either positional or keyword), passes these verbatim to the inner function. Only the arguments declared in the prototype are broadcast. Scalars are represented as 0-dimensional numpy arrays: arrays with shape (), and these broadcast as one would expect: #+BEGIN_EXAMPLE >>> @nps.broadcast_define( (('n',), ('n',), ())) ... def scaled_inner_product(a, b, scale): ... return a.dot(b)*scale >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> scale = np.array((10,100)) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> scale array([ 10, 100]) >>> scaled_inner_product(a,b,scale) array([[ 3050], [125000]]) #+END_EXAMPLE Let's look at a more involved example. Let's say we have a function that takes a set of points in R^2 and a single center point in R^2, and finds a best-fit least-squares line that passes through the given center point. Let it return a 3D vector containing the slope, y-intercept and the RMS residual of the fit. This broadcasting-enabled function can be defined like this: #+BEGIN_EXAMPLE import numpy as np import numpysane as nps @nps.broadcast_define( (('n',2), (2,)) ) def fit(xy, c): # line-through-origin-model: y = m*x # E = sum( (m*x - y)**2 ) # dE/dm = 2*sum( (m*x-y)*x ) = 0 # ----> m = sum(x*y)/sum(x*x) x,y = (xy - c).transpose() m = np.sum(x*y) / np.sum(x*x) err = m*x - y err **= 2 rms = np.sqrt(err.mean()) # I return m,b because I need to translate the line back b = c[1] - m*c[0] return np.array((m,b,rms)) #+END_EXAMPLE And I can use broadcasting to compute a number of these fits at once. Let's say I want to compute 4 different fits of 5 points each. I can do this: #+BEGIN_EXAMPLE n = 5 m = 4 c = np.array((20,300)) xy = np.arange(m*n*2, dtype=np.float64).reshape(m,n,2) + c xy += np.random.rand(*xy.shape)*5 res = fit( xy, c ) mb = res[..., 0:2] rms = res[..., 2] print "RMS residuals: {}".format(rms) #+END_EXAMPLE Here I had 4 different sets of points, but a single center point c. If I wanted 4 different center points, I could pass c as an array of shape (4,2). I can use broadcasting to plot all the results (the points and the fitted lines): #+BEGIN_EXAMPLE import gnuplotlib as gp gp.plot( *nps.mv(xy,-1,0), _with='linespoints', equation=['{}*x + {}'.format(mb_single[0], mb_single[1]) for mb_single in mb], unset='grid', square=1) #+END_EXAMPLE The examples above all create a separate output array for each broadcasted slice, and copy the contents from each such slice into the larger output array that contains all the results. This is inefficient, and it is possible to pre-allocate an array to forgo these extra allocation and copy operations. There are several settings to control this. If the function being broadcasted can write its output to a given array instead of creating a new one, most of the inefficiency goes away. broadcast_define() supports the case where this function takes this array in a kwarg: the name of this kwarg can be given to broadcast_define() like so: #+BEGIN_EXAMPLE @nps.broadcast_define( ....., out_kwarg = "out" ) def func( ....., out): ..... out[:] = result #+END_EXAMPLE When used this way, the return value of the broadcasted function is ignored. In order for broadcast_define() to pass such an output array to the inner function, this output array must be available, which means that it must be given to us somehow, or we must create it. The most efficient way to make a broadcasted call is to create the full output array beforehand, and to pass that to the broadcasted function. In this case, nothing extra will be allocated, and no unnecessary copies will be made. This can be done like this: #+BEGIN_EXAMPLE @nps.broadcast_define( (('n',), ('n',)), ....., out_kwarg = "out" ) def inner_product(a, b, out): ..... out.setfield(a.dot(b), out.dtype) out = np.empty((2,4), np.float) inner_product( np.arange(3), np.arange(2*4*3).reshape(2,4,3), out=out) #+END_EXAMPLE In this example, the caller knows that it's calling an inner_product function, and that the shape of each output slice would be (). The caller also knows the input dimensions and that we have an extra broadcasting dimension (2,4), so the output array will have shape (2,4) + () = (2,4). With this knowledge, the caller preallocates the array, and passes it to the broadcasted function call. Furthermore, in this case the inner function will be called with an output array EVERY time, and this is the only mode the inner function needs to support. If the caller doesn't know (or doesn't want to pre-compute) the shape of the output, it can let the broadcasting machinery create this array for them. In order for this to be possible, the shape of the output should be pre-declared, and the dtype of the output should be known: #+BEGIN_EXAMPLE @nps.broadcast_define( (('n',), ('n',)), (), out_kwarg = "out" ) def inner_product(a, b, out): ..... out.setfield(a.dot(b), out.dtype) out = inner_product( np.arange(3), np.arange(2*4*3).reshape(2,4,3), dtype=int) #+END_EXAMPLE Note that the caller didn't need to specify the prototype of the output or the extra broadcasting dimensions (output prototype is in the broadcast_define() call, but not the inner_product() call). Specifying the dtype here is optional: it defaults to float if omitted. If we want the output array to be pre-allocated, the output prototype (it is () in this example) is required: we must know the shape of the output array in order to create it. Without a declared output prototype, we can still make mostly- efficient calls: the broadcasting mechanism can call the inner function for the first slice as we showed earlier, by creating a new array for the slice. This new array required an extra allocation and copy, but it contains the required shape information. This infomation will be used to allocate the output, and the subsequent calls to the inner function will be efficient: #+BEGIN_EXAMPLE @nps.broadcast_define( (('n',), ('n',)), out_kwarg = "out" ) def inner_product(a, b, out=None): ..... if out is None: return a.dot(b) out.setfield(a.dot(b), out.dtype) return out out = inner_product( np.arange(3), np.arange(2*4*3).reshape(2,4,3)) #+END_EXAMPLE Here we were slighly inefficient, but the ONLY required extra specification was out_kwarg: that's all you need. Also it is important to note that in this case the inner function is called both with passing it an output array to fill in, and with asking it to create a new one (by passing out=None to the inner function). This inner function then must support both modes of operation. If the inner function does not support filling in an output array, none of these efficiency improvements are possible. It is possible for a function to return more than one output, and this is supported by broadcast_define(). This case works exactly like the one-output case, except the output prototype is REQUIRED, and this output prototype contains multiple tuples, one for each output. The inner function must return the outputs in a tuple, and each individual output will be broadcasted as expected. broadcast_define() is analogous to thread_define() in PDL. ** broadcast_generate() A generator that produces broadcasted slices SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> for s in nps.broadcast_generate( (('n',), ('n',)), (a,b)): ... print "slice: {}".format(s) slice: (array([0, 1, 2]), array([100, 101, 102])) slice: (array([3, 4, 5]), array([103, 104, 105])) #+END_EXAMPLE The broadcasting operation of numpysane is described in detail in the numpysane.broadcast_define() docstring and in the main README of numpysane. This function can be used as a Python generator to produce each broadcasted slice one by one Since Python generators are inherently 1-dimensional, this function effectively flattens the broadcasted results. If the correct output shape needs to be reconstituted, the leading shape is available by calling numpysane.broadcast_extra_dims() with the same arguments as this function. ** broadcast_extra_dims() Report the extra leading dimensions a broadcasted call would produce SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6). reshape( 2,3) >>> b = np.arange(15).reshape(5,1,3) >>> print(nps.broadcast_extra_dims((('n',), ('n',)), (a,b))) [5,2] #+END_EXAMPLE The broadcasting operation of numpysane is described in detail in the numpysane.broadcast_define() docstring and in the main README of numpysane. This function applies the broadcasting rules to report the leading dimensions of a broadcasted result if a broadcasted function was called with the given arguments. This is most useful to reconstitute the desired shape from flattened output produced by numpysane.broadcast_generate() ** glue() Concatenates a given list of arrays along the given 'axis' keyword argument. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> row = a[0,:] + 1000 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> row array([1000, 1001, 1002]) >>> nps.glue(a,b, axis=-1) array([[ 0, 1, 2, 100, 101, 102], [ 3, 4, 5, 103, 104, 105]]) # empty arrays ignored when glueing. Useful for initializing an accumulation >>> nps.glue(a,b, np.array(()), axis=-1) array([[ 0, 1, 2, 100, 101, 102], [ 3, 4, 5, 103, 104, 105]]) >>> nps.glue(a,b,row, axis=-2) array([[ 0, 1, 2], [ 3, 4, 5], [ 100, 101, 102], [ 103, 104, 105], [1000, 1001, 1002]]) >>> nps.glue(a,b, axis=-3) array([[[ 0, 1, 2], [ 3, 4, 5]], [[100, 101, 102], [103, 104, 105]]]) #+END_EXAMPLE The 'axis' must be given in a keyword argument. In order to count dimensions from the inner-most outwards, this function accepts only negative axis arguments. This is because numpy broadcasts from the last dimension, and the last dimension is the inner-most in the (usual) internal storage scheme. Allowing glue() to look at dimensions at the start would allow it to unalign the broadcasting dimensions, which is never what you want. To glue along the last dimension, pass axis=-1; to glue along the second-to-last dimension, pass axis=-2, and so on. Unlike in PDL, this function refuses to create duplicated data to make the shapes fit. In my experience, this isn't what you want, and can create bugs. For instance, PDL does this: #+BEGIN_EXAMPLE pdl> p sequence(3,2) [ [0 1 2] [3 4 5] ] pdl> p sequence(3) [0 1 2] pdl> p PDL::glue( 0, sequence(3,2), sequence(3) ) [ [0 1 2 0 1 2] <--- Note the duplicated "0,1,2" [3 4 5 0 1 2] ] #+END_EXAMPLE while numpysane.glue() does this: #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a[0:1,:] >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[0, 1, 2]]) >>> nps.glue(a,b,axis=-1) [exception] #+END_EXAMPLE Finally, this function adds as many length-1 dimensions at the front as required. Note that this does not create new data, just new degenerate dimensions. Example: #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> res = nps.glue(a,b, axis=-5) >>> res array([[[[[ 0, 1, 2], [ 3, 4, 5]]]], [[[[100, 101, 102], [103, 104, 105]]]]]) >>> res.shape (2, 1, 1, 2, 3) #+END_EXAMPLE In numpysane older than 0.10 the semantics were slightly different: the axis kwarg was optional, and glue(*args) would glue along a new leading dimension, and thus would be equivalent to cat(*args). This resulted in very confusing error messages if the user accidentally omitted the kwarg. To request the legacy behavior, do #+BEGIN_EXAMPLE nps.glue.legacy_version = '0.9' #+END_EXAMPLE ** cat() Concatenates a given list of arrays along a new first (outer) dimension. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> c = a - 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> c array([[-100, -99, -98], [ -97, -96, -95]]) >>> res = nps.cat(a,b,c) >>> res array([[[ 0, 1, 2], [ 3, 4, 5]], [[ 100, 101, 102], [ 103, 104, 105]], [[-100, -99, -98], [ -97, -96, -95]]]) >>> res.shape (3, 2, 3) >>> [x for x in res] [array([[0, 1, 2], [3, 4, 5]]), array([[100, 101, 102], [103, 104, 105]]), array([[-100, -99, -98], [ -97, -96, -95]])] ### Note that this is the same as [a,b,c]: cat is the reverse of ### iterating on an array #+END_EXAMPLE This function concatenates the input arrays into an array shaped like the highest-dimensioned input, but with a new outer (at the start) dimension. The concatenation axis is this new dimension. As usual, the dimensions are aligned along the last one, so broadcasting will continue to work as expected. Note that this is the opposite operation from iterating a numpy array: iteration splits an array over its leading dimension, while cat joins a number of arrays via a new leading dimension. ** clump() Groups the given n dimensions together. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpysane as nps >>> nps.clump( arr(2,3,4), n = -2).shape (2, 12) #+END_EXAMPLE Reshapes the array by grouping together 'n' dimensions, where 'n' is given in a kwarg. If 'n' > 0, then n leading dimensions are clumped; if 'n' < 0, then -n trailing dimensions are clumped So for instance, if x.shape is (2,3,4) then nps.clump(x, n = -2).shape is (2,12) and nps.clump(x, n = 2).shape is (6, 4) In numpysane older than 0.10 the semantics were different: n > 0 was required, and we ALWAYS clumped the trailing dimensions. Thus the new clump(-n) is equivalent to the old clump(n). To request the legacy behavior, do #+BEGIN_EXAMPLE nps.clump.legacy_version = '0.9' #+END_EXAMPLE ** atleast_dims() Returns an array with extra length-1 dimensions to contain all given axes. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> nps.atleast_dims(a, -1).shape (2, 3) >>> nps.atleast_dims(a, -2).shape (2, 3) >>> nps.atleast_dims(a, -3).shape (1, 2, 3) >>> nps.atleast_dims(a, 0).shape (2, 3) >>> nps.atleast_dims(a, 1).shape (2, 3) >>> nps.atleast_dims(a, 2).shape [exception] >>> l = [-3,-2,-1,0,1] >>> nps.atleast_dims(a, l).shape (1, 2, 3) >>> l [-3, -2, -1, 1, 2] #+END_EXAMPLE If the given axes already exist in the given array, the given array itself is returned. Otherwise length-1 dimensions are added to the front until all the requested dimensions exist. The given axis>=0 dimensions MUST all be in-bounds from the start, otherwise the most-significant axis becomes unaligned; an exception is thrown if this is violated. The given axis<0 dimensions that are out-of-bounds result in new dimensions added at the front. If new dimensions need to be added at the front, then any axis>=0 indices become offset. For instance: #+BEGIN_EXAMPLE >>> x.shape (2, 3, 4) >>> [x.shape[i] for i in (0,-1)] [2, 4] >>> x = nps.atleast_dims(x, 0, -1, -5) >>> x.shape (1, 1, 2, 3, 4) >>> [x.shape[i] for i in (0,-1)] [1, 4] #+END_EXAMPLE Before the call, axis=0 refers to the length-2 dimension and axis=-1 refers to the length=4 dimension. After the call, axis=-1 refers to the same dimension as before, but axis=0 now refers to a new length=1 dimension. If it is desired to compensate for this offset, then instead of passing the axes as separate arguments, pass in a single list of the axes indices. This list will be modified to offset the axis>=0 appropriately. Ideally, you only pass in axes<0, and this does not apply. Doing this in the above example: #+BEGIN_EXAMPLE >>> l [0, -1, -5] >>> x.shape (2, 3, 4) >>> [x.shape[i] for i in (l[0],l[1])] [2, 4] >>> x=nps.atleast_dims(x, l) >>> x.shape (1, 1, 2, 3, 4) >>> l [2, -1, -5] >>> [x.shape[i] for i in (l[0],l[1])] [2, 4] #+END_EXAMPLE We passed the axis indices in a list, and this list was modified to reflect the new indices: The original axis=0 becomes known as axis=2. Again, if you pass in only axis<0, then you don't need to care about this. ** mv() Moves a given axis to a new position. Similar to numpy.moveaxis(). SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.mv( a, -1, 0).shape (4, 2, 3) >>> nps.mv( a, -1, -5).shape (4, 1, 1, 2, 3) >>> nps.mv( a, 0, -5).shape (2, 1, 1, 3, 4) #+END_EXAMPLE New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ** xchg() Exchanges the positions of the two given axes. Similar to numpy.swapaxes() SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.xchg( a, -1, 0).shape (4, 3, 2) >>> nps.xchg( a, -1, -5).shape (4, 1, 2, 3, 1) >>> nps.xchg( a, 0, -5).shape (2, 1, 1, 3, 4) #+END_EXAMPLE New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ** transpose() Reverses the order of the last two dimensions. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.transpose(a).shape (2, 4, 3) >>> nps.transpose( np.arange(3) ).shape (3, 1) #+END_EXAMPLE A "matrix" is generally seen as a 2D array that we can transpose by looking at the 2 dimensions in the opposite order. Here we treat an n-dimensional array as an n-2 dimensional object containing 2D matrices. As usual, the last two dimensions contain the matrix. New length-1 dimensions are added at the front, as required, meaning that 1D input of shape (n,) is interpreted as a 2D input of shape (1,n), and the transpose is 2 of shape (n,1). ** dummy() Adds length-1 dimensions at the given positions. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.dummy(a, 0).shape (1, 2, 3, 4) >>> nps.dummy(a, 1).shape (2, 1, 3, 4) >>> nps.dummy(a, -1).shape (2, 3, 4, 1) >>> nps.dummy(a, -2).shape (2, 3, 1, 4) >>> nps.dummy(a, -2, -2).shape (2, 3, 1, 1, 4) >>> nps.dummy(a, -5).shape (1, 1, 2, 3, 4) #+END_EXAMPLE This is similar to numpy.expand_dims(), but handles out-of-bounds dimensions better. New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ** reorder() Reorders the dimensions of an array. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.reorder( a, 0, -1, 1 ).shape (2, 4, 3) >>> nps.reorder( a, -2 , -1, 0 ).shape (3, 4, 2) >>> nps.reorder( a, -4 , -2, -5, -1, 0 ).shape (1, 3, 1, 4, 2) #+END_EXAMPLE This is very similar to numpy.transpose(), but handles out-of-bounds dimensions much better. New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ** dot() Non-conjugating dot product of two 1-dimensional n-long vectors. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> b = a+5 >>> a array([0, 1, 2]) >>> b array([5, 6, 7]) >>> nps.dot(a,b) 20 #+END_EXAMPLE This is identical to numpysane.inner(). for a conjugating version of this function, use nps.vdot(). Note that the stock numpy dot() has some special handling when its dot() is given more than 1-dimensional input. THIS function has no special handling: normal broadcasting rules are applied, as expected. In-place operation is available with the "out" kwarg. The output dtype can be selected with the "dtype" kwarg. If omitted, the dtype of the input is used ** vdot() Conjugating dot product of two 1-dimensional n-long vectors. vdot(a,b) is equivalent to dot(np.conj(a), b) SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.array(( 1 + 2j, 3 + 4j, 5 + 6j)) >>> b = a+5 >>> a array([ 1.+2.j, 3.+4.j, 5.+6.j]) >>> b array([ 6.+2.j, 8.+4.j, 10.+6.j]) >>> nps.vdot(a,b) array((136-60j)) >>> nps.dot(a,b) array((24+148j)) #+END_EXAMPLE For a non-conjugating version of this function, use nps.dot(). Note that the numpy vdot() has some special handling when its vdot() is given more than 1-dimensional input. THIS function has no special handling: normal broadcasting rules are applied. ** outer() Outer product of two 1-dimensional n-long vectors. SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> b = a+5 >>> a array([0, 1, 2]) >>> b array([5, 6, 7]) >>> nps.outer(a,b) array([[ 0, 0, 0], [ 5, 6, 7], [10, 12, 14]]) #+END_EXAMPLE This function is broadcast-aware through numpysane.broadcast_define(). The expected inputs have input prototype: #+BEGIN_EXAMPLE (('n',), ('m',)) #+END_EXAMPLE and output prototype #+BEGIN_EXAMPLE ('n', 'm') #+END_EXAMPLE The first 2 positional arguments will broadcast. The trailing shape of those arguments must match the input prototype; the leading shape must follow the standard broadcasting rules. Positional arguments past the first 2 and all the keyword arguments are passed through untouched. ** norm2() Broadcast-aware 2-norm. norm2(x) is identical to inner(x,x) SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> a array([0, 1, 2]) >>> nps.norm2(a) 5 #+END_EXAMPLE This is a convenience function to compute a 2-norm ** mag() Magnitude of a vector. mag(x) is functionally identical to sqrt(inner(x,x)) SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> a array([0, 1, 2]) >>> nps.mag(a) 2.23606797749979 #+END_EXAMPLE This is a convenience function to compute a magnitude of a vector, with full broadcasting support. In-place operation is available with the "out" kwarg. The output dtype can be selected with the "dtype" kwarg. If omitted, dtype=float is selected. ** trace() Broadcast-aware trace SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3*4*4).reshape(3,4,4) >>> a array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]], [[16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]], [[32, 33, 34, 35], [36, 37, 38, 39], [40, 41, 42, 43], [44, 45, 46, 47]]]) >>> nps.trace(a) array([ 30, 94, 158]) #+END_EXAMPLE This function is broadcast-aware through numpysane.broadcast_define(). The expected inputs have input prototype: #+BEGIN_EXAMPLE (('n', 'n'),) #+END_EXAMPLE and output prototype #+BEGIN_EXAMPLE () #+END_EXAMPLE The first 1 positional arguments will broadcast. The trailing shape of those arguments must match the input prototype; the leading shape must follow the standard broadcasting rules. Positional arguments past the first 1 and all the keyword arguments are passed through untouched. ** matmult2() Multiplication of two matrices SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6) .reshape(2,3) >>> b = np.arange(12).reshape(3,4) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) >>> nps.matmult2(a,b) array([[20, 23, 26, 29], [56, 68, 80, 92]]) #+END_EXAMPLE This function is exposed publically mostly for legacy compatibility. Use numpysane.matmult() instead ** matmult() Multiplication of N matrices SYNOPSIS #+BEGIN_EXAMPLE >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6) .reshape(2,3) >>> b = np.arange(12).reshape(3,4) >>> c = np.arange(4) .reshape(4,1) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) >>> c array([[0], [1], [2], [3]]) >>> nps.matmult(a,b,c) array([[162], [504]]) >>> abc = np.zeros((2,1), dtype=float) >>> nps.matmult(a,b,c, out=abc) >>> abc array([[162], [504]]) #+END_EXAMPLE This multiplies N matrices together by repeatedly calling matmult2() for each adjacent pair. In-place output is supported with the 'out' keyword argument This function supports broadcasting fully, in C internally * COMPATIBILITY Python 2 and Python 3 should both be supported. Please report a bug if either one doesn't work. * REPOSITORY https://github.com/dkogan/numpysane * AUTHOR Dima Kogan * LICENSE AND COPYRIGHT Copyright 2016-2020 Dima Kogan. This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (any version) as published by the Free Software Foundation See https://www.gnu.org/licenses/lgpl.html numpysane-0.35/extract_README.py000077500000000000000000000116141407353053200165310ustar00rootroot00000000000000#!/usr/bin/python r'''Constructs README, README.org files The README files are generated by this script. They are made from: - The main module docstring, with some org markup applied to the README.org, but not to the README - The docstrings from each API function in the module, with some org markup applied to the README.org, but not to the README - README.footer.org, copied verbatim The main module name must be passed in as the first cmdline argument. ''' import sys try: modname,readmeorg,readme,readmefooterorg = sys.argv[1:] except: raise Exception("Usage: {} modulename readmeorg readme readmefooterorg".format(sys.argv[0])) exec( 'import {} as mod'.format(modname) ) import inspect import re try: from StringIO import StringIO ## for Python 2 except ImportError: from io import StringIO ## for Python 3 def dirmod(): r'''Same as dir(mod), but returns only functions, in the definition order''' with open('{}.py'.format(modname), 'r') as f: for l in f: m = re.match(r'def +([a-zA-Z0-9_]+)\(', l) if m: yield m.group(1) with open(readmeorg, 'w') as f_target_org: with open(readme, 'w') as f_target: f_target_org.write(r'''* TALK I just gave a talk about this at [[https://www.socallinuxexpo.org/scale/18x][SCaLE 18x]]. Here are the [[https://www.youtube.com/watch?v=YOOapXNtUWw][video of the talk]] and the [[https://github.com/dkogan/talk-numpysane-gnuplotlib/raw/master/numpysane-gnuplotlib.pdf]["slides"]]. ''') def write(s): f_target. write(s) f_target_org.write(s) def write_orgized(s): r'''Writes the given string, reformatted slightly with org in mind. The only change this function applies, is to look for indented block (signifying examples) and to wrap then in a #+BEGIN_SRC or #+BEGIN_EXAMPLE. ''' # the non-org version is written as is f_target.write(s) # the org version neeeds massaging f = f_target_org in_quote = False queued_blanks = 0 indent_size = 4 prev_indented = False sio = StringIO(s) for l in sio: if not in_quote: if len(l) <= 1: # blank line f.write(l) continue if not re.match(' '*indent_size, l): # don't have full indent. not quote start prev_indented = re.match(' ', l) f.write(l) continue if re.match(' '*indent_size + '-', l): # Start of indented list. not quote start prev_indented = re.match(' ', l) f.write(l) continue if prev_indented: # prev line(s) were indented, so this can't start a quote f.write(l) continue # start of quote. What kind? in_quote = True f.write('#+BEGIN_EXAMPLE\n') f.write(l[indent_size:]) continue # we're in a quote. Skip blank lines for now if len(l) <= 1: queued_blanks = queued_blanks+1 continue if re.match(' '*indent_size, l): # still in quote. Write it out f.write( '\n'*queued_blanks) queued_blanks = 0 f.write(l[indent_size:]) continue # not in quote anymore f.write('#+END_EXAMPLE\n') f.write( '\n'*queued_blanks) f.write(l) queued_blanks = 0 in_quote = False prev_indented = False f.write('\n') if in_quote: f.write('#+END_EXAMPLE\n') header = '* NAME\n{}: '.format(modname) write( header ) write_orgized(inspect.getdoc(mod)) write( '\n' ) def getdoc(func): if re.match('_', func): return None if not inspect.isfunction(mod.__dict__[func]): return None doc = inspect.getdoc(mod.__dict__[func]) if not doc: return None return doc funcs_docs = [(f,getdoc(f)) for f in dirmod()] funcs_docs = [(f,d) for f,d in funcs_docs if d is not None] if len(funcs_docs): write('* INTERFACE\n') for func,doc in funcs_docs: write('** {}()\n'.format(func)) write_orgized( doc ) write( '\n' ) with open(readmefooterorg, 'r') as f_footer: write( f_footer.read() ) numpysane-0.35/numpysane.py000077500000000000000000002620071407353053200160650ustar00rootroot00000000000000#!/usr/bin/python r'''more-reasonable core functionality for numpy * SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> row = a[0,:] + 1000 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> row array([1000, 1001, 1002]) >>> nps.glue(a,b, axis=-1) array([[ 0, 1, 2, 100, 101, 102], [ 3, 4, 5, 103, 104, 105]]) >>> nps.glue(a,b,row, axis=-2) array([[ 0, 1, 2], [ 3, 4, 5], [ 100, 101, 102], [ 103, 104, 105], [1000, 1001, 1002]]) >>> nps.cat(a,b) array([[[ 0, 1, 2], [ 3, 4, 5]], [[100, 101, 102], [103, 104, 105]]]) >>> @nps.broadcast_define( (('n',), ('n',)) ) ... def inner_product(a, b): ... return a.dot(b) >>> inner_product(a,b) array([ 305, 1250]) * DESCRIPTION Numpy is a very widely used toolkit for numerical computation in Python. Despite its popularity, some of its core functionality is mysterious and/or incomplete. The numpysane library seeks to fill those gaps by providing its own replacement routines. Many of the replacement functions are direct translations from PDL (http://pdl.perl.org), a numerical computation library for perl. The functions provided by this module fall into three broad categories: - Broadcasting support - Nicer array manipulation - Basic linear algebra ** Broadcasting Numpy has a limited support for broadcasting (http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html), a generic way to vectorize functions. A broadcasting-aware function knows the dimensionality of its inputs, and any extra dimensions in the input are automatically used for vectorization. *** Broadcasting rules A basic example is an inner product: a function that takes in two identically-sized 1-dimensional arrays (input prototype (('n',), ('n',)) ) and returns a scalar (output prototype () ). If one calls a broadcasting-aware inner product with two arrays of shape (2,3,4) as input, it would compute 6 inner products of length-4 each, and report the output in an array of shape (2,3). In short: - The most significant dimension in a numpy array is the LAST one, so the prototype of an input argument must exactly match a given input's trailing shape. So a prototype shape of (a,b,c) accepts an argument shape of (......, a,b,c), with as many or as few leading dimensions as desired. - The extra leading dimensions must be compatible across all the inputs. This means that each leading dimension must either - equal 1 - be missing (thus assumed to equal 1) - equal to some positive integer >1, consistent across all arguments - The output is collected into an array that's sized as a superset of the above-prototype shape of each argument More involved example: A function with input prototype ( (3,), ('n',3), ('n',), ('m',) ) given inputs of shape (1,5, 3) (2,1, 8,3) ( 8) ( 5, 9) will return an output array of shape (2,5, ...), where ... is the shape of each output slice. Note again that the prototype dictates the TRAILING shape of the inputs. *** What about the stock broadcasting support? The numpy documentation dedicates a whole page explaining the broadcasting rules, but only a small number of numpy functions provide any broadcasting support. It's fairly inconsistent, and most functions have no broadcasting support and no mention of it in the documentation. And as a result, this is not a prominent part of the numpy ecosystem and there's little user awareness that it exists. *** What this module provides This module contains functionality to make any arbitrary function broadcastable, in either C or Python. In both cases, the input and output prototypes are declared, and these are used for shape-checking and vectorization each time the function is called. The functions can have either - A single output, returned as a numpy array. The output specification in the prototype is a single shape tuple - Multiple outputs, returned as a tuple of numpy arrays. The output specification in the prototype is a tuple of shape tuples *** Broadcasting in python This is invoked as a decorator, applied to any function. An example: >>> import numpysane as nps >>> @nps.broadcast_define( (('n',), ('n',)) ) ... def inner_product(a, b): ... return a.dot(b) Here we have a simple inner product function to compute ONE inner product. The 'broadcast_define' decorator adds broadcasting-awareness: 'inner_product()' expects two 1D vectors of length 'n' each (same 'n' for the two inputs), vectorizing extra dimensions, as needed. The inputs are shape-checked, and incompatible dimensions will trigger an exception. Example: >>> import numpy as np >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> inner_product(a,b) array([ 305, 1250]) Another related function in this module broadcast_generate(). It's similar to broadcast_define(), but instead of adding broadcasting-awareness to an existing function, it returns a generator that produces tuples from a set of arguments according to a given prototype. Similarly, broadcast_extra_dims() is available to report the outer shape of a potential broadcasting operation. Stock numpy has some rudimentary support for all this with its vectorize() function, but it assumes only scalar inputs and outputs, which severely limits its usefulness. See the docstrings for 'broadcast_define' and 'broadcast_generate' in the INTERFACE section below for usage details. *** Broadcasting in C The python broadcasting is useful, but it is a python loop, so the loop itself is computationally expensive if we have many iterations. If the function being wrapped is available in C, we can apply broadcasting awareness in C, which makes a much faster loop. The "numpysane_pywrap" module generates code to wrap arbitrary C code in a broadcasting-aware wrapper callable from python. This is an analogue of PDL::PP (http://pdl.perl.org/PDLdocs/PP.html). This generated code is compiled and linked into a python extension module, as usual. This functionality documented separately: https://github.com/dkogan/numpysane/blob/master/README-pywrap.org After I wrote this, I realized there is some support for this in stock numpy: https://docs.scipy.org/doc/numpy-1.13.0/reference/c-api.ufunc.html Note: I have not tried using these APIs. ** Nicer array manipulation Numpy functions that move dimensions around and concatenate matrices are unintuitive. For instance, a simple concatenation of a row-vector or a column-vector to a matrix requires arcane knowledge to accomplish reliably. This module provides new functions that can be used for these basic operations. These new functions do have well-defined and sensible behavior, and they largely come from the interfaces in PDL (http://pdl.perl.org). These all respect the core rules of numpy broadcasting: - LEADING length-1 dimensions don't affect the meaning of an array, so the routines handle missing or extra length-1 dimensions at the front - The inner-most dimensions of an array are the TRAILING ones, so whenever an axis specification is used, it is strongly recommended (sometimes required) to count the axes from the back by passing in axis<0 A high level description of the functionality is given here, and each function is described in detail in the INTERFACE section below. In the following examples, I use a function "arr" that returns a numpy array with given dimensions: >>> def arr(*shape): ... product = reduce( lambda x,y: x*y, shape) ... return numpy.arange(product).reshape(*shape) >>> arr(1,2,3) array([[[0, 1, 2], [3, 4, 5]]]) >>> arr(1,2,3).shape (1, 2, 3) *** Concatenation This module provides two functions to do this **** glue Concatenates some number of arrays along a given axis ('axis' must be given in a kwarg). Implicit length-1 dimensions are added at the start as needed. Dimensions other than the glueing axis must match exactly. Basic usage: >>> row_vector = arr( 3,) >>> col_vector = arr(5,1,) >>> matrix = arr(5,3,) >>> numpysane.glue(matrix, row_vector, axis = -2).shape (6,3) >>> numpysane.glue(matrix, col_vector, axis = -1).shape (5,4) **** cat Concatenate some number of arrays along a new leading axis. Implicit length-1 dimensions are added, and the logical shapes of the inputs must match. This function is a logical inverse of numpy array iteration: iteration splits an array over its leading dimension, while cat joins a number of arrays via a new leading dimension. Basic usage: >>> numpysane.cat(arr(5,), arr(5,)).shape (2,5) >>> numpysane.cat(arr(5,), arr(1,1,5,)).shape (2,1,1,5) *** Manipulation of dimensions Several functions are available, all being fairly direct ports of their PDL (http://pdl.perl.org) equivalents **** clump Reshapes the array by grouping together 'n' dimensions, where 'n' is given in a kwarg. If 'n' > 0, then n leading dimensions are clumped; if 'n' < 0, then -n trailing dimensions are clumped. Basic usage: >>> numpysane.clump( arr(2,3,4), n = -2).shape (2, 12) >>> numpysane.clump( arr(2,3,4), n = 2).shape (6, 4) **** atleast_dims Adds length-1 dimensions at the front of an array so that all the given dimensions are in-bounds. Any axis<0 may expand the shape. Adding new leading dimensions (axis>=0) is never useful, since numpy broadcasts from the end, so clump() treats axis>0 as a check only: the requested axis MUST already be in-bounds, or an exception is thrown. This function always preserves the meaning of all the axes in the array: axis=-1 is the same axis before and after the call. Basic usage: >>> numpysane.atleast_dims(arr(2,3), -1).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), -2).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), -3).shape (1, 2, 3) >>> numpysane.atleast_dims(arr(2,3), 0).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), 1).shape (2, 3) >>> numpysane.atleast_dims(arr(2,3), 2).shape [exception] **** mv Moves a dimension from one position to another. Basic usage to move the last dimension (-1) to the front (0) >>> numpysane.mv(arr(2,3,4), -1, 0).shape (4, 2, 3) Or to move a dimension -5 (added implicitly) to the end >>> numpysane.mv(arr(2,3,4), -5, -1).shape (1, 2, 3, 4, 1) **** xchg Exchanges the positions of two dimensions. Basic usage to move the last dimension (-1) to the front (0), and the front to the back. >>> numpysane.xchg(arr(2,3,4), -1, 0).shape (4, 3, 2) Or to swap a dimension -5 (added implicitly) with dimension -2 >>> numpysane.xchg(arr(2,3,4), -5, -2).shape (3, 1, 2, 1, 4) **** transpose Reverses the order of the two trailing dimensions in an array. The whole array is seen as being an array of 2D matrices, each matrix living in the 2 most significant dimensions, which implies this definition. Basic usage: >>> numpysane.transpose( arr(2,3) ).shape (3,2) >>> numpysane.transpose( arr(5,2,3) ).shape (5,3,2) >>> numpysane.transpose( arr(3,) ).shape (3,1) Note that in the second example we had 5 matrices, and we transposed each one. And in the last example we turned a row vector into a column vector by adding an implicit leading length-1 dimension before transposing. **** dummy Adds a single length-1 dimension at the given position. Basic usage: >>> numpysane.dummy(arr(2,3,4), -1).shape (2, 3, 4, 1) **** reorder Reorders the dimensions in an array using the given order. Basic usage: >>> numpysane.reorder( arr(2,3,4), -1, -2, -3 ).shape (4, 3, 2) >>> numpysane.reorder( arr(2,3,4), 0, -1, 1 ).shape (2, 4, 3) >>> numpysane.reorder( arr(2,3,4), -2 , -1, 0 ).shape (3, 4, 2) >>> numpysane.reorder( arr(2,3,4), -4 , -2, -5, -1, 0 ).shape (1, 3, 1, 4, 2) ** Basic linear algebra *** inner Broadcast-aware inner product. Identical to numpysane.dot(). Basic usage to compute 4 inner products of length 3 each: >>> numpysane.inner(arr( 3,), arr(4,3,)).shape (4,) >>> numpysane.inner(arr( 3,), arr(4,3,)) array([5, 14, 23, 32]) *** dot Broadcast-aware non-conjugating dot product. Identical to numpysane.inner(). *** vdot Broadcast-aware conjugating dot product. Same as numpysane.dot(), except this one conjugates complex input, which numpysane.dot() does not *** outer Broadcast-aware outer product. Basic usage to compute 4 outer products of length 3 each: >>> numpysane.outer(arr( 3,), arr(4,3,)).shape array(4, 3, 3) *** norm2 Broadcast-aware 2-norm. numpysane.norm2(x) is identical to numpysane.inner(x,x): >>> numpysane.norm2(arr(4,3)) array([5, 50, 149, 302]) *** mag Broadcast-aware vector magnitude. mag(x) is functionally identical to sqrt(numpysane.norm2(x)) and sqrt(numpysane.inner(x,x)) >>> numpysane.mag(arr(4,3)) array([ 2.23606798, 7.07106781, 12.20655562, 17.3781472 ]) *** trace Broadcast-aware matrix trace. >>> numpysane.trace(arr(4,3,3)) array([12., 39., 66., 93.]) *** matmult Broadcast-aware matrix multiplication. This accepts an arbitrary number of inputs, and adds leading length-1 dimensions as needed. Multiplying a row-vector by a matrix >>> numpysane.matmult( arr(3,), arr(3,2) ).shape (2,) Multiplying a row-vector by 5 different matrices: >>> numpysane.matmult( arr(3,), arr(5,3,2) ).shape (5, 2) Multiplying a matrix by a col-vector: >>> numpysane.matmult( arr(3,2), arr(2,1) ).shape (3, 1) Multiplying a row-vector by a matrix by a col-vector: >>> numpysane.matmult( arr(3,), arr(3,2), arr(2,1) ).shape (1,) ** What's wrong with existing numpy functions? Why did I go through all the trouble to reimplement all this? Doesn't numpy already do all these things? Yes, it does. But in a very nonintuitive way. *** Concatenation **** hstack() hstack() performs a "horizontal" concatenation. When numpy prints an array, this is the last dimension (the most significant dimensions in numpy are at the end). So one would expect that this function concatenates arrays along this last dimension. In the special case of 1D and 2D arrays, one would be right: >>> numpy.hstack( (arr(3), arr(3))).shape (6,) >>> numpy.hstack( (arr(2,3), arr(2,3))).shape (2, 6) but in any other case, one would be wrong: >>> numpy.hstack( (arr(1,2,3), arr(1,2,3))).shape (1, 4, 3) <------ I expect (1, 2, 6) >>> numpy.hstack( (arr(1,2,3), arr(1,2,4))).shape [exception] <------ I expect (1, 2, 7) >>> numpy.hstack( (arr(3), arr(1,3))).shape [exception] <------ I expect (1, 6) >>> numpy.hstack( (arr(1,3), arr(3))).shape [exception] <------ I expect (1, 6) The above should all succeed, and should produce the shapes as indicated. Cases such as "numpy.hstack( (arr(3), arr(1,3)))" are maybe up for debate, but broadcasting rules allow adding as many extra length-1 dimensions as we want without changing the meaning of the object, so I claim this should work. Either way, if you print out the operands for any of the above, you too would expect a "horizontal" stack() to work as stated above. It turns out that normally hstack() concatenates along axis=1, unless the first argument only has one dimension, in which case axis=0 is used. In a system where the most significant dimension is the last one, this is only correct if everyone has only 2D arrays. The correct way to do this is to concatenate along axis=-1. It works for n-dimensionsal objects, and doesn't require the special case logic for 1-dimensional objects. **** vstack() Similarly, vstack() performs a "vertical" concatenation. When numpy prints an array, this is the second-to-last dimension (remember, the most significant dimensions in numpy are at the end). So one would expect that this function concatenates arrays along this second-to-last dimension. Again, in the special case of 1D and 2D arrays, one would be right: >>> numpy.vstack( (arr(2,3), arr(2,3))).shape (4, 3) >>> numpy.vstack( (arr(3), arr(3))).shape (2, 3) >>> numpy.vstack( (arr(1,3), arr(3))).shape (2, 3) >>> numpy.vstack( (arr(3), arr(1,3))).shape (2, 3) >>> numpy.vstack( (arr(2,3), arr(3))).shape (3, 3) Note that this function appears to tolerate some amount of shape mismatches. It does it in a form one would expect, but given the state of the rest of this system, I found it surprising. For instance "numpy.hstack( (arr(1,3), arr(3)))" fails, so one would think that "numpy.vstack( (arr(1,3), arr(3)))" would fail too. And once again, adding more dimensions make it confused, for the same reason: >>> numpy.vstack( (arr(1,2,3), arr(2,3))).shape [exception] <------ I expect (1, 4, 3) >>> numpy.vstack( (arr(1,2,3), arr(1,2,3))).shape (2, 2, 3) <------ I expect (1, 4, 3) Similarly to hstack(), vstack() concatenates along axis=0, which is "vertical" only for 2D arrays, but not for any others. And similarly to hstack(), the 1D case has special-cased logic to make it work properly. The correct way to do this is to concatenate along axis=-2. It works for n-dimensionsal objects, and doesn't require the special case for 1-dimensional objects. **** dstack() I'll skip the detailed description, since this is similar to hstack() and vstack(). The intent was to concatenate across axis=-3, but the implementation takes axis=2 instead. This is wrong, as before. And I find it strange that these 3 functions even exist, since they are all special-cases: the concatenation axis should be an argument, and at most, the edge special case (hstack()) should exist. This brings us to the next function **** concatenate() This is a more general function, and unlike hstack(), vstack() and dstack(), it takes as input a list of arrays AND the concatenation dimension. It accepts negative concatenation dimensions to allow us to count from the end, so things should work better. And in many cases that failed previously, they do: >>> numpy.concatenate( (arr(1,2,3), arr(1,2,3)), axis=-1).shape (1, 2, 6) >>> numpy.concatenate( (arr(1,2,3), arr(1,2,4)), axis=-1).shape (1, 2, 7) >>> numpy.concatenate( (arr(1,2,3), arr(1,2,3)), axis=-2).shape (1, 4, 3) But many things still don't work as I would expect: >>> numpy.concatenate( (arr(1,3), arr(3)), axis=-1).shape [exception] <------ I expect (1, 6) >>> numpy.concatenate( (arr(3), arr(1,3)), axis=-1).shape [exception] <------ I expect (1, 6) >>> numpy.concatenate( (arr(1,3), arr(3)), axis=-2).shape [exception] <------ I expect (3, 3) >>> numpy.concatenate( (arr(3), arr(1,3)), axis=-2).shape [exception] <------ I expect (2, 3) >>> numpy.concatenate( (arr(2,3), arr(2,3)), axis=-3).shape [exception] <------ I expect (2, 2, 3) This function works as expected only if - All inputs have the same number of dimensions - All inputs have a matching shape, except for the dimension along which we're concatenating - All inputs HAVE the dimension along which we're concatenating A common use case that violates these conditions: I have an object that contains N 3D vectors, and I want to add another 3D vector to it. This is essentially the first failing example above. **** stack() The name makes it sound exactly like concatenate(), and it takes the same arguments, but it is very different. stack() requires that all inputs have EXACTLY the same shape. It then concatenates all the inputs along a new dimension, and places that dimension in the location given by the 'axis' input. If this is the exact type of concatenation you want, this function works fine. But it's one of many things a user may want to do. **** Thoughts on concatenation This module introduces numpysane.glue() and numpysane.cat() to replace all the above functions. These do not refer to anything being "horizontal" or "vertical", nor do they talk about "rows" or "columns": these concepts simply don't apply in a generic N-dimensional system. These functions are very explicit about the dimensionality of the inputs/outputs, and fit well into a broadcasting-aware system. Since these functions assume that broadcasting is an important concept in the system, the given axis indices should be counted from the most significant dimension: the last dimension in numpy. This means that where an axis index is specified, negative indices are encouraged. glue() forbids axis>=0 outright. ***** Example for further justification An array containing N 3D vectors would have shape (N,3). Another array containing a single 3D vector would have shape (3,). Counting the dimensions from the end, each vector is indexed in dimension -1. However, counting from the front, the vector is indexed in dimension 0 or 1, depending on which of the two arrays we're looking at. If we want to add the single vector to the array containing the N vectors, and we mistakenly try to concatenate along the first dimension, it would fail if N != 3. But if we're unlucky, and N=3, then we'd get a nonsensical output array of shape (3,4). Why would an array of N 3D vectors have shape (N,3) and not (3,N)? Because if we apply python iteration to it, we'd expect to get N iterates of arrays with shape (3,) each, and numpy iterates from the first dimension: >>> a = numpy.arange(2*3).reshape(2,3) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> [x for x in a] [array([0, 1, 2]), array([3, 4, 5])] *** Manipulation of dimensions **** atleast_xd() Numpy has 3 special-case functions atleast_1d(), atleast_2d() and atleast_3d(). For 4d and higher, you need to do something else. These do surprising things: >>> numpy.atleast_3d(arr(3)).shape (1, 3, 1) **** transpose() Given a matrix (a 2D array), numpy.transpose() swaps the two dimensions, as expected. Given anything else, it does not do what is expected: >>> numpy.transpose(arr(3, )).shape (3,) >>> numpy.transpose(arr(3,4, )).shape (4, 3) >>> numpy.transpose(arr(3,4,5,6,)).shape (6, 5, 4, 3) I.e. numpy.transpose() reverses the order of ALL dimensions in the array. So if we have N 2D matrices in a single array, numpy.transpose() doesn't transpose each matrix. *** Basic linear algebra **** inner() and dot() numpy.inner() and numpy.dot() are strange. In a real-valued n-dimensional Euclidean space, a "dot product" is just another name for an "inner product". Numpy disagrees. It looks like numpy.dot() is matrix multiplication, with some wonky behaviors when given higher-dimension objects, and with some special-case behaviors for 1-dimensional and 0-dimensional objects: >>> numpy.dot( arr(4,5,2,3), arr(3,5)).shape (4, 5, 2, 5) <--- expected result for a broadcasted matrix multiplication >>> numpy.dot( arr(3,5), arr(4,5,2,3)).shape [exception] <--- numpy.dot() is not commutative. Expected for matrix multiplication, but not for a dot product >>> numpy.dot( arr(4,5,2,3), arr(1,3,5)).shape (4, 5, 2, 1, 5) <--- don't know where this came from at all >>> numpy.dot( arr(4,5,2,3), arr(3)).shape (4, 5, 2) <--- 1D special case. This is a dot product. >>> numpy.dot( arr(4,5,2,3), 3).shape (4, 5, 2, 3) <--- 0D special case. This is a scaling. It looks like numpy.inner() is some sort of quasi-broadcastable inner product, also with some funny dimensioning rules. In many cases it looks like numpy.dot(a,b) is the same as numpy.inner(a, transpose(b)) where transpose() swaps the last two dimensions: >>> numpy.inner( arr(4,5,2,3), arr(5,3)).shape (4, 5, 2, 5) <--- All the length-3 inner products collected into a shape with not-quite-broadcasting rules >>> numpy.inner( arr(5,3), arr(4,5,2,3)).shape (5, 4, 5, 2) <--- numpy.inner() is not commutative. Unexpected for an inner product >>> numpy.inner( arr(4,5,2,3), arr(1,5,3)).shape (4, 5, 2, 1, 5) <--- No idea >>> numpy.inner( arr(4,5,2,3), arr(3)).shape (4, 5, 2) <--- 1D special case. This is a dot product. >>> numpy.inner( arr(4,5,2,3), 3).shape (4, 5, 2, 3) <--- 0D special case. This is a scaling. ''' import numpy as np from functools import reduce import itertools import types import inspect from distutils.version import StrictVersion # setup.py assumes the version is a simple string in '' quotes __version__ = '0.35' def _product(l): r'''Returns product of all values in the given list''' return reduce( lambda a,b: a*b, l ) def _clone_function(f, name): r'''Returns a clone of a given function. This is useful to copy a function, updating its metadata, such as the documentation, name, etc. There are also differences here between python 2 and python 3 that this function handles. ''' def get(f, what): what2 = 'func_{}'.format(what) what3 = '__{}__' .format(what) try: return getattr(f, what2) except: try: return getattr(f, what3) except: pass return None return types.FunctionType(get(f, 'code'), get(f, 'globals'), name, get(f, 'defaults'), get(f, 'closure')) class NumpysaneError(Exception): def __init__(self, err): self.err = err def __str__(self): return self.err def _eval_broadcast_dims( args, prototype ): r'''Helper function to evaluate a given list of arguments in respect to a given broadcasting prototype. This function will flag any errors in the dimensionality of the inputs. If no errors are detected, it returns dims_extra,dims_named where dims_extra is the outer shape of the broadcast This is a list: the union of all the leading shapes of all the arguments, after the trailing shapes of the prototype have been stripped dims_named is the sizes of the named dimensions This is a dict mapping dimension names to their sizes ''' # First I initialize dims_extra: the array containing the broadcasted # slices. Each argument calls for some number of extra dimensions, and the # overall array is as large as the biggest one of those Ndims_extra = 0 for i_arg in range(len(args)): Ndims_extra_here = len(args[i_arg].shape) - len(prototype[i_arg]) if Ndims_extra_here > Ndims_extra: Ndims_extra = Ndims_extra_here dims_extra = [1] * Ndims_extra def parse_dim( name_arg, shape_prototype, shape_arg, dims_named ): def range_rev(n): r'''Returns a range from -1 to -n. Useful to index variable-sized lists while aligning their ends.''' return range(-1, -n-1, -1) # first, I make sure the input is at least as dimension-ful as the # prototype. I do this by prepending dummy dimensions of length-1 as # necessary if len(shape_prototype) > len(shape_arg): ndims_missing_here = len(shape_prototype) - len(shape_arg) shape_arg = (1,) * ndims_missing_here + shape_arg # MAKE SURE THE PROTOTYPE DIMENSIONS MATCH (the trailing dimensions) # # Loop through the dimensions. Set the dimensionality of any new named # argument to whatever the current argument has. Any already-known # argument must match for i_dim in range_rev(len(shape_prototype)): dim_prototype = shape_prototype[i_dim] if not isinstance(dim_prototype, int): # This is a named dimension. These can have any value, but ALL # dimensions of the same name must thave the SAME value # EVERYWHERE if dim_prototype not in dims_named: dims_named[dim_prototype] = shape_arg[i_dim] dim_prototype = dims_named[dim_prototype] # The prototype dimension (named or otherwise) now has a numeric # value. Make sure it matches what I have if dim_prototype != shape_arg[i_dim]: raise NumpysaneError("Argument {} dimension '{}': expected {} but got {}". format(name_arg, shape_prototype[i_dim], dim_prototype, shape_arg[i_dim])) # I now know that this argument matches the prototype. I look at the # extra dimensions to broadcast, and make sure they match with the # dimensions I saw previously Ndims_extra_here = len(shape_arg) - len(shape_prototype) # MAKE SURE THE BROADCASTED DIMENSIONS MATCH (the leading dimensions) # # This argument has Ndims_extra_here dimensions to broadcast. The # current shape to broadcast must be at least as large, and must match for i_dim in range_rev(Ndims_extra_here): dim_arg = shape_arg[i_dim - len(shape_prototype)] if dim_arg != 1: if dims_extra[i_dim] == 1: dims_extra[i_dim] = dim_arg elif dims_extra[i_dim] != dim_arg: raise NumpysaneError("Argument {} prototype {} extra broadcast dim {} mismatch: previous arg set this to {}, but this arg wants {}". format(name_arg, shape_prototype, i_dim, dims_extra[i_dim], dim_arg)) dims_named = {} # parse_dim() adds to this for i_arg in range(len(args)): parse_dim( i_arg, prototype[i_arg], args[i_arg].shape, dims_named ) return dims_extra,dims_named def _broadcast_iter_dim( args, prototype, dims_extra ): r'''Generator to iterate through all the broadcasting slices. ''' # pad the dimension of each arg with ones. This lets me use the full # dims_extra index on each argument, without worrying about overflow args = [ atleast_dims(args[i], -(len(prototype[i])+len(dims_extra)) ) for i in range(len(args)) ] # per-arg dims_extra indexing varies: len-1 dimensions always index at 0. I # make a mask that I apply each time idx_slice_mask = np.ones( (len(args), len(dims_extra)), dtype=int) for i in range(len(args)): idx_slice_mask[i, np.array(args[i].shape,dtype=int)[:len(dims_extra)]==1] = 0 for idx_slice in itertools.product( *(range(x) for x in dims_extra) ): # tuple(idx) because of wonky behavior differences: # >>> a # array([[0, 1, 2], # [3, 4, 5]]) # # >>> a[tuple((1,1))] # 4 # # >>> a[list((1,1))] # array([[3, 4, 5], # [3, 4, 5]]) yield tuple( args[i][tuple(idx_slice * idx_slice_mask[i])] for i in range(len(args)) ) def broadcast_define(prototype, prototype_output=None, out_kwarg=None): r'''Vectorizes an arbitrary function, expecting input as in the given prototype. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> @nps.broadcast_define( (('n',), ('n',)) ) ... def inner_product(a, b): ... return a.dot(b) >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> inner_product(a,b) array([ 305, 1250]) The prototype defines the dimensionality of the inputs. In the inner product example above, the input is two 1D n-dimensional vectors. In particular, the 'n' is the same for the two inputs. This function is intended to be used as a decorator, applied to a function defining the operation to be vectorized. Each element in the prototype list refers to each input, in order. In turn, each such element is a list that describes the shape of that input. Each of these shape descriptors can be any of - a positive integer, indicating an input dimension of exactly that length - a string, indicating an arbitrary, but internally consistent dimension The normal numpy broadcasting rules (as described elsewhere) apply. In summary: - Dimensions are aligned at the end of the shape list, and must match the prototype - Extra dimensions left over at the front must be consistent for all the input arguments, meaning: - All dimensions of length != 1 must match - Dimensions of length 1 match corresponding dimensions of any length in other arrays - Missing leading dimensions are implicitly set to length 1 - The output(s) have a shape where - The trailing dimensions are whatever the function being broadcasted returns - The leading dimensions come from the extra dimensions in the inputs Calling a function wrapped with broadcast_define() with extra arguments (either positional or keyword), passes these verbatim to the inner function. Only the arguments declared in the prototype are broadcast. Scalars are represented as 0-dimensional numpy arrays: arrays with shape (), and these broadcast as one would expect: >>> @nps.broadcast_define( (('n',), ('n',), ())) ... def scaled_inner_product(a, b, scale): ... return a.dot(b)*scale >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> scale = np.array((10,100)) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> scale array([ 10, 100]) >>> scaled_inner_product(a,b,scale) array([[ 3050], [125000]]) Let's look at a more involved example. Let's say we have a function that takes a set of points in R^2 and a single center point in R^2, and finds a best-fit least-squares line that passes through the given center point. Let it return a 3D vector containing the slope, y-intercept and the RMS residual of the fit. This broadcasting-enabled function can be defined like this: import numpy as np import numpysane as nps @nps.broadcast_define( (('n',2), (2,)) ) def fit(xy, c): # line-through-origin-model: y = m*x # E = sum( (m*x - y)**2 ) # dE/dm = 2*sum( (m*x-y)*x ) = 0 # ----> m = sum(x*y)/sum(x*x) x,y = (xy - c).transpose() m = np.sum(x*y) / np.sum(x*x) err = m*x - y err **= 2 rms = np.sqrt(err.mean()) # I return m,b because I need to translate the line back b = c[1] - m*c[0] return np.array((m,b,rms)) And I can use broadcasting to compute a number of these fits at once. Let's say I want to compute 4 different fits of 5 points each. I can do this: n = 5 m = 4 c = np.array((20,300)) xy = np.arange(m*n*2, dtype=np.float64).reshape(m,n,2) + c xy += np.random.rand(*xy.shape)*5 res = fit( xy, c ) mb = res[..., 0:2] rms = res[..., 2] print "RMS residuals: {}".format(rms) Here I had 4 different sets of points, but a single center point c. If I wanted 4 different center points, I could pass c as an array of shape (4,2). I can use broadcasting to plot all the results (the points and the fitted lines): import gnuplotlib as gp gp.plot( *nps.mv(xy,-1,0), _with='linespoints', equation=['{}*x + {}'.format(mb_single[0], mb_single[1]) for mb_single in mb], unset='grid', square=1) The examples above all create a separate output array for each broadcasted slice, and copy the contents from each such slice into the larger output array that contains all the results. This is inefficient, and it is possible to pre-allocate an array to forgo these extra allocation and copy operations. There are several settings to control this. If the function being broadcasted can write its output to a given array instead of creating a new one, most of the inefficiency goes away. broadcast_define() supports the case where this function takes this array in a kwarg: the name of this kwarg can be given to broadcast_define() like so: @nps.broadcast_define( ....., out_kwarg = "out" ) def func( ....., out): ..... out[:] = result When used this way, the return value of the broadcasted function is ignored. In order for broadcast_define() to pass such an output array to the inner function, this output array must be available, which means that it must be given to us somehow, or we must create it. The most efficient way to make a broadcasted call is to create the full output array beforehand, and to pass that to the broadcasted function. In this case, nothing extra will be allocated, and no unnecessary copies will be made. This can be done like this: @nps.broadcast_define( (('n',), ('n',)), ....., out_kwarg = "out" ) def inner_product(a, b, out): ..... out.setfield(a.dot(b), out.dtype) out = np.empty((2,4), np.float) inner_product( np.arange(3), np.arange(2*4*3).reshape(2,4,3), out=out) In this example, the caller knows that it's calling an inner_product function, and that the shape of each output slice would be (). The caller also knows the input dimensions and that we have an extra broadcasting dimension (2,4), so the output array will have shape (2,4) + () = (2,4). With this knowledge, the caller preallocates the array, and passes it to the broadcasted function call. Furthermore, in this case the inner function will be called with an output array EVERY time, and this is the only mode the inner function needs to support. If the caller doesn't know (or doesn't want to pre-compute) the shape of the output, it can let the broadcasting machinery create this array for them. In order for this to be possible, the shape of the output should be pre-declared, and the dtype of the output should be known: @nps.broadcast_define( (('n',), ('n',)), (), out_kwarg = "out" ) def inner_product(a, b, out): ..... out.setfield(a.dot(b), out.dtype) out = inner_product( np.arange(3), np.arange(2*4*3).reshape(2,4,3), dtype=int) Note that the caller didn't need to specify the prototype of the output or the extra broadcasting dimensions (output prototype is in the broadcast_define() call, but not the inner_product() call). Specifying the dtype here is optional: it defaults to float if omitted. If we want the output array to be pre-allocated, the output prototype (it is () in this example) is required: we must know the shape of the output array in order to create it. Without a declared output prototype, we can still make mostly- efficient calls: the broadcasting mechanism can call the inner function for the first slice as we showed earlier, by creating a new array for the slice. This new array required an extra allocation and copy, but it contains the required shape information. This infomation will be used to allocate the output, and the subsequent calls to the inner function will be efficient: @nps.broadcast_define( (('n',), ('n',)), out_kwarg = "out" ) def inner_product(a, b, out=None): ..... if out is None: return a.dot(b) out.setfield(a.dot(b), out.dtype) return out out = inner_product( np.arange(3), np.arange(2*4*3).reshape(2,4,3)) Here we were slighly inefficient, but the ONLY required extra specification was out_kwarg: that's all you need. Also it is important to note that in this case the inner function is called both with passing it an output array to fill in, and with asking it to create a new one (by passing out=None to the inner function). This inner function then must support both modes of operation. If the inner function does not support filling in an output array, none of these efficiency improvements are possible. It is possible for a function to return more than one output, and this is supported by broadcast_define(). This case works exactly like the one-output case, except the output prototype is REQUIRED, and this output prototype contains multiple tuples, one for each output. The inner function must return the outputs in a tuple, and each individual output will be broadcasted as expected. broadcast_define() is analogous to thread_define() in PDL. ''' def inner_decorator_for_some_reason(func): # args broadcast, kwargs do not. All auxillary data should go into the # kwargs def broadcast_loop(*args, **kwargs): if len(args) < len(prototype): raise NumpysaneError("Mismatched number of input arguments. Wanted at least {} but got {}". \ format(len(prototype), len(args))) args_passthru = args[ len(prototype):] args = args[0:len(prototype) ] # make sure all the arguments are numpy arrays args = tuple(np.asarray(arg) for arg in args) # dims_extra: extra dimensions to broadcast through # dims_named: values of the named dimensions dims_extra,dims_named = \ _eval_broadcast_dims( args, prototype) # If None, the single output is either returned, or stored into # out_kwarg. If an integer, then a tuple is returned (or stored into # out_kwarg). If Noutputs==1 then we return a TUPLE of length 1 Noutputs = None # substitute named variable values into the output prototype prototype_output_expanded = None if prototype_output is not None: # If a single prototype_output is given, wrap it in a tuple to indicate # that we only have one output if all( type(o) is int or type(o) is str for o in prototype_output ): prototype_output_expanded = \ [d if type(d) is int else dims_named[d] \ for d in prototype_output] else: Noutputs = len(prototype_output) prototype_output_expanded = \ [ [d if type(d) is int else dims_named[d] \ for d in _prototype_output] for \ _prototype_output in prototype_output ] # I checked all the dimensions and aligned everything. I have my # to-broadcast dimension counts. Iterate through all the broadcasting # output, and gather the results output = None i_slice = 0 if Noutputs is None: # We expect a SINGLE output # if the output was supposed to go to a particular place, set that if out_kwarg is not None and out_kwarg in kwargs: output = kwargs[out_kwarg] if prototype_output_expanded is not None: expected_shape = dims_extra + prototype_output_expanded if output.shape != tuple(expected_shape): raise NumpysaneError("Inconsistent output shape: expected {}, but got {}".format(expected_shape, output.shape)) # if we know enough to allocate the output, do that elif prototype_output_expanded is not None: kwargs_dtype = {} if 'dtype' in kwargs: kwargs_dtype['dtype'] = kwargs['dtype'] output = np.empty(dims_extra + prototype_output_expanded, **kwargs_dtype) # else: # We don't have an output and we don't know its dimensions, so # we can't allocate an array for it. Leave output as None. I # will allocate it as soon I get the first slice; this will let # me know how large the whole thing should be # if no broadcasting involved, just call the function if not dims_extra: # if the function knows how to write directly to an array, # request that if output is not None and out_kwarg is not None: kwargs[out_kwarg] = output sliced_args = args + args_passthru result = func( *sliced_args, **kwargs ) if out_kwarg is not None and \ kwargs.get(out_kwarg) is not None: # We wrote the output in-place. Return the output array return kwargs.get(out_kwarg) # Using the returned output. Run some checks, and return the # returned value if isinstance(result, tuple): raise NumpysaneError("Only a single output expected, but a tuple was returned!") if prototype_output_expanded is not None and \ np.array(result).shape != tuple(prototype_output_expanded): raise NumpysaneError("Inconsistent slice output shape: expected {}, but got {}".format(prototype_output_expanded, np.array(result).shape)) return result # reshaped output. I write to this array if output is not None: output_flattened = clump(output, n=len(dims_extra)) for x in _broadcast_iter_dim( args, prototype, dims_extra ): # if the function knows how to write directly to an array, # request that if output is not None and out_kwarg is not None: kwargs[out_kwarg] = output_flattened[i_slice, ...] sliced_args = x + args_passthru result = func( *sliced_args, **kwargs ) if output is None or out_kwarg is None: # We weren't writing directly into the output, so check # the output for validity if isinstance(result, tuple): raise NumpysaneError("Only a single output expected, but a tuple was returned!") if not isinstance(result, np.ndarray): result = np.array(result) if prototype_output_expanded is None: prototype_output_expanded = result.shape else: if result.shape != tuple(prototype_output_expanded): raise NumpysaneError("Inconsistent slice output shape: expected {}, but got {}".format(prototype_output_expanded, result.shape)) if output is None: # I didn't already have an output array because I didn't # know how large it should be. But I now have the first # slice, and I know how big the whole output should be. # I create it output = np.empty( dims_extra + list(result.shape), dtype = result.dtype) output_flattened = output.reshape( (_product(dims_extra),) + result.shape) output_flattened[i_slice, ...] = result elif out_kwarg is None: output_flattened[i_slice, ...] = result # else: # I was writing directly to the output, so I don't need to # manually populate the slice i_slice = i_slice+1 else: # We expect MULTIPLE outputs: a tuple of length Noutputs # if the output was supposed to go to a particular place, set that if out_kwarg is not None and out_kwarg in kwargs: output = kwargs[out_kwarg] if prototype_output_expanded is not None: for i in range(Noutputs): expected_shape = dims_extra + prototype_output_expanded[i] if output[i].shape != tuple(expected_shape): raise NumpysaneError("Inconsistent output shape for output {}: expected {}, but got {}". \ format(i, expected_shape, output[i].shape)) # if we know enough to allocate the output, do that elif prototype_output_expanded is not None: kwargs_dtype = {} if 'dtype' in kwargs: kwargs_dtype['dtype'] = kwargs['dtype'] output = [np.empty(dims_extra + prototype_output_expanded[i], **kwargs_dtype) for i in range(Noutputs)] # else: # We don't have an output and we don't know its dimensions, so # we can't allocate an array for it. Leave output as None. I # will allocate it as soon I get the first slice; this will let # me know how large the whole thing should be # if no broadcasting involved, just call the function if not dims_extra: # if the function knows how to write directly to an array, # request that if output is not None and out_kwarg is not None: kwargs[out_kwarg] = tuple(output) sliced_args = args + args_passthru result = func( *sliced_args, **kwargs ) if out_kwarg is not None and \ kwargs.get(out_kwarg) is not None: # We wrote the output in-place. Return the output array return kwargs.get(out_kwarg) if not isinstance(result, tuple): raise NumpysaneError("A tuple of {} outputs is expected, but an object of type {} was returned". \ format(Noutputs, type(result))) if len(result) != Noutputs: raise NumpysaneError("A tuple of {} outputs is expected, but a length-{} tuple was returned". \ format(Noutputs, len(result))) if prototype_output_expanded is not None: for i in range(Noutputs): if np.array(result[i]).shape != tuple(prototype_output_expanded[i]): raise NumpysaneError("Inconsistent output {} shape: expected {}, but got {}". \ format(i, prototype_output_expanded[i], np.array(result[i]).shape)) return result # reshaped output. I write to this array if output is not None: output_flattened = [clump(output[i], n=len(dims_extra)) for i in range(Noutputs)] for x in _broadcast_iter_dim( args, prototype, dims_extra ): # if the function knows how to write directly to an array, # request that if output is not None and out_kwarg is not None: kwargs[out_kwarg] = tuple(o[i_slice, ...] for o in output_flattened) sliced_args = x + args_passthru result = func( *sliced_args, **kwargs ) if output is None or out_kwarg is None: # We weren't writing directly into the output, so check # the output for validity if not isinstance(result, tuple): raise NumpysaneError("A tuple of {} outputs is expected, but an object of type {} was returned". \ format(Noutputs, type(result))) if len(result) != Noutputs: raise NumpysaneError("A tuple of {} outputs is expected, but a length-{} tuple was returned". \ format(Noutputs, len(result))) result = [x if isinstance(x, np.ndarray) else np.array(x) for x in result] if prototype_output_expanded is None: prototype_output_expanded = [result[i].shape for i in range(Noutputs)] else: for i in range(Noutputs): if result[i].shape != tuple(prototype_output_expanded[i]): raise NumpysaneError("Inconsistent slice output {} shape: expected {}, but got {}". \ format(i, prototype_output_expanded[i], result[i].shape)) if output is None: # I didn't already have an output array because I didn't # know how large it should be. But I now have the first # slice, and I know how big the whole output should be. # I create it output = [np.empty( dims_extra + list(result[i].shape), dtype = result[i].dtype) for i in range(Noutputs)] output_flattened = [output[i].reshape( (_product(dims_extra),) + result[i].shape) for i in range(Noutputs)] for i in range(Noutputs): output_flattened[i][i_slice, ...] = result[i] elif out_kwarg is None: for i in range(Noutputs): output_flattened[i][i_slice, ...] = result[i] # else: # I was writing directly to the output, so I don't need to # manually populate the slice i_slice = i_slice+1 return output if out_kwarg is not None and not isinstance(out_kwarg, str): raise NumpysaneError("out_kwarg must be a string") # Make sure all dimensions are >=0 and that named output dimensions are # known from the input known_named_dims = set() if not isinstance(prototype, tuple): raise NumpysaneError("Input prototype must be given as a tuple") for dims_input in prototype: if not isinstance(dims_input, tuple): raise NumpysaneError("Input prototype dims must be given as a tuple") for dim in dims_input: if type(dim) is not int: if type(dim) is not str: raise NumpysaneError("Prototype dimensions must be integers > 0 or strings. Got '{}' of type '{}'". \ format(dim, type(dim))) known_named_dims.add(dim) else: if dim < 0: raise NumpysaneError("Prototype dimensions must be > 0. Got '{}'". \ format(dim)) if prototype_output is not None: if not isinstance(prototype_output, tuple): raise NumpysaneError("Output prototype dims must be given as a tuple") # If a single prototype_output is given, wrap it in a tuple to indicate # that we only have one output if all( type(o) is int or type(o) is str for o in prototype_output ): prototype_outputs = (prototype_output, ) else: prototype_outputs = prototype_output if not all( isinstance(p,tuple) for p in prototype_outputs ): raise NumpysaneError("Output dimensions must be integers > 0 or strings. Each output must be a tuple. Some given output aren't tuples: {}". \ format(prototype_outputs)) for dims_output in prototype_outputs: for dim in dims_output: if type(dim) is not int: if type(dim) is not str: raise NumpysaneError("Output dimensions must be integers > 0 or strings. Got '{}' of type '{}'". \ format(dim, type(dim))) if dim not in known_named_dims: raise NumpysaneError("Output prototype has named dimension '{}' not seen in the input prototypes". \ format(dim)) else: if dim < 0: raise NumpysaneError("Prototype dimensions must be > 0. Got '{}'". \ format(dim)) func_out = _clone_function( broadcast_loop, func.__name__ ) func_out.__doc__ = inspect.getdoc(func) if func_out.__doc__ is None: func_out.__doc__ = '' func_out.__doc__+= \ '''\n\nThis function is broadcast-aware through numpysane.broadcast_define(). The expected inputs have input prototype: {prototype} {output_prototype_text} The first {nargs} positional arguments will broadcast. The trailing shape of those arguments must match the input prototype; the leading shape must follow the standard broadcasting rules. Positional arguments past the first {nargs} and all the keyword arguments are passed through untouched.'''. \ format(prototype = prototype, output_prototype_text = 'No output prototype is defined.' if prototype_output is None else 'and output prototype\n\n {}'.format(prototype_output), nargs = len(prototype)) return func_out return inner_decorator_for_some_reason def broadcast_generate(prototype, args): r'''A generator that produces broadcasted slices SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> for s in nps.broadcast_generate( (('n',), ('n',)), (a,b)): ... print "slice: {}".format(s) slice: (array([0, 1, 2]), array([100, 101, 102])) slice: (array([3, 4, 5]), array([103, 104, 105])) The broadcasting operation of numpysane is described in detail in the numpysane.broadcast_define() docstring and in the main README of numpysane. This function can be used as a Python generator to produce each broadcasted slice one by one Since Python generators are inherently 1-dimensional, this function effectively flattens the broadcasted results. If the correct output shape needs to be reconstituted, the leading shape is available by calling numpysane.broadcast_extra_dims() with the same arguments as this function. ''' if len(args) != len(prototype): raise NumpysaneError("Mismatched number of input arguments. Wanted {} but got {}". \ format(len(prototype), len(args))) # make sure all the arguments are numpy arrays args = tuple(np.asarray(arg) for arg in args) # dims_extra: extra dimensions to broadcast through # dims_named: values of the named dimensions dims_extra,dims_named = \ _eval_broadcast_dims( args, prototype ) # I checked all the dimensions and aligned everything. I have my # to-broadcast dimension counts. Iterate through all the broadcasting # output, and gather the results for x in _broadcast_iter_dim( args, prototype, dims_extra ): yield x def broadcast_extra_dims(prototype, args): r'''Report the extra leading dimensions a broadcasted call would produce SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6). reshape( 2,3) >>> b = np.arange(15).reshape(5,1,3) >>> print(nps.broadcast_extra_dims((('n',), ('n',)), (a,b))) [5,2] The broadcasting operation of numpysane is described in detail in the numpysane.broadcast_define() docstring and in the main README of numpysane. This function applies the broadcasting rules to report the leading dimensions of a broadcasted result if a broadcasted function was called with the given arguments. This is most useful to reconstitute the desired shape from flattened output produced by numpysane.broadcast_generate() ''' if len(args) != len(prototype): raise NumpysaneError("Mismatched number of input arguments. Wanted {} but got {}". \ format(len(prototype), len(args))) # make sure all the arguments are numpy arrays args = tuple(np.asarray(arg) for arg in args) # dims_extra: extra dimensions to broadcast through # dims_named: values of the named dimensions dims_extra,dims_named = \ _eval_broadcast_dims( args, prototype ) return dims_extra def glue(*args, **kwargs): r'''Concatenates a given list of arrays along the given 'axis' keyword argument. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> row = a[0,:] + 1000 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> row array([1000, 1001, 1002]) >>> nps.glue(a,b, axis=-1) array([[ 0, 1, 2, 100, 101, 102], [ 3, 4, 5, 103, 104, 105]]) # empty arrays ignored when glueing. Useful for initializing an accumulation >>> nps.glue(a,b, np.array(()), axis=-1) array([[ 0, 1, 2, 100, 101, 102], [ 3, 4, 5, 103, 104, 105]]) >>> nps.glue(a,b,row, axis=-2) array([[ 0, 1, 2], [ 3, 4, 5], [ 100, 101, 102], [ 103, 104, 105], [1000, 1001, 1002]]) >>> nps.glue(a,b, axis=-3) array([[[ 0, 1, 2], [ 3, 4, 5]], [[100, 101, 102], [103, 104, 105]]]) The 'axis' must be given in a keyword argument. In order to count dimensions from the inner-most outwards, this function accepts only negative axis arguments. This is because numpy broadcasts from the last dimension, and the last dimension is the inner-most in the (usual) internal storage scheme. Allowing glue() to look at dimensions at the start would allow it to unalign the broadcasting dimensions, which is never what you want. To glue along the last dimension, pass axis=-1; to glue along the second-to-last dimension, pass axis=-2, and so on. Unlike in PDL, this function refuses to create duplicated data to make the shapes fit. In my experience, this isn't what you want, and can create bugs. For instance, PDL does this: pdl> p sequence(3,2) [ [0 1 2] [3 4 5] ] pdl> p sequence(3) [0 1 2] pdl> p PDL::glue( 0, sequence(3,2), sequence(3) ) [ [0 1 2 0 1 2] <--- Note the duplicated "0,1,2" [3 4 5 0 1 2] ] while numpysane.glue() does this: >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a[0:1,:] >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[0, 1, 2]]) >>> nps.glue(a,b,axis=-1) [exception] Finally, this function adds as many length-1 dimensions at the front as required. Note that this does not create new data, just new degenerate dimensions. Example: >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> res = nps.glue(a,b, axis=-5) >>> res array([[[[[ 0, 1, 2], [ 3, 4, 5]]]], [[[[100, 101, 102], [103, 104, 105]]]]]) >>> res.shape (2, 1, 1, 2, 3) In numpysane older than 0.10 the semantics were slightly different: the axis kwarg was optional, and glue(*args) would glue along a new leading dimension, and thus would be equivalent to cat(*args). This resulted in very confusing error messages if the user accidentally omitted the kwarg. To request the legacy behavior, do nps.glue.legacy_version = '0.9' ''' legacy = \ hasattr(glue, 'legacy_version') and \ StrictVersion(glue.legacy_version) <= StrictVersion('0.9') axis = kwargs.get('axis') if legacy: if axis is not None and axis >= 0: raise NumpysaneError("axis >= 0 can make broadcasting dimensions inconsistent, and is thus not allowed") else: if axis is None: raise NumpysaneError("glue() requires the axis to be given in the 'axis' kwarg") if axis >= 0: raise NumpysaneError("axis >= 0 can make broadcasting dimensions inconsistent, and is thus not allowed") # deal with scalar (non-ndarray) args args = [ np.asarray(x) for x in args ] # Special case to support this common idiom: # # accum = np.array(()) # while ...: # x = ... # accum = nps.glue(accum, x, axis = -2) # # Without special logic, this would throw an error since accum.shape starts # at (0,), which is almost certainly not compatible with x.shape. I support # both glue(empty,x) and glue(x,empty) if len(args) == 2: if args[0].shape == (0,) and args[1].size != 0: return atleast_dims(args[1], axis) if args[1].shape == (0,) and args[0].size != 0: return atleast_dims(args[0], axis) # Legacy behavior: if no axis is given, add a new axis at the front, and # glue along it max_ndim = max( x.ndim for x in args ) if axis is None: axis = -1 - max_ndim # if we're glueing along a dimension beyond what we already have, expand the # target dimension count if max_ndim < -axis: max_ndim = -axis # Now I add dummy dimensions at the front of each array, to bring the source # arrays to the same dimensionality. After this is done, ndims for all the # matrices will be the same, and np.concatenate() should know what to do. args = [ x[(np.newaxis,)*(max_ndim - x.ndim) + (Ellipsis,)] for x in args ] return atleast_dims(np.concatenate( args, axis=axis ), axis) def cat(*args): r'''Concatenates a given list of arrays along a new first (outer) dimension. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> b = a + 100 >>> c = a - 100 >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[100, 101, 102], [103, 104, 105]]) >>> c array([[-100, -99, -98], [ -97, -96, -95]]) >>> res = nps.cat(a,b,c) >>> res array([[[ 0, 1, 2], [ 3, 4, 5]], [[ 100, 101, 102], [ 103, 104, 105]], [[-100, -99, -98], [ -97, -96, -95]]]) >>> res.shape (3, 2, 3) >>> [x for x in res] [array([[0, 1, 2], [3, 4, 5]]), array([[100, 101, 102], [103, 104, 105]]), array([[-100, -99, -98], [ -97, -96, -95]])] ### Note that this is the same as [a,b,c]: cat is the reverse of ### iterating on an array This function concatenates the input arrays into an array shaped like the highest-dimensioned input, but with a new outer (at the start) dimension. The concatenation axis is this new dimension. As usual, the dimensions are aligned along the last one, so broadcasting will continue to work as expected. Note that this is the opposite operation from iterating a numpy array: iteration splits an array over its leading dimension, while cat joins a number of arrays via a new leading dimension. ''' if len(args) == 0: return np.array(()) max_ndim = max( x.ndim for x in args ) return glue(*args, axis = -1 - max_ndim) def clump(x, **kwargs): r'''Groups the given n dimensions together. SYNOPSIS >>> import numpysane as nps >>> nps.clump( arr(2,3,4), n = -2).shape (2, 12) Reshapes the array by grouping together 'n' dimensions, where 'n' is given in a kwarg. If 'n' > 0, then n leading dimensions are clumped; if 'n' < 0, then -n trailing dimensions are clumped So for instance, if x.shape is (2,3,4) then nps.clump(x, n = -2).shape is (2,12) and nps.clump(x, n = 2).shape is (6, 4) In numpysane older than 0.10 the semantics were different: n > 0 was required, and we ALWAYS clumped the trailing dimensions. Thus the new clump(-n) is equivalent to the old clump(n). To request the legacy behavior, do nps.clump.legacy_version = '0.9' ''' legacy = \ hasattr(clump, 'legacy_version') and \ StrictVersion(clump.legacy_version) <= StrictVersion('0.9') n = kwargs.get('n') if n is None: raise NumpysaneError("clump() requires a dimension count in the 'n' kwarg") if legacy: # old PDL-like clump(). Takes positive dimension counts, and acts from # the most-significant dimension (from the back) if n < 0: raise NumpysaneError("clump() requires n > 0") if n <= 1: return x if x.ndim < n: n = x.ndim s = list(x.shape[:-n]) + [ _product(x.shape[-n:]) ] return x.reshape(s) if -1 <= n and n <= 1: return x if x.ndim < n: n = x.ndim if -x.ndim > n: n = -x.ndim if n < 0: s = list(x.shape[:n]) + [ _product(x.shape[n:]) ] else: s = [ _product(x.shape[:n]) ] + list(x.shape[n:]) return x.reshape(s) def atleast_dims(x, *dims): r'''Returns an array with extra length-1 dimensions to contain all given axes. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6).reshape(2,3) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> nps.atleast_dims(a, -1).shape (2, 3) >>> nps.atleast_dims(a, -2).shape (2, 3) >>> nps.atleast_dims(a, -3).shape (1, 2, 3) >>> nps.atleast_dims(a, 0).shape (2, 3) >>> nps.atleast_dims(a, 1).shape (2, 3) >>> nps.atleast_dims(a, 2).shape [exception] >>> l = [-3,-2,-1,0,1] >>> nps.atleast_dims(a, l).shape (1, 2, 3) >>> l [-3, -2, -1, 1, 2] If the given axes already exist in the given array, the given array itself is returned. Otherwise length-1 dimensions are added to the front until all the requested dimensions exist. The given axis>=0 dimensions MUST all be in-bounds from the start, otherwise the most-significant axis becomes unaligned; an exception is thrown if this is violated. The given axis<0 dimensions that are out-of-bounds result in new dimensions added at the front. If new dimensions need to be added at the front, then any axis>=0 indices become offset. For instance: >>> x.shape (2, 3, 4) >>> [x.shape[i] for i in (0,-1)] [2, 4] >>> x = nps.atleast_dims(x, 0, -1, -5) >>> x.shape (1, 1, 2, 3, 4) >>> [x.shape[i] for i in (0,-1)] [1, 4] Before the call, axis=0 refers to the length-2 dimension and axis=-1 refers to the length=4 dimension. After the call, axis=-1 refers to the same dimension as before, but axis=0 now refers to a new length=1 dimension. If it is desired to compensate for this offset, then instead of passing the axes as separate arguments, pass in a single list of the axes indices. This list will be modified to offset the axis>=0 appropriately. Ideally, you only pass in axes<0, and this does not apply. Doing this in the above example: >>> l [0, -1, -5] >>> x.shape (2, 3, 4) >>> [x.shape[i] for i in (l[0],l[1])] [2, 4] >>> x=nps.atleast_dims(x, l) >>> x.shape (1, 1, 2, 3, 4) >>> l [2, -1, -5] >>> [x.shape[i] for i in (l[0],l[1])] [2, 4] We passed the axis indices in a list, and this list was modified to reflect the new indices: The original axis=0 becomes known as axis=2. Again, if you pass in only axis<0, then you don't need to care about this. ''' if any( not isinstance(d, int) for d in dims ): if len(dims) == 1 and isinstance(dims[0], list): dims = dims[0] else: raise NumpysaneError("atleast_dims() takes in axes as integers in separate arguments or\n" "as a single modifiable list") if max(dims) >= x.ndim: raise NumpysaneError("Axis {} out of bounds because x.ndim={}.\n" "To keep the last dimension anchored, " "only <0 out-of-bounds axes are allowed".format(max(dims), x.ndim)) need_ndim = -min(d if d<0 else -1 for d in dims) if x.ndim >= need_ndim: return x num_new_axes = need_ndim-x.ndim # apply an offset to any axes that need it if isinstance(dims, list): dims[:] = [ d+num_new_axes if d >= 0 else d for d in dims ] return x[ (np.newaxis,)*(num_new_axes) ] def mv(x, axis_from, axis_to): r'''Moves a given axis to a new position. Similar to numpy.moveaxis(). SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.mv( a, -1, 0).shape (4, 2, 3) >>> nps.mv( a, -1, -5).shape (4, 1, 1, 2, 3) >>> nps.mv( a, 0, -5).shape (2, 1, 1, 3, 4) New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ''' axes = [axis_from, axis_to] x = atleast_dims(x, axes) # The below is equivalent to # return np.moveaxis( x, *axes ) # but some older installs have numpy 1.8, where this isn't available axis_from = axes[0] if axes[0] >= 0 else x.ndim + axes[0] axis_to = axes[1] if axes[1] >= 0 else x.ndim + axes[1] # python3 needs the list() cast order = list(range(0, axis_from)) + list(range((axis_from+1), x.ndim)) order.insert(axis_to, axis_from) return np.transpose(x, order) def xchg(x, axis_a, axis_b): r'''Exchanges the positions of the two given axes. Similar to numpy.swapaxes() SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.xchg( a, -1, 0).shape (4, 3, 2) >>> nps.xchg( a, -1, -5).shape (4, 1, 2, 3, 1) >>> nps.xchg( a, 0, -5).shape (2, 1, 1, 3, 4) New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ''' axes = [axis_a, axis_b] x = atleast_dims(x, axes) return np.swapaxes( x, *axes ) def transpose(x): r'''Reverses the order of the last two dimensions. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.transpose(a).shape (2, 4, 3) >>> nps.transpose( np.arange(3) ).shape (3, 1) A "matrix" is generally seen as a 2D array that we can transpose by looking at the 2 dimensions in the opposite order. Here we treat an n-dimensional array as an n-2 dimensional object containing 2D matrices. As usual, the last two dimensions contain the matrix. New length-1 dimensions are added at the front, as required, meaning that 1D input of shape (n,) is interpreted as a 2D input of shape (1,n), and the transpose is 2 of shape (n,1). ''' return xchg( atleast_dims(x, -2), -1, -2) def dummy(x, axis, *axes_rest): r'''Adds length-1 dimensions at the given positions. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.dummy(a, 0).shape (1, 2, 3, 4) >>> nps.dummy(a, 1).shape (2, 1, 3, 4) >>> nps.dummy(a, -1).shape (2, 3, 4, 1) >>> nps.dummy(a, -2).shape (2, 3, 1, 4) >>> nps.dummy(a, -2, -2).shape (2, 3, 1, 1, 4) >>> nps.dummy(a, -5).shape (1, 1, 2, 3, 4) This is similar to numpy.expand_dims(), but handles out-of-bounds dimensions better. New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ''' def dummy_inner(x, axis): need_ndim = axis+1 if axis >= 0 else -axis if x.ndim >= need_ndim: # referring to an axis that already exists. expand_dims() thus works return np.expand_dims(x, axis) # referring to a non-existing axis. I simply add sufficient new axes, and # I'm done return atleast_dims(x, axis) axes = (axis,) + axes_rest for axis in axes: x = dummy_inner(x, axis) return x def reorder(x, *dims): r'''Reorders the dimensions of an array. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(24).reshape(2,3,4) >>> a.shape (2, 3, 4) >>> nps.reorder( a, 0, -1, 1 ).shape (2, 4, 3) >>> nps.reorder( a, -2 , -1, 0 ).shape (3, 4, 2) >>> nps.reorder( a, -4 , -2, -5, -1, 0 ).shape (1, 3, 1, 4, 2) This is very similar to numpy.transpose(), but handles out-of-bounds dimensions much better. New length-1 dimensions are added at the front, as required, and any axis>=0 that are passed in refer to the array BEFORE these new dimensions are added. ''' dims = list(dims) x = atleast_dims(x, dims) return np.transpose(x, dims) # Note that this explicitly isn't done with @broadcast_define. Instead I # implement the internals with core numpy routines. The advantage is that these # are some of very few numpy functions that support broadcasting, and they do so # in C, so their broadcasting loop is FAST. Much more so than my current # @broadcast_define loop def dot(a, b, out=None, dtype=None): r'''Non-conjugating dot product of two 1-dimensional n-long vectors. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> b = a+5 >>> a array([0, 1, 2]) >>> b array([5, 6, 7]) >>> nps.dot(a,b) 20 This is identical to numpysane.inner(). for a conjugating version of this function, use nps.vdot(). Note that the stock numpy dot() has some special handling when its dot() is given more than 1-dimensional input. THIS function has no special handling: normal broadcasting rules are applied, as expected. In-place operation is available with the "out" kwarg. The output dtype can be selected with the "dtype" kwarg. If omitted, the dtype of the input is used ''' if out is not None and dtype is not None and out.dtype != dtype: raise NumpysaneError("'out' and 'dtype' given explicitly, but the dtypes are mismatched!") if dtype is not None: # Handle overflows. Cases that require this are checked in the tests v = np.sum(a.astype(dtype)*b.astype(dtype), axis=-1, out=out, dtype=dtype ) else: v = np.sum(a*b, axis=-1, out=out, dtype=dtype ) if out is None: return v return out # nps.inner and nps.dot are equivalent. Set the functionality and update the # docstring inner = _clone_function( dot, "inner" ) doc = dot.__doc__ doc = doc.replace("vdot", "aaa") doc = doc.replace("dot", "bbb") doc = doc.replace("inner", "ccc") doc = doc.replace("ccc", "dot") doc = doc.replace("bbb", "inner") doc = doc.replace("aaa", "vdot") inner.__doc__ = doc # Note that this explicitly isn't done with @broadcast_define. Instead I # implement the internals with core numpy routines. The advantage is that these # are some of very few numpy functions that support broadcasting, and they do so # on the C level, so their broadcasting loop is FAST. Much more so than my # current @broadcast_define loop def vdot(a, b, out=None, dtype=None): r'''Conjugating dot product of two 1-dimensional n-long vectors. vdot(a,b) is equivalent to dot(np.conj(a), b) SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.array(( 1 + 2j, 3 + 4j, 5 + 6j)) >>> b = a+5 >>> a array([ 1.+2.j, 3.+4.j, 5.+6.j]) >>> b array([ 6.+2.j, 8.+4.j, 10.+6.j]) >>> nps.vdot(a,b) array((136-60j)) >>> nps.dot(a,b) array((24+148j)) For a non-conjugating version of this function, use nps.dot(). Note that the numpy vdot() has some special handling when its vdot() is given more than 1-dimensional input. THIS function has no special handling: normal broadcasting rules are applied. ''' return dot(np.conj(a), b, out=out, dtype=dtype) @broadcast_define( (('n',), ('m',)), prototype_output=('n','m'), out_kwarg='out' ) def outer(a, b, out=None): r'''Outer product of two 1-dimensional n-long vectors. SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> b = a+5 >>> a array([0, 1, 2]) >>> b array([5, 6, 7]) >>> nps.outer(a,b) array([[ 0, 0, 0], [ 5, 6, 7], [10, 12, 14]]) ''' if out is None: return np.outer(a,b) out.setfield(np.outer(a,b), out.dtype) return out # Note that this explicitly isn't done with @broadcast_define. Instead I # implement the internals with core numpy routines. The advantage is that these # are some of very few numpy functions that support broadcasting, and they do so # in C, so their broadcasting loop is FAST. Much more so than my current # @broadcast_define loop def norm2(a, **kwargs): r'''Broadcast-aware 2-norm. norm2(x) is identical to inner(x,x) SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> a array([0, 1, 2]) >>> nps.norm2(a) 5 This is a convenience function to compute a 2-norm ''' return inner(a,a, **kwargs) def mag(a, out=None, dtype=None): r'''Magnitude of a vector. mag(x) is functionally identical to sqrt(inner(x,x)) SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3) >>> a array([0, 1, 2]) >>> nps.mag(a) 2.23606797749979 This is a convenience function to compute a magnitude of a vector, with full broadcasting support. In-place operation is available with the "out" kwarg. The output dtype can be selected with the "dtype" kwarg. If omitted, dtype=float is selected. ''' if out is None: if dtype is None: dtype = float out = inner(a,a, dtype=dtype) if not isinstance(out, np.ndarray): # given two vectors, and without and 'out' array, inner() produces a # scalar, not an array. So I can't updated it inplace, and just # return a copy return np.sqrt(out) else: if dtype is None: dtype = out.dtype inner(a,a, out=out, dtype=dtype) # in-place sqrt np.sqrt.at(out,()) return out @broadcast_define( (('n','n',),), prototype_output=() ) def trace(a): r'''Broadcast-aware trace SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(3*4*4).reshape(3,4,4) >>> a array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]], [[16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]], [[32, 33, 34, 35], [36, 37, 38, 39], [40, 41, 42, 43], [44, 45, 46, 47]]]) >>> nps.trace(a) array([ 30, 94, 158]) ''' return np.trace(a) # Could be implemented with a simple loop around np.dot(): # # @broadcast_define( (('n', 'm'), ('m', 'l')), prototype_output=('n','l'), out_kwarg='out' ) # def matmult2(a, b, out=None): # return np.dot(a,b) # # but this would produce a python broadcasting loop, which is potentially slow. # Instead I'm using the np.matmul() primitive to get C broadcasting loops. This # function has stupid special-case rules for low-dimensional arrays, so I make # sure to do the normal broadcasting thing in those cases def matmult2(a, b, out=None): r'''Multiplication of two matrices SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6) .reshape(2,3) >>> b = np.arange(12).reshape(3,4) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) >>> nps.matmult2(a,b) array([[20, 23, 26, 29], [56, 68, 80, 92]]) This function is exposed publically mostly for legacy compatibility. Use numpysane.matmult() instead ''' if not isinstance(a, np.ndarray) and not isinstance(b, np.ndarray): # two non-arrays (assuming two scalars) if out is not None: o = a*b out.setfield(o, out.dtype) out.resize([]) return out return a*b if not isinstance(a, np.ndarray) or len(a.shape) == 0: # one non-array (assuming one scalar) if out is not None: out.setfield(a*b, out.dtype) out.resize(b.shape) return out return a*b if not isinstance(b, np.ndarray) or len(b.shape) == 0: # one non-array (assuming one scalar) if out is not None: out.setfield(a*b, out.dtype) out.resize(a.shape) return out return a*b if len(b.shape) == 1: b = b[np.newaxis, :] o = np.matmul(a,b, out) return o def matmult( a, *rest, **kwargs ): r'''Multiplication of N matrices SYNOPSIS >>> import numpy as np >>> import numpysane as nps >>> a = np.arange(6) .reshape(2,3) >>> b = np.arange(12).reshape(3,4) >>> c = np.arange(4) .reshape(4,1) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> b array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) >>> c array([[0], [1], [2], [3]]) >>> nps.matmult(a,b,c) array([[162], [504]]) >>> abc = np.zeros((2,1), dtype=float) >>> nps.matmult(a,b,c, out=abc) >>> abc array([[162], [504]]) This multiplies N matrices together by repeatedly calling matmult2() for each adjacent pair. In-place output is supported with the 'out' keyword argument This function supports broadcasting fully, in C internally ''' if len(rest) == 0: raise Exception("Need at least two terms to multiply") out = None if len(kwargs.keys()) > 1: raise Exception("Only ONE kwarg is supported: 'out'") if kwargs: # have exactly one kwarg if 'out' not in kwargs: raise Exception("The only supported kwarg is 'out'") out = kwargs['out'] return matmult2(a,reduce(matmult2, rest), out=out) # I use np.matmul at one point. This was added in numpy 1.10.0, but # apparently I want to support even older releases. I thus provide a # compatibility function in that case. This is slower (python loop instead of C # loop), but at least it works if not hasattr(np, 'matmul'): @broadcast_define( (('n','m'),('m','o')), ('n','o')) def matmul(a,b, out=None): return np.dot(a,b,out) np.matmul = matmul numpysane-0.35/numpysane_pywrap.py000066400000000000000000002014071407353053200174610ustar00rootroot00000000000000r'''Python-wrap C code with broadcasting awareness * SYNOPSIS Let's implement a broadcastable and type-checked inner product that is - Written in C (i.e. it is fast) - Callable from python using numpy arrays (i.e. it is convenient) We write a bit of python to generate the wrapping code. "genpywrap.py": import numpy as np import numpysane as nps import numpysane_pywrap as npsp m = npsp.module( name = "innerlib", docstring = "An inner product module in C") m.function( "inner", "Inner product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (), Ccode_slice_eval = \ {np.float64: r""" double* out = (double*)data_slice__output; const int N = dims_slice__a[0]; *out = 0.0; for(int i=0; i inner_pywrap.c We build this into a python module: COMPILE=(`python3 -c " import sysconfig conf = sysconfig.get_config_vars() print('{} {} {} -I{}'.format(*[conf[x] for x in ('CC', 'CFLAGS', 'CCSHARED', 'INCLUDEPY')]))"`) LINK=(`python3 -c " import sysconfig conf = sysconfig.get_config_vars() print('{} {} {}'.format(*[conf[x] for x in ('BLDSHARED', 'BLDLIBRARY', 'LDFLAGS')]))"`) EXT_SUFFIX=`python3 -c " import sysconfig print(sysconfig.get_config_vars('EXT_SUFFIX')[0])"` ${COMPILE[@]} -c -o inner_pywrap.o inner_pywrap.c ${LINK[@]} -o innerlib$EXT_SUFFIX inner_pywrap.o Here we used the build commands directly. This could be done with setuptools/distutils instead; it's a normal extension module. And now we can compute broadcasted inner products from a python script "tst.py": import numpy as np import innerlib print(innerlib.inner( np.arange(4, dtype=float), np.arange(8, dtype=float).reshape( 2,4))) Running it to compute inner([0,1,2,3],[0,1,2,3]) and inner([0,1,2,3],[4,5,6,7]): $ python3 tst.py [14. 38.] * DESCRIPTION This module provides routines to python-wrap existing C code by generating C sources that define the wrapper python extension module. To create the wrappers we 1. Instantiate a new numpysane_pywrap.module class 2. Call module.function() for each wrapper function we want to add to this module 3. Call module.write() to write the C sources defining this module to standard output The sources can then be built and executed normally, as any other python extension module. The resulting functions are called as one would expect: output = f_one_output (input0, input1, ...) (output0, output1, ...) = f_multiple_outputs(input0, input1, ...) depending on whether we declared a single output, or multiple outputs (see below). It is also possible to pre-allocate the output array(s), and call the functions like this (see below): output = np.zeros(...) f_one_output (input0, input1, ..., out = output) output0 = np.zeros(...) output1 = np.zeros(...) f_multiple_outputs(input0, input1, ..., out = (output0, output1)) Each wrapped function is broadcasting-aware. The normal numpy broadcasting rules (as described in 'broadcast_define' and on the numpy website: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) apply. In summary: - Dimensions are aligned at the end of the shape list, and must match the prototype - Extra dimensions left over at the front must be consistent for all the input arguments, meaning: - All dimensions of length != 1 must match - Dimensions of length 1 match corresponding dimensions of any length in other arrays - Missing leading dimensions are implicitly set to length 1 - The output(s) have a shape where - The trailing dimensions match the prototype - The leading dimensions come from the extra dimensions in the inputs When we create a wrapper function, we only define how to compute a single broadcasted slice. If the generated function is called with higher-dimensional inputs, this slice code will be called multiple times. This broadcast loop is produced by the numpysane_pywrap generator automatically. The generated code also - parses the python arguments - generates python return values - validates the inputs (and any pre-allocated outputs) to make sure the given shapes and types all match the declared shapes and types. For instance, computing an inner product of a 5-vector and a 3-vector is illegal - creates the output arrays as necessary This code-generator module does NOT produce any code to implicitly make copies of the input. If the inputs fail validation (unknown types given, contiguity checks failed, etc) then an exception is raised. Copying the input is potentially slow, so we require the user to do that, if necessary. ** Explicated example In the synopsis we declared the wrapper module like this: m = npsp.module( name = "innerlib", docstring = "An inner product module in C") This produces a module named "innerlib". Note that the python importer will look for this module in a file called "innerlib$EXT_SUFFIX" where EXT_SUFFIX comes from the python configuration. This is normal behavior for python extension modules. A module can contain many wrapper functions. Each one is added by calling 'm.function()'. We did this: m.function( "inner", "Inner product pywrapped with numpysane_pywrap", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (), Ccode_slice_eval = \ {np.float64: r""" double* out = (double*)data_slice__output; const int N = dims_slice__a[0]; *out = 0.0; for(int i=0; i>> print(innerlib.inner( np.arange(4, dtype=float), np.arange(8, dtype=float).reshape( 2,4)), scale_string = "1.0") [14. 38.] >>> print(innerlib.inner( np.arange(4, dtype=float), np.arange(8, dtype=float).reshape( 2,4), scale = 2.0, scale_string = "10.0")) [280. 760.] ** Precomputing a cookie outside the slice computation Sometimes it is useful to generate some resource once, before any of the broadcasted slices were evaluated. The slice evaluation code could then make use of this resource. Example: allocating memory, opening files. This is supported using a 'cookie'. We define a structure that contains data that will be available to all the generated functions. This structure is initialized at the beginning, used by the slice computation functions, and then cleaned up at the end. This is most easily described with an example. The scaled inner product demonstrated immediately above has an inefficiency: we compute 'atof(scale_string)' once for every slice, even though the string does not change. We should compute the atof() ONCE, and use the resulting value each time. And we can: m.function( "inner", "Inner product pywrapped with numpysane_pywrap", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (), extra_args = (("double", "scale", "1", "d"), ("const char*", "scale_string", "NULL", "s")), Ccode_cookie_struct = r""" double scale; /* from BOTH scale arguments: "scale", "scale_string" */ """, Ccode_validate = r""" if(scale_string == NULL) { PyErr_Format(PyExc_RuntimeError, "The 'scale_string' argument is required" ); return false; } cookie->scale = *scale * (scale_string ? atof(scale_string) : 1.0); return true; """, Ccode_slice_eval = \ {np.float64: r""" double* out = (double*)data_slice__output; const int N = dims_slice__a[0]; *out = 0.0; for(int i=0; iscale; return true;""" }, // Cleanup, such as free() or close() goes here Ccode_cookie_cleanup = '' ) We defined a cookie structure that contains one element: 'double scale'. We compute the scale factor (from BOTH of the extra arguments) before any of the slices are evaluated: in the validation function. Then we apply the already-computed scale with each slice. Both the validation and slice computation functions have the whole cookie structure available in '*cookie'. It is expected that the validation function will write something to the cookie, and the slice functions will read it, but this is not enforced: this structure is not const, and both functions can do whatever they like. If the cookie initialization did something that must be cleaned up (like a malloc() for instance), the cleanup code can be specified in the 'Ccode_cookie_cleanup' argument to function(). Note: this cleanup code is ALWAYS executed, even if there were errors that raise an exception, EVEN if we haven't initialized the cookie yet. When the cookie object is first initialized, it is filled with 0, so the cleanup code can detect whether the cookie has been initialized or not: m.function( ... Ccode_cookie_struct = r""" ... bool initialized; """, Ccode_validate = r""" ... cookie->initialized = true; return true; """, Ccode_cookie_cleanup = r""" if(cookie->initialized) cleanup(); """ ) ** Examples For some sample usage, see the wrapper-generator used in the test suite: https://github.com/dkogan/numpysane/blob/master/test/genpywrap.py ** Planned functionality Currently, each broadcasted slice is computed sequentially. But since the slices are inherently independent, this is a natural place to add parallelism. And implemention this with something like OpenMP should be straightforward. I'll get around to doing this eventually, but in the meantime, patches are welcome. ''' import sys import time import numpy as np from numpysane import NumpysaneError import os import re # Technically I'm supposed to use some "resource extractor" something to unbreak # setuptools. But I'm instead assuming that this was installed via Debian or by # using the eager_resources tag in setup(). This allows files to remain files, # and to appear in a "normal" directory, where this script can grab them and use # them # # And I try two different directories, in case I'm running in-place # # And pip does something yet different, which I support in a hacky way. This is # a mess _pywrap_path = [ # in-place: running from the source tree os.path.dirname( __file__ ) + '/pywrap-templates', # distro: /usr/share/... sys.prefix + '/share/python-numpysane/pywrap-templates' ] # pip: /home/whoever/.local/share/... _m = re.match("(/home/[^/]+/\.local)/lib/", __file__) if _m is not None: _local_prefix = _m.group(1) _pywrap_path.append( _local_prefix + '/share/python-numpysane/pywrap-templates') for p in _pywrap_path: _module_header_filename = p + '/pywrap_module_header.c' _module_footer_filename = p + '/pywrap_module_footer_generic.c' _function_filename = p + '/pywrap_function_generic.c' if os.path.exists(_module_header_filename): break else: raise NumpysaneError("Couldn't find pywrap templates! Looked in {}".format(_pywrap_path)) def _quote(s, convert_newlines=False): r'''Quote string for inclusion in C code There should be a library for this. Hopefuly this is correct. ''' s = s.replace('\\', '\\\\') # Pass all \ through verbatim if convert_newlines: s = s.replace('\n', '\\n') # All newlines -> \n s = s.replace('"', '\\"') # Quote all " return s def _substitute(s, **kwargs): r'''format() with specific semantics - {xxx} substitutions found in kwargs are made - {xxx} expressions not found in kwargs are left as is - {{ }} escaping is not respected: any '{xxx}' is replaced Otherwise they're left alone (useful for C code) ''' for k in kwargs.keys(): s = s.replace('{' + k + '}', kwargs[k]) return s class module: def __init__(self, name, docstring, header=''): r'''Initialize the python-wrapper-generator SYNOPSIS import numpysane_pywrap as npsp m = npsp.module( name = "wrapped_library", docstring = r"""wrapped by numpysane_pywrap Does this thing and does that thing""", header = '#include "library.h"') ARGUMENTS - name The name of the python module we're creating - docstring The docstring for this module - header Optional, defaults to ''. C code to include verbatim. Any #includes or utility functions can go here ''' with open( _module_header_filename, 'r') as f: self.module_header = f.read() + "\n" + header + "\n" with open( _module_footer_filename, 'r') as f: self.module_footer = _substitute(f.read(), MODULE_NAME = name, MODULE_DOCSTRING = _quote(docstring, convert_newlines=True)) self.functions = [] def function(self, name, docstring, args_input, prototype_input, prototype_output, Ccode_slice_eval, Ccode_validate = None, Ccode_cookie_struct = '', Ccode_cookie_cleanup = '', extra_args = ()): r'''Add a wrapper function to the module we're creating SYNOPSIS We can wrap a C function inner() like this: m.function( "inner", "Inner product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (), Ccode_slice_eval = \ {np.float64: r""" double* out = (double*)data_slice__output; const int N = dims_slice__a[0]; *out = 0.0; for(int i=0; i=0. Got '{}'". \ format(i_dim, args_input[i_arg], dim)) elif isinstance(dim, str): if dim not in named_dims: named_dims[dim] = -1-len(named_dims) else: raise NumpysaneError("Dimension {} in argument '{}' must be a string (named dimension) or an integer>=0. Got '{}' (type '{}')". \ format(i_dim, args_input[i_arg], dim, type(dim))) # The output is allowed to have named dimensions, but ONLY those that # appear in the input. The output may be a single tuple (describing the # one output) or it can be a tuple of tuples (describing multiple # outputs) if not isinstance(prototype_output, tuple): raise NumpysaneError("Output prototype dims must be given as a tuple") # If a single prototype_output is given, wrap it in a tuple to indicate # that we only have one output # If None, the single output is returned. If an integer, then a tuple is # returned. If Noutputs==1 then we return a TUPLE of length 1 Noutputs = None if all( type(o) is int or type(o) is str for o in prototype_output ): prototype_outputs = (prototype_output, ) else: prototype_outputs = prototype_output if not all( isinstance(p,tuple) for p in prototype_outputs ): raise NumpysaneError("Output dimensions must be integers > 0 or strings. Each output must be a tuple. Some given output aren't tuples: {}". \ format(prototype_outputs)) Noutputs = len(prototype_outputs) for i_output in range(len(prototype_outputs)): dims_output = prototype_outputs[i_output] for i_dim in range(len(dims_output)): dim = dims_output[i_dim] if isinstance(dim,int): if dim < 0: raise NumpysaneError("Output {} dimension {} must be a string (named dimension) or an integer>=0. Got '{}'". \ format(i_output, i_dim, dim)) elif isinstance(dim, str): if dim not in named_dims: # This output is a new named dimension. Output matrices must be passed in to define it named_dims[dim] = -1-len(named_dims) else: raise NumpysaneError("Dimension {} in output {} must be a string (named dimension) or an integer>=0. Got '{}' (type '{}')". \ format(i_dim, i_output, dim, type(dim))) def expand_prototype(shape): r'''Produces a shape string for each argument This is the dimensions passed-into this function except, named dimensions are consolidated, and set to -1,-2,..., and the whole thing is stringified and joined ''' shape = [ dim if isinstance(dim,int) else named_dims[dim] for dim in shape ] return ','.join(str(dim) for dim in shape) PROTOTYPE_DIM_DEFS = '' for i_arg_input in range(Ninputs): PROTOTYPE_DIM_DEFS += " const npy_intp PROTOTYPE_{}[{}] = {{{}}};\n". \ format(args_input[i_arg_input], len(prototype_input[i_arg_input]), expand_prototype(prototype_input[i_arg_input])); if Noutputs is None: PROTOTYPE_DIM_DEFS += " const npy_intp PROTOTYPE_{}[{}] = {{{}}};\n". \ format("output", len(prototype_output), expand_prototype(prototype_output)); else: for i_output in range(Noutputs): PROTOTYPE_DIM_DEFS += " const npy_intp PROTOTYPE_{}{}[{}] = {{{}}};\n". \ format("output", i_output, len(prototype_outputs[i_output]), expand_prototype(prototype_outputs[i_output])); PROTOTYPE_DIM_DEFS += " int Ndims_named = {};\n". \ format(len(named_dims)) # Output handling. We unpack each output array into a separate variable. # And if we have multiple outputs, we make sure that each one is # passed as a pre-allocated array # # At the start __py__output__arg has no reference if Noutputs is None: # Just one output. The argument IS the output array UNPACK_OUTPUTS = r''' int populate_output_tuple__i = -1; if(__py__output__arg == Py_None) __py__output__arg = NULL; if(__py__output__arg == NULL) { // One output, not given. Leave everything at NULL (it already is). // Will be allocated later } else { // Argument given. Treat it as an array Py_INCREF(__py__output__arg); if(!PyArray_Check(__py__output__arg)) { PyErr_SetString(PyExc_RuntimeError, "Could not interpret given argument as a numpy array"); goto done; } __py__output = (PyArrayObject*)__py__output__arg; Py_INCREF(__py__output); } ''' else: # Multiple outputs. Unpack, make sure we have pre-made arrays UNPACK_OUTPUTS = r''' int populate_output_tuple__i = -1; if(__py__output__arg == Py_None) __py__output__arg = NULL; if(__py__output__arg == NULL) { __py__output__arg = PyTuple_New({Noutputs}); if(__py__output__arg == NULL) { PyErr_Format(PyExc_RuntimeError, "Could not allocate output tuple of length %d", {Noutputs}); goto done; } // I made a tuple, but I don't yet have arrays to populate it with. I'll make // those later, and I'll fill the tuple later populate_output_tuple__i = 0; } else { Py_INCREF(__py__output__arg); if( !PySequence_Check(__py__output__arg) ) { PyErr_Format(PyExc_RuntimeError, "Have multiple outputs. The given 'out' argument is expected to be a sequence of length %d, but a non-sequence was given", {Noutputs}); goto done; } if( PySequence_Size(__py__output__arg) != {Noutputs} ) { PyErr_Format(PyExc_RuntimeError, "Have multiple outputs. The given 'out' argument is expected to be a sequence of length %d, but a sequence of length %d was given", {Noutputs}, PySequence_Size(__py__output__arg)); goto done; } #define PULL_OUT_OUTPUT_ARRAYS(name) \ __py__ ## name = (PyArrayObject*)PySequence_GetItem(__py__output__arg, i++); \ if(__py__ ## name == NULL || !PyArray_Check(__py__ ## name)) \ { \ PyErr_SetString(PyExc_RuntimeError, \ "Have multiple outputs. The given 'out' array MUST contain pre-allocated arrays, but " #name " is not an array"); \ goto done; \ } int i=0; OUTPUTS(PULL_OUT_OUTPUT_ARRAYS) #undef PULL_OUT_OUTPUT_ARRAYS } '''.replace('{Noutputs}', str(Noutputs)) # The keys of Ccode_slice_eval are either: # - a type: all inputs, outputs MUST have this type # - a list of types: the types in this list correspond to the inputs and # outputs of the call, in that order. # # I convert the known-type list to the one-type-per-element form for # consistent processing known_types = list(Ccode_slice_eval.keys()) Ninputs_and_outputs = Ninputs + (1 if Noutputs is None else Noutputs) for i in range(len(known_types)): if isinstance(known_types[i], type): known_types[i] = (known_types[i],) * Ninputs_and_outputs elif hasattr(known_types[i], '__iter__') and \ len(known_types[i]) == Ninputs_and_outputs and \ all(isinstance(t,type) for t in known_types[i]): # already a list of types. we're good pass else: raise NumpysaneError("Each of Ccode_slice_eval.keys() MUST be either a type, or a list of types (one for each input followed by one for each output in order; {} + {} = {} total)".format(Ninputs, Ninputs_and_outputs-Ninputs, Ninputs_and_outputs)) # {TYPESETS} is _(11, 15, 17, 0) _(13, 15, 17, 1) # {TYPESET_MATCHES_ARGLIST} is t0,t1,t2 # {TYPESETS_NAMES} is # " (float32,int32)\n" # " (float64,int32)\n" TYPESETS = ' '.join( ("_(" + ','.join(tuple(str(np.dtype(t).num) for t in known_types[i]) + (str(i),)) + ')') \ for i in range(len(known_types))) TYPESET_MATCHES_ARGLIST = ','.join(('t' + str(i)) for i in range(Ninputs_and_outputs)) def parened_type_list(l, Ninputs): r'''Converts list of types to string Like "(inputs: float32,int32 outputs: float32, float32)" ''' si = 'inputs: ' + ','.join( np.dtype(t).name for t in l[:Ninputs]) so = 'outputs: ' + ','.join( np.dtype(t).name for t in l[Ninputs:]) return '(' + si + ' ' + so + ')' TYPESETS_NAMES = ' '.join(('" ' + parened_type_list(s,Ninputs) +'\\n"') \ for s in known_types) ARGUMENTS_LIST = ['#define ARGUMENTS(_)'] for i_arg_input in range(Ninputs): ARGUMENTS_LIST.append( '_({})'.format(args_input[i_arg_input]) ) OUTPUTS_LIST = ['#define OUTPUTS(_)'] if Noutputs is None: OUTPUTS_LIST.append( '_({})'.format("output") ) else: for i_output in range(Noutputs): OUTPUTS_LIST.append( '_({}{})'.format("output", i_output) ) if not hasattr(self, 'function_body'): with open(_function_filename, 'r') as f: self.function_body = f.read() function_template = r''' static bool {FUNCTION_NAME}({ARGUMENTS}) { {FUNCTION_BODY} } ''' if Noutputs is None: slice_args = ("output",) + args_input else: slice_args = tuple("output{}".format(i) for i in range(Noutputs))+args_input slice_args_ndims = \ [ len(prototype) for prototype in prototype_outputs ] + \ [ len(prototype) for prototype in prototype_input ] EXTRA_ARGUMENTS_ARG_DEFINE = '' EXTRA_ARGUMENTS_NAMELIST = '' EXTRA_ARGUMENTS_PARSECODES = '' EXTRA_ARGUMENTS_ARGLIST_PARSE_PYARG = [] EXTRA_ARGUMENTS_ARGLIST_CALL_C = [] EXTRA_ARGUMENTS_ARGLIST_DEFINE = [] for c_type, arg_name, default_value, parse_arg in extra_args: # I strip from the c_type and leading "const " and any trailing "*". # The intent is that I can take "const char*" strings, and pass them # onto the inner functions using the same pointer, without # dereferencing it a second time m = re.match("\s*const\s+(.*?)$", c_type) if m is not None: c_type_no_leading_const = m.group(1) else: c_type_no_leading_const = c_type m = re.search("(.*)\*\s*$", c_type_no_leading_const) if m is not None: c_type_did_strip_pointer = True c_type_no_leading_const_no_pointer = m.group(1) else: c_type_did_strip_pointer = False c_type_no_leading_const_no_pointer = c_type_no_leading_const EXTRA_ARGUMENTS_ARGLIST_DEFINE.append('const {}* {} __attribute__((unused))'. \ format(c_type_no_leading_const_no_pointer, arg_name)) EXTRA_ARGUMENTS_ARG_DEFINE += "{} {} = {};\n".format(c_type, arg_name, default_value) EXTRA_ARGUMENTS_NAMELIST += '"{}",'.format(arg_name) EXTRA_ARGUMENTS_PARSECODES += '"{}"'.format(parse_arg) EXTRA_ARGUMENTS_ARGLIST_PARSE_PYARG.append('&' + arg_name) if c_type_did_strip_pointer: EXTRA_ARGUMENTS_ARGLIST_CALL_C.append(arg_name) else: EXTRA_ARGUMENTS_ARGLIST_CALL_C.append('&' + arg_name) EXTRA_ARGUMENTS_ARGLIST_DEFINE.append('__{FUNCTION_NAME}__cookie_t* cookie __attribute__((unused))'.format(FUNCTION_NAME=name)) EXTRA_ARGUMENTS_ARGLIST_CALL_C.append('cookie') EXTRA_ARGUMENTS_SLICE_ARG = ','.join(EXTRA_ARGUMENTS_ARGLIST_DEFINE) EXTRA_ARGUMENTS_ARGLIST_PARSE_PYARG = ''.join([s+',' for s in EXTRA_ARGUMENTS_ARGLIST_PARSE_PYARG]) EXTRA_ARGUMENTS_ARGLIST_CALL_C = ','.join(EXTRA_ARGUMENTS_ARGLIST_CALL_C) def ctype_from_dtype(t): f'''Get the corresponding C type from a numpy dtype If one does not exist, return None''' t = np.dtype(t) if not t.isnative or not (t.isbuiltin==1): return None nbits = str(t.itemsize * 8) if t.kind == 'f': return "npy_float" + nbits if t.kind == 'c': return "npy_complex" + nbits if t.kind == 'i': return "npy_int" + nbits if t.kind == 'u': return "npy_uint" + nbits return None text = '' contiguous_macro_template = r''' #define _CHECK_CONTIGUOUS__{name}(seterror) \ ({ \ bool result = true; \ bool have_dim_0 = false; \ /* If I have no data, just call the thing contiguous. This is useful */ \ /* because np.ascontiguousarray doesn't set contiguous alignment */ \ /* for empty arrays */ \ for(int i=0; i=-Ndims_slice__{name}; i--) \ { \ if(strides_slice__{name}[i+Ndims_slice__{name}] != sizeof_element__{name}*Nelems_slice) \ { \ result = false; \ if(seterror) \ PyErr_Format(PyExc_RuntimeError, \ "Variable '{name}' must be contiguous in memory, and it isn't in (at least) dimension %d", i); \ break; \ } \ Nelems_slice *= dims_slice__{name}[i+Ndims_slice__{name}]; \ } \ } \ result; \ }) #define CHECK_CONTIGUOUS__{name}() _CHECK_CONTIGUOUS__{name}(false) #define CHECK_CONTIGUOUS_AND_SETERROR__{name}() _CHECK_CONTIGUOUS__{name}(true) ''' for n in slice_args: text += contiguous_macro_template.replace("{name}", n) text += \ '\n' + \ '#define CHECK_CONTIGUOUS_ALL() ' + \ ' && '.join( "CHECK_CONTIGUOUS__"+n+"()" for n in slice_args) + \ '\n' + \ '#define CHECK_CONTIGUOUS_AND_SETERROR_ALL() ' + \ ' && '.join( "CHECK_CONTIGUOUS_AND_SETERROR__"+n+"()" for n in slice_args) + \ '\n' text += _substitute(''' typedef struct { {COOKIE_STRUCT_CONTENTS} } __{FUNCTION_NAME}__cookie_t; ''', FUNCTION_NAME = name, COOKIE_STRUCT_CONTENTS = Ccode_cookie_struct) # The user provides two sets of C code that we include verbatim in # static functions: # # - The validation function. Evaluated once to check the input for # validity. This is in addition to the broadcasting shape and type # compatibility checks. Probably the user won't be looking at the data # pointer # - The slice function. Evaluated once per broadcasted slice to actually # perform the computation. Probably the user will be looking at just # the _slice data, not the _full data # # These functions have identical prototypes arglist = [ arg for n in slice_args for arg in \ ("const int Ndims_full__" + n + " __attribute__((unused))", "const npy_intp* dims_full__" + n + " __attribute__((unused))", "const npy_intp* strides_full__" + n + " __attribute__((unused))", "const int Ndims_slice__" + n + " __attribute__((unused))", "const npy_intp* dims_slice__" + n + " __attribute__((unused))", "const npy_intp* strides_slice__" + n + " __attribute__((unused))", "npy_intp sizeof_element__" + n + " __attribute__((unused))", "void* {DATA_ARGNAME}__" + n + " __attribute__((unused))")] + \ EXTRA_ARGUMENTS_ARGLIST_DEFINE arglist_string = '\n ' + ',\n '.join(arglist) text += \ _substitute(function_template, FUNCTION_NAME = "__{}__validate".format(name), ARGUMENTS = _substitute(arglist_string, DATA_ARGNAME="data"), FUNCTION_BODY = "return true;" if Ccode_validate is None else Ccode_validate) # The evaluation function for one slice known_typesets = list(Ccode_slice_eval.keys()) # known_types is the same, but tweaked for i_typeset in range(len(known_typesets)): slice_function = "__{}__{}__slice".format(name,i_typeset) text += '\n' text_undef = '' for i_arg in range(len(slice_args)): ctype = ctype_from_dtype(known_types[i_typeset][i_arg]) if ctype is None: continue arg_name = slice_args [i_arg] ndims = slice_args_ndims[i_arg] text_here = \ '#define ctype__{name} {ctype}\n' + \ '#define item__{name}(' + \ ','.join([ "__ivar"+str(i) for i in range(ndims)]) + \ ') (*(ctype__{name}*)(data_slice__{name} ' + \ ''.join(['+ (__ivar' + str(i) + ')*strides_slice__{name}['+str(i)+']' \ for i in range(ndims)]) + \ '))\n' text += _substitute(text_here, name = arg_name, ctype= ctype) text_undef += '#undef item__{name}\n' .replace('{name}', arg_name) text_undef += '#undef ctype__{name}\n'.replace('{name}', arg_name) text += \ _substitute(function_template, FUNCTION_NAME = slice_function, ARGUMENTS = _substitute(arglist_string, DATA_ARGNAME="data_slice"), FUNCTION_BODY = Ccode_slice_eval[known_typesets[i_typeset]]) text += text_undef text += \ ' \\\n '.join(ARGUMENTS_LIST) + \ '\n\n' + \ ' \\\n '.join(OUTPUTS_LIST) + \ '\n\n' + \ _substitute(self.function_body, FUNCTION_NAME = name, PROTOTYPE_DIM_DEFS = PROTOTYPE_DIM_DEFS, UNPACK_OUTPUTS = UNPACK_OUTPUTS, EXTRA_ARGUMENTS_SLICE_ARG = EXTRA_ARGUMENTS_SLICE_ARG, EXTRA_ARGUMENTS_ARG_DEFINE = EXTRA_ARGUMENTS_ARG_DEFINE, EXTRA_ARGUMENTS_NAMELIST = EXTRA_ARGUMENTS_NAMELIST, EXTRA_ARGUMENTS_PARSECODES = EXTRA_ARGUMENTS_PARSECODES, EXTRA_ARGUMENTS_ARGLIST_PARSE_PYARG = EXTRA_ARGUMENTS_ARGLIST_PARSE_PYARG, EXTRA_ARGUMENTS_ARGLIST_CALL_C = EXTRA_ARGUMENTS_ARGLIST_CALL_C, TYPESETS = TYPESETS, TYPESET_MATCHES_ARGLIST = TYPESET_MATCHES_ARGLIST, TYPESETS_NAMES = TYPESETS_NAMES, COOKIE_CLEANUP = Ccode_cookie_cleanup) for n in slice_args: text += '#undef _CHECK_CONTIGUOUS__{name}\n'.replace('{name}', n) text += '#undef CHECK_CONTIGUOUS__{name}\n'.replace('{name}', n) text += '#undef CHECK_CONTIGUOUS_AND_SETERROR__{name}\n'.replace('{name}', n) text += '\n' text += '#undef CHECK_CONTIGUOUS_ALL\n' text += '#undef CHECK_CONTIGUOUS_AND_SETERROR_ALL\n' self.functions.append( (name, _quote(docstring, convert_newlines=True), text) ) def write(self, file=sys.stdout): r'''Write out the generated C code DESCRIPTION Once we defined all of the wrapper functions in this module by calling 'function()' for each one, we're ready to write out the generated C source that defines this module. write() writes out the C source to standard output by default. ARGUMENTS - file The python file object to write the output to. Defaults to standard output ''' # Get shellquote from the right place in python2 and python3 try: import pipes shellquote = pipes.quote except: # python3 puts this into a different module import shlex shellquote = shlex.quote print("// THIS IS A GENERATED FILE. DO NOT MODIFY WITH CHANGES YOU WANT TO KEEP", file=file) print("// Generated on {} with {}\n\n". \ format(time.strftime("%Y-%m-%d %H:%M:%S"), ' '.join(shellquote(s) for s in sys.argv)), file=file) print('#define FUNCTIONS(_) \\', file=file) print(' \\\n'.join( ' _({}, "{}")'.format(f[0],f[1]) for f in self.functions), file=file) print("\n") print('///////// {{{{{{{{{ ' + _module_header_filename, file=file) print(self.module_header, file=file) print('///////// }}}}}}}}} ' + _module_header_filename, file=file) for f in self.functions: print('///////// {{{{{{{{{ ' + _function_filename, file=file) print('///////// for function ' + f[0], file=file) print(f[2], file=file) print('///////// }}}}}}}}} ' + _function_filename, file=file) print('\n', file=file) print('///////// {{{{{{{{{ ' + _module_footer_filename, file=file) print(self.module_footer, file=file) print('///////// }}}}}}}}} ' + _module_footer_filename, file=file) numpysane-0.35/pywrap-templates/000077500000000000000000000000001407353053200170005ustar00rootroot00000000000000numpysane-0.35/pywrap-templates/pywrap_function_generic.c000066400000000000000000000603571407353053200241020ustar00rootroot00000000000000static PyObject* __pywrap__{FUNCTION_NAME}(PyObject* NPY_UNUSED(self), PyObject* args, PyObject* kwargs) { #define SLICE_ARG(name) \ \ const int Ndims_full__ ## name, \ const npy_intp* dims_full__ ## name, \ const npy_intp* strides_full__ ## name, \ \ const int Ndims_slice__ ## name, \ const npy_intp* dims_slice__ ## name, \ const npy_intp* strides_slice__ ## name, \ \ npy_intp sizeof_element__ ## name, \ void* data_slice__ ## name, // The cookie we compute BEFORE computing any slices. This is available to // the slice-computation function to do whatever they please. I initialize // the cookie to all-zeros. If any cleanup is needed, the COOKIE_CLEANUP // code at the end of this function should include an "inited" flag in the // cookie in order to know whether the cookie was inited in the first place, // and whether any cleanup is actually required __{FUNCTION_NAME}__cookie_t _cookie = {}; // I'd like to access the "cookie" here in a way identical to how I access // it inside the functions, so it must be a cookie_t* cookie __{FUNCTION_NAME}__cookie_t* cookie = &_cookie; typedef bool (slice_function_t)(OUTPUTS(SLICE_ARG) ARGUMENTS(SLICE_ARG) {EXTRA_ARGUMENTS_SLICE_ARG}); PyObject* __py__result__ = NULL; PyObject* __py__output__arg = NULL; #define ARG_DEFINE(name) PyArrayObject* __py__ ## name = NULL; ARGUMENTS(ARG_DEFINE); OUTPUTS( ARG_DEFINE); {EXTRA_ARGUMENTS_ARG_DEFINE}; SET_SIGINT(); #define NAMELIST(name) #name , char* keywords[] = { ARGUMENTS(NAMELIST) "out", {EXTRA_ARGUMENTS_NAMELIST} NULL }; #define PARSECODE(name) "O&" #define PARSEARG(name) PyArray_Converter, &__py__ ## name, if(!PyArg_ParseTupleAndKeywords( args, kwargs, ARGUMENTS(PARSECODE) "|O" {EXTRA_ARGUMENTS_PARSECODES}, keywords, ARGUMENTS(PARSEARG) &__py__output__arg, {EXTRA_ARGUMENTS_ARGLIST_PARSE_PYARG} NULL)) goto done; // parse_dims() is a helper function to evaluate a given list of arguments // in respect to a given broadcasting prototype. This function will flag any // errors in the dimensionality of the inputs. If no errors are detected, it // returns // dims_extra,dims_named // where // dims_extra is the outer dimensions of the broadcast // dims_named is the values of the named dimensions // First I initialize dims_extra: the array containing the broadcasted // slices. Each argument calls for some number of extra dimensions, and the // overall array is as large as the biggest one of those {PROTOTYPE_DIM_DEFS}; {UNPACK_OUTPUTS}; // At this point each output array is either NULL or a PyObject with a // reference. In all cases, Py_XDECREF() should be done at the end. If we // have multiple outputs, either the output sequence is already filled-in // with valid arrays (if they were passed-in; I just checked in // UNPACK_OUTPUTS) or the output tuple is full of blank spaces, and each // output is NULL (if I just made a new tuple). In the latter case I'll fill // it in later // // The output argument in __py__output__arg is NULL if we have a single // output that's not yet allocated. Otherwise it has a reference also, so it // should be PY_XDECREF() at the end. This __py__output__arg is what we // should return, unless it's NULL or Py_None. In that case we need to // allocate a new array, and return THAT { // I process the types. The output arrays may not have been created yet, // in which case I just let NULL pass, and ignore the type. I'll make // new arrays later, and those will have the right type #define DEFINE_OUTPUT_TYPENUM(name) int selected_typenum__ ## name; OUTPUTS(DEFINE_OUTPUT_TYPENUM); #undef DEFINE_OUTPUT_TYPENUM slice_function_t* slice_function; #define TYPE_MATCHES_ARGLIST(name) int typenum__ ## name, bool type_matches(ARGUMENTS(TYPE_MATCHES_ARGLIST) OUTPUTS( TYPE_MATCHES_ARGLIST) slice_function_t* f) { #define SET_SELECTED_TYPENUM_OUTPUT(name) selected_typenum__ ## name = typenum__ ## name; #define TYPE_MATCHES(name) \ && ( __py__ ## name == NULL || \ (PyObject*)__py__ ## name == Py_None || \ PyArray_DESCR(__py__ ## name)->type_num == typenum__ ## name ) if(true ARGUMENTS(TYPE_MATCHES) OUTPUTS(TYPE_MATCHES)) { /* all arguments match this typeset! */ slice_function = f; OUTPUTS(SET_SELECTED_TYPENUM_OUTPUT); return true; } return false; } #undef SET_SELECTED_TYPENUM_OUTPUT #undef TYPE_MATCHES #undef TYPE_MATCHES_ARGLIST #define TYPESETS(_) \ {TYPESETS} #define TYPESET_MATCHES({TYPESET_MATCHES_ARGLIST}, i) \ else if( type_matches({TYPESET_MATCHES_ARGLIST}, \ __{FUNCTION_NAME}__ ## i ## __slice) ) \ { \ /* matched. type_matches() did all the work. */ \ } if(0) ; TYPESETS(TYPESET_MATCHES) else { #if PY_MAJOR_VERSION == 3 #define INPUT_PERCENT_S(name) "%S," #define INPUT_TYPEOBJ(name) ,(((PyObject*)__py__ ## name != Py_None && __py__ ## name != NULL) ? \ (PyObject*)PyArray_DESCR(__py__ ## name)->typeobj : (PyObject*)Py_None) PyErr_Format(PyExc_RuntimeError, "The set of input and output types must correspond to one of these sets:\n" {TYPESETS_NAMES} "instead I got types (inputs: " ARGUMENTS(INPUT_PERCENT_S) ")" " outputs: (" OUTPUTS(INPUT_PERCENT_S) ")\n" "None in an output is not an error: a new array of the right type will be created" ARGUMENTS(INPUT_TYPEOBJ) OUTPUTS(INPUT_TYPEOBJ) ); #else ////////// python2 doesn't support %S PyErr_Format(PyExc_RuntimeError, "The set of input and output types must correspond to one of these sets:\n" {TYPESETS_NAMES}); #endif goto done; } #undef TYPESETS #undef TYPESET_MATCHES // Now deal with dimensionality // It's possible for my arguments (and the output) to have fewer // dimensions than required by the prototype, and still pass all the // dimensionality checks, assuming implied leading dimensions of length // 1. For instance I could receive a scalar where a ('n',) dimension is // expected, or a ('n',) vector where an ('m','n') array is expected. I // initially handle this with Ndims_extra<0 for those arguments and then // later, I make copies with actual "1" values in place. I do that because: // // 1. I want to support the above-described case where implicit leading // length-1 dimensions are used // // 2. I want to support new named-dimensions in the outputs, pulled from // the in-place arrays // // #2 requires partial processing of the outputs before they're all // guaranteed to exist. So I can't allocate temporary __dims__##name and // __strides__##name arrays on the stack: I don't know how big they are // yet. But I need explicit dimensions in memory to pass to the // validation and slice callbacks. So I do it implicitly first, and then // explicitly // the maximum of Ndims_extra_this for all the arguments. Each one COULD // be <0 but Ndims_extra is capped at the bottom at 0 int Ndims_extra = 0; #define DECLARE_DIM_VARS(name) \ const int PROTOTYPE_LEN_ ## name = (int)sizeof(PROTOTYPE_ ## name)/sizeof(PROTOTYPE_ ## name[0]); \ int __ndim__ ## name = -1; \ const npy_intp* __dims__ ## name = NULL; \ const npy_intp* __strides__ ## name = NULL; \ /* May be <0 */ \ int Ndims_extra__ ## name = -1; #define DEFINE_DIM_VARS(name) \ if((PyObject*)__py__ ## name != Py_None && __py__ ## name != NULL) \ { \ __ndim__ ## name = PyArray_NDIM (__py__ ## name); \ __dims__ ## name = PyArray_DIMS (__py__ ## name); \ __strides__ ## name = PyArray_STRIDES(__py__ ## name); \ /* May be <0 */ \ Ndims_extra__ ## name = __ndim__ ## name - PROTOTYPE_LEN_ ## name; \ if(Ndims_extra < Ndims_extra__ ## name) \ Ndims_extra = Ndims_extra__ ## name; \ } ARGUMENTS(DECLARE_DIM_VARS); ARGUMENTS(DEFINE_DIM_VARS); OUTPUTS( DECLARE_DIM_VARS); OUTPUTS( DEFINE_DIM_VARS); // Any outputs that are given are processed here. Outputs that are NOT // given are skipped for now. I'll create them later, and do the // necessary updates and checks later by expanding DEFINE_DIM_VARS later npy_intp dims_extra[Ndims_extra]; for(int i=0; i= 0) \ { \ PyTuple_SET_ITEM(__py__output__arg, \ populate_output_tuple__i, \ (PyObject*)__py__ ## name); \ populate_output_tuple__i++; \ Py_INCREF(__py__ ## name); \ } \ else if(__py__output__arg == NULL) \ { \ /* one output, no output given */ \ __py__output__arg = (PyObject*)__py__ ## name; \ Py_INCREF(__py__output__arg); \ } \ DEFINE_DIM_VARS(name); \ } OUTPUTS(CREATE_MISSING_OUTPUT); // I'm done messing around with the dimensions. Everything passed, and // all the arrays have been created. Some arrays MAY have some implicit // length-1 dimensions. I can't communicate this to the validation and // slice functions. So I explicitly make copies of the dimension and // stride arrays, making any implicit length-1 dimensions explicit. The // callbacks then see all the dimension data in memory. // // Most of the time we won't have any implicit dimensions, so these // mounted shapes would then be copies of the normal ones #define MAKE_MOUNTED_COPIES(name) \ int __ndim__mounted__ ## name = __ndim__ ## name; \ if( __ndim__ ## name < PROTOTYPE_LEN_ ## name ) \ /* Too few input dimensions. Add dummy dimension of length 1 */ \ __ndim__mounted__ ## name = PROTOTYPE_LEN_ ## name; \ npy_intp __dims__mounted__ ## name[__ndim__mounted__ ## name]; \ npy_intp __strides__mounted__ ## name[__ndim__mounted__ ## name]; \ { \ int i_dim = -1; \ for(; i_dim >= -__ndim__ ## name; i_dim--) \ { \ /* copies of the original shapes */ \ __dims__mounted__ ## name[i_dim + __ndim__mounted__ ## name] = __dims__ ## name[i_dim + __ndim__ ## name]; \ __strides__mounted__## name[i_dim + __ndim__mounted__ ## name] = __strides__ ## name[i_dim + __ndim__ ## name]; \ } \ for(; i_dim >= -__ndim__mounted__ ## name; i_dim--) \ { \ /* extra dummy dimensions, as needed */ \ __dims__mounted__ ## name[i_dim + __ndim__mounted__ ## name] = 1; \ __strides__mounted__ ## name[i_dim + __ndim__mounted__ ## name] = 0; \ } \ } \ /* Now guaranteed >= 0 because of the padding */ \ int Ndims_extra__mounted__ ## name = __ndim__mounted__ ## name - PROTOTYPE_LEN_ ## name; \ \ /* Ndims_extra and dims_extra[] are already right */ ARGUMENTS(MAKE_MOUNTED_COPIES); OUTPUTS( MAKE_MOUNTED_COPIES); // Each output variable is now an allocated array, and each one has a // reference. The argument __py__output__arg ALSO has a reference #define ARGLIST_CALL_USER_CALLBACK(name) \ __ndim__mounted__ ## name , \ __dims__mounted__ ## name, \ __strides__mounted__ ## name, \ __ndim__mounted__ ## name - Ndims_extra__mounted__ ## name, \ &__dims__mounted__ ## name[ Ndims_extra__mounted__ ## name ], \ &__strides__mounted__ ## name[ Ndims_extra__mounted__ ## name ], \ PyArray_ITEMSIZE(__py__ ## name), \ (void*)data_argument__ ## name, #define DEFINE_DATA_ARGUMENT(name) char* data_argument__ ## name; #define INIT_DATA_ARGUMENT(name) data_argument__ ## name = PyArray_DATA(__py__ ## name); ARGUMENTS(DEFINE_DATA_ARGUMENT); OUTPUTS( DEFINE_DATA_ARGUMENT); ARGUMENTS(INIT_DATA_ARGUMENT); OUTPUTS( INIT_DATA_ARGUMENT); if( ! __{FUNCTION_NAME}__validate(OUTPUTS( ARGLIST_CALL_USER_CALLBACK) ARGUMENTS(ARGLIST_CALL_USER_CALLBACK) {EXTRA_ARGUMENTS_ARGLIST_CALL_C}) ) { if(PyErr_Occurred() == NULL) PyErr_SetString(PyExc_RuntimeError, "User-provided validation failed!"); goto done; } // if the extra dimensions are degenerate, just return the empty array // we have for(int i=0; i=0 && __dims__a[idim] != 1) stride_extra_elements_a[idim_extra] = __strides__a[idim] / sizeof(double); else stride_extra_elements_a[idim_extra] = 0; idim = idim_extra + Ndims_extra_b - Ndims_extra; if(idim>=0 && __dims__b[idim] != 1) stride_extra_elements_b[idim_extra] = __strides__b[idim] / sizeof(double); else stride_extra_elements_b[idim_extra] = 0; } #endif // I checked all the dimensions and aligned everything. I have my // to-broadcast dimension counts. // Iterate through all the broadcasting output, and gather the results int idims_extra[Ndims_extra]; for(int i=0; i=0; i--) { if(++idims[i] < Ndims[i]) return true; idims[i] = 0; } return false; } do { // This loop is awkward. I don't update the slice data pointer // incrementally with each slice, but advance each dimension for // each slice. There should be a better way ARGUMENTS(INIT_DATA_ARGUMENT); OUTPUTS( INIT_DATA_ARGUMENT); #undef DEFINE_DATA_ARGUMENT #undef INIT_DATA_ARGUMENT for( int i_dim=-1; i_dim >= -Ndims_extra; i_dim--) { #define ADVANCE_SLICE(name) \ if(i_dim + Ndims_extra__mounted__ ## name >= 0 && \ __dims__mounted__ ## name[i_dim + Ndims_extra__mounted__ ## name] != 1) \ data_argument__ ## name += idims_extra[i_dim + Ndims_extra]*__strides__ ## name[i_dim + Ndims_extra__mounted__ ## name]; ARGUMENTS(ADVANCE_SLICE); OUTPUTS( ADVANCE_SLICE); } if( ! slice_function( OUTPUTS( ARGLIST_CALL_USER_CALLBACK) ARGUMENTS(ARGLIST_CALL_USER_CALLBACK) {EXTRA_ARGUMENTS_ARGLIST_CALL_C}) ) { if(PyErr_Occurred() == NULL) PyErr_Format(PyExc_RuntimeError, "__{FUNCTION_NAME}__slice failed!"); goto done; } } while(next(idims_extra, dims_extra, Ndims_extra)); __py__result__ = (PyObject*)__py__output__arg; } done: // I free the arguments (I'm done with them) and the outputs (I'm done with // each individual one; the thing I'm returning has its own reference) #define FREE_PYARRAY(name) Py_XDECREF(__py__ ## name); ARGUMENTS(FREE_PYARRAY); OUTPUTS( FREE_PYARRAY); if(__py__result__ == NULL) { // An error occurred. I'm not returning an output, so release that too Py_XDECREF(__py__output__arg); } // If we allocated any resource into the cookie earlier, we can clean it up // now {COOKIE_CLEANUP} RESET_SIGINT(); return __py__result__; } #undef ARG_DEFINE #undef NAMELIST #undef PARSECODE #undef PARSEARG #undef DECLARE_DIM_VARS #undef DEFINE_DIM_VARS #undef PARSE_DIMS #undef SLICE_ARG #undef INPUT_PERCENT_S #undef INPUT_TYPEOBJ #undef ARGLIST_CALL_USER_CALLBACK #undef ADVANCE_SLICE #undef FREE_PYARRAY #undef CHECK_DIMS_NAMED_KNOWN #undef CREATE_MISSING_OUTPUT #undef MAKE_MOUNTED_COPIES #undef ARGUMENTS #undef OUTPUTS numpysane-0.35/pywrap-templates/pywrap_module_footer_generic.c000066400000000000000000000013721407353053200251100ustar00rootroot00000000000000#define PYMETHODDEF_ENTRY(name,docstring) \ { #name, \ (PyCFunction)__pywrap__ ## name, \ METH_VARARGS | METH_KEYWORDS, \ docstring }, static PyMethodDef methods[] = { FUNCTIONS(PYMETHODDEF_ENTRY) {} }; #if PY_MAJOR_VERSION == 2 PyMODINIT_FUNC init{MODULE_NAME}(void) { Py_InitModule3("{MODULE_NAME}", methods, "{MODULE_DOCSTRING}"); import_array(); } #else static struct PyModuleDef module_def = { PyModuleDef_HEAD_INIT, "{MODULE_NAME}", "{MODULE_DOCSTRING}", -1, methods }; PyMODINIT_FUNC PyInit_{MODULE_NAME}(void) { PyObject* module = PyModule_Create(&module_def); import_array(); return module; } #endif numpysane-0.35/pywrap-templates/pywrap_module_header.c000066400000000000000000000155331407353053200233520ustar00rootroot00000000000000#define NPY_NO_DEPRECATED_API NPY_API_VERSION #include #include #include #include #include // Python is silly. There's some nuance about signal handling where it sets a // SIGINT (ctrl-c) handler to just set a flag, and the python layer then reads // this flag and does the thing. Here I'm running C code, so SIGINT would set a // flag, but not quit, so I can't interrupt the solver. Thus I reset the SIGINT // handler to the default, and put it back to the python-specific version when // I'm done #define SET_SIGINT() struct sigaction sigaction_old; \ do { \ if( 0 != sigaction(SIGINT, \ &(struct sigaction){ .sa_handler = SIG_DFL }, \ &sigaction_old) ) \ { \ PyErr_SetString(PyExc_RuntimeError, "sigaction() failed"); \ goto done; \ } \ } while(0) #define RESET_SIGINT() do { \ if( 0 != sigaction(SIGINT, \ &sigaction_old, NULL )) \ PyErr_SetString(PyExc_RuntimeError, "sigaction-restore failed"); \ } while(0) static bool parse_dim_for_one_arg(// input and output npy_intp* dims_named, npy_intp* dims_extra, // input int Ndims_extra, const char* arg_name, int Ndims_extra_var, const npy_intp* dims_want, int Ndims_want, const npy_intp* dims_var, int Ndims_var, bool is_output) { // MAKE SURE THE PROTOTYPE DIMENSIONS MATCH (the trailing dimensions) // // Loop through the dimensions. Set the dimensionality of any new named // argument to whatever the current argument has. Any already-known // argument must match for( int i_dim=-1; i_dim >= -Ndims_want; i_dim--) { int i_dim_want = i_dim + Ndims_want; int dim_want = dims_want[i_dim_want]; int i_dim_var = i_dim + Ndims_var; // if we didn't get enough dimensions, use dim=1 int dim_var = i_dim_var >= 0 ? dims_var[i_dim_var] : 1; if(dim_want < 0) { // This is a named dimension. These can have any value, but // ALL dimensions of the same name must thave the SAME value // EVERYWHERE if(dims_named[-dim_want-1] < 0) dims_named[-dim_want-1] = dim_var; dim_want = dims_named[-dim_want-1]; } // The prototype dimension (named or otherwise) now has a numeric // value. Make sure it matches what I have if(dim_want != dim_var) { if(dims_want[i_dim_want] < 0) PyErr_Format(PyExc_RuntimeError, "Argument '%s': prototype says dimension %d (named dimension %d) has length %d, but got %d", arg_name, i_dim, (int)dims_want[i_dim_want], dim_want, dim_var); else PyErr_Format(PyExc_RuntimeError, "Argument '%s': prototype says dimension %d has length %d, but got %d", arg_name, i_dim, dim_want, dim_var); return false; } } // I now know that this argument matches the prototype. I look at the // extra dimensions to broadcast, and make sure they match with the // dimensions I saw previously // MAKE SURE THE BROADCASTED DIMENSIONS MATCH (the leading dimensions) // // This argument has Ndims_extra_var dimensions above the prototype (may be // <0 if there're implicit leading length-1 dimensions at the start). The // current dimensions to broadcast must match // // Also handle special broadcasting logic for outputs. The extra dimensions // on the output must match the extra dimensions from the inputs EXACTLY. // Some things could be reasonably supported, but it's all not very useful, // and error-prone if(is_output && Ndims_extra_var != Ndims_extra) { PyErr_Format(PyExc_RuntimeError, "Outputs must match the broadcasted dimensions EXACTLY. '%s' has %d extra, broadcasted dimensions while the inputs have %d", arg_name, Ndims_extra_var, Ndims_extra); return false; } for( int i_dim=-1; i_dim >= -Ndims_extra_var; i_dim--) { int i_dim_var = i_dim - Ndims_want + Ndims_var; // if we didn't get enough dimensions, use dim=1 int dim_var = i_dim_var >= 0 ? dims_var[i_dim_var] : 1; if (dim_var != 1) { int i_dim_extra = i_dim + Ndims_extra; if(i_dim_extra < 0) { PyErr_Format(PyExc_RuntimeError, "Argument '%s' dimension %d (broadcasted dimension %d) i_dim_extra<0: %d. This shouldn't happen. There's a bug in the implicit-leading-dimension logic. Please report", arg_name, i_dim-Ndims_want, i_dim, i_dim_extra); return false; } if(is_output && dims_extra[i_dim_extra] != dim_var) { PyErr_Format(PyExc_RuntimeError, "Outputs must match the broadcasted dimensions EXACTLY. '%s' dimension %d (broadcasted dimension %d) has length %d, while the inputs have %d", arg_name, i_dim-Ndims_want, i_dim, dim_var, dims_extra[i_dim_extra]); return false; } if( dims_extra[i_dim_extra] == 1) dims_extra[i_dim_extra] = dim_var; else if(dims_extra[i_dim_extra] != dim_var) { PyErr_Format(PyExc_RuntimeError, "Argument '%s' dimension %d (broadcasted dimension %d) mismatch. Previously saw length %d, but here have length %d", arg_name, i_dim-Ndims_want, i_dim, (int)dims_extra[i_dim_extra], dim_var); return false; } } } return true; } numpysane-0.35/setup.py000077500000000000000000000021761407353053200152050ustar00rootroot00000000000000#!/usr/bin/python # using distutils not setuptools because setuptools puts dist_files in the root # of the host prefix, not the target prefix. Life is too short to fight this # nonsense from distutils.core import setup import re import glob version = None with open("numpysane.py", "r") as f: for l in f: m = re.match("__version__ *= *'(.*?)' *$", l) if m: version = m.group(1) break if version is None: raise Exception("Couldn't find version in 'numpysane.py'") pywrap_templates = glob.glob('pywrap-templates/*.c') setup(name = 'numpysane', version = version, author = 'Dima Kogan', author_email = 'dima@secretsauce.net', url = 'http://github.com/dkogan/numpysane', description = 'more-reasonable core functionality for numpy', long_description = """numpysane is a collection of core routines to provide basic numpy functionality in a more reasonable way""", license = 'LGPL', py_modules = ['numpysane', 'numpysane_pywrap'], data_files = [ ('share/python-numpysane/pywrap-templates', pywrap_templates)]) numpysane-0.35/test/000077500000000000000000000000001407353053200144415ustar00rootroot00000000000000numpysane-0.35/test/genpywrap.py000077500000000000000000000244501407353053200170370ustar00rootroot00000000000000#!/usr/bin/python3 r'''generate broadcast-aware python wrapping to testlib The test suite runs this script to python-wrap the testlib C library, then the test suite builds this python extension module, and then the test suite validates this module's behavior ''' import sys import os dir_path = os.path.dirname(os.path.realpath(__file__)) sys.path[:0] = dir_path + '/..', import numpy as np import numpysane as nps import numpysane_pywrap as npsp docstring_module = r"""Some functions to test the python wrapping multiline docstring to stress-test this thing line line line """ m = npsp.module( name = "testlib", docstring = docstring_module, header = ''' #include #include "testlib.h" ''') m.function( "identity3", r"""Generates a 3x3 identity matrix multi-line docstring to make sure it works """, args_input = (), prototype_input = (), prototype_output = (3,3), Ccode_slice_eval = \ {np.float64: r''' for(int i=0; i<3; i++) for(int j=0; j<3; j++) item__output(i,j) = (i==j) ? 1.0 : 0.0; return true; '''}) m.function( "identity", '''Generates an NxN identity matrix. Output matrices must be passed-in to define N''', args_input = (), prototype_input = (), prototype_output = ('N', 'N'), Ccode_slice_eval = \ {np.float64: r''' int N = dims_slice__output[0]; for(int i=0; i 0.0 && CHECK_CONTIGUOUS_AND_SETERROR_ALL() ) ) return false; cookie->scale = *scale * (scale_string ? atof(scale_string) : 1.0); cookie->ptr = malloc(1000); return cookie->ptr != NULL; ''', Ccode_slice_eval = \ {np.float64: r''' int N = dims_slice__a[0]; item__output0() = innerouter((double*)data_slice__output1, (double*)data_slice__a, (double*)data_slice__b, cookie->scale, N); return true; '''}, Ccode_cookie_cleanup = 'if(cookie->ptr != NULL) free(cookie->ptr);' ) m.function( "sorted_indices", "Return the sorted element indices", args_input = ('x',), prototype_input = (('n',),), prototype_output = ('n',), Ccode_slice_eval = \ {(np.float32, np.int32): r''' sorted_indices_float((int*)data_slice__output, (float*)data_slice__x, dims_slice__x[0]); return true; ''', (np.float64, np.int32): r''' sorted_indices_double((int*)data_slice__output, (double*)data_slice__x, dims_slice__x[0]); return true; '''}, Ccode_validate = 'return CHECK_CONTIGUOUS_AND_SETERROR__x();' ) # Tests. Try to wrap functions using illegal output prototypes. The wrapper code # should barf try: m.function( "outer_broken", "Outer-product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = ('n', -1), Ccode_slice_eval = {np.float64: 'return true;'}) except: pass # known error else: raise Exception("Expected error didn't happen") try: m.function( "outer_broken", "Outer-product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), (-1,)), prototype_output = ('n', 'n'), Ccode_slice_eval = {np.float64: 'return true;'}) except: pass # known error else: raise Exception("Expected error didn't happen") try: m.function( "outer_broken", "Outer-product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',-1)), prototype_output = (), Ccode_slice_eval = {np.float64: 'return true;'}) except: pass # known error else: raise Exception("Expected error didn't happen: input dims must >=0") try: m.function( "outer_broken", "Outer-product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = (-1,), Ccode_slice_eval = {np.float64: 'return true;'}) except: pass # known error else: raise Exception("Expected error didn't happen: output dims must >=0") try: m.function( "outer_broken", "Outer-product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = ('m', ()), Ccode_slice_eval = {np.float64: 'return true;'}) except: pass # known error else: raise Exception("Expected error didn't happen: output dims must be integers or strings") try: m.function( "outer_broken", "Outer-product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), ('n',)), prototype_output = 'n', Ccode_slice_eval = {np.float64: 'return true;'}) except: pass # known error else: raise Exception("Expected error didn't happen: output dims must be a tuple") try: m.function( "outer_broken", "Outer-product pywrapped with npsp", args_input = ('a', 'b'), prototype_input = (('n',), 'n'), prototype_output = 'n', Ccode_slice_eval = {np.float64: 'return true;'}) except: pass # known error else: raise Exception("Expected error didn't happen: output dims must be a tuple") try: m.function( "sorted_indices_broken", "Return the sorted element indices", args_input = ('x',), prototype_input = (('n',),), prototype_output = ('n',), Ccode_slice_eval = { np.float64: 'return true;' }) except: raise Exception("Valid usage of Ccode_slice_eval keys failed") try: m.function( "sorted_indices_broken2", "Return the sorted element indices", args_input = ('x',), prototype_input = (('n',),), prototype_output = ('n',), Ccode_slice_eval = { np.float64: 'return true;', np.int32: 'return true;' }) except: raise Exception("Valid usage of Ccode_slice_eval keys failed") try: m.function( "sorted_indices_broken3", "Return the sorted element indices", args_input = ('x',), prototype_input = (('n',),), prototype_output = ('n',), Ccode_slice_eval = { (np.float64, np.int32): 'return true;', np.int32: 'return true;' }) except: raise Exception("Valid usage of Ccode_slice_eval keys failed") try: m.function( "sorted_indices_broken4", "Return the sorted element indices", args_input = ('x',), prototype_input = (('n',),), prototype_output = ('n',), Ccode_slice_eval = { (np.float64, np.int32, np.int32): 'return true;', np.int32: 'return true;' }) except: pass # known error else: raise Exception("Expected invalid usage of Ccode_slice_eval keys didn't fail!") m.write() numpysane-0.35/test/test-c-broadcasting.py000077500000000000000000001033521407353053200206570ustar00rootroot00000000000000#!/usr/bin/python3 r'''Test the broadcasting in C Uses the "testlib" guinea pig C library ''' import sys import os dir_path = os.path.dirname(os.path.realpath(__file__)) sys.path[:0] = dir_path + '/..', import numpy as np import numpysane as nps # Local test harness. The python standard ones all suck from testutils import * # The extension module we're testing import testlib def check(matching_functions, A, B): r'''Compare results of pairs of matching functions matching_functions is a list of pairs of functions that are supposed to produce identical results (testlib and numpysane implementations of inner and outer products). A and B are lists of arguments that we try out. These support broadcasting, so either one is allowed to be a single array, which is then used for all the checks. I check both dynamically-created and inlined "out" arrays ''' N = 1 if type(A) is tuple and len(A) > N: N = len(A) if type(B) is tuple and len(B) > N: N = len(B) if type(A) is not tuple: A = (A,) * N if type(B) is not tuple: B = (B,) * N for what,f0,f1 in matching_functions: for i in range(N): out0 = f0(A[i], B[i]) out1 = f1(A[i], B[i]) confirm_equal( out0, out1, msg = what + ' matches. Dynamically-allocated output' ) outshape = out1.shape out0 = np.zeros(outshape, dtype=np.array(A[i]).dtype) out1 = np.ones (outshape, dtype=np.array(A[i]).dtype) f0(A[i], B[i], out=out0) f1(A[i], B[i], out=out1) confirm_equal( out0, out1, msg = what + ' matches. Pre-allocated output' ) # pairs of functions that should produce identical results matching_functions = ( ("inner", testlib.inner, nps.inner), ("outer", testlib.outer, nps.outer) ) # Basic 1D arrays a0 = np.arange(5, dtype=float) b = a0+3 # a needs to broadcast; contiguous and strided a1 = np.arange(10, dtype=float).reshape(2,5) a2 = nps.transpose(np.arange(10, dtype=float).reshape(5,2)) # Try it! check(matching_functions, (a0,a1,a2), b) # Try it again, but use the floating-point version check( (("inner", nps.inner, testlib.inner),), tuple([a.astype(int) for a in (a0,a1,a2)]), b.astype(int)) confirm_raises( lambda: check( (("inner", nps.inner, testlib.inner),), (a0,a1,a2), b.astype(int)), msg = "types must match" ) # Too few input dimensions (passing a scalar where a vector is expected). This # should be ok. It can be viewed as a length-1 vector check( (("inner", nps.inner, testlib.inner),), 6., (5., np.array(5, dtype=float), np.array((5,), dtype=float), ),) # Too few output dimensions. No. This is accepted only for inputs out = np.zeros((), dtype=float) confirm_raises(lambda: testlib.inner( nps.atleast_dims(np.array(6.,dtype=float), -5), nps.atleast_dims(np.array(5.,dtype=float), -2), out=out)) # Broadcasting. Should be ok. No barf. confirm_does_not_raise(lambda: testlib.inner(np.arange(10, dtype=float).reshape( 2,5), np.arange(15, dtype=float).reshape(3,1,5)), msg='Aligned dimensions') confirm_raises( lambda: testlib.inner(np.arange(10, dtype=float).reshape(2,5), np.arange(15, dtype=float).reshape(3,5)) ) confirm_raises( lambda: testlib.inner(np.arange(5), np.arange(6)) ) confirm_does_not_raise( lambda: testlib.outer(a0,b, out=np.zeros((5,5), dtype=float)), msg = "Basic in-place broadcasting") confirm_raises(lambda: testlib.outer(a0,b, out=np.zeros((5,5), dtype=int)), msg = "Output type must match") confirm_raises(lambda: testlib.outer(a0.astype(int),b.astype(int), out=np.zeros((5,5), dtype=float)), msg = "Output type must match") confirm_does_not_raise( lambda: testlib.outer(a0.astype(float),b.astype(float), out=np.zeros((5,5), dtype=float)), msg = "Output type must match") confirm_does_not_raise( lambda: testlib.inner(a0.astype(int),b.astype(int), out=np.zeros((), dtype=int)), msg = "Output type must match") confirm_raises( lambda: testlib.outer(a0,b, out=np.zeros((3,3), dtype=float)), msg = "Wrong dimensions on out" ) confirm_raises( lambda: testlib.outer(a0,b, out=np.zeros((4,5), dtype=float)), msg = "Wrong dimensions on out" ) confirm_raises( lambda: testlib.outer(a0,b, out=np.zeros((5,), dtype=float)), msg = "Wrong dimensions on out" ) confirm_raises( lambda: testlib.outer(a0,b, out=np.zeros((), dtype=float)), msg = "Wrong dimensions on out" ) confirm_raises( lambda: testlib.outer(a0,b, out=np.zeros((5,5,5), dtype=float)), msg = "Wrong dimensions on out" ) from functools import reduce def arr(*shape, **kwargs): dtype = kwargs.get('dtype',float) r'''Return an arange() array of the given shape.''' if len(shape) == 0: return np.array(3, dtype=dtype) product = reduce( lambda x,y: x*y, shape) return np.arange(product, dtype=dtype).reshape(*shape) def test_identity3(): r'''Testing identity3()''' ref = np.array(((1,0,0), (0,1,0), (0,0,1)),dtype=float) ref_int = np.array(((1,0,0), (0,1,0), (0,0,1)),dtype=int) out = ref*0 out_int = ref_int*0 assertResult_inoutplace( ref, testlib.identity3 ) confirm_does_not_raise( lambda: testlib.identity3(out = out)) confirm_raises( lambda: testlib.identity3(out = out_int)) out_discontiguous = np.zeros((4,5,6), dtype=float)[:3,:3,0] confirm(not out_discontiguous.flags['C_CONTIGUOUS']) testlib.identity3(out = out_discontiguous) confirm_equal(ref, out_discontiguous) def test_identity(): r'''Testing identity() This tests much of the named-dimensions-in-output-only logic ''' # This i ref = np.eye(2, dtype=float) ref_int = np.eye(2, dtype=int) out = np.zeros((2,2), dtype=float) out_int = np.zeros((2,2), dtype=int) out32 = np.zeros((3,2), dtype=float) out23 = np.zeros((3,2), dtype=float) confirm_raises(lambda: testlib.identity(), msg='output-only named dimensions MUST be given in the in-place array') confirm_raises(lambda: testlib.identity(out=out_int), msg='types must match') confirm_equal(ref, testlib.identity(out=out), msg='basic output-only named dimensions work') confirm_raises(lambda: testlib.identity(out=out23), msg='output-only named dimensions must still be self-consistent') confirm_raises(lambda: testlib.identity(out=out32), msg='output-only named dimensions must still be self-consistent') out_discontiguous = np.zeros((4,5,6), dtype=float)[:3,:3,0] confirm(not out_discontiguous.flags['C_CONTIGUOUS']) testlib.identity(out = out_discontiguous) confirm_equal(np.eye(3, dtype=float), out_discontiguous) def test_inner(): r'''Testing the broadcasted inner product''' ref = np.array([[[ 30, 255, 730], [ 180, 780, 1630]], [[ 180, 780, 1630], [1455, 2430, 3655]], [[ 330, 1305, 2530], [2730, 4080, 5680]], [[ 480, 1830, 3430], [4005, 5730, 7705.0]]]) assertResult_inoutplace( ref, testlib.inner, arr(2,3,5), arr(4,1,3,5) ) output = np.empty((4,2,3), dtype=int) confirm_raises( lambda: testlib.inner( arr( 2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=output ), "inner(out=out, dtype=dtype) have out=dtype==dtype" ) # make sure non-contiguous output works properly output = np.empty((4,2,3), dtype=float) confirm(output.flags['C_CONTIGUOUS']) output = nps.reorder( np.empty((2,3,4), dtype=float), 2,0,1 ) confirm(not output.flags['C_CONTIGUOUS']) confirm_equal( testlib.inner( arr( 2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=output ), ref, msg = 'Noncontiguous output' ) confirm(not output.flags['C_CONTIGUOUS']) confirm_equal( output, ref, msg = 'Noncontiguous output' ) def test_outer(): r'''Testing the broadcasted outer product''' # comes from PDL. numpy has a reversed axis ordering convention from # PDL, so I transpose the array before comparing ref = nps.transpose( np.array([[[[[0,0,0,0,0],[0,1,2,3,4],[0,2,4,6,8],[0,3,6,9,12],[0,4,8,12,16]], [[25,30,35,40,45],[30,36,42,48,54],[35,42,49,56,63],[40,48,56,64,72],[45,54,63,72,81]], [[100,110,120,130,140],[110,121,132,143,154],[120,132,144,156,168],[130,143,156,169,182],[140,154,168,182,196]]], [[[0,0,0,0,0],[15,16,17,18,19],[30,32,34,36,38],[45,48,51,54,57],[60,64,68,72,76]], [[100,105,110,115,120],[120,126,132,138,144],[140,147,154,161,168],[160,168,176,184,192],[180,189,198,207,216]], [[250,260,270,280,290],[275,286,297,308,319],[300,312,324,336,348],[325,338,351,364,377],[350,364,378,392,406]]]], [[[[0,15,30,45,60],[0,16,32,48,64],[0,17,34,51,68],[0,18,36,54,72],[0,19,38,57,76]], [[100,120,140,160,180],[105,126,147,168,189],[110,132,154,176,198],[115,138,161,184,207],[120,144,168,192,216]], [[250,275,300,325,350],[260,286,312,338,364],[270,297,324,351,378],[280,308,336,364,392],[290,319,348,377,406]]], [[[225,240,255,270,285],[240,256,272,288,304],[255,272,289,306,323],[270,288,306,324,342],[285,304,323,342,361]], [[400,420,440,460,480],[420,441,462,483,504],[440,462,484,506,528],[460,483,506,529,552],[480,504,528,552,576]], [[625,650,675,700,725],[650,676,702,728,754],[675,702,729,756,783],[700,728,756,784,812],[725,754,783,812,841]]]], [[[[0,30,60,90,120],[0,31,62,93,124],[0,32,64,96,128],[0,33,66,99,132],[0,34,68,102,136]], [[175,210,245,280,315],[180,216,252,288,324],[185,222,259,296,333],[190,228,266,304,342],[195,234,273,312,351]], [[400,440,480,520,560],[410,451,492,533,574],[420,462,504,546,588],[430,473,516,559,602],[440,484,528,572,616]]], [[[450,480,510,540,570],[465,496,527,558,589],[480,512,544,576,608],[495,528,561,594,627],[510,544,578,612,646]], [[700,735,770,805,840],[720,756,792,828,864],[740,777,814,851,888],[760,798,836,874,912],[780,819,858,897,936]], [[1000,1040,1080,1120,1160],[1025,1066,1107,1148,1189],[1050,1092,1134,1176,1218],[1075,1118,1161,1204,1247],[1100,1144,1188,1232,1276]]]], [[[[0,45,90,135,180],[0,46,92,138,184],[0,47,94,141,188],[0,48,96,144,192],[0,49,98,147,196]], [[250,300,350,400,450],[255,306,357,408,459],[260,312,364,416,468],[265,318,371,424,477],[270,324,378,432,486]], [[550,605,660,715,770],[560,616,672,728,784],[570,627,684,741,798],[580,638,696,754,812],[590,649,708,767,826]]], [[[675,720,765,810,855],[690,736,782,828,874],[705,752,799,846,893],[720,768,816,864,912],[735,784,833,882,931]], [[1000,1050,1100,1150,1200],[1020,1071,1122,1173,1224],[1040,1092,1144,1196,1248],[1060,1113,1166,1219,1272],[1080,1134,1188,1242,1296]], [[1375,1430,1485,1540,1595],[1400,1456,1512,1568,1624],[1425,1482,1539,1596,1653],[1450,1508,1566,1624,1682],[1475,1534,1593,1652,1711]]]]])) assertResult_inoutplace( ref, testlib.outer, arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float) ) # make sure non-contiguous output (in both the broadcasting AND within each # slice) works properly output = np.empty((4,2,3,5,5), dtype=float) confirm(output.flags['C_CONTIGUOUS']) output = nps.reorder( np.empty((2,3,4,5,5), dtype=float), 2,0,1, 4,3) confirm(not output.flags['C_CONTIGUOUS']) confirm_equal( testlib.outer( arr( 2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=output ), ref, msg = 'Noncontiguous output (broadcasting and within each slice)' ) confirm(not output.flags['C_CONTIGUOUS']) confirm_equal( output, ref, msg = 'Noncontiguous output (broadcasting and within each slice)' ) def test_innerouter(): r'''Testing the broadcasted inner product''' ref_inner = np.array([[[ 30, 255, 730], [ 180, 780, 1630]], [[ 180, 780, 1630], [1455, 2430, 3655]], [[ 330, 1305, 2530], [2730, 4080, 5680]], [[ 480, 1830, 3430], [4005, 5730, 7705.0]]]) # comes from PDL. numpy has a reversed axis ordering convention from # PDL, so I transpose the array before comparing ref_outer = nps.transpose( np.array([[[[[0,0,0,0,0],[0,1,2,3,4],[0,2,4,6,8],[0,3,6,9,12],[0,4,8,12,16]], [[25,30,35,40,45],[30,36,42,48,54],[35,42,49,56,63],[40,48,56,64,72],[45,54,63,72,81]], [[100,110,120,130,140],[110,121,132,143,154],[120,132,144,156,168],[130,143,156,169,182],[140,154,168,182,196]]], [[[0,0,0,0,0],[15,16,17,18,19],[30,32,34,36,38],[45,48,51,54,57],[60,64,68,72,76]], [[100,105,110,115,120],[120,126,132,138,144],[140,147,154,161,168],[160,168,176,184,192],[180,189,198,207,216]], [[250,260,270,280,290],[275,286,297,308,319],[300,312,324,336,348],[325,338,351,364,377],[350,364,378,392,406]]]], [[[[0,15,30,45,60],[0,16,32,48,64],[0,17,34,51,68],[0,18,36,54,72],[0,19,38,57,76]], [[100,120,140,160,180],[105,126,147,168,189],[110,132,154,176,198],[115,138,161,184,207],[120,144,168,192,216]], [[250,275,300,325,350],[260,286,312,338,364],[270,297,324,351,378],[280,308,336,364,392],[290,319,348,377,406]]], [[[225,240,255,270,285],[240,256,272,288,304],[255,272,289,306,323],[270,288,306,324,342],[285,304,323,342,361]], [[400,420,440,460,480],[420,441,462,483,504],[440,462,484,506,528],[460,483,506,529,552],[480,504,528,552,576]], [[625,650,675,700,725],[650,676,702,728,754],[675,702,729,756,783],[700,728,756,784,812],[725,754,783,812,841]]]], [[[[0,30,60,90,120],[0,31,62,93,124],[0,32,64,96,128],[0,33,66,99,132],[0,34,68,102,136]], [[175,210,245,280,315],[180,216,252,288,324],[185,222,259,296,333],[190,228,266,304,342],[195,234,273,312,351]], [[400,440,480,520,560],[410,451,492,533,574],[420,462,504,546,588],[430,473,516,559,602],[440,484,528,572,616]]], [[[450,480,510,540,570],[465,496,527,558,589],[480,512,544,576,608],[495,528,561,594,627],[510,544,578,612,646]], [[700,735,770,805,840],[720,756,792,828,864],[740,777,814,851,888],[760,798,836,874,912],[780,819,858,897,936]], [[1000,1040,1080,1120,1160],[1025,1066,1107,1148,1189],[1050,1092,1134,1176,1218],[1075,1118,1161,1204,1247],[1100,1144,1188,1232,1276]]]], [[[[0,45,90,135,180],[0,46,92,138,184],[0,47,94,141,188],[0,48,96,144,192],[0,49,98,147,196]], [[250,300,350,400,450],[255,306,357,408,459],[260,312,364,416,468],[265,318,371,424,477],[270,324,378,432,486]], [[550,605,660,715,770],[560,616,672,728,784],[570,627,684,741,798],[580,638,696,754,812],[590,649,708,767,826]]], [[[675,720,765,810,855],[690,736,782,828,874],[705,752,799,846,893],[720,768,816,864,912],[735,784,833,882,931]], [[1000,1050,1100,1150,1200],[1020,1071,1122,1173,1224],[1040,1092,1144,1196,1248],[1060,1113,1166,1219,1272],[1080,1134,1188,1242,1296]], [[1375,1430,1485,1540,1595],[1400,1456,1512,1568,1624],[1425,1482,1539,1596,1653],[1450,1508,1566,1624,1682],[1475,1534,1593,1652,1711]]]]])) # not in-place try: i,o = testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float)) except: confirm(False, msg="broadcasted innerouter succeeded") else: confirm_equal(i.shape, ref_inner.shape, msg="broadcasted innerouter produced correct inner.shape") confirm_equal(i, ref_inner, msg="broadcasted innerouter produced correct inner") confirm_equal(o.shape, ref_outer.shape, msg="broadcasted innerouter produced correct outer.shape") confirm_equal(o, ref_outer, msg="broadcasted innerouter produced correct outer") # in-place try: i = np.empty(ref_inner.shape, dtype=float) o = np.empty(ref_outer.shape, dtype=float) testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,o)) except: confirm(False, msg="broadcasted in-place innerouter succeeded") else: confirm(True, msg="broadcasted in-place innerouter succeeded") confirm_equal(i.shape, ref_inner.shape, msg="broadcasted in-place innerouter produced correct inner.shape") confirm_equal(i, ref_inner, msg="broadcasted in-place innerouter produced correct inner") confirm_equal(o.shape, ref_outer.shape, msg="broadcasted in-place innerouter produced correct outer.shape") confirm_equal(o, ref_outer, msg="broadcasted in-place innerouter produced correct outer") # in-place with float scaling try: i = np.empty(ref_inner.shape, dtype=float) o = np.empty(ref_outer.shape, dtype=float) testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,o), scale=3.5) except: confirm(False, msg="broadcasted in-place innerouter succeeded") else: confirm(True, msg="broadcasted in-place innerouter succeeded") confirm_equal(i.shape, ref_inner.shape, msg="broadcasted in-place innerouter with scaling produced correct inner.shape") confirm_equal(i, ref_inner * 3.5, msg="broadcasted in-place innerouter with scaling produced correct inner") confirm_equal(o.shape, ref_outer.shape, msg="broadcasted in-place innerouter with scaling produced correct outer.shape") confirm_equal(o, ref_outer * 3.5, msg="broadcasted in-place innerouter with scaling produced correct outer") # in-place with float scaling and string scaling try: i = np.empty(ref_inner.shape, dtype=float) o = np.empty(ref_outer.shape, dtype=float) testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,o), scale=3.5, scale_string="10.0") except: confirm(False, msg="broadcasted in-place innerouter succeeded") else: confirm(True, msg="broadcasted in-place innerouter succeeded") confirm_equal(i.shape, ref_inner.shape, msg="broadcasted in-place innerouter with float and string scaling produced correct inner.shape") confirm_equal(i, ref_inner * 35., msg="broadcasted in-place innerouter with float and string scaling produced correct inner") confirm_equal(o.shape, ref_outer.shape, msg="broadcasted in-place innerouter with float and string scaling produced correct outer.shape") confirm_equal(o, ref_outer * 35., msg="broadcasted in-place innerouter with float and string scaling produced correct outer") # in-place, with some extra dummy dimensions in the output. Not allowed i = np.empty((1,) + ref_inner.shape, dtype=float) o = np.empty(ref_outer.shape, dtype=float) confirm_raises( lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,o)), msg="Extra broadcasted dimensions in the output not allowed") # in-place, with some extra dummy dimensions in the output. Not allowed i = np.empty(ref_inner.shape, dtype=float) o = np.empty((1,) + ref_outer.shape, dtype=float) confirm_raises( lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,o)), msg="Extra broadcasted dimensions in the output not allowed") # now some bogus shapes and types that should fail i = np.empty(ref_inner.shape, dtype=float) o = np.empty(ref_outer.shape, dtype=float) confirm_does_not_raise( lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,o)), msg = "basic broadcasted innerouter works") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,i)), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(o,o)), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(o,i)), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,o,i)), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,i,o)), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=int), out=(i,o)), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=o), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=i), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i,None)), msg = "in-place broadcasting output dimensions match") confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(None,i)), msg = "in-place broadcasting output dimensions match") iint = np.empty(ref_inner.shape, dtype=int) i1 = np.empty((1,) + ref_inner.shape, dtype=float) i2 = np.empty((2,) + ref_inner.shape, dtype=float) confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(iint,o)), msg = "in-place broadcasting output types match") confirm_raises( lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i1,o)), msg = "broadcasted innerouter: extra output dims are forbidden") confirm_raises(lambda: testlib.innerouter(arr( 2,3,5, dtype=float), arr(4,1,3,5, dtype=float), out=(i2,o)), msg = "in-place broadcasting output dimensions match") confirm_does_not_raise(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), scale=3.5), msg = 'Validation looks at the cookie') confirm_raises(lambda: testlib.innerouter(arr(2,3,5, dtype=float), arr(4,1,3,5, dtype=float), scale=-3.5), msg = 'Validation looks at the cookie') def test_sorted_indices(): x64 = np.array((1., 5., 3, 2.5, 3.5, 2.9), dtype=float) x32 = np.array((1., 5., 3, 2.5, 3.5, 2.9), dtype=np.float32) iref = np.array((0, 3, 5, 2, 4, 1), dtype=int) confirm_raises(lambda: testlib.sorted_indices(np.arange(5, dtype=int))) confirm_does_not_raise(lambda: testlib.sorted_indices(np.arange(5, dtype=np.float32))) confirm_does_not_raise(lambda: testlib.sorted_indices(np.arange(5, dtype=np.float32), out=np.arange(5, dtype=np.int32))) confirm_raises(lambda: testlib.sorted_indices(np.arange(5, dtype=np.float32), out=np.arange(5, dtype=int))) confirm_raises(lambda: testlib.sorted_indices(np.arange(5, dtype=np.float32), out=np.arange(5, dtype=float))) assertResult_inoutplace( iref, testlib.sorted_indices, x64, out_inplace_dtype=np.int32) assertResult_inoutplace( iref, testlib.sorted_indices, x32, out_inplace_dtype=np.int32) def test_broadcasting(): assertValueShape( np.array(5), (), testlib.inner, arr(3), arr(3)) assertValueShape( np.array((5,14)), (2,), testlib.inner, arr(2,3), arr(3)) assertValueShape( np.array((5,14)), (2,), testlib.inner, arr(3), arr(2,3)) assertValueShape( np.array(((5,14),)), (1,2,), testlib.inner, arr(1,2,3), arr(3)) assertValueShape( np.array(((5,),(14,))), (2,1,), testlib.inner, arr(2,1,3), arr(3)) assertValueShape( np.array((5,14)), (2,), testlib.inner, arr(2,3), arr(1,3)) assertValueShape( np.array((5,14)), (2,), testlib.inner, arr(1,3), arr(2,3)) assertValueShape( np.array(((5,14),)), (1,2,), testlib.inner, arr(1,2,3), arr(1,3)) assertValueShape( np.array(((5,),(14,))), (2,1,), testlib.inner, arr(2,1,3), arr(1,3)) assertValueShape( np.array(((5,14),(14,50))), (2,2,), testlib.inner, arr(2,1,3), arr(2,3)) assertValueShape( np.array(((5,14),(14,50))), (2,2,), testlib.inner, arr(2,1,3), arr(1,2,3)) confirm_raises( lambda: testlib.inner(arr(3)), msg='right number of args' ) confirm_raises( lambda: testlib.inner(arr(3),arr(5)), msg='matching args') confirm_raises( lambda: testlib.inner(arr(2,3),arr(4,3)), msg='matching args') confirm_raises( lambda: testlib.inner(arr(3,3,3),arr(2,1,3)), msg='matching args') confirm_raises( lambda: testlib.inner(arr(1,2,4),arr(2,1,3)), msg='matching args') # make sure the output COUNTS are checked (if I expect 2 outputs, but get # only 1, that's an error confirm( testlib.innerouter(arr(5), arr( 5)) is not None, msg='output count check' ) confirm( testlib.innerouter(arr(5), arr(2,5)) is not None, msg='output count check' ) confirm( testlib.innerouter(arr(5), arr( 5)) is not None, msg='output dimensionality check with given out' ) # Basic out_kwarg tests. More thorough ones later, in # test_broadcasting_into_output()) a5 = arr( 5, dtype=float) a25 = arr(2, 5, dtype=float) a125 = arr(1, 2, 5, dtype=float) o = np.zeros((), dtype=float) o2 = np.zeros((2,), dtype=float) o5 = np.zeros((5,), dtype=float) o12 = np.zeros((1,2), dtype=float) o22 = np.zeros((2,2), dtype=float) o55 = np.zeros((5,5), dtype=float) o25 = np.zeros((2,5), dtype=float) o255 = np.zeros((2,5,5), dtype=float) o1255= np.zeros((1,2,5,5),dtype=float) # no broadcasting confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=o), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=o2), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=(o,)), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=(o55,)), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=(o55,o)), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=(o,o2)), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=(o,o5)), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=(o2,o55)), \ msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a5, out=(o,o55,o)), \ msg='output dimensionality check with given out' ) confirm( testlib.innerouter(a5, a5, out=(o,o55)) is not None, msg='output dimensionality check with given out' ) confirm_equal(o, a5.dot(a5), msg='in-place broadcasting computed the right value') confirm_equal(o55, np.outer(a5,a5), msg='in-place broadcasting computed the right value') # two broadcasted slices confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=o), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=o2), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o,)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o55,)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o55,o)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o,o2)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o,o5)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o2,o55)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o,o55,o)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o,o55)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a25, out=(o,o255)), msg='output dimensionality check with given out' ) confirm( testlib.innerouter(a5, a25, out=(o2,o255)) is not None, msg='output dimensionality check with given out' ) confirm_equal(o2, nps.inner(a5,a25), msg='in-place broadcasting computed the right value') confirm_equal(o255, nps.outer(a5,a25), msg='in-place broadcasting computed the right value') # Non-contiguous data should work with inner and outer, but not innerouter # (that's what the underlying C library does/does not support) a2 = arr(2, dtype=float) a25_noncontiguous = arr(5, 2, dtype=float).T o255_noncontiguous = nps.transpose(np.zeros((2,5,5), dtype=float)) o255_noncontiguous_in_broadcast = np.zeros((2,2,5,5), dtype=float)[:,0,:,:] confirm_does_not_raise(lambda: testlib.inner (a25_noncontiguous, a5), msg='Validation: noncontiguous in the function slice') confirm_does_not_raise(lambda: testlib.outer (a25_noncontiguous, a5), msg='Validation: noncontiguous in the function slice') confirm_does_not_raise(lambda: testlib.outer (a25_noncontiguous, a5, out=o255_noncontiguous), msg='Validation: noncontiguous in the function slice') confirm_raises (lambda: testlib.innerouter(a25_noncontiguous, a5), msg='Validation: noncontiguous in the function slice') confirm_does_not_raise(lambda: testlib.innerouter(a25, a5, out=(a2, o255)), msg='Validation: noncontiguous in the function slice') confirm_raises (lambda: testlib.innerouter(a25, a5, out=(a2, o255_noncontiguous)), msg='Validation: noncontiguous in the function slice') confirm_does_not_raise(lambda: testlib.innerouter(a25, a5, out=(a2, o255_noncontiguous_in_broadcast)), msg='Validation: noncontiguous array that are noncontiguous ONLY in the broadcasted dimensions (i.e. each slice IS contiguous)') # Extra slices in the output not allowed confirm_does_not_raise( lambda: \ testlib.innerouter(a5, a25, out=(o2,o255)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a125, out=(o2,o255)), msg='output dimensionality check with given out' ) confirm_does_not_raise( lambda: \ testlib.innerouter(a5, a125, out=(o12,o1255)), msg='output dimensionality check with given out' ) confirm_raises( lambda: \ testlib.innerouter(a5, a125, out=(o12,o2255)), msg='output dimensionality check with given out' ) test_identity3() test_identity() test_inner() test_outer() test_innerouter() test_broadcasting() test_sorted_indices() finish() numpysane-0.35/test/test-numpysane.py000077500000000000000000001610161407353053200200170ustar00rootroot00000000000000#!/usr/bin/python2 import sys import os dir_path = os.path.dirname(os.path.realpath(__file__)) sys.path[:0] = dir_path + '/..', import numpy as np import numpysane as nps from functools import reduce # Local test harness. The python standard ones all suck from testutils import * def arr(*shape, **kwargs): dtype = kwargs.get('dtype',float) r'''Return an arange() array of the given shape.''' if len(shape) == 0: return np.array(3, dtype=dtype) product = reduce( lambda x,y: x*y, shape) return np.arange(product, dtype=dtype).reshape(*shape) def test_broadcasting(): # first check invalid broadcasting defines def define_f_broken1(): @nps.broadcast_define( (('n',), ('n',())), ('m') ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken1, msg="input dims must be integers or strings" ) def define_f_broken2(): @nps.broadcast_define( (('n',), ('n',-1)), () ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken2, msg="input dims must >=0" ) def define_f_broken3(): @nps.broadcast_define( (('n',), ('n',)), (-1,) ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken3, msg="output dims must >=0" ) def define_f_broken4(): @nps.broadcast_define( (('n',), ('n',)), ('m', ()) ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken4, msg="output dims must be integers or strings" ) def define_f_broken5(): @nps.broadcast_define( (('n',), ('n',)), ('m') ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken5, msg="output dims must all be known" ) def define_f_broken6(): @nps.broadcast_define( (('n',), ('n',)), 'n' ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken6, msg="output dims must be a tuple" ) def define_f_broken7(): @nps.broadcast_define( (('n',), ('n',)), ('n',), ('n',) ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken7, msg="multiple outputs must be specified as a tuple of tuples" ) def define_f_broken8(): @nps.broadcast_define( (('n',), ('n',)), (('n',), 'n') ) def f_broken(a, b): return a.dot(b) confirm_raises( define_f_broken8, msg="output dims must be a tuple" ) def define_f_good9(): @nps.broadcast_define( (('n',), ('n',)), (('n',), ('n',)) ) def f_broken(a, b): return a.dot(b) return True confirm( define_f_good9, msg="Multiple outputs can be defined" ) r'''Checking broadcasting rules.''' @nps.broadcast_define( (('n',), ('n',)) ) def f1(a, b): r'''Basic inner product.''' return a.dot(b) assertValueShape( np.array(5), (), f1, arr(3), arr(3)) assertValueShape( np.array((5,14)), (2,), f1, arr(2,3), arr(3)) assertValueShape( np.array((5,14)), (2,), f1, arr(3), arr(2,3)) assertValueShape( np.array(((5,14),)), (1,2,), f1, arr(1,2,3), arr(3)) assertValueShape( np.array(((5,),(14,))), (2,1,), f1, arr(2,1,3), arr(3)) assertValueShape( np.array((5,14)), (2,), f1, arr(2,3), arr(1,3)) assertValueShape( np.array((5,14)), (2,), f1, arr(1,3), arr(2,3)) assertValueShape( np.array(((5,14),)), (1,2,), f1, arr(1,2,3), arr(1,3)) assertValueShape( np.array(((5,),(14,))), (2,1,), f1, arr(2,1,3), arr(1,3)) assertValueShape( np.array(((5,14),(14,50))), (2,2,), f1, arr(2,1,3), arr(2,3)) assertValueShape( np.array(((5,14),(14,50))), (2,2,), f1, arr(2,1,3), arr(1,2,3)) confirm_raises( lambda: f1(arr(3)), msg='right number of args' ) confirm_raises( lambda: f1(arr(3),arr(5)), msg='matching args') confirm_raises( lambda: f1(arr(2,3),arr(4,3)), msg='matching args') confirm_raises( lambda: f1(arr(3,3,3),arr(2,1,3)), msg='matching args') confirm_raises( lambda: f1(arr(1,2,4),arr(2,1,3)), msg='matching args') # fancier function, has some preset dimensions @nps.broadcast_define( ((3,), ('n',3), ('n',), ('m',)) ) def f2(a,b,c,d): return d n=4 m=6 d = np.arange(m) assertValueShape( d, (m,), f2, arr( 3), arr(n,3), arr( n), arr( m)) assertValueShape( np.array((d,)), (1,m), f2, arr(1, 3), arr(1, n,3), arr( n), arr(1, m)) assertValueShape( np.array((d,)), (1,m,), f2, arr(1, 3), arr(1, n,3), arr( n), arr( m)) assertValueShape( np.array((d,d+m,d+2*m,d+3*m,d+4*m)), (5,m), f2, arr(5, 3), arr(5, n,3), arr( n), arr(5, m)) assertValueShape( np.array(((d,d+m,d+2*m,d+3*m,d+4*m),)), (1,5,m), f2, arr(1,5, 3), arr( 5, n,3), arr( n), arr( 5, m)) assertValueShape( np.array(((d,d+m,d+2*m,d+3*m,d+4*m), (d,d+m,d+2*m,d+3*m,d+4*m))), (2,5,m), f2, arr(1,5, 3), arr(2,5, n,3), arr( n), arr( 5, m)) assertValueShape( np.array(((d,d+m,d+2*m,d+3*m,d+4*m), (d,d+m,d+2*m,d+3*m,d+4*m))), (2,5,m), f2, arr(1,5, 3), arr(2,1, n,3), arr( n), arr( 5, m)) assertValueShape( np.array((((d,d,d,d,d), (d,d,d,d,d)),)), (1,2,5,m), f2, arr(1,1,5, 3), arr(1,2,1, n,3), arr(1, n), arr(1, 1, m)) confirm_raises( lambda: f2( arr(5, 3), arr(5, n,3), arr( m), arr(5, m)), msg='matching args') confirm_raises( lambda: f2( arr(5, 2), arr(5, n,3), arr( n), arr(5, m)), msg='matching args') confirm_raises( lambda: f2( arr(5, 2), arr(5, n,2), arr( n), arr(5, m)), msg='matching args') confirm_raises( lambda: f2( arr(1, 3), arr(1, n,3), arr( 5*n), arr(1, m)), msg='matching args') # Make sure extra args and the kwargs are passed through @nps.broadcast_define( ((3,), ('n',3), ('n',), ('m',)) ) def f3(a,b,c,d, e,f, *args, **kwargs): def val_or_0(x): return x if x else 0 return np.array( (a[0], val_or_0(e), val_or_0(f), val_or_0(args[0]), val_or_0( kwargs.get('xxx'))) ) assertValueShape( np.array( ((0, 1, 2, 3, 6), (3, 1, 2, 3, 6)) ), (2,5), f3, arr(2, 3), arr(1, n,3), arr( n), arr( m), 1, 2, 3, 4., dummy=5, xxx=6) # Make sure scalars (0-dimensional array) can broadcast @nps.broadcast_define( (('n',), ('n','m'), (2,), ()) ) def f4(a,b,c,d): return d @nps.broadcast_define( (('n',), ('n','m'), (2,), ()) ) def f5(a,b,c,d): return nps.glue( c, d, axis=-1 ) assertValueShape( np.array((5,5)), (2,), f4, arr( 3), arr(1, 3,4), arr(2, 2), np.array(5)) assertValueShape( np.array((5,5)), (2,), f4, arr( 3), arr(1, 3,4), arr(2, 2), 5) assertValueShape( np.array(((0,1,5),(2,3,5))), (2,3), f5, arr( 3), arr(1, 3,4), arr(2, 2), np.array(5)) assertValueShape( np.array(((0,1,5),(2,3,5))), (2,3), f5, arr( 3), arr(1, 3,4), arr(2, 2), 5) confirm_raises( lambda: f5( arr( 3), arr(1, 3,4), arr(2, 2), arr(5)) ) # Test the generator prototype = (('n',), ('n','m'), (2,), ()) i=0 args = (arr( 3), arr(1, 3,4), arr(2, 2), np.array(5)) for s in nps.broadcast_generate(prototype, args): confirm_equal( arr(3), s[0] ) confirm_equal( arr(3,4), s[1] ) confirm_equal( arr(2) + 2*i, s[2] ) confirm_equal( np.array(5), s[3] ) confirm_equal( s[3].shape, ()) i = i+1 confirm_equal( nps.broadcast_extra_dims(prototype,args), (2,)) i=0 args = (arr( 3), arr(1, 3,4), arr(2, 2), 5) for s in nps.broadcast_generate(prototype, args): confirm_equal( arr(3), s[0] ) confirm_equal( arr(3,4), s[1] ) confirm_equal( arr(2) + 2*i, s[2] ) confirm_equal( np.array(5), s[3] ) confirm_equal( s[3].shape, ()) i = i+1 confirm_equal( nps.broadcast_extra_dims(prototype,args), (2,)) i=0 args = (arr( 3), arr(1, 3,4), arr(2, 2), arr(2)) for s in nps.broadcast_generate(prototype, args): confirm_equal( arr(3), s[0] ) confirm_equal( arr(3,4), s[1] ) confirm_equal( arr(2) + 2*i, s[2] ) confirm_equal( np.array(i), s[3] ) confirm_equal( s[3].shape, ()) i = i+1 confirm_equal( nps.broadcast_extra_dims(prototype,args), (2,)) confirm_equal( nps.broadcast_extra_dims((('n',), ('n',)), (arr(2,3), arr(5,1,3))), (5,2)) # Make sure we add dummy length-1 dimensions assertValueShape( None, (3,), nps.matmult, arr(4), arr(4,3) ) assertValueShape( None, (3,), nps.matmult2, arr(4), arr(4,3) ) assertValueShape( None, (1,3,), nps.matmult, arr(1,4), arr(4,3) ) assertValueShape( None, (1,3,), nps.matmult2, arr(1,4), arr(4,3) ) assertValueShape( None, (10,3,), nps.matmult, arr(4), arr(10,4,3) ) assertValueShape( None, (10,3,), nps.matmult2, arr(4), arr(10,4,3) ) assertValueShape( None, (10,1,3,), nps.matmult, arr(1,4), arr(10,4,3) ) assertValueShape( None, (10,1,3,), nps.matmult2, arr(1,4), arr(10,4,3) ) # scalar output shouldn't barf @nps.broadcast_define( ((),), ) def f6(x): return 6 @nps.broadcast_define( ((),), ()) def f7(x): return 7 assertValueShape( 6, (), f6, 5) assertValueShape( 6*np.ones((5,)), (5,), f6, np.arange(5)) assertValueShape( 7, (), f7, 5) assertValueShape( 7*np.ones((5,)), (5,), f7, np.arange(5)) # make sure the output dimensionality is checked @nps.broadcast_define( (('n',), ('n',)), ('n',) ) def f8(a, b): return a.dot(b) confirm_raises( lambda: f8(arr(5), arr( 5)), msg='output dimensionality check' ) confirm_raises( lambda: f8(arr(5), arr(2,5)), msg='output dimensionality check' ) # make sure the output COUNTS are checked (if I expect 2 outputs, but get # only 1, that's an error @nps.broadcast_define( (('n',), ('n',)) ) def f9(a, b): return a.dot(b),nps.outer(a,b) confirm_raises( lambda: f9(arr(5), arr( 5)), msg='output count check' ) confirm_raises( lambda: f9(arr(5), arr(2,5)), msg='output count check' ) @nps.broadcast_define( (('n',), ('n',)), ('n',) ) def f10(a, b): return a.dot(b),nps.outer(a,b) confirm_raises( lambda: f10(arr(5), arr( 5)), msg='output count check' ) confirm_raises( lambda: f10(arr(5), arr(2,5)), msg='output count check' ) @nps.broadcast_define( (('n',), ('n',)), ('n', 'n') ) def f11(a, b): return a.dot(b),nps.outer(a,b) confirm_raises( lambda: f11(arr(5), arr( 5)), msg='output count check' ) confirm_raises( lambda: f11(arr(5), arr(2,5)), msg='output count check' ) @nps.broadcast_define( (('n',), ('n',)), (('n', 'n'),) ) def f11(a, b): return a.dot(b),nps.outer(a,b) confirm_raises( lambda: f11(arr(5), arr( 5)), msg='output count check' ) confirm_raises( lambda: f11(arr(5), arr(2,5)), msg='output count check' ) @nps.broadcast_define( (('n',), ('n',)), (('n', 'n'),('n',)) ) def f12(a, b): return a.dot(b),nps.outer(a,b) confirm_raises( lambda: f12(arr(5), arr( 5)), msg='output count check' ) confirm_raises( lambda: f12(arr(5), arr(2,5)), msg='output count check' ) @nps.broadcast_define( (('n',), ('n',)), (('n',),('n', 'n')) ) def f13(a, b): return a.dot(b),nps.outer(a,b) confirm_raises( lambda: f13(arr(5), arr( 5)), msg='output dimensionality check' ) confirm_raises( lambda: f13(arr(5), arr(2,5)), msg='output dimensionality check' ) @nps.broadcast_define( (('n',), ('n',)), ((),('n', 'n',)) ) def f13(a, b): return a.dot(b),nps.outer(a,b) confirm( f13(arr(5), arr( 5)) is not None, msg='output count check' ) confirm( f13(arr(5), arr(2,5)) is not None, msg='output count check' ) # check output dimensionality with an 'out' kwarg @nps.broadcast_define( (('n',), ('n',)), ('n', 'n'), out_kwarg = 'out') def f14_oneoutput(a, b, out): if out is None: return nps.outer(a,b) nps.outer(a,b,out=out) @nps.broadcast_define( (('n',), ('n',)), ((),('n', 'n')), out_kwarg = 'out') def f14(a, b, out): if out is None: return a.dot(b),nps.outer(a,b) if not isinstance(out,tuple) or len(out) != 2: raise Exception("'out' must be a tuple") nps.inner(a,b,out=out[0]) nps.outer(a,b,out=out[1]) confirm( f14(arr(5), arr( 5)) is not None, msg='output dimensionality check with out_kwarg' ) # Basic out_kwarg tests. More thorough ones later, in # test_broadcasting_into_output()) a5 = arr( 5, dtype=float) a25 = arr(2, 5, dtype=float) o = np.zeros((), dtype=float) o2 = np.zeros((2,), dtype=float) o5 = np.zeros((5,), dtype=float) o55 = np.zeros((5,5), dtype=float) o25 = np.zeros((2,5), dtype=float) o255 = np.zeros((2,5,5), dtype=float) # no broadcasting confirm_raises( lambda: f14(a5, a5, out=o), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=o2), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=(o,)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=(o55,)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=(o55,o)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=(o,o2)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=(o,o5)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=(o2,o55)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a5, out=(o,o55,o)), msg='output dimensionality check with out_kwarg' ) confirm_does_not_raise( lambda: f14_oneoutput(a5, a5, out=o55), msg='output dimensionality check with out_kwarg' ) confirm_equal(o55, np.outer(a5,a5), msg='in-place broadcasting computed the right value') confirm_equal( o55, f14_oneoutput(a5, a5), msg='in-place broadcasting computed the right value') confirm_does_not_raise( lambda: f14(a5, a5, out=(o,o55)), msg='output dimensionality check with out_kwarg' ) confirm_equal(o, a5.dot(a5), msg='in-place broadcasting computed the right value') confirm_equal(o55, np.outer(a5,a5), msg='in-place broadcasting computed the right value') # two broadcasted slices confirm_raises( lambda: f14(a5, a25, out=o), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=o2), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o,)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o55,)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o55,o)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o,o2)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o,o5)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o2,o55)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o,o55,o)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o,o55)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o2,o55)), msg='output dimensionality check with out_kwarg' ) confirm_raises( lambda: f14(a5, a25, out=(o,o255)), msg='output dimensionality check with out_kwarg' ) confirm_does_not_raise( lambda: f14(a5, a25, out=(o2,o255)), msg='output dimensionality check with out_kwarg' ) confirm_equal(o2, nps.inner(a5,a25), msg='in-place broadcasting computed the right value') confirm_equal(o255, nps.outer(a5,a25), msg='in-place broadcasting computed the right value') (o2,o255) = f14(a5, a25) confirm_equal(o2, nps.inner(a5,a25), msg='in-place broadcasting computed the right value') confirm_equal(o255, nps.outer(a5,a25), msg='in-place broadcasting computed the right value') def test_broadcasting_into_output(): r'''Checking broadcasting with the output array defined.''' # I think about all 2^5 = 32 combinations: # # broadcast_define(): yes/no prototype_output, out_kwarg, single/multiple outputs # broadcasted call: yes/no dtype, output prototype_input = (('n',), ('n',)) in1, in2 = arr(3), arr(2,4,3) out_inner_ref = np.array([[ 5, 14, 23, 32], [41, 50, 59, 68]]) out_outer_ref = np.array([[[[ 0., 0., 0.], [ 0., 1., 2.], [ 0., 2., 4.]], [[ 0., 0., 0.], [ 3., 4., 5.], [ 6., 8., 10.]], [[ 0., 0., 0.], [ 6., 7., 8.], [12., 14., 16.]], [[ 0., 0., 0.], [ 9., 10., 11.], [18., 20., 22.]]], [[[ 0., 0., 0.], [12., 13., 14.], [24., 26., 28.]], [[ 0., 0., 0.], [15., 16., 17.], [30., 32., 34.]], [[ 0., 0., 0.], [18., 19., 20.], [36., 38., 40.]], [[ 0., 0., 0.], [21., 22., 23.], [42., 44., 46.]]]]) def f_inner(a, b, out=None, dtype=None): r'''Basic inner product.''' if out is None: if dtype is None: return a.dot(b) else: return a.dot(b).astype(dtype) if f_inner.do_dtype_check: if dtype is not None: confirm_equal( out.dtype, dtype ) if f_inner.do_base_check: if f_inner.base is not None: confirm_is(out.base, f_inner.base) f_inner.base_check_count = f_inner.base_check_count+1 else: f_inner.base_check_count = 0 f_inner.base = out.base if f_inner.do_dim_check: if out.shape != (): raise nps.NumpysaneError("mismatched lists: {} and {}". \ format(out.shape, ())) out.setfield(a.dot(b), out.dtype) return out def f_inner_outer(a, b, out=None, dtype=None): r'''Basic inner AND outer product.''' if out is None: if dtype is None: return a.dot(b),np.outer(a,b) else: return a.dot(b).astype(dtype),np.outer(a,b).astype(dtype) if f_inner_outer.do_dtype_check: if dtype is not None: confirm_equal( out[0].dtype, dtype ) confirm_equal( out[1].dtype, dtype ) if f_inner_outer.do_base_check: if f_inner_outer.base is not None: confirm_is(out[0].base, f_inner_outer.base[0]) confirm_is(out[1].base, f_inner_outer.base[1]) f_inner_outer.base_check_count = f_inner_outer.base_check_count+1 else: f_inner_outer.base_check_count = 0 f_inner_outer.base = [o.base for o in out] if f_inner_outer.do_dim_check: if len(out) != 2: raise nps.NumpysaneError("mismatched Noutput") if out[0].shape != (): raise nps.NumpysaneError("mismatched dimensions in output 0") if out[1].shape != (a.shape[0],b.shape[0]): raise nps.NumpysaneError("mismatched dimensions in output 1") out[0].setfield(a.dot(b), out[0].dtype) out[1].setfield(np.outer(a,b), out[1].dtype) return out for f, out_ref, prototype_output, prototype_output_bad in \ ((f_inner, out_inner_ref, (), (1,)), (f_inner_outer, (out_inner_ref, out_outer_ref), ((),(3,3)), ((),(3,3),(1,))), ): def confirm_call_out_values(f, *args, **kwargs): try: out = f(*args, **kwargs) if not isinstance(out_ref, tuple): confirm_equal(out, out_ref, "Output matches") confirm_equal(out.shape, out_ref.shape, "Output shape matches") else: for i in range(len(out_ref)): confirm_equal(out[i], out_ref[i], "Output matches") confirm_equal(out[i].shape, out_ref[i].shape, "Output shape matches") except: confirm(False, msg='broadcasted function call') multiple_outputs = False try: if isinstance(prototype_output[0], tuple): multiple_outputs = True except: pass def confirm_call_out_values(f, *args, **kwargs): try: out = f(*args, **kwargs) if not isinstance(out_ref, tuple): confirm_equal(out, out_ref, "Output matches") confirm_equal(out.shape, out_ref.shape, "Output shape matches") else: for i in range(len(out_ref)): confirm_equal(out[i], out_ref[i], "Output matches") confirm_equal(out[i].shape, out_ref[i].shape, "Output shape matches") except: confirm(False, msg='broadcasted function call') # First we look at the case where broadcast_define() has no out_kwarg. # Then the output cannot be specified at all. If prototype_output # exists, then it is either used to create the output array, or to # validate the dimensions of output slices obtained from elsewhere. The # dtype is simply passed through to the inner function, is free to use # it, to not use it, or to crash in response (the f() function above # will take it; created arrays will be of that type; passed-in arrays # will create an error for a wrong type) f1 = nps.broadcast_define(prototype_input) (f) f2 = nps.broadcast_define(prototype_input, prototype_output=prototype_output) (f) f3 = nps.broadcast_define(prototype_input, prototype_output=prototype_output_bad)(f) f.do_base_check = False f.do_dtype_check = False f.do_dim_check = True if not multiple_outputs: confirm_call_out_values(f1, in1, in2) confirm_call_out_values(f1, in1, in2, dtype=float) confirm_call_out_values(f1, in1, in2, dtype=int) else: confirm_raises(lambda: f1(in1, in2)) confirm_raises(lambda: f1(in1, in2, dtype=float)) confirm_raises(lambda: f1(in1, in2, dtype=int)) confirm_call_out_values(f2, in1, in2) confirm_raises ( lambda: f3(in1, in2) ) # OK then. Let's now pass in an out_kwarg. Here we do not yet # pre-allocate an output. Thus if we don't pass in a prototype_output # either, the first slice will dictate the output shape, and we'll have # 7 inner calls into an output array (6 base comparisons). If we DO pass # in a prototype_output, then we will allocate immediately, and we'll # see 8 inner calls into an output array (7 base comparisons) f1 = nps.broadcast_define(prototype_input, out_kwarg="out") (f) f2 = nps.broadcast_define(prototype_input, out_kwarg="out", prototype_output=prototype_output) (f) f3 = nps.broadcast_define(prototype_input, out_kwarg="out", prototype_output=prototype_output_bad)(f) f.do_base_check = True f.do_dtype_check = True f.do_dim_check = True if not multiple_outputs: f.base = None confirm_call_out_values(f1, in1, in2) confirm_equal( 6, f.base_check_count ) f.base = None confirm_call_out_values(f1, in1, in2, dtype=float) confirm_equal( 6, f.base_check_count ) f.base = None confirm_call_out_values(f1, in1, in2, dtype=int) confirm_equal( 6, f.base_check_count ) f.base = None confirm_call_out_values(f2, in1, in2) confirm_equal( 7, f.base_check_count ) f.base = None confirm_call_out_values(f2, in1, in2, dtype=float) confirm_equal( 7, f.base_check_count ) f.base = None confirm_call_out_values(f2, in1, in2, dtype=int) confirm_equal( 7, f.base_check_count ) # Here the inner function will get an improperly-sized array to fill in. # broadcast_define() itself won't see any issues with this, but the # inner function is free to detect the error f.do_dim_check = False f.base = None confirm_does_not_raise( lambda: f3(in1, in2), msg='broadcasted function call') f.do_dim_check = True f.base = None confirm_raises( lambda: f3(in1, in2) ) # Now pre-allocate the full output array ourselves. Any prototype_output # we pass in is used for validation. Any dtype passed in does nothing, # but assertValueShape() will flag discrepancies. We use the same # f1,f2,f3 as above f.do_base_check = True f.do_dtype_check = False f.do_dim_check = True out_ref_mounted = out_ref if multiple_outputs else (out_ref,) # correct shape, varying dtypes out0 = tuple( np.empty( o.shape, dtype=float ) for o in out_ref_mounted) out1 = tuple( np.empty( o.shape, dtype=int ) for o in out_ref_mounted) # shape has too many dimensions out2 = tuple( np.empty( o.shape + (1,), dtype=int ) for o in out_ref_mounted) out3 = tuple( np.empty( o.shape + (2,), dtype=int ) for o in out_ref_mounted) out4 = tuple( np.empty( (1,) + o.shape, dtype=int ) for o in out_ref_mounted) out5 = tuple( np.empty( (2,) + o.shape, dtype=int ) for o in out_ref_mounted) # shape has the correct number of dimensions, but they aren't right out6 = tuple( np.empty( (1,) + o.shape[1:], dtype=int ) for o in out_ref_mounted) out7 = tuple( np.empty( o.shape[:1] + (1,), dtype=int ) for o in out_ref_mounted) if not multiple_outputs: out0 = out0[0] out1 = out1[0] out2 = out2[0] out3 = out3[0] out4 = out4[0] out5 = out5[0] out6 = out6[0] out7 = out7[0] # f1 and f2 should work exactly the same, since prototype_output is just # a validating parameter if not multiple_outputs: f.base = None assertValueShape( out_ref, out_ref.shape, f1, in1, in2, out=out0) confirm_equal( 7, f.base_check_count ) f.base = None assertValueShape( out_ref, out_ref.shape, f1, in1, in2, out=out0, dtype=float) confirm_equal( 7, f.base_check_count ) f.base = None assertValueShape( out_ref, out_ref.shape, f1, in1, in2, out=out1) confirm_equal( 7, f.base_check_count ) f.base = None assertValueShape( out_ref, out_ref.shape, f1, in1, in2, out=out1, dtype=int) confirm_equal( 7, f.base_check_count ) f.base = None confirm_call_out_values( f2, in1, in2, out=out0) confirm_equal( 7, f.base_check_count ) f.base = None confirm_call_out_values( f2, in1, in2, out=out0, dtype=float) confirm_equal( 7, f.base_check_count ) f.base = None confirm_call_out_values( f2, in1, in2, out=out1) confirm_equal( 7, f.base_check_count ) f.base = None confirm_call_out_values( f2, in1, in2, out=out1, dtype=int) confirm_equal( 7, f.base_check_count ) # any improperly-sized output matrices WILL be flagged if # prototype_output is given, and will likely be flagged if it isn't # also, although there are cases where this wouldn't happen. I simply # expect all of these to fail for out_misshaped in out2,out3,out4,out5,out6,out7: f.do_dim_check = False f.base = None confirm_raises( lambda: f2(in1, in2, out=out_misshaped) ) f.do_dim_check = True f.base = None confirm_raises( lambda: f1(in1, in2, out=out_misshaped) ) def test_concatenation(): r'''Checking the various concatenation functions.''' confirm_raises( lambda: nps.glue( arr(2,3), arr(2,3), axis=0), msg='axes are negative' ) confirm_raises( lambda: nps.glue( arr(2,3), arr(2,3), axis=1), msg='axes are negative' ) # basic glueing assertValueShape( None, (2,6), nps.glue, arr(2,3), arr(2,3), axis=-1 ) assertValueShape( None, (4,3), nps.glue, arr(2,3), arr(2,3), axis=-2 ) assertValueShape( None, (2,2,3), nps.glue, arr(2,3), arr(2,3), axis=-3 ) assertValueShape( None, (2,1,2,3), nps.glue, arr(2,3), arr(2,3), axis=-4 ) confirm_raises ( lambda: nps.glue( arr(2,3), arr(2,3)) ) assertValueShape( None, (2,2,3), nps.cat, arr(2,3), arr(2,3) ) # extra length-1 dims added as needed, data not duplicated as needed confirm_raises( lambda: nps.glue( arr(3), arr(2,3), axis=-1) ) assertValueShape( None, (3,3), nps.glue, arr(3), arr(2,3), axis=-2 ) confirm_raises( lambda: nps.glue( arr(3), arr(2,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(3), arr(2,3), axis=-4) ) confirm_raises( lambda: nps.glue( arr(3), arr(2,3)) ) confirm_raises( lambda: nps.cat( arr(3), arr(2,3)) ) confirm_raises( lambda: nps.glue( arr(2,3), arr(3), axis=-1) ) assertValueShape( None, (3,3), nps.glue, arr(2,3), arr(3), axis=-2 ) confirm_raises( lambda: nps.glue( arr(2,3), arr(3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(2,3), arr(3), axis=-4) ) confirm_raises( lambda: nps.cat( arr(2,3), arr(3)) ) confirm_raises( lambda: nps.glue( arr(1,3), arr(2,3), axis=-1) ) assertValueShape( None, (3,3), nps.glue, arr(1,3), arr(2,3), axis=-2 ) confirm_raises( lambda: nps.glue( arr(1,3), arr(2,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(1,3), arr(2,3), axis=-4) ) confirm_raises( lambda: nps.cat( arr(1,3), arr(2,3)) ) confirm_raises( lambda: nps.glue( arr(2,3), arr(1,3), axis=-1) ) assertValueShape( None, (3,3), nps.glue, arr(2,3), arr(1,3), axis=-2 ) confirm_raises( lambda: nps.glue( arr(2,3), arr(1,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(2,3), arr(1,3), axis=-4) ) confirm_raises( lambda: nps.cat( arr(2,3), arr(1,3)) ) confirm_raises( lambda: nps.glue( arr(1,3), arr(2,3), axis=-1) ) assertValueShape( None, (3,3), nps.glue, arr(1,3), arr(2,3), axis=-2 ) confirm_raises( lambda: nps.glue( arr(1,3), arr(2,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(1,3), arr(2,3), axis=-4) ) confirm_raises( lambda: nps.cat( arr(1,3), arr(2,3)) ) assertValueShape( None, (3,), nps.glue, arr(3), np.array(()), axis=-1 ) assertValueShape( None, (4,), nps.glue, arr(3), np.array(5), axis=-1 ) # zero-length arrays do the right thing confirm_raises( lambda: nps.glue( arr(0,3), arr(2,3), axis=-1) ) assertValueShape( None, (2,3), nps.glue, arr(0,3), arr(2,3), axis=-2 ) confirm_raises( lambda: nps.glue( arr(0,3), arr(2,3), axis=-3) ) assertValueShape( None, (2,3), nps.glue, arr(2,0), arr(2,3), axis=-1 ) confirm_raises( lambda: nps.glue( arr(2,0), arr(2,3), axis=-2) ) confirm_raises( lambda: nps.glue( arr(2,0), arr(2,3), axis=-3) ) assertValueShape( None, (2,3), nps.glue, arr(2,0), arr(2,3), axis=-1 ) confirm_raises( lambda: nps.glue( arr(2,0), arr(2,3), axis=-2) ) confirm_raises( lambda: nps.glue( arr(2,0), arr(2,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(2,3), arr(0,3), axis=-1) ) assertValueShape( None, (2,3), nps.glue, arr(2,3), arr(0,3), axis=-2 ) confirm_raises( lambda: nps.glue( arr(2,3), arr(0,3), axis=-3) ) assertValueShape( None, (0,5), nps.glue, arr(0,2), arr(0,3), axis=-1 ) confirm_raises( lambda: nps.glue( arr(0,2), arr(0,3), axis=-2) ) confirm_raises( lambda: nps.glue( arr(0,2), arr(0,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(2,0), arr(0,3), axis=-1) ) confirm_raises( lambda: nps.glue( arr(2,0), arr(0,3), axis=-2) ) confirm_raises( lambda: nps.glue( arr(2,0), arr(0,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(0,2), arr(3,0), axis=-1) ) confirm_raises( lambda: nps.glue( arr(0,2), arr(3,0), axis=-2) ) confirm_raises( lambda: nps.glue( arr(0,2), arr(3,0), axis=-3) ) confirm_raises( lambda: nps.glue( arr(2,0), arr(3,0), axis=-1) ) assertValueShape( None, (5,0), nps.glue, arr(2,0), arr(3,0), axis=-2 ) confirm_raises( lambda: nps.glue( arr(2,0), arr(3,0), axis=-3) ) assertValueShape( None, (2,0), nps.glue, arr(1,0), arr(1,0), axis=-2 ) assertValueShape( None, (2,0), nps.glue, arr(0,), arr(1,0), axis=-2 ) assertValueShape( None, (2,0), nps.glue, arr(1,0), arr(0,), axis=-2 ) assertValueShape( None, (0,), nps.glue, arr(0,), arr(0,), axis=-1 ) assertValueShape( None, (2,0), nps.glue, arr(0,), arr(0,), axis=-2 ) assertValueShape( None, (2,1,1,0), nps.glue, arr(0,), arr(0,), axis=-4 ) assertValueShape( None, (0,), nps.glue, arr(0,), arr(0,), axis=-1 ) assertValueShape( None, (2,), nps.glue, arr(2,), arr(0,), axis=-1 ) assertValueShape( None, (2,), nps.glue, arr(0,), arr(2,), axis=-1 ) assertValueShape( None, (1,2,), nps.glue, arr(0,), arr(2,), axis=-2 ) # same as before, but np.array(()) instead of np.arange(0) assertValueShape( None, (0,), nps.glue, np.array(()), np.array(()), axis=-1 ) assertValueShape( None, (2,), nps.glue, arr(2,), np.array(()), axis=-1 ) assertValueShape( None, (2,), nps.glue, np.array(()),arr(2,), axis=-1 ) assertValueShape( None, (1,2), nps.glue, np.array(()),arr(2,), axis=-2 ) assertValueShape( None, (5,7), nps.glue, np.array(()),arr(5,7), axis=-2 ) assertValueShape( None, (5,7), nps.glue, arr(5,7), np.array(()),axis=-2 ) assertValueShape( None, (0,6), nps.glue, arr(0,3), arr(0,3), axis=-1 ) assertValueShape( None, (0,3), nps.glue, arr(0,3), arr(0,3), axis=-2 ) assertValueShape( None, (2,0,3), nps.glue, arr(0,3), arr(0,3), axis=-3 ) confirm_raises( lambda: nps.glue( arr(3,0), arr(0,3), axis=-1) ) confirm_raises( lambda: nps.glue( arr(3,0), arr(0,3), axis=-2) ) confirm_raises( lambda: nps.glue( arr(3,0), arr(0,3), axis=-3) ) confirm_raises( lambda: nps.glue( arr(0,3), arr(3,0), axis=-1) ) confirm_raises( lambda: nps.glue( arr(0,3), arr(3,0), axis=-2) ) confirm_raises( lambda: nps.glue( arr(0,3), arr(3,0), axis=-3) ) assertValueShape( None, (3,0), nps.glue, arr(3,0), arr(3,0), axis=-1 ) assertValueShape( None, (6,0), nps.glue, arr(3,0), arr(3,0), axis=-2 ) assertValueShape( None, (2,3,0), nps.glue, arr(3,0), arr(3,0), axis=-3 ) # legacy behavior allows one to omit the 'axis' kwarg nps.glue.legacy_version = '0.9' assertValueShape( None, (2,2,3), nps.glue, arr(2,3), arr(2,3) ) delattr(nps.glue, 'legacy_version') def test_dimension_manipulation(): r'''Checking the various functions that manipulate dimensions.''' assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=5 ) assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=4 ) assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=3 ) assertValueShape( None, (6,4), nps.clump, arr(2,3,4), n=2 ) assertValueShape( None, (2,3,4), nps.clump, arr(2,3,4), n=1 ) assertValueShape( None, (2,3,4), nps.clump, arr(2,3,4), n=0 ) assertValueShape( None, (2,3,4), nps.clump, arr(2,3,4), n=1 ) assertValueShape( None, (2,12), nps.clump, arr(2,3,4), n=-2 ) assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=-3 ) assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=-4 ) assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=-5 ) # legacy behavior: n>0 required, and always clumps the trailing dimensions nps.clump.legacy_version = '0.9' confirm_raises ( lambda: nps.clump( arr(2,3,4), n=-1) ) assertValueShape( None, (2,3,4), nps.clump, arr(2,3,4), n=0 ) assertValueShape( None, (2,3,4), nps.clump, arr(2,3,4), n=1 ) assertValueShape( None, (2,12), nps.clump, arr(2,3,4), n=2 ) assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=3 ) assertValueShape( None, (24,), nps.clump, arr(2,3,4), n=4 ) delattr(nps.clump, 'legacy_version') assertValueShape( None, (2,3,4), nps.atleast_dims, arr(2,3,4), -1, 1 ) assertValueShape( None, (2,3,4), nps.atleast_dims, arr(2,3,4), -2, 1 ) assertValueShape( None, (2,3,4), nps.atleast_dims, arr(2,3,4), -3, 1 ) assertValueShape( None, (1,2,3,4), nps.atleast_dims, arr(2,3,4), -4, 1 ) assertValueShape( None, (2,3,4), nps.atleast_dims, arr(2,3,4), -2, 0 ) assertValueShape( None, (2,3,4), nps.atleast_dims, arr(2,3,4), -2, 1 ) assertValueShape( None, (2,3,4), nps.atleast_dims, arr(2,3,4), -2, 2 ) confirm_raises ( lambda: nps.atleast_dims( arr(2,3,4), -2, 3) ) assertValueShape( None, (3,), nps.atleast_dims, arr(3), 0 ) confirm_raises ( lambda: nps.atleast_dims( arr(3), 1) ) assertValueShape( None, (3,), nps.atleast_dims, arr(3), -1 ) assertValueShape( None, (1,3,), nps.atleast_dims, arr(3), -2 ) confirm_raises ( lambda: nps.atleast_dims( arr(), 0) ) confirm_raises ( lambda: nps.atleast_dims( arr(), 1) ) assertValueShape( None, (1,), nps.atleast_dims, arr(), -1 ) assertValueShape( None, (1,1), nps.atleast_dims, arr(), -2 ) l = (-4,1) confirm_raises ( lambda: nps.atleast_dims( arr(2,3,4), l) ) l = [-4,1] confirm_raises ( lambda: nps.atleast_dims( arr(2,3,4), l, -1) ) assertValueShape( None, (1,2,3,4), nps.atleast_dims, arr(2,3,4), l ) confirm_equal ( l, [-4, 2]) assertValueShape( None, (3,4,2), nps.mv, arr(2,3,4), -3, -1 ) assertValueShape( None, (3,2,4), nps.mv, arr(2,3,4), -3, 1 ) assertValueShape( None, (2,1,1,3,4), nps.mv, arr(2,3,4), -3, -5 ) assertValueShape( None, (2,1,1,3,4), nps.mv, arr(2,3,4), 0, -5 ) assertValueShape( None, (4,3,2), nps.xchg, arr(2,3,4), -3, -1 ) assertValueShape( None, (3,2,4), nps.xchg, arr(2,3,4), -3, 1 ) assertValueShape( None, (2,1,1,3,4), nps.xchg, arr(2,3,4), -3, -5 ) assertValueShape( None, (2,1,1,3,4), nps.xchg, arr(2,3,4), 0, -5 ) assertValueShape( None, (2,4,3), nps.transpose, arr(2,3,4) ) assertValueShape( None, (4,3), nps.transpose, arr(3,4) ) assertValueShape( None, (4,1), nps.transpose, arr(4) ) assertValueShape( None, (1,2,3,4), nps.dummy, arr(2,3,4), 0 ) assertValueShape( None, (2,1,3,4), nps.dummy, arr(2,3,4), 1 ) assertValueShape( None, (2,3,4,1), nps.dummy, arr(2,3,4), -1 ) assertValueShape( None, (2,3,1,4), nps.dummy, arr(2,3,4), -2 ) assertValueShape( None, (2,3,1,1,4), nps.dummy, arr(2,3,4), -2, -2 ) assertValueShape( None, (2,1,3,4), nps.dummy, arr(2,3,4), -3 ) assertValueShape( None, (1,2,3,4), nps.dummy, arr(2,3,4), -4 ) assertValueShape( None, (1,1,2,3,4), nps.dummy, arr(2,3,4), -5 ) assertValueShape( None, (2,3,1,4), nps.dummy, arr(2,3,4), 2 ) confirm_raises ( lambda: nps.dummy( arr(2,3,4), 3) ) assertValueShape( None, (2,4,3), nps.reorder, arr(2,3,4), 0, -1, 1 ) assertValueShape( None, (3,4,2), nps.reorder, arr(2,3,4), -2, -1, 0 ) assertValueShape( None, (1,3,1,4,2), nps.reorder, arr(2,3,4), -4, -2, -5, -1, 0 ) confirm_raises ( lambda: nps.reorder( arr(2,3,4), -4, -2, -5, -1, 0, 5), msg='reorder barfs on out-of-bounds dimensions' ) def test_inner(): r'''Testing the broadcasted inner product''' assertResult_inoutplace( np.array([[[ 30, 255, 730], [ 180, 780, 1630]], [[ 180, 780, 1630], [1455, 2430, 3655]], [[ 330, 1305, 2530], [2730, 4080, 5680]], [[ 480, 1830, 3430], [4005, 5730, 7705.0]]]), nps.inner, arr(2,3,5), arr(4,1,3,5), out_inplace_dtype=float ) assertResult_inoutplace( np.array([[[ 30, 255, 730], [ 180, 780, 1630]], [[ 180, 780, 1630], [1455, 2430, 3655]], [[ 330, 1305, 2530], [2730, 4080, 5680]], [[ 480, 1830, 3430], [4005, 5730, 7705.0]]]), nps.inner, arr(2,3,5), arr(4,1,3,5), dtype=float, out_inplace_dtype=float ) output = np.empty((4,2,3), dtype=float) confirm_raises( lambda: nps.inner( arr(2,3,5), arr(4,1,3,5), dtype=int, out=output ), "inner(out=out, dtype=dtype) have out=dtype==dtype" ) assertResult_inoutplace( np.array((24+148j)), nps.dot, np.array(( 1 + 2j, 3 + 4j, 5 + 6j)), np.array(( 1 + 2j, 3 + 4j, 5 + 6j)) + 5, out_inplace_dtype=np.complex) assertResult_inoutplace( np.array((136-60j)), nps.vdot, np.array(( 1 + 2j, 3 + 4j, 5 + 6j)), np.array(( 1 + 2j, 3 + 4j, 5 + 6j)) + 5, out_inplace_dtype=np.complex) # complex values AND non-trivial dimensions a = arr( 2,3,5).astype(np.complex) b = arr(4,1,3,5).astype(np.complex) a += a*a * 1j b -= b * 1j dot_ref = np.array([[[ 130.0 +70.0j, 2180.0 +1670.0j, 9730.0 +8270.0j], [ 3430.0 +3070.0j, 18230.0 +16670.0j, 46030.0 +42770.0j]], [[ 730.0 +370.0j, 6530.0 +4970.0j, 21580.0 +18320.0j], [ 26530.0 +23620.0j, 56330.0 +51470.0j, 102880.0 +95570.0j]], [[ 1330.0 +670.0j, 10880.0 +8270.0j, 33430.0 +28370.0j], [ 49630.0 +44170.0j, 94430.0 +86270.0j, 159730.0 +148370.0j]], [[ 1930.0 +970.0j, 15230.0 +11570.0j, 45280.0 +38420.0j], [ 72730.0 +64720.0j, 132530.0 +121070.0j, 216580.0 +201170.0j]]]) vdot_ref = np.array([[[ -70.0 -130.0j, -1670.0 -2180.0j, -8270.0 -9730.0j], [ -3070.0 -3430.0j, -16670.0 -18230.0j, -42770.0 -46030.0j]], [[ -370.0 -730.0j, -4970.0 -6530.0j, -18320.0 -21580.0j], [ -23620.0 -26530.0j, -51470.0 -56330.0j, -95570.0 -102880.0j]], [[ -670.0 -1330.0j, -8270.0 -10880.0j, -28370.0 -33430.0j], [ -44170.0 -49630.0j, -86270.0 -94430.0j, -148370.0 -159730.0j]], [[ -970.0 -1930.0j, -11570.0 -15230.0j, -38420.0 -45280.0j], [ -64720.0 -72730.0j, -121070.0 -132530.0j, -201170.0 -216580.0j]]]) assertResult_inoutplace( dot_ref, nps.dot, a, b, out_inplace_dtype=np.complex) assertResult_inoutplace( vdot_ref, nps.vdot, a, b, out_inplace_dtype=np.complex) def test_mag(): r'''Testing the broadcasted norm2, magnitude''' # input is a 1D array of integers, no output dtype specified. Output # should be a floating-point scalar assertResult_inoutplace( np.sqrt(nps.norm2(np.arange(5))), nps.mag, arr(5, dtype=int) ) # input is a 1D array of floats, no output dtype specified. Output # should be a floating-point scalar assertResult_inoutplace( np.sqrt(nps.norm2(np.arange(5))), nps.mag, arr(5, dtype=float) ) # input is a 1D array of integers, no output dtype specified. Output # should be a floating-point scalar assertResult_inoutplace( np.sqrt(nps.norm2(np.arange(5))), nps.mag, arr(5, dtype=int), dtype = np.float32, out_inplace_dtype = np.float32) # input is a 1D array of integers, output dtype=int. Output should be an # integer scalar output = np.empty((), dtype=int) nps.mag( np.arange(5, dtype=int), out = output ) confirm_equal(int(np.sqrt(nps.norm2(np.arange(5)))), output) # input is a 1D array of integers, output dtype=float. Output should be an # float scalar output = np.empty((), dtype=float) nps.mag( np.arange(5, dtype=int), out = output ) confirm_equal(np.sqrt(nps.norm2(np.arange(5))), output) # input is a 2D array of integers, no output dtype specified. Output # should be a floating-point 1D vector assertResult_inoutplace( np.sqrt(np.array(( nps.norm2(np.arange(5)), nps.norm2(np.arange(5,10))))), nps.mag, arr(2,5, dtype=int) ) # input is a 2D array of floats, no output dtype specified. Output # should be a floating-point 1D vector assertResult_inoutplace( np.sqrt(np.array(( nps.norm2(np.arange(5)), nps.norm2(np.arange(5,10))))), nps.mag, arr(2,5, dtype=float) ) # input is a 2D array of integers, output dtype=int. Output should be an # array of integers output = np.empty((2,), dtype=int) nps.mag( arr(2,5, dtype=int), out = output ) confirm_equal(np.sqrt(np.array(( nps.norm2(np.arange(5)), nps.norm2(np.arange(5,10))))).astype(int), output) # input is a 2D array of integers, output dtype=float. Output should be an # array of floats output = np.empty((2,), dtype=float) nps.mag( arr(2,5, dtype=int), out = output ) confirm_equal(np.sqrt(np.array(( nps.norm2(np.arange(5)), nps.norm2(np.arange(5,10))))), output) # Make sure overflows can be handled by specifying a dtype r = np.array((-206, 10), dtype=np.int16) confirm_equal(206.*206. + 10.*10., nps.norm2(r, dtype=float), msg = "norm2 can handle overflows with a dtype") confirm_equal(np.sqrt(206.*206. + 10.*10.), nps.mag(r, dtype=float), msg = "mag can handle overflows with a dtype") def test_outer(): r'''Testing the broadcasted outer product''' # comes from PDL. numpy has a reversed axis ordering convention from # PDL, so I transpose the array before comparing ref = nps.transpose( np.array([[[[[0,0,0,0,0],[0,1,2,3,4],[0,2,4,6,8],[0,3,6,9,12],[0,4,8,12,16]], [[25,30,35,40,45],[30,36,42,48,54],[35,42,49,56,63],[40,48,56,64,72],[45,54,63,72,81]], [[100,110,120,130,140],[110,121,132,143,154],[120,132,144,156,168],[130,143,156,169,182],[140,154,168,182,196]]], [[[0,0,0,0,0],[15,16,17,18,19],[30,32,34,36,38],[45,48,51,54,57],[60,64,68,72,76]], [[100,105,110,115,120],[120,126,132,138,144],[140,147,154,161,168],[160,168,176,184,192],[180,189,198,207,216]], [[250,260,270,280,290],[275,286,297,308,319],[300,312,324,336,348],[325,338,351,364,377],[350,364,378,392,406]]]], [[[[0,15,30,45,60],[0,16,32,48,64],[0,17,34,51,68],[0,18,36,54,72],[0,19,38,57,76]], [[100,120,140,160,180],[105,126,147,168,189],[110,132,154,176,198],[115,138,161,184,207],[120,144,168,192,216]], [[250,275,300,325,350],[260,286,312,338,364],[270,297,324,351,378],[280,308,336,364,392],[290,319,348,377,406]]], [[[225,240,255,270,285],[240,256,272,288,304],[255,272,289,306,323],[270,288,306,324,342],[285,304,323,342,361]], [[400,420,440,460,480],[420,441,462,483,504],[440,462,484,506,528],[460,483,506,529,552],[480,504,528,552,576]], [[625,650,675,700,725],[650,676,702,728,754],[675,702,729,756,783],[700,728,756,784,812],[725,754,783,812,841]]]], [[[[0,30,60,90,120],[0,31,62,93,124],[0,32,64,96,128],[0,33,66,99,132],[0,34,68,102,136]], [[175,210,245,280,315],[180,216,252,288,324],[185,222,259,296,333],[190,228,266,304,342],[195,234,273,312,351]], [[400,440,480,520,560],[410,451,492,533,574],[420,462,504,546,588],[430,473,516,559,602],[440,484,528,572,616]]], [[[450,480,510,540,570],[465,496,527,558,589],[480,512,544,576,608],[495,528,561,594,627],[510,544,578,612,646]], [[700,735,770,805,840],[720,756,792,828,864],[740,777,814,851,888],[760,798,836,874,912],[780,819,858,897,936]], [[1000,1040,1080,1120,1160],[1025,1066,1107,1148,1189],[1050,1092,1134,1176,1218],[1075,1118,1161,1204,1247],[1100,1144,1188,1232,1276]]]], [[[[0,45,90,135,180],[0,46,92,138,184],[0,47,94,141,188],[0,48,96,144,192],[0,49,98,147,196]], [[250,300,350,400,450],[255,306,357,408,459],[260,312,364,416,468],[265,318,371,424,477],[270,324,378,432,486]], [[550,605,660,715,770],[560,616,672,728,784],[570,627,684,741,798],[580,638,696,754,812],[590,649,708,767,826]]], [[[675,720,765,810,855],[690,736,782,828,874],[705,752,799,846,893],[720,768,816,864,912],[735,784,833,882,931]], [[1000,1050,1100,1150,1200],[1020,1071,1122,1173,1224],[1040,1092,1144,1196,1248],[1060,1113,1166,1219,1272],[1080,1134,1188,1242,1296]], [[1375,1430,1485,1540,1595],[1400,1456,1512,1568,1624],[1425,1482,1539,1596,1653],[1450,1508,1566,1624,1682],[1475,1534,1593,1652,1711]]]]])) assertResult_inoutplace( ref, nps.outer, arr(2,3,5), arr(4,1,3,5), out_inplace_dtype=float ) # unequal dimensions. a = arr(1,3,1,4) b = arr( 3,7,3) ref = nps.matmult( nps.dummy(a, -1), nps.dummy(b, -2)) assertResult_inoutplace( ref, nps.outer, a, b, out_inplace_dtype=float ) def test_matmult(): r'''Testing the broadcasted matrix multiplication''' assertValueShape( None, (4,2,3,5), nps.matmult, arr(2,3,7), arr(4,1,7,5) ) ref = np.array([[[[ 42, 48, 54], [ 114, 136, 158]], [[ 114, 120, 126], [ 378, 400, 422]]], [[[ 186, 224, 262], [ 258, 312, 366]], [[ 642, 680, 718], [ 906, 960, 1014]]]]) assertResult_inoutplace( ref, nps.matmult2, arr(2,1,2,4), arr(2,4,3), out_inplace_dtype=float ) ref2 = np.array([[[[ 156.], [ 452.]], [[ 372.], [ 1244.]]], [[[ 748.], [ 1044.]], [[ 2116.], [ 2988.]]]]) assertResult_inoutplace(ref2, nps.matmult2, arr(2,1,2,4), nps.matmult2(arr(2,4,3), arr(3,1))) assertResult_inoutplace(ref2, nps.matmult, arr(2,1,2,4), arr(2,4,3), arr(3,1)) # checking the null-dimensionality logic A = arr(2,3) assertResult_inoutplace( nps.inner(nps.transpose(A), np.arange(2)), nps.matmult, np.arange(2), A ) A = arr(3) assertResult_inoutplace( A*2, nps.matmult, np.array([2]), A ) A = arr(3) assertResult_inoutplace( A*2, nps.matmult, np.array(2), A ) test_broadcasting() test_broadcasting_into_output() test_concatenation() test_dimension_manipulation() test_inner() test_mag() test_outer() test_matmult() finish() numpysane-0.35/test/testlib.c000066400000000000000000000077141407353053200162640ustar00rootroot00000000000000#define _GNU_SOURCE #include #include // Test C library being wrapped by numpysane_pywrap. This library can compute // inner and outer products // Inner product supports arbitrary strides, and 3 data types #define DEFINE_INNER_T(T) \ T inner_ ## T(const T* a, \ const T* b, \ int stride_a, \ int stride_b, \ int n) \ { \ T s = 0.0; \ for(int i=0; i #define DEFINE_SORTED_INDICES_T(T) \ static int compar_indices_ ## T(const void* _i0, const void* _i1, \ void* _x) \ { \ const int i0 = *(const int*)_i0; \ const int i1 = *(const int*)_i1; \ const T* x = (const T*)_x; \ if( x[i0] < x[i1] ) return -1; \ if( x[i0] > x[i1] ) return 1; \ return 0; \ } \ /* Assumes that indices_order[] has room for at least N values */ \ void sorted_indices_ ## T(/* output */ \ int* indices_order, \ \ /* input */ \ const T* x, int N) \ { \ for(int i=0; i // Inner product supports arbitrary strides, and 3 data types #define DECLARE_INNER_T(T) \ T inner_ ## T(const T* a, \ const T* b, \ int stride_a, \ int stride_b, \ int n); DECLARE_INNER_T(int32_t) DECLARE_INNER_T(int64_t) DECLARE_INNER_T(double) // Outer product supports arbitrary strides, and only the "double" data type void outer(double* out, int stride_out_incol, int stride_out_inrow, const double* a, const double* b, int stride_a, int stride_b, int n); // inner and outer product together. Only contiguous data is supported. "double" // only. non-broadcasted "scale" argument scales the output double innerouter(double* out, const double* a, const double* b, double scale, int n); // Assumes that indices_order[] has room for at least N values void sorted_indices_float(// output int* indices_order, // input const float* x, int N); void sorted_indices_double(// output int* indices_order, // input const double* x, int N); numpysane-0.35/test/testutils.py000066400000000000000000000173121407353053200170570ustar00rootroot00000000000000r'''A simple test harness These should be trivial, but all the standard ones in python suck. This one sucks far less. ''' import sys import numpy as np import os import re from inspect import currentframe Nchecks = 0 NchecksFailed = 0 # no line breaks. Useful for test reporting. Yes, this sets global state, but # we're running a test harness. This is fine np.set_printoptions(linewidth=1e10, suppress=True) def test_location(): r'''Reports string describing current location in the test Skips over the backtrace entries that are in the test harness itself ''' filename_this = os.path.split( __file__ )[1] if filename_this.endswith(".pyc"): filename_this = filename_this[:-1] frame = currentframe().f_back.f_back while frame: if frame.f_back is None or \ not frame.f_code.co_filename.endswith(filename_this): break frame = frame.f_back testfile = os.path.split(frame.f_code.co_filename)[1] try: return "{}:{} {}()".format(testfile, frame.f_lineno, frame.f_code.co_name) except: return '' def print_red(x): """print the message in red""" sys.stdout.write("\x1b[31m" + test_location() + ": " + x + "\x1b[0m\n") def print_green(x): """Print the message in green""" sys.stdout.write("\x1b[32m" + test_location() + ": " + x + "\x1b[0m\n") def confirm_equal(x, xref, msg='', eps=1e-6): r'''If x is equal to xref, report test success. msg identifies this check. eps sets the RMS equality tolerance. The x,xref arguments can be given as many different types. This function tries to do the right thing. ''' global Nchecks global NchecksFailed Nchecks = Nchecks + 1 # strip all trailing whitespace in each line, in case these are strings if isinstance(x, str): x = re.sub('[ \t]+(\n|$)', '\\1', x) if isinstance(xref, str): xref = re.sub('[ \t]+(\n|$)', '\\1', xref) # convert data to numpy if possible try: xref = np.array(xref) except: pass try: x = np.array(x) except: pass try: # flatten array if possible x = x.ravel() xref = xref.ravel() except: pass try: N = x.shape[0] except: N = 1 try: Nref = xref.shape[0] except: Nref = 1 if N != Nref: print_red(("FAILED{}: mismatched array sizes: N = {} but Nref = {}. Arrays: \n" + "x = {}\n" + "xref = {}"). format((': ' + msg) if msg else '', N, Nref, x, xref)) NchecksFailed = NchecksFailed + 1 return False if N != 0: try: # I I can subtract, get the error that way diff = x - xref def norm2sq(x): """Return 2 norm""" return np.inner(x, x) rms = np.sqrt(norm2sq(diff) / N) if not np.all(np.isfinite(rms)): print_red("FAILED{}: Some comparison results are NaN or Inf. " "rms error = {}. x = {}, xref = {}".format( (': ' + msg) if msg else '', rms, x, xref)) NchecksFailed = NchecksFailed + 1 return False if rms > eps: print_red("FAILED{}: rms error = {}.\nx,xref,err =\n{}".format( (': ' + msg) if msg else '', rms, np.vstack((x, xref, diff)).transpose())) NchecksFailed = NchecksFailed + 1 return False except: # Can't subtract. Do == instead if not np.array_equal(x, xref): print_red("FAILED{}: x =\n'{}', xref =\n'{}'".format( (': ' + msg) if msg else '', x, xref)) NchecksFailed = NchecksFailed + 1 return False print_green("OK{}".format((': ' + msg) if msg else '')) return True def confirm(x, msg=''): r'''If x is true, report test success. msg identifies this check''' global Nchecks global NchecksFailed Nchecks = Nchecks + 1 if not x: print_red("FAILED{}".format((': ' + msg) if msg else '')) NchecksFailed = NchecksFailed + 1 return False print_green("OK{}".format((': ' + msg) if msg else '')) return True def confirm_is(x, xref, msg=''): r'''If x is xref, report test success. msg identifies this check ''' global Nchecks global NchecksFailed Nchecks = Nchecks + 1 if x is xref: print_green("OK{}".format((': ' + msg) if msg else '')) return True print_red("FAILED{}".format((': ' + msg) if msg else '')) NchecksFailed = NchecksFailed + 1 return False def confirm_raises(f, msg=''): r'''If f() raises an exception, report test success. msg identifies this check''' global Nchecks global NchecksFailed Nchecks = Nchecks + 1 try: f() print_red("FAILED{}".format((': ' + msg) if msg else '')) NchecksFailed = NchecksFailed + 1 return False except: print_green("OK{}".format((': ' + msg) if msg else '')) return True def confirm_does_not_raise(f, msg=''): r'''If f() raises an exception, report test failure. msg identifies this check''' global Nchecks global NchecksFailed Nchecks = Nchecks + 1 try: f() print_green("OK{}".format((': ' + msg) if msg else '')) return True except: print_red("FAILED{}".format((': ' + msg) if msg else '')) NchecksFailed = NchecksFailed + 1 return False def finish(): r'''Finalize the executed tests. Prints the test summary. Exits successfully iff all the tests passed. ''' if not Nchecks and not NchecksFailed: print_red("No tests defined") sys.exit(0) if NchecksFailed: print_red("Some tests failed: {} out of {}".format(NchecksFailed, Nchecks)) sys.exit(1) print_green("All tests passed: {} total".format(Nchecks)) sys.exit(0) # numpysane-specific tests. Keep these in this file to make sure test-harness # line numbers are not reported def assertValueShape(value_ref, s, f, *args, **kwargs): r'''Makes sure a given call produces a given value and shape. It is redundant to specify both, but it makes it clear I'm asking for what I think I'm asking. The value check can be skipped by passing None. ''' try: res = f(*args, **kwargs) except Exception as e: print_red("FAILED: Exception \"{}\" calling \"{}\"".format(e,f)) global NchecksFailed NchecksFailed += 1 return if 'out' in kwargs: confirm(res is kwargs['out'], msg='returning same matrix as the given "out"') if s is not None: try: shape = res.shape except: shape = () confirm_equal(shape, s, msg='shape matches') if value_ref is not None: confirm_equal(value_ref, res, msg='value matches') if 'dtype' in kwargs: confirm_equal(np.dtype(res.dtype), np.dtype(kwargs['dtype']), msg='matching dtype') def assertResult_inoutplace( ref, func, *args, **kwargs ): r'''makes sure func(a,b) == ref. Tests both a pre-allocated array and a slice-at-a-time allocate/copy mode Only one test-specific kwarg is known: 'out_inplace_dtype'. The rest are passed down to the test function ''' out_inplace_dtype = kwargs.get('out_inplace_dtype', None) try: del kwargs['out_inplace_dtype'] except: pass assertValueShape( ref, ref.shape, func, *args, **kwargs ) output = np.empty(ref.shape, dtype=out_inplace_dtype) assertValueShape( ref, ref.shape, func, *args, out=output, **kwargs) confirm_equal(ref, output)