././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/0000755000175100001720000000000014242767166014002 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/.github/0000755000175100001720000000000014242767166015342 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/.github/workflows/0000755000175100001720000000000014242767166017377 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/.github/workflows/main.yml0000644000175100001720000000333214242767131021037 0ustar00runnerdockername: CI on: push: pull_request: jobs: tests: uses: OpenAstronomy/github-actions-workflows/.github/workflows/tox.yml@main with: coverage: codecov runs-on: | linux: ubuntu-22.04 envs: | - linux: py37-test-numpy116 - linux: py37-test-numpy117 - linux: py38-test-numpy118 - linux: py39-test - linux: py310-test - macos: py36-test-numpy113 - macos: py36-test-numpy114 - macos: py36-test-numpy115 # The following two jobs are disabled due to an issue that seems # specific to Python 3.7 and not worth investigating. # - macos: py37-test-numpy116 # - macos: py37-test-numpy117 - macos: py38-test-numpy118 - macos: py39-test - macos: py310-test - windows: py36-test-numpy113 - windows: py36-test-numpy114 - windows: py36-test-numpy115 - windows: py37-test-numpy116 - windows: py37-test-numpy117 - windows: py38-test-numpy118 - windows: py39-test - windows: py310-test publish: uses: OpenAstronomy/github-actions-workflows/.github/workflows/publish.yml@main with: test_extras: test test_command: pytest --pyargs fast_histogram -m "not hypothesis" sdist-runs-on: ubuntu-22.04 targets: | - cp*-manylinux_i686 - cp*-manylinux_x86_64 - cp*-manylinux_aarch64 # - cp*-musllinux_i686 - cp*-musllinux_x86_64 # - cp*-musllinux_aarch64 - pp*-manylinux_i686 - pp*-manylinux_x86_64 # - pp*-manylinux_aarch64 - cp*-macosx_x86_64 - cp*-macosx_arm64 - windows secrets: pypi_token: ${{ secrets.PYPI_TOKEN }} ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/.gitignore0000644000175100001720000000142714242767131015766 0ustar00runnerdocker# Compiled files *.py[cod] *.a *.o *.so *.pyd __pycache__ # Ignore .c files by default to avoid including generated code. If you want to # add a non-generated .c extension, use `git add -f filename.c`. *.c # Other generated files MANIFEST astropy/version.py astropy/cython_version.py astropy/wcs/include/wcsconfig.h astropy/_erfa/core.py astropy/_erfa/core.pyx # Sphinx _build _generated docs/api docs/generated docs/visualization/ngc6976.jpeg docs/visualization/ngc6976-default.jpeg # Packages/installer info *.egg *.egg-info dist build eggs .eggs parts bin var sdist develop-eggs .installed.cfg distribute-*.tar.gz # Other .cache .tox .*.swp .*.swo *~ .project .pydevproject .settings .coverage cover htmlcov # Mac OSX .DS_Store # PyCharm .idea .tox .tmp fast_histogram/version.py ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/CHANGES.rst0000644000175100001720000000335014242767131015575 0ustar00runnerdocker0.11 (2022-05-23) ----------------- - Use Python limited API to produce forward-compatible wheels. [#56] 0.10 (2021-09-06) ----------------- - Add function for histograms in arbitrarily high dimensions. [#54, #55] 0.9 (2020-05-24) ---------------- - Fixed a bug that caused incorrect results in the weighted 1-d histogram and the weighted and unweighted 2-d histogram functions if using arrays with different layouts in memory. [#52] 0.8 (2020-01-07) ---------------- - Fixed compatibility of test suite with latest version of the hypothesis package. [#40] 0.7 (2019-01-09) ---------------- - Fix definition of numpy as a build-time dependency. [#36] 0.6 (2019-01-07) ---------------- - Define numpy as a build-time dependency in pyproject.toml. [#33] - Release the GIL during calculations in C code. [#31] 0.5 (2018-09-26) ---------------- - Fix bug that caused histograms of n-dimensional arrays to not be computed correctly. [#21] - Avoid memory copies for non-native endian 64-bit float arrays. [#18] - Avoid memory copies for any numerical Numpy type and non-contiguous arrays. [#23] - Raise a better error if arrays are passed to the ``bins`` argument. [#24] 0.4 (2018-02-12) ---------------- - Make sure that Numpy is not required to run setup.py. [#15] - Fix installation on platforms with an ASCII locale. [#15] 0.3 (2017-10-28) ---------------- - Use long instead of int for x/y sizes and indices - Implement support for weights= option 0.2.1 (2017-07-18) ------------------ - Fixed rst syntax in README 0.2 (2017-07-18) ---------------- - Fixed segmentation fault under certain conditions. - Ensure that arrays are C-contiguous before passing them to the C code. 0.1 (2017-07-18) ---------------- - Initial version ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/LICENSE0000644000175100001720000000242314242767131015000 0ustar00runnerdockerCopyright (c) 2017, Thomas P. Robitaille All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/MANIFEST.in0000644000175100001720000000011614242767131015526 0ustar00runnerdockerinclude LICENSE include README.rst include CHANGES.rst include pyproject.toml ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/PKG-INFO0000644000175100001720000001453414242767166015106 0ustar00runnerdockerMetadata-Version: 2.1 Name: fast-histogram Version: 0.11 Summary: Fast simple 1D and 2D histograms Home-page: https://github.com/astrofrog/fast-histogram Author: Thomas Robitaille Author-email: thomas.robitaille@gmail.com License: BSD Requires-Python: >=3.6 Provides-Extra: test License-File: LICENSE |CI Status| |asv| About ----- Sometimes you just want to compute simple 1D or 2D histograms with regular bins. Fast. No nonsense. `Numpy's `__ histogram functions are versatile, and can handle for example non-regular binning, but this versatility comes at the expense of performance. The **fast-histogram** mini-package aims to provide simple and fast histogram functions for regular bins that don't compromise on performance. It doesn't do anything complicated - it just implements a simple histogram algorithm in C and keeps it simple. The aim is to have functions that are fast but also robust and reliable. The result is a 1D histogram function here that is **7-15x faster** than ``numpy.histogram``, and a 2D histogram function that is **20-25x faster** than ``numpy.histogram2d``. To install:: pip install fast-histogram or if you use conda you can instead do:: conda install -c conda-forge fast-histogram The ``fast_histogram`` module then provides two functions: ``histogram1d`` and ``histogram2d``: .. code:: python from fast_histogram import histogram1d, histogram2d Example ------- Here's an example of binning 10 million points into a regular 2D histogram: .. code:: python In [1]: import numpy as np In [2]: x = np.random.random(10_000_000) In [3]: y = np.random.random(10_000_000) In [4]: %timeit _ = np.histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30) 935 ms ± 58.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [5]: from fast_histogram import histogram2d In [6]: %timeit _ = histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30) 40.2 ms ± 624 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) (note that ``10_000_000`` is possible in Python 3.6 syntax, use ``10000000`` instead in previous versions) The version here is over 20 times faster! The following plot shows the speedup as a function of array size for the bin parameters shown above: .. figure:: https://github.com/astrofrog/fast-histogram/raw/master/speedup_compared.png :alt: Comparison of performance between Numpy and fast-histogram as well as results for the 1D case, also with 30 bins. The speedup for the 2D case is consistently between 20-25x, and for the 1D case goes from 15x for small arrays to around 7x for large arrays. Q&A --- Why don't the histogram functions return the edges? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Computing and returning the edges may seem trivial but it can slow things down by a factor of a few when computing histograms of 10^5 or fewer elements, so not returning the edges is a deliberate decision related to performance. You can easily compute the edges yourself if needed though, using ``numpy.linspace``. Doesn't package X already do this, but better? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This may very well be the case! If this duplicates another package, or if it is possible to use Numpy in a smarter way to get the same performance gains, please open an issue and I'll consider deprecating this package :) One package that does include fast histogram functions (including in n-dimensions) and can compute other statistics is `vaex `_, so take a look there if you need more advanced functionality! Are the 2D histograms not transposed compared to what they should be? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There is technically no 'right' and 'wrong' orientation - here we adopt the convention which gives results consistent with Numpy, so: .. code:: python numpy.histogram2d(x, y, range=[[xmin, xmax], [ymin, ymax]], bins=[nx, ny]) should give the same result as: .. code:: python fast_histogram.histogram2d(x, y, range=[[xmin, xmax], [ymin, ymax]], bins=[nx, ny]) Why not contribute this to Numpy directly? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As mentioned above, the Numpy functions are much more versatile, so they could not be replaced by the ones here. One option would be to check in Numpy's functions for cases that are simple and dispatch to functions such as the ones here, or add dedicated functions for regular binning. I hope we can get this in Numpy in some form or another eventually, but for now, the aim is to have this available to packages that need to support a range of Numpy versions. Why not use Cython? ~~~~~~~~~~~~~~~~~~~ I originally implemented this in Cython, but found that I could get a 50% performance improvement by going straight to a C extension. What about using Numba? ~~~~~~~~~~~~~~~~~~~~~~~ I specifically want to keep this package as easy as possible to install, and while `Numba `__ is a great package, it is not trivial to install outside of Anaconda. Could this be parallelized? ~~~~~~~~~~~~~~~~~~~~~~~~~~~ This may benefit from parallelization under certain circumstances. The easiest solution might be to use OpenMP, but this won't work on all platforms, so it would need to be made optional. Couldn't you make it faster by using the GPU? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Almost certainly, though the aim here is to have an easily installable and portable package, and introducing GPUs is going to affect both of these. Why make a package specifically for this? This is a tiny amount of functionality ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Packages that need this could simply bundle their own C extension or Cython code to do this, but the main motivation for releasing this as a mini-package is to avoid making pure-Python packages into packages that require compilation just because of the need to compute fast histograms. Can I contribute? ~~~~~~~~~~~~~~~~~ Yes please! This is not meant to be a finished package, and I welcome pull request to improve things. .. |CI Status| image:: https://github.com/astrofrog/fast-histogram/actions/workflows/main.yml/badge.svg :target: https://github.com/astrofrog/fast-histogram/actions/workflows/main.yml .. |asv| image:: https://img.shields.io/badge/benchmarked%20by-asv-brightgreen.svg :target: https://astrofrog.github.io/fast-histogram ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/README.rst0000644000175100001720000001405614242767131015467 0ustar00runnerdocker|CI Status| |asv| About ----- Sometimes you just want to compute simple 1D or 2D histograms with regular bins. Fast. No nonsense. `Numpy's `__ histogram functions are versatile, and can handle for example non-regular binning, but this versatility comes at the expense of performance. The **fast-histogram** mini-package aims to provide simple and fast histogram functions for regular bins that don't compromise on performance. It doesn't do anything complicated - it just implements a simple histogram algorithm in C and keeps it simple. The aim is to have functions that are fast but also robust and reliable. The result is a 1D histogram function here that is **7-15x faster** than ``numpy.histogram``, and a 2D histogram function that is **20-25x faster** than ``numpy.histogram2d``. To install:: pip install fast-histogram or if you use conda you can instead do:: conda install -c conda-forge fast-histogram The ``fast_histogram`` module then provides two functions: ``histogram1d`` and ``histogram2d``: .. code:: python from fast_histogram import histogram1d, histogram2d Example ------- Here's an example of binning 10 million points into a regular 2D histogram: .. code:: python In [1]: import numpy as np In [2]: x = np.random.random(10_000_000) In [3]: y = np.random.random(10_000_000) In [4]: %timeit _ = np.histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30) 935 ms ± 58.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [5]: from fast_histogram import histogram2d In [6]: %timeit _ = histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30) 40.2 ms ± 624 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) (note that ``10_000_000`` is possible in Python 3.6 syntax, use ``10000000`` instead in previous versions) The version here is over 20 times faster! The following plot shows the speedup as a function of array size for the bin parameters shown above: .. figure:: https://github.com/astrofrog/fast-histogram/raw/master/speedup_compared.png :alt: Comparison of performance between Numpy and fast-histogram as well as results for the 1D case, also with 30 bins. The speedup for the 2D case is consistently between 20-25x, and for the 1D case goes from 15x for small arrays to around 7x for large arrays. Q&A --- Why don't the histogram functions return the edges? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Computing and returning the edges may seem trivial but it can slow things down by a factor of a few when computing histograms of 10^5 or fewer elements, so not returning the edges is a deliberate decision related to performance. You can easily compute the edges yourself if needed though, using ``numpy.linspace``. Doesn't package X already do this, but better? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This may very well be the case! If this duplicates another package, or if it is possible to use Numpy in a smarter way to get the same performance gains, please open an issue and I'll consider deprecating this package :) One package that does include fast histogram functions (including in n-dimensions) and can compute other statistics is `vaex `_, so take a look there if you need more advanced functionality! Are the 2D histograms not transposed compared to what they should be? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There is technically no 'right' and 'wrong' orientation - here we adopt the convention which gives results consistent with Numpy, so: .. code:: python numpy.histogram2d(x, y, range=[[xmin, xmax], [ymin, ymax]], bins=[nx, ny]) should give the same result as: .. code:: python fast_histogram.histogram2d(x, y, range=[[xmin, xmax], [ymin, ymax]], bins=[nx, ny]) Why not contribute this to Numpy directly? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As mentioned above, the Numpy functions are much more versatile, so they could not be replaced by the ones here. One option would be to check in Numpy's functions for cases that are simple and dispatch to functions such as the ones here, or add dedicated functions for regular binning. I hope we can get this in Numpy in some form or another eventually, but for now, the aim is to have this available to packages that need to support a range of Numpy versions. Why not use Cython? ~~~~~~~~~~~~~~~~~~~ I originally implemented this in Cython, but found that I could get a 50% performance improvement by going straight to a C extension. What about using Numba? ~~~~~~~~~~~~~~~~~~~~~~~ I specifically want to keep this package as easy as possible to install, and while `Numba `__ is a great package, it is not trivial to install outside of Anaconda. Could this be parallelized? ~~~~~~~~~~~~~~~~~~~~~~~~~~~ This may benefit from parallelization under certain circumstances. The easiest solution might be to use OpenMP, but this won't work on all platforms, so it would need to be made optional. Couldn't you make it faster by using the GPU? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Almost certainly, though the aim here is to have an easily installable and portable package, and introducing GPUs is going to affect both of these. Why make a package specifically for this? This is a tiny amount of functionality ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Packages that need this could simply bundle their own C extension or Cython code to do this, but the main motivation for releasing this as a mini-package is to avoid making pure-Python packages into packages that require compilation just because of the need to compute fast histograms. Can I contribute? ~~~~~~~~~~~~~~~~~ Yes please! This is not meant to be a finished package, and I welcome pull request to improve things. .. |CI Status| image:: https://github.com/astrofrog/fast-histogram/actions/workflows/main.yml/badge.svg :target: https://github.com/astrofrog/fast-histogram/actions/workflows/main.yml .. |asv| image:: https://img.shields.io/badge/benchmarked%20by-asv-brightgreen.svg :target: https://astrofrog.github.io/fast-histogram ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/comparison/0000755000175100001720000000000014242767166016154 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/comparison/README.rst0000644000175100001720000000036414242767131017636 0ustar00runnerdockerAbout ----- The scripts in this directory are used to make the speedup_compared.png plot at the root of this repository. For more complete benchmarks, you can also visit our `asv benchmarks `_ page. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/comparison/benchmark.py0000644000175100001720000000501114242767131020445 0ustar00runnerdocker# Script to compare the speedup provided by fast-histogram import numpy as np from timeit import timeit, repeat SETUP_1D = """ import numpy as np from numpy import histogram as np_histogram1d from fast_histogram import histogram1d x = np.random.random({size}) """ NUMPY_1D_STMT = "np_histogram1d(x, range=[-1, 2], bins=30)" FAST_1D_STMT = "histogram1d(x, range=[-1, 2], bins=30)" SETUP_2D = """ import numpy as np from numpy import histogram2d as np_histogram2d from fast_histogram import histogram2d x = np.random.random({size}) y = np.random.random({size}) """ NUMPY_2D_STMT = "np_histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30)" FAST_2D_STMT = "histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30)" # How long each benchmark should aim to take TARGET_TIME = 1.0 def time_stats(stmt=None, setup=None): # Call once to check how long it takes time_single = timeit(stmt=stmt, setup=setup, number=1) # Find out how many times we can call it. We always call it at least three # times for accuracy number = max(3, int(TARGET_TIME / time_single)) print(' -> estimated time to complete test: {0:.1f}s'.format(time_single * 10 * number)) times = repeat(stmt=stmt, setup=setup, repeat=10, number=number) return np.min(times) / number, np.mean(times) / number, np.median(times) / number FMT_HEADER = '# {:7s}' + ' {:10s}' * 12 + '\n' FMT = '{:9d}' + ' {:10.7e}' * 12 + '\n' with open('benchmark_times.txt', 'w') as f: f.write(FMT_HEADER.format('size', 'np_1d_min', 'np_1d_mean', 'np_1d_median', 'fa_1d_min', 'fa_1d_mean', 'fa_1d_median', 'np_2d_min', 'np_2d_mean', 'np_2d_median', 'fa_2d_min', 'fa_2d_mean', 'fa_2d_median')) for log10_size in range(0, 9): size = int(10 ** log10_size) print('Running benchmarks for size={0}'.format(size)) np_1d_min, np_1d_mean, np_1d_median = time_stats(stmt=NUMPY_1D_STMT, setup=SETUP_1D.format(size=size)) fa_1d_min, fa_1d_mean, fa_1d_median = time_stats(stmt=FAST_1D_STMT, setup=SETUP_1D.format(size=size)) np_2d_min, np_2d_mean, np_2d_median = time_stats(stmt=NUMPY_2D_STMT, setup=SETUP_2D.format(size=size)) fa_2d_min, fa_2d_mean, fa_2d_median = time_stats(stmt=FAST_2D_STMT, setup=SETUP_2D.format(size=size)) f.write(FMT.format(size, np_1d_min, np_1d_mean, np_1d_median, fa_1d_min, fa_1d_mean, fa_1d_median, np_2d_min, np_2d_mean, np_2d_median, fa_2d_min, fa_2d_mean, fa_2d_median)) f.flush() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/comparison/plot.py0000644000175100001720000000137114242767131017476 0ustar00runnerdocker# Script to make the comparison plot for the benchmark import numpy as np import matplotlib.pyplot as plt (size, np_1d_min, np_1d_mean, np_1d_median, fa_1d_min, fa_1d_mean, fa_1d_median, np_2d_min, np_2d_mean, np_2d_median, fa_2d_min, fa_2d_mean, fa_2d_median) = np.loadtxt('benchmark_times.txt', unpack=True) fig = plt.figure() ax = fig.add_subplot(1, 1, 1) ax.plot(size, np_1d_min / fa_1d_min, color=(34 / 255, 122 / 255, 181 / 255), label='1D') ax.plot(size, np_2d_min / fa_2d_min, color=(255 / 255, 133 / 255, 25 / 255), label='2D') ax.set_xscale('log') ax.set_xlim(0.3, 3e8) ax.set_ylim(1, 35) ax.grid() ax.set_xlabel('Array size') ax.set_ylabel('Speedup (fast-histogram / numpy)') ax.legend() fig.savefig('speedup_compared.png', bbox_inches='tight') ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/fast_histogram/0000755000175100001720000000000014242767166017014 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/fast_histogram/__init__.py0000644000175100001720000000010514242767131021111 0ustar00runnerdockerfrom .histogram import * from .version import version as __version__ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/fast_histogram/_histogram_core.c0000644000175100001720000011210614242767131022315 0ustar00runnerdocker#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION #define Py_LIMITED_API 0x030600f0 #include #include #include /* Define docstrings */ static char module_docstring[] = "Fast histogram functioins"; static char _histogram1d_docstring[] = "Compute a 1D histogram"; static char _histogram2d_docstring[] = "Compute a 2D histogram"; static char _histogramdd_docstring[] = "Compute a histogram with arbitrary dimensionality"; static char _histogram1d_weighted_docstring[] = "Compute a weighted 1D histogram"; static char _histogram2d_weighted_docstring[] = "Compute a weighted 2D histogram"; static char _histogramdd_weighted_docstring[] = "Compute a weighted histogram with arbitrary dimensionality"; /* Declare the C functions here. */ static PyObject *_histogram1d(PyObject *self, PyObject *args); static PyObject *_histogram2d(PyObject *self, PyObject *args); static PyObject *_histogramdd(PyObject *self, PyObject *args); static PyObject *_histogram1d_weighted(PyObject *self, PyObject *args); static PyObject *_histogram2d_weighted(PyObject *self, PyObject *args); static PyObject *_histogramdd_weighted(PyObject *self, PyObject *args); /* Define the methods that will be available on the module. */ static PyMethodDef module_methods[] = { {"_histogram1d", _histogram1d, METH_VARARGS, _histogram1d_docstring}, {"_histogram2d", _histogram2d, METH_VARARGS, _histogram2d_docstring}, {"_histogramdd", _histogramdd, METH_VARARGS, _histogramdd_docstring}, {"_histogram1d_weighted", _histogram1d_weighted, METH_VARARGS, _histogram1d_weighted_docstring}, {"_histogram2d_weighted", _histogram2d_weighted, METH_VARARGS, _histogram2d_weighted_docstring}, {"_histogramdd_weighted", _histogramdd_weighted, METH_VARARGS, _histogramdd_weighted_docstring}, {NULL, NULL, 0, NULL} }; /* This is the function that is called on import. */ #if PY_MAJOR_VERSION >= 3 #define MOD_ERROR_VAL NULL #define MOD_SUCCESS_VAL(val) val #define MOD_INIT(name) PyMODINIT_FUNC PyInit_##name(void) #define MOD_DEF(ob, name, doc, methods) \ static struct PyModuleDef moduledef = { \ PyModuleDef_HEAD_INIT, name, doc, -1, methods, }; \ ob = PyModule_Create(&moduledef); #else #define MOD_ERROR_VAL #define MOD_SUCCESS_VAL(val) #define MOD_INIT(name) void init##name(void) #define MOD_DEF(ob, name, doc, methods) \ ob = Py_InitModule3(name, methods, doc); #endif MOD_INIT(_histogram_core) { PyObject *m; MOD_DEF(m, "_histogram_core", module_docstring, module_methods); if (m == NULL) return MOD_ERROR_VAL; import_array(); return MOD_SUCCESS_VAL(m); } static PyObject *_histogram1d(PyObject *self, PyObject *args) { long n; int ix, nx; double xmin, xmax, tx, fnx, normx; PyObject *x_obj, *count_obj; PyArrayObject *x_array, *count_array; npy_intp dims[1]; double *count; NpyIter *iter; NpyIter_IterNextFunc *iternext; char **dataptr; npy_intp *strideptr, *innersizeptr; PyArray_Descr *dtype; /* Parse the input tuple */ if (!PyArg_ParseTuple(args, "Oidd", &x_obj, &nx, &xmin, &xmax)) { PyErr_SetString(PyExc_TypeError, "Error parsing input"); return NULL; } /* Interpret the input objects as `numpy` arrays. */ x_array = (PyArrayObject *)PyArray_FROM_O(x_obj); /* If that didn't work, throw an `Exception`. */ if (x_array == NULL) { PyErr_SetString(PyExc_TypeError, "Couldn't parse the input arrays."); Py_XDECREF(x_array); return NULL; } /* How many data points are there? */ n = (long)PyArray_DIM(x_array, 0); /* Build the output array */ dims[0] = nx; count_obj = PyArray_SimpleNew(1, dims, NPY_DOUBLE); if (count_obj == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't build output array"); Py_DECREF(x_array); Py_XDECREF(count_obj); return NULL; } count_array = (PyArrayObject *)count_obj; PyArray_FILLWBYTE(count_array, 0); if (n == 0) { Py_DECREF(x_array); return count_obj; } dtype = PyArray_DescrFromType(NPY_DOUBLE); iter = NpyIter_New(x_array, NPY_ITER_READONLY | NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED, NPY_KEEPORDER, NPY_SAFE_CASTING, dtype); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); Py_DECREF(x_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* * The iternext function gets stored in a local variable * so it can be called repeatedly in an efficient manner. */ iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); NpyIter_Deallocate(iter); Py_DECREF(x_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* The location of the data pointer which the iterator may update */ dataptr = NpyIter_GetDataPtrArray(iter); /* The location of the stride which the iterator may update */ strideptr = NpyIter_GetInnerStrideArray(iter); /* The location of the inner loop size which the iterator may update */ innersizeptr = NpyIter_GetInnerLoopSizePtr(iter); /* Pre-compute variables for efficiency in the histogram calculation */ fnx = nx; normx = fnx / (xmax - xmin); /* Get C array for output array */ count = (double *)PyArray_DATA(count_array); Py_BEGIN_ALLOW_THREADS do { /* Get the inner loop data/stride/count values */ npy_intp stride = *strideptr; npy_intp size = *innersizeptr; /* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */ while (size--) { tx = *(double *)dataptr[0]; if (tx >= xmin && tx < xmax) { ix = (tx - xmin) * normx; count[ix] += 1.; } dataptr[0] += stride; } } while (iternext(iter)); Py_END_ALLOW_THREADS NpyIter_Deallocate(iter); /* Clean up. */ Py_DECREF(x_array); return count_obj; } static PyObject *_histogram2d(PyObject *self, PyObject *args) { long n; int ix, iy, nx, ny; double xmin, xmax, tx, fnx, normx, ymin, ymax, ty, fny, normy; PyObject *x_obj, *y_obj, *count_obj; PyArrayObject *x_array, *y_array, *count_array, *arrays[2]; npy_intp dims[2]; double *count; NpyIter *iter; NpyIter_IterNextFunc *iternext; char **dataptr; npy_intp *strideptr, *innersizeptr; PyArray_Descr *dtypes[] = {PyArray_DescrFromType(NPY_DOUBLE), PyArray_DescrFromType(NPY_DOUBLE)}; npy_uint32 op_flags[] = {NPY_ITER_READONLY, NPY_ITER_READONLY}; /* Parse the input tuple */ if (!PyArg_ParseTuple(args, "OOiddidd", &x_obj, &y_obj, &nx, &xmin, &xmax, &ny, &ymin, &ymax)) { PyErr_SetString(PyExc_TypeError, "Error parsing input"); return NULL; } /* Interpret the input objects as `numpy` arrays. */ x_array = (PyArrayObject *)PyArray_FROM_O(x_obj); y_array = (PyArrayObject *)PyArray_FROM_O(y_obj); /* If that didn't work, throw an `Exception`. */ if (x_array == NULL || y_array == NULL) { PyErr_SetString(PyExc_TypeError, "Couldn't parse the input arrays."); Py_XDECREF(x_array); Py_XDECREF(y_array); return NULL; } /* How many data points are there? */ n = (long)PyArray_DIM(x_array, 0); /* Check the dimensions. */ if (n != (long)PyArray_DIM(y_array, 0)) { PyErr_SetString(PyExc_RuntimeError, "Dimension mismatch between x and y"); Py_DECREF(x_array); Py_DECREF(y_array); return NULL; } /* Build the output array */ dims[0] = nx; dims[1] = ny; count_obj = PyArray_SimpleNew(2, dims, NPY_DOUBLE); if (count_obj == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't build output array"); Py_DECREF(x_array); Py_DECREF(y_array); Py_XDECREF(count_obj); return NULL; } count_array = (PyArrayObject *)count_obj; PyArray_FILLWBYTE(count_array, 0); if (n == 0) { Py_DECREF(x_array); Py_DECREF(y_array); return count_obj; } arrays[0] = x_array; arrays[1] = y_array; iter = NpyIter_AdvancedNew(2, arrays, NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED, NPY_KEEPORDER, NPY_SAFE_CASTING, op_flags, dtypes, -1, NULL, NULL, 0); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* * The iternext function gets stored in a local variable * so it can be called repeatedly in an efficient manner. */ iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); NpyIter_Deallocate(iter); Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* The location of the data pointer which the iterator may update */ dataptr = NpyIter_GetDataPtrArray(iter); /* The location of the stride which the iterator may update */ strideptr = NpyIter_GetInnerStrideArray(iter); /* The location of the inner loop size which the iterator may update */ innersizeptr = NpyIter_GetInnerLoopSizePtr(iter); /* Pre-compute variables for efficiency in the histogram calculation */ fnx = nx; fny = ny; normx = fnx / (xmax - xmin); normy = fny / (ymax - ymin); /* Get C array for output array */ count = (double *)PyArray_DATA(count_array); Py_BEGIN_ALLOW_THREADS do { /* Get the inner loop data/stride/count values */ npy_intp stride0 = strideptr[0]; npy_intp stride1 = strideptr[1]; npy_intp size = *innersizeptr; /* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */ while (size--) { tx = *(double *)dataptr[0]; ty = *(double *)dataptr[1]; if (tx >= xmin && tx < xmax && ty >= ymin && ty < ymax) { ix = (tx - xmin) * normx; iy = (ty - ymin) * normy; count[iy + ny * ix] += 1.; } dataptr[0] += stride0; dataptr[1] += stride1; } } while (iternext(iter)); Py_END_ALLOW_THREADS NpyIter_Deallocate(iter); /* Clean up. */ Py_DECREF(x_array); Py_DECREF(y_array); return count_obj; } static PyObject *_histogramdd(PyObject *self, PyObject *args) { long n; int ndim, sample_parsing_success; PyObject *sample_obj, *range_obj, *bins_obj, *count_obj; PyArrayObject **arrays, *range, *bins, *count_array; npy_intp *dims; double *count, *range_c, *fndim, *norms; double tx; int bin_idx, local_bin_idx, in_range, *stride; // using xmin and xmax for all dimensions double xmin, xmax; NpyIter *iter; NpyIter_IterNextFunc *iternext; char **dataptr; npy_intp *strideptr, *innersizeptr; PyArray_Descr *dtype; PyArray_Descr **dtypes; npy_uint32 *op_flags; /* Parse the input tuple */ if (!PyArg_ParseTuple(args, "OOO", &sample_obj, &bins_obj, &range_obj)) { PyErr_SetString(PyExc_TypeError, "Error parsing input"); return NULL; } ndim = (int)PyTuple_Size(sample_obj); /* Interpret the input objects as `numpy` arrays. */ arrays = (PyArrayObject **)malloc(sizeof(PyArrayObject *) * ndim); sample_parsing_success = 1; for (int i = 0; i < ndim; i++){ arrays[i] = (PyArrayObject *)PyArray_FROM_O(PyTuple_GetItem(sample_obj, i)); if (arrays[i] == NULL){ sample_parsing_success = 0; } } dtype = PyArray_DescrFromType(NPY_DOUBLE); range = (PyArrayObject *)PyArray_FromAny(range_obj, dtype, 2, 2, NPY_ARRAY_IN_ARRAY, NULL); dtype = PyArray_DescrFromType(NPY_INTP); bins = (PyArrayObject *)PyArray_FromAny(bins_obj, dtype, 1, 1, NPY_ARRAY_IN_ARRAY, NULL); /* If that didn't work, throw an `Exception`. */ if (range == NULL || bins == NULL || !sample_parsing_success) { PyErr_SetString(PyExc_TypeError, "Couldn't parse at least one of the input arrays." " `range` must be passed as a 2D ndarray of type `np.double`," " `bins` must be passed as a 1D ndarray of type `np.intp`."); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } /* How many data points are there? */ n = (long)PyArray_DIM(arrays[0], 0); if (ndim > 1){ for (int i = 0; i < ndim; i++){ if (!((long)PyArray_DIM(arrays[i], 0) == n)){ PyErr_SetString(PyExc_RuntimeError, "Lengths of sample arrays do not match."); for (int j = 0; j < ndim; j++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } } } /* copy the content of `bins` into `dims` */ dtype = PyArray_DescrFromType(NPY_INTP); iter = NpyIter_New(bins, NPY_ITER_READONLY, NPY_CORDER, NPY_SAFE_CASTING, dtype); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator over binning."); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iteration function over binning."); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } NpyIter_Deallocate(iter); Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } dataptr = NpyIter_GetDataPtrArray(iter); dims = (npy_intp *)malloc(sizeof(npy_intp) * ndim); int i = 0; do{ dims[i] = *(npy_intp *)dataptr[0]; i++; } while (iternext(iter)); NpyIter_Deallocate(iter); /* build the output array */ count_obj = PyArray_SimpleNew(ndim, dims, NPY_DOUBLE); if (count_obj == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't build output array"); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); Py_XDECREF(count_obj); free(arrays); free(dims); return NULL; } count_array = (PyArrayObject *)count_obj; PyArray_FILLWBYTE(count_array, 0); if (n == 0) { for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); free(dims); return count_obj; } /* copy the content of the numpy array `ranges` into a simple C array */ // This just makes is easier to access the values later in the loop. range_c = (double *)malloc(sizeof(double) * ndim * 2); dtype = PyArray_DescrFromType(NPY_DOUBLE); iter = NpyIter_New(range, NPY_ITER_READONLY, NPY_CORDER, NPY_SAFE_CASTING, dtype); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator over range. This needs to be passed as type `numpy.double`."); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); return NULL; } iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iteration function over range. This needs to be passed as type `numpy.double`."); NpyIter_Deallocate(iter); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); return NULL; } dataptr = NpyIter_GetDataPtrArray(iter); i = 0; do{ range_c[i] = *(double *)dataptr[0]; i++; } while (iternext(iter)); NpyIter_Deallocate(iter); /* now we pre-compute the bin normalizations for all dimensions */ fndim = (double *)malloc(sizeof(double) * ndim); norms = (double *)malloc(sizeof(double) * ndim); for (int j = 0; j < ndim; j++){ fndim[j] = (double)dims[j]; norms[j] = fndim[j] / (range_c[j * 2 + 1] - range_c[j * 2]); } dtypes = (PyArray_Descr **)malloc(sizeof(PyArray_Descr *) * ndim); op_flags = (npy_uint32 *)malloc(sizeof(npy_uint32) * ndim); for (int i = 0; i < ndim; i++){ dtypes[i] = PyArray_DescrFromType(NPY_DOUBLE); op_flags[i] = NPY_ITER_READONLY; } iter = NpyIter_AdvancedNew(ndim, arrays, NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED, NPY_KEEPORDER, NPY_SAFE_CASTING, op_flags, dtypes, -1, NULL, NULL, 0); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); free(fndim); free(norms); free(dtypes); free(op_flags); return NULL; } /* * The iternext function gets stored in a local variable * so it can be called repeatedly in an efficient manner. */ iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); NpyIter_Deallocate(iter); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); free(fndim); free(norms); free(dtypes); free(op_flags); return NULL; } /* The location of the data pointer which the iterator may update */ dataptr = NpyIter_GetDataPtrArray(iter); /* The location of the stride which the iterator may update */ strideptr = NpyIter_GetInnerStrideArray(iter); /* The location of the inner loop size which the iterator may update */ innersizeptr = NpyIter_GetInnerLoopSizePtr(iter); /* Get C array for output array */ count = (double *)PyArray_DATA(count_array); /* Pre-compute index stride */ // We comput the strides for the bin index for each dimension. The desired // behavior is this: // 1D: bin_idx = ix // --> stride = {1} // 2D: bin_idx = ny * ix + iy // --> stride = {ny, 1} // 3D: bin_idx = nz * ny * ix + nz * iy + iz // --> stride = {nz * ny, nz, 1} // ... and so on for higher dimensions. // Notice how the order of multiplication requires that we step through the // dimensions backwards. stride = (int *)malloc(sizeof(int) * ndim); for (int i = 0; i < ndim; i++){ stride[i] = 1; } if (ndim > 1){ for (int i = ndim - 1; i > 0; i--){ stride[i - 1] = stride[i] * (int)dims[i]; } } Py_BEGIN_ALLOW_THREADS do { /* Get the inner loop data/stride/count values */ npy_intp size = *innersizeptr; /* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */ while (size--) { bin_idx = 0; in_range = 1; for (int i = 0; i < ndim; i++){ xmin = range_c[i * 2]; xmax = range_c[i * 2 + 1]; tx = *(double *)dataptr[i]; dataptr[i] += strideptr[i]; if (tx < xmin || tx >= xmax){ in_range = 0; } else { local_bin_idx = (tx - xmin) * norms[i]; bin_idx += stride[i] * local_bin_idx; } } if (in_range){ count[bin_idx] += 1; } } } while (iternext(iter)); Py_END_ALLOW_THREADS NpyIter_Deallocate(iter); for (int i = 0; i < ndim; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); free(dims); free(range_c); free(fndim); free(norms); free(dtypes); free(op_flags); free(stride); return count_obj; } static PyObject *_histogram1d_weighted(PyObject *self, PyObject *args) { long n; int ix, nx; double xmin, xmax, tx, tw, fnx, normx; PyObject *x_obj, *w_obj, *count_obj; PyArrayObject *x_array, *w_array, *count_array, *arrays[2]; npy_intp dims[1]; double *count; NpyIter *iter; NpyIter_IterNextFunc *iternext; char **dataptr; npy_intp *strideptr, *innersizeptr; PyArray_Descr *dtypes[] = {PyArray_DescrFromType(NPY_DOUBLE), PyArray_DescrFromType(NPY_DOUBLE)}; npy_uint32 op_flags[] = {NPY_ITER_READONLY, NPY_ITER_READONLY}; /* Parse the input tuple */ if (!PyArg_ParseTuple(args, "OOidd", &x_obj, &w_obj, &nx, &xmin, &xmax)) { PyErr_SetString(PyExc_TypeError, "Error parsing input"); return NULL; } /* Interpret the input objects as `numpy` arrays. */ x_array = (PyArrayObject *)PyArray_FROM_O(x_obj); w_array = (PyArrayObject *)PyArray_FROM_O(w_obj); /* If that didn't work, throw an `Exception`. */ if (x_array == NULL || w_array == NULL) { PyErr_SetString(PyExc_TypeError, "Couldn't parse the input arrays."); Py_XDECREF(x_array); Py_XDECREF(w_array); return NULL; } /* How many data points are there? */ n = (long)PyArray_DIM(x_array, 0); /* Check the dimensions. */ if (n != (long)PyArray_DIM(w_array, 0)) { PyErr_SetString(PyExc_RuntimeError, "Dimension mismatch between x and w"); Py_DECREF(x_array); Py_DECREF(w_array); return NULL; } /* Build the output array */ dims[0] = nx; count_obj = PyArray_SimpleNew(1, dims, NPY_DOUBLE); if (count_obj == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't build output array"); Py_DECREF(x_array); Py_DECREF(w_array); Py_XDECREF(count_obj); return NULL; } count_array = (PyArrayObject *)count_obj; PyArray_FILLWBYTE(count_array, 0); if (n == 0) { Py_DECREF(x_array); Py_DECREF(w_array); return count_obj; } arrays[0] = x_array; arrays[1] = w_array; iter = NpyIter_AdvancedNew(2, arrays, NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED, NPY_KEEPORDER, NPY_SAFE_CASTING, op_flags, dtypes, -1, NULL, NULL, 0); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); Py_DECREF(x_array); Py_DECREF(w_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* * The iternext function gets stored in a local variable * so it can be called repeatedly in an efficient manner. */ iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); NpyIter_Deallocate(iter); Py_DECREF(x_array); Py_DECREF(w_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* The location of the data pointer which the iterator may update */ dataptr = NpyIter_GetDataPtrArray(iter); /* The location of the stride which the iterator may update */ strideptr = NpyIter_GetInnerStrideArray(iter); /* The location of the inner loop size which the iterator may update */ innersizeptr = NpyIter_GetInnerLoopSizePtr(iter); /* Pre-compute variables for efficiency in the histogram calculation */ fnx = nx; normx = fnx / (xmax - xmin); /* Get C array for output array */ count = (double *)PyArray_DATA(count_array); Py_BEGIN_ALLOW_THREADS do { /* Get the inner loop data/stride/count values */ npy_intp stride0 = strideptr[0]; npy_intp stride1 = strideptr[1]; npy_intp size = *innersizeptr; /* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */ while (size--) { tx = *(double *)dataptr[0]; tw = *(double *)dataptr[1]; if (tx >= xmin && tx < xmax) { ix = (tx - xmin) * normx; count[ix] += tw; } dataptr[0] += stride0; dataptr[1] += stride1; } } while (iternext(iter)); Py_END_ALLOW_THREADS NpyIter_Deallocate(iter); /* Clean up. */ Py_DECREF(x_array); Py_DECREF(w_array); return count_obj; } static PyObject *_histogram2d_weighted(PyObject *self, PyObject *args) { long n; int ix, iy, nx, ny; double xmin, xmax, tx, fnx, normx, ymin, ymax, ty, fny, normy, tw; PyObject *x_obj, *y_obj, *w_obj, *count_obj; PyArrayObject *x_array, *y_array, *w_array, *count_array, *arrays[3]; npy_intp dims[2]; double *count; NpyIter *iter; NpyIter_IterNextFunc *iternext; char **dataptr; npy_intp *strideptr, *innersizeptr; PyArray_Descr *dtypes[] = {PyArray_DescrFromType(NPY_DOUBLE), PyArray_DescrFromType(NPY_DOUBLE), PyArray_DescrFromType(NPY_DOUBLE)}; npy_uint32 op_flags[] = {NPY_ITER_READONLY, NPY_ITER_READONLY, NPY_ITER_READONLY}; /* Parse the input tuple */ if (!PyArg_ParseTuple(args, "OOOiddidd", &x_obj, &y_obj, &w_obj, &nx, &xmin, &xmax, &ny, &ymin, &ymax)) { PyErr_SetString(PyExc_TypeError, "Error parsing input"); return NULL; } /* Interpret the input objects as `numpy` arrays. */ x_array = (PyArrayObject *)PyArray_FROM_O(x_obj); y_array = (PyArrayObject *)PyArray_FROM_O(y_obj); w_array = (PyArrayObject *)PyArray_FROM_O(w_obj); /* If that didn't work, throw an `Exception`. */ if (x_array == NULL || y_array == NULL || w_array == NULL) { PyErr_SetString(PyExc_TypeError, "Couldn't parse the input arrays."); Py_XDECREF(x_array); Py_XDECREF(y_array); Py_XDECREF(w_array); return NULL; } /* How many data points are there? */ n = (long)PyArray_DIM(x_array, 0); /* Check the dimensions. */ if (n != (long)PyArray_DIM(y_array, 0) || n != (long)PyArray_DIM(w_array, 0)) { PyErr_SetString(PyExc_RuntimeError, "Dimension mismatch between x, y, and w"); Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(w_array); return NULL; } /* Build the output array */ dims[0] = nx; dims[1] = ny; count_obj = PyArray_SimpleNew(2, dims, NPY_DOUBLE); if (count_obj == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't build output array"); Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(w_array); Py_XDECREF(count_obj); return NULL; } count_array = (PyArrayObject *)count_obj; PyArray_FILLWBYTE(count_array, 0); if (n == 0) { Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(w_array); return count_obj; } arrays[0] = x_array; arrays[1] = y_array; arrays[2] = w_array; iter = NpyIter_AdvancedNew(3, arrays, NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED, NPY_KEEPORDER, NPY_SAFE_CASTING, op_flags, dtypes, -1, NULL, NULL, 0); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(w_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* * The iternext function gets stored in a local variable * so it can be called repeatedly in an efficient manner. */ iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); NpyIter_Deallocate(iter); Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(w_array); Py_DECREF(count_obj); Py_DECREF(count_array); return NULL; } /* The location of the data pointer which the iterator may update */ dataptr = NpyIter_GetDataPtrArray(iter); /* The location of the stride which the iterator may update */ strideptr = NpyIter_GetInnerStrideArray(iter); /* The location of the inner loop size which the iterator may update */ innersizeptr = NpyIter_GetInnerLoopSizePtr(iter); /* Pre-compute variables for efficiency in the histogram calculation */ fnx = nx; fny = ny; normx = fnx / (xmax - xmin); normy = fny / (ymax - ymin); /* Get C array for output array */ count = (double *)PyArray_DATA(count_array); Py_BEGIN_ALLOW_THREADS do { /* Get the inner loop data/stride/count values */ npy_intp stride0 = strideptr[0]; npy_intp stride1 = strideptr[1]; npy_intp stride2 = strideptr[2]; npy_intp size = *innersizeptr; /* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */ while (size--) { tx = *(double *)dataptr[0]; ty = *(double *)dataptr[1]; tw = *(double *)dataptr[2]; if (tx >= xmin && tx < xmax && ty >= ymin && ty < ymax) { ix = (tx - xmin) * normx; iy = (ty - ymin) * normy; count[iy + ny * ix] += tw; } dataptr[0] += stride0; dataptr[1] += stride1; dataptr[2] += stride2; } } while (iternext(iter)); Py_END_ALLOW_THREADS NpyIter_Deallocate(iter); /* Clean up. */ Py_DECREF(x_array); Py_DECREF(y_array); Py_DECREF(w_array); return count_obj; } static PyObject *_histogramdd_weighted(PyObject *self, PyObject *args) { long n; int ndim, sample_parsing_success; PyObject *sample_obj, *range_obj, *bins_obj, *count_obj, *weights_obj; PyArrayObject **arrays, *range, *bins, *count_array; npy_intp *dims; double *count, *range_c, *fndim, *norms; double tx, tw; int bin_idx, local_bin_idx, in_range, *stride; // using xmin and xmax for all dimensions double xmin, xmax; NpyIter *iter; NpyIter_IterNextFunc *iternext; char **dataptr; npy_intp *strideptr, *innersizeptr; PyArray_Descr *dtype; PyArray_Descr **dtypes; npy_uint32 *op_flags; /* Parse the input tuple */ if (!PyArg_ParseTuple(args, "OOOO", &sample_obj, &bins_obj, &range_obj, &weights_obj)) { PyErr_SetString(PyExc_TypeError, "Error parsing input"); return NULL; } ndim = (int)PyTuple_Size(sample_obj); /* Interpret the input objects as `numpy` arrays. */ arrays = (PyArrayObject **)malloc(sizeof(PyArrayObject *) * (ndim + 1)); sample_parsing_success = 1; for (int i = 0; i < ndim; i++){ arrays[i] = (PyArrayObject *)PyArray_FROM_O(PyTuple_GetItem(sample_obj, i)); if (arrays[i] == NULL){ sample_parsing_success = 0; } } /* the last index is always the weights array */ arrays[ndim] = (PyArrayObject *)PyArray_FROM_O(weights_obj); if (arrays[ndim] == NULL){ sample_parsing_success = 0; } dtype = PyArray_DescrFromType(NPY_DOUBLE); range = (PyArrayObject *)PyArray_FromAny(range_obj, dtype, 2, 2, NPY_ARRAY_IN_ARRAY, NULL); dtype = PyArray_DescrFromType(NPY_INTP); bins = (PyArrayObject *)PyArray_FromAny(bins_obj, dtype, 1, 1, NPY_ARRAY_IN_ARRAY, NULL); /* If that didn't work, throw an `Exception`. */ if (range == NULL || bins == NULL || !sample_parsing_success) { PyErr_SetString(PyExc_TypeError, "Couldn't parse at least one of the input arrays." " `range` must be passed as a 2D ndarray of type `np.double`," " `bins` must be passed as a 1D ndarray of type `np.intp`."); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } /* How many data points are there? */ n = (long)PyArray_DIM(arrays[0], 0); for (int i = 0; i < ndim + 1; i++){ if (!((long)PyArray_DIM(arrays[i], 0) == n)){ PyErr_SetString(PyExc_RuntimeError, "Lengths of sample and/or weight arrays do not match."); for (int j = 0; j < ndim + 1; j++){ Py_XDECREF(arrays[j]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } } /* copy the content of `bins` into `dims` */ dtype = PyArray_DescrFromType(NPY_INTP); iter = NpyIter_New(bins, NPY_ITER_READONLY, NPY_CORDER, NPY_SAFE_CASTING, dtype); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator over binning."); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iteration function over binning."); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } NpyIter_Deallocate(iter); Py_XDECREF(range); Py_XDECREF(bins); free(arrays); return NULL; } dataptr = NpyIter_GetDataPtrArray(iter); dims = (npy_intp *)malloc(sizeof(npy_intp) * ndim); int i = 0; do{ dims[i] = *(npy_intp *)dataptr[0]; i++; } while (iternext(iter)); NpyIter_Deallocate(iter); /* build the output array */ count_obj = PyArray_SimpleNew(ndim, dims, NPY_DOUBLE); if (count_obj == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't build output array"); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); Py_XDECREF(count_obj); free(arrays); free(dims); return NULL; } count_array = (PyArrayObject *)count_obj; PyArray_FILLWBYTE(count_array, 0); if (n == 0) { for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); free(dims); return count_obj; } /* copy the content of the numpy array `ranges` into a simple C array */ // This just makes is easier to access the values later in the loop. range_c = (double *)malloc(sizeof(double) * ndim * 2); dtype = PyArray_DescrFromType(NPY_DOUBLE); iter = NpyIter_New(range, NPY_ITER_READONLY, NPY_CORDER, NPY_SAFE_CASTING, dtype); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator over range. This needs to be passed as type `numpy.double`."); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); return NULL; } iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iteration function over range. This needs to be passed as type `numpy.double`."); NpyIter_Deallocate(iter); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); return NULL; } dataptr = NpyIter_GetDataPtrArray(iter); i = 0; do{ range_c[i] = *(double *)dataptr[0]; i++; } while (iternext(iter)); NpyIter_Deallocate(iter); /* now we pre-compute the bin normalizations for all dimensions */ fndim = (double *)malloc(sizeof(double) * ndim); norms = (double *)malloc(sizeof(double) * ndim); for (int j = 0; j < ndim; j++){ fndim[j] = (double)dims[j]; norms[j] = fndim[j] / (range_c[j * 2 + 1] - range_c[j * 2]); } dtypes = (PyArray_Descr **)malloc(sizeof(PyArray_Descr *) * (ndim + 1)); op_flags = (npy_uint32 *)malloc(sizeof(npy_uint32) * (ndim + 1)); for (int i = 0; i < ndim + 1; i++){ dtypes[i] = PyArray_DescrFromType(NPY_DOUBLE); op_flags[i] = NPY_ITER_READONLY; } iter = NpyIter_AdvancedNew(ndim + 1, arrays, NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED, NPY_KEEPORDER, NPY_SAFE_CASTING, op_flags, dtypes, -1, NULL, NULL, 0); if (iter == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); free(fndim); free(norms); free(dtypes); free(op_flags); return NULL; } /* * The iternext function gets stored in a local variable * so it can be called repeatedly in an efficient manner. */ iternext = NpyIter_GetIterNext(iter, NULL); if (iternext == NULL) { PyErr_SetString(PyExc_RuntimeError, "Couldn't set up iterator"); NpyIter_Deallocate(iter); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_DECREF(count_obj); free(arrays); free(dims); free(range_c); free(fndim); free(norms); free(dtypes); free(op_flags); return NULL; } /* The location of the data pointer which the iterator may update */ dataptr = NpyIter_GetDataPtrArray(iter); /* The location of the stride which the iterator may update */ strideptr = NpyIter_GetInnerStrideArray(iter); /* The location of the inner loop size which the iterator may update */ innersizeptr = NpyIter_GetInnerLoopSizePtr(iter); /* Get C array for output array */ count = (double *)PyArray_DATA(count_array); /* Pre-compute index stride */ // We comput the strides for the bin index for each dimension. The desired // behavior is this: // 1D: bin_idx = ix // --> stride = {1} // 2D: bin_idx = ny * ix + iy // --> stride = {ny, 1} // 3D: bin_idx = nz * ny * ix + nz * iy + iz // --> stride = {nz * ny, nz, 1} // ... and so on for higher dimensions. // Notice how the order of multiplication requires that we step through the // dimensions backwards. stride = (int *)malloc(sizeof(int) * ndim); for (int i = 0; i < ndim; i++){ stride[i] = 1; } if (ndim > 1){ for (int i = ndim - 1; i > 0; i--){ stride[i - 1] = stride[i] * (int)dims[i]; } } Py_BEGIN_ALLOW_THREADS do { /* Get the inner loop data/stride/count values */ npy_intp size = *innersizeptr; /* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */ while (size--) { bin_idx = 0; in_range = 1; for (int i = 0; i < ndim; i++){ xmin = range_c[i * 2]; xmax = range_c[i * 2 + 1]; tx = *(double *)dataptr[i]; dataptr[i] += strideptr[i]; if (tx < xmin || tx >= xmax){ in_range = 0; } else { local_bin_idx = (tx - xmin) * norms[i]; bin_idx += stride[i] * local_bin_idx; } } tw = *(double *)dataptr[ndim]; dataptr[ndim] += strideptr[ndim]; if (in_range){ count[bin_idx] += tw; } } } while (iternext(iter)); Py_END_ALLOW_THREADS NpyIter_Deallocate(iter); for (int i = 0; i < ndim + 1; i++){ Py_XDECREF(arrays[i]); } Py_XDECREF(range); Py_XDECREF(bins); free(arrays); free(dims); free(range_c); free(fndim); free(norms); free(dtypes); free(op_flags); free(stride); return count_obj; } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/fast_histogram/histogram.py0000644000175100001720000001340514242767131021356 0ustar00runnerdockerfrom __future__ import division import numbers import numpy as np from ._histogram_core import (_histogram1d, _histogram2d, _histogramdd, _histogram1d_weighted, _histogram2d_weighted, _histogramdd_weighted) __all__ = ['histogram1d', 'histogram2d', 'histogramdd'] def histogram1d(x, bins, range, weights=None): """ Compute a 1D histogram assuming equally spaced bins. Parameters ---------- x : `~numpy.ndarray` The position of the points to bin in the 1D histogram bins : int The number of bins range : iterable The range as a tuple of (xmin, xmax) weights : `~numpy.ndarray` The weights of the points in the 1D histogram Returns ------- array : `~numpy.ndarray` The 1D histogram array """ nx = bins if not np.isscalar(bins): raise TypeError('bins should be an integer') xmin, xmax = range if not np.isfinite(xmin): raise ValueError("xmin should be finite") if not np.isfinite(xmax): raise ValueError("xmax should be finite") if xmax <= xmin: raise ValueError("xmax should be greater than xmin") if nx <= 0: raise ValueError("nx should be strictly positive") if weights is None: return _histogram1d(x, nx, xmin, xmax) else: return _histogram1d_weighted(x, weights, nx, xmin, xmax) def histogram2d(x, y, bins, range, weights=None): """ Compute a 2D histogram assuming equally spaced bins. Parameters ---------- x, y : `~numpy.ndarray` The position of the points to bin in the 2D histogram bins : int or iterable The number of bins in each dimension. If given as an integer, the same number of bins is used for each dimension. range : iterable The range to use in each dimention, as an iterable of value pairs, i.e. [(xmin, xmax), (ymin, ymax)] weights : `~numpy.ndarray` The weights of the points in the 1D histogram Returns ------- array : `~numpy.ndarray` The 2D histogram array """ if isinstance(bins, numbers.Integral): nx = ny = bins else: nx, ny = bins if not np.isscalar(nx) or not np.isscalar(ny): raise TypeError('bins should be an iterable of two integers') (xmin, xmax), (ymin, ymax) = range if not np.isfinite(xmin): raise ValueError("xmin should be finite") if not np.isfinite(xmax): raise ValueError("xmax should be finite") if not np.isfinite(ymin): raise ValueError("ymin should be finite") if not np.isfinite(ymax): raise ValueError("ymax should be finite") if xmax <= xmin: raise ValueError("xmax should be greater than xmin") if ymax <= ymin: raise ValueError("xmax should be greater than xmin") if nx <= 0: raise ValueError("nx should be strictly positive") if ny <= 0: raise ValueError("ny should be strictly positive") if weights is None: return _histogram2d(x, y, nx, xmin, xmax, ny, ymin, ymax) else: return _histogram2d_weighted(x, y, weights, nx, xmin, xmax, ny, ymin, ymax) def histogramdd(sample, bins, range, weights=None): """ Compute a histogram of N samples in D dimensions. Parameters ---------- sample : (N, D) `~numpy.ndarray`, or (D, N) array_like The data to be histogrammed. * When an array_like, each element is the list of values for single coordinate - such as ``histogramdd((X, Y, Z), bins, range)``. * When a `~numpy.ndarray`, each row is a coordinate in a D-dimensional space - such as ``histogramdd(np.array([p1, p2, p3]), bins, range)``. * In the special case of D = 1, it is allowed to pass an array or array_like with length N. The second form is converted internally into the first form, thus the first form is preferred. bins : int or iterable The number of bins in each dimension. If given as an integer, the same number of bins is used for every dimension. range : iterable The range to use in each dimention, as an iterable of D value pairs, i.e. [(xmin, xmax), (ymin, ymax)] weights : `~numpy.ndarray` The weights of the points in `sample`. Returns ------- array : `~numpy.ndarray` The ND histogram array """ if isinstance(sample, np.ndarray): _sample = tuple(np.atleast_2d(sample.T)) else: # handle special case in 1D if isinstance(sample[0], numbers.Real): _sample = (sample,) else: _sample = tuple(sample) ndim = len(_sample) n = len(_sample[0]) if isinstance(bins, numbers.Integral): _bins = bins * np.ones(ndim, dtype=np.intp) else: _bins = np.array(bins, dtype=np.intp) if len(_bins) != ndim: raise ValueError("number of bin counts does not match number of dimensions") if np.any(_bins <= 0): raise ValueError("all bin numbers should be strictly positive") _range = np.zeros((ndim, 2), dtype=np.double) if not len(range) == ndim: raise ValueError("number of ranges does not equal number of dimensions") for i, r in enumerate(range): if not len(r) == 2: raise ValueError("should pass a minimum and maximum value for each dimension") if r[0] >= r[1]: raise ValueError("each range should be strictly increasing") _range[i][0] = r[0] _range[i][1] = r[1] if weights is None: return _histogramdd(_sample, _bins, _range) else: return _histogramdd_weighted(_sample, _bins, _range, weights) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/fast_histogram/tests/0000755000175100001720000000000014242767166020156 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/fast_histogram/tests/__init__.py0000644000175100001720000000000014242767131022245 0ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/fast_histogram/tests/test_histogram.py0000644000175100001720000003444614242767131023567 0ustar00runnerdockerimport numpy as np import pytest from hypothesis import given, settings, assume, example from hypothesis import strategies as st from hypothesis import HealthCheck from hypothesis.extra.numpy import arrays from ..histogram import histogram1d, histogram2d, histogramdd # NOTE: for now we don't test the full range of floating-point values in the # tests below, because Numpy's behavior isn't always deterministic in some # of the extreme regimes. We should add manual (non-hypothesis and not # comparing to Numpy) test cases. @given(values=arrays(dtype='f4', 'f8', '= xmin) if weights: assume(np.allclose(np.sum(w[inside]), np.sum(reference))) else: n_inside = np.sum(inside) assume(n_inside == np.sum(reference)) fast = histogram1d(x, bins=nx, weights=w, range=(xmin, xmax)) # Numpy returns results for 32-bit results as a 32-bit histogram, but only # for 1D arrays. Since this is a summation variable it makes sense to # return 64-bit, so rather than changing the behavior of histogram1d, we # cast to 32-bit float here. if x.dtype.kind == 'f' and x.dtype.itemsize == 4: rtol = 1e-7 else: rtol = 1e-14 np.testing.assert_allclose(fast, reference, rtol=rtol) fastdd = histogramdd((x,), bins=nx, weights=w, range=[(xmin, xmax)]) np.testing.assert_array_equal(fast, fastdd) @given(values=arrays(dtype='f4', 'f8', '= xmin) & (y < ymax) & (y >= ymin) if weights: assume(np.allclose(np.sum(w[inside]), np.sum(reference))) else: n_inside = np.sum(inside) assume(n_inside == np.sum(reference)) fast = histogram2d(x, y, bins=(nx, ny), weights=w, range=((xmin, xmax), (ymin, ymax))) if x.dtype.kind == 'f' and x.dtype.itemsize == 4: rtol = 1e-7 else: rtol = 1e-14 np.testing.assert_allclose(fast, reference, rtol=rtol) fastdd = histogramdd((x, y), bins=(nx, ny), weights=w, range=((xmin, xmax), (ymin, ymax))) np.testing.assert_array_equal(fast, fastdd) @given(values=arrays(dtype='f4', 'f8', ' hist_size: break _bins.append(bins[i]) accum_size *= bins[i] ndim = len(_bins) values = values.astype(dtype) ranges = ranges.astype(dtype) ranges = ranges[:ndim] # Ranges are symmetric because otherwise the probability of samples falling inside # is just too small and we would just be testing a bunch of empty histograms. ranges = np.vstack((-ranges, ranges)).T size = len(values) // (ndim + 1) if weights: w = values[:size] else: w = None sample = tuple(values[size*(i+1):size*(i+2)] for i in range(ndim)) # for simplicity using the same range in all dimensions try: reference = np.histogramdd(sample, bins=_bins, weights=w, range=ranges)[0] except Exception: # If Numpy fails, we skip the comparison since this isn't our fault return # First, check the Numpy result because it sometimes doesn't make sense. See # bug report https://github.com/numpy/numpy/issues/9435. # FIXME: for now use < since that's what our algorithm does inside = (sample[0] < ranges[0][1]) & (sample[0] >= ranges[0][0]) if ndim > 1: for i in range(ndim - 1): inside = inside & (sample[i+1] < ranges[i+1][1]) & (sample[i+1] >= ranges[i+1][0]) if weights: assume(np.allclose(np.sum(w[inside]), np.sum(reference))) else: n_inside = np.sum(inside) assume(n_inside == np.sum(reference)) fast = histogramdd(sample, bins=_bins, weights=w, range=ranges) if sample[0].dtype.kind == 'f' and sample[0].dtype.itemsize == 4: rtol = 1e-7 else: rtol = 1e-14 np.testing.assert_allclose(fast, reference, rtol=rtol) def test_nd_arrays(): x = np.random.random(1000) result_1d = histogram1d(x, bins=10, range=(0, 1)) result_3d = histogram1d(x.reshape((10, 10, 10)), bins=10, range=(0, 1)) result_3d_dd = histogramdd((x.reshape((10, 10, 10)),), bins=10, range=((0, 1), )) np.testing.assert_equal(result_1d, result_3d) np.testing.assert_equal(result_1d, result_3d_dd) y = np.random.random(1000) result_1d = histogram2d(x, y, bins=(10, 10), range=[(0, 1), (0, 1)]) result_3d = histogram2d(x.reshape((10, 10, 10)), y.reshape((10, 10, 10)), bins=(10, 10), range=[(0, 1), (0, 1)]) result_3d_dd = histogramdd((x.reshape((10, 10, 10)), y.reshape((10, 10, 10))), bins=(10, 10), range=[(0, 1), (0, 1)]) np.testing.assert_equal(result_1d, result_3d) np.testing.assert_equal(result_1d, result_3d_dd) def test_list(): # Make sure that lists can be passed in x_list = [1.4, 2.1, 4.2] x_arr = np.array(x_list) result_list = histogram1d(x_list, bins=10, range=(0, 10)) result_arr = histogram1d(x_arr, bins=10, range=(0, 10)) np.testing.assert_equal(result_list, result_arr) result_list_dd = histogramdd(x_list, bins=10, range=((0, 10),)) result_arr_dd = histogramdd(x_arr, bins=10, range=((0, 10),)) np.testing.assert_equal(result_list_dd, result_arr_dd) def test_histogramdd_interface(): # make sure the interface of histogramdd works as numpy.histogramdd x_list = [1.4, 2.1, 4.2, 8.7, 5.1] x_arr = np.array(x_list) y_list = [6.6, 3.2, 2.9, 3.9, 0.1] y_arr = np.array(y_list) # test 1D (needs special handling in case the sample is a list) sample = x_arr result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10),)) result_fh = histogramdd(sample, bins=10, range=((0, 10),)) np.testing.assert_equal(result_np, result_fh) sample = x_list result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10),)) result_fh = histogramdd(sample, bins=10, range=((0, 10),)) np.testing.assert_equal(result_np, result_fh) # test (D, N) array_like sample = (x_arr, y_arr) result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10), (0, 10))) result_fh = histogramdd(sample, bins=10, range=((0, 10), (0, 10))) np.testing.assert_equal(result_np, result_fh) sample = [x_arr, y_arr] result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10), (0, 10))) result_fh = histogramdd(sample, bins=10, range=((0, 10), (0, 10))) np.testing.assert_equal(result_np, result_fh) sample = (x_list, y_list) result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10), (0, 10))) result_fh = histogramdd(sample, bins=10, range=((0, 10), (0, 10))) np.testing.assert_equal(result_np, result_fh) sample = [x_list, y_list] result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10), (0, 10))) result_fh = histogramdd(sample, bins=10, range=((0, 10), (0, 10))) np.testing.assert_equal(result_np, result_fh) # test (N, D) array sample = np.vstack([x_arr, y_arr]).T result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10), (0, 10))) result_fh = histogramdd(sample, bins=10, range=((0, 10), (0, 10))) np.testing.assert_equal(result_np, result_fh) sample = np.vstack([x_list, y_list]).T result_np, _ = np.histogramdd(sample, bins=10, range=((0, 10), (0, 10))) result_fh = histogramdd(sample, bins=10, range=((0, 10), (0, 10))) np.testing.assert_equal(result_np, result_fh) def test_non_contiguous(): x = np.random.random((10, 10, 10))[::2, ::3, :] y = np.random.random((10, 10, 10))[::2, ::3, :] z = np.random.random((10, 10, 10))[::2, ::3, :] w = np.random.random((10, 10, 10))[::2, ::3, :] assert not x.flags.c_contiguous assert not x.flags.f_contiguous result_1 = histogram1d(x, bins=10, range=(0, 1)) result_2 = histogram1d(x.copy(), bins=10, range=(0, 1)) np.testing.assert_equal(result_1, result_2) result_1 = histogram1d(x, bins=10, range=(0, 1), weights=w) result_2 = histogram1d(x.copy(), bins=10, range=(0, 1), weights=w) np.testing.assert_equal(result_1, result_2) result_1 = histogram2d(x, y, bins=(10, 10), range=[(0, 1), (0, 1)]) result_2 = histogram2d(x.copy(), y.copy(), bins=(10, 10), range=[(0, 1), (0, 1)]) np.testing.assert_equal(result_1, result_2) result_1 = histogram2d(x, y, bins=(10, 10), range=[(0, 1), (0, 1)], weights=w) result_2 = histogram2d(x.copy(), y.copy(), bins=(10, 10), range=[(0, 1), (0, 1)], weights=w) np.testing.assert_equal(result_1, result_2) result_1 = histogramdd((x, y, z), bins=(10, 10, 10), range=[(0, 1), (0, 1), (0, 1)]) result_2 = histogramdd((x.copy(), y.copy(), z.copy()), bins=(10, 10, 10), range=[(0, 1), (0, 1), (0, 1)]) np.testing.assert_equal(result_1, result_2) result_1 = histogramdd((x, y, z), bins=(10, 10, 10), range=[(0, 1), (0, 1), (0, 1)], weights=w) result_2 = histogramdd((x.copy(), y.copy(), z.copy()), bins=(10, 10, 10), range=[(0, 1), (0, 1), (0, 1)], weights=w) np.testing.assert_equal(result_1, result_2) def test_array_bins(): edges = np.array([0, 1, 2, 3, 4]) with pytest.raises(TypeError) as exc: histogram1d([1, 2, 3], bins=edges, range=(0, 10)) assert exc.value.args[0] == 'bins should be an integer' with pytest.raises(TypeError) as exc: histogram2d([1, 2, 3], [1, 2 ,3], bins=[edges, edges], range=[(0, 10), (0, 10)]) assert exc.value.args[0] == 'bins should be an iterable of two integers' def test_mixed_strides(): # Make sure all functions work properly when passed arrays with mismatched # strides. x = np.random.random((30, 20, 40, 50))[:, 10, :, 20] y = np.random.random((30, 40, 50))[:, :, 10] z = np.random.random((30, 10, 5, 80, 90))[:, 5, 2, ::2, 22] assert x.shape == y.shape and x.shape == z.shape assert x.strides != y.strides and y.strides != z.strides and z.strides != x.strides result_1 = histogram1d(x, bins=10, range=(0, 1)) result_2, _ = np.histogram(x, bins=10, range=(0, 1)) np.testing.assert_equal(result_1, result_2) result_3 = histogram1d(x, weights=y, bins=10, range=(0, 1)) result_4, _ = np.histogram(x, weights=y, bins=10, range=(0, 1)) np.testing.assert_equal(result_3, result_4) result_5 = histogram2d(x, y, bins=(10, 10), range=[(0, 1), (0, 1)]) result_6, _, _ = np.histogram2d(x.ravel(), y.ravel(), bins=(10, 10), range=[(0, 1), (0, 1)]) np.testing.assert_equal(result_5, result_6) result_7 = histogram2d(x, y, weights=z, bins=(10, 10), range=[(0, 1), (0, 1)]) result_8, _, _ = np.histogram2d(x.ravel(), y.ravel(), weights=z.ravel(), bins=(10, 10), range=[(0, 1), (0, 1)]) np.testing.assert_equal(result_7, result_8) result_9 = histogramdd((x, y), bins=(10, 10), range=[(0, 1), (0, 1)]) result_10, _, _ = np.histogram2d(x.ravel(), y.ravel(), bins=(10, 10), range=[(0, 1), (0, 1)]) np.testing.assert_equal(result_9, result_10) result_11 = histogramdd((x, y), weights=z, bins=(10, 10), range=[(0, 1), (0, 1)]) result_12, _, _ = np.histogram2d(x.ravel(), y.ravel(), weights=z.ravel(), bins=(10, 10), range=[(0, 1), (0, 1)]) np.testing.assert_equal(result_11, result_12) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337717.0 fast-histogram-0.11/fast_histogram/version.py0000644000175100001720000000021314242767165021046 0ustar00runnerdocker# coding: utf-8 # file generated by setuptools_scm # don't change, don't track in version control version = '0.11' version_tuple = (0, 11) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/fast_histogram.egg-info/0000755000175100001720000000000014242767166020506 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337717.0 fast-histogram-0.11/fast_histogram.egg-info/PKG-INFO0000644000175100001720000001453414242767165021611 0ustar00runnerdockerMetadata-Version: 2.1 Name: fast-histogram Version: 0.11 Summary: Fast simple 1D and 2D histograms Home-page: https://github.com/astrofrog/fast-histogram Author: Thomas Robitaille Author-email: thomas.robitaille@gmail.com License: BSD Requires-Python: >=3.6 Provides-Extra: test License-File: LICENSE |CI Status| |asv| About ----- Sometimes you just want to compute simple 1D or 2D histograms with regular bins. Fast. No nonsense. `Numpy's `__ histogram functions are versatile, and can handle for example non-regular binning, but this versatility comes at the expense of performance. The **fast-histogram** mini-package aims to provide simple and fast histogram functions for regular bins that don't compromise on performance. It doesn't do anything complicated - it just implements a simple histogram algorithm in C and keeps it simple. The aim is to have functions that are fast but also robust and reliable. The result is a 1D histogram function here that is **7-15x faster** than ``numpy.histogram``, and a 2D histogram function that is **20-25x faster** than ``numpy.histogram2d``. To install:: pip install fast-histogram or if you use conda you can instead do:: conda install -c conda-forge fast-histogram The ``fast_histogram`` module then provides two functions: ``histogram1d`` and ``histogram2d``: .. code:: python from fast_histogram import histogram1d, histogram2d Example ------- Here's an example of binning 10 million points into a regular 2D histogram: .. code:: python In [1]: import numpy as np In [2]: x = np.random.random(10_000_000) In [3]: y = np.random.random(10_000_000) In [4]: %timeit _ = np.histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30) 935 ms ± 58.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [5]: from fast_histogram import histogram2d In [6]: %timeit _ = histogram2d(x, y, range=[[-1, 2], [-2, 4]], bins=30) 40.2 ms ± 624 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) (note that ``10_000_000`` is possible in Python 3.6 syntax, use ``10000000`` instead in previous versions) The version here is over 20 times faster! The following plot shows the speedup as a function of array size for the bin parameters shown above: .. figure:: https://github.com/astrofrog/fast-histogram/raw/master/speedup_compared.png :alt: Comparison of performance between Numpy and fast-histogram as well as results for the 1D case, also with 30 bins. The speedup for the 2D case is consistently between 20-25x, and for the 1D case goes from 15x for small arrays to around 7x for large arrays. Q&A --- Why don't the histogram functions return the edges? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Computing and returning the edges may seem trivial but it can slow things down by a factor of a few when computing histograms of 10^5 or fewer elements, so not returning the edges is a deliberate decision related to performance. You can easily compute the edges yourself if needed though, using ``numpy.linspace``. Doesn't package X already do this, but better? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This may very well be the case! If this duplicates another package, or if it is possible to use Numpy in a smarter way to get the same performance gains, please open an issue and I'll consider deprecating this package :) One package that does include fast histogram functions (including in n-dimensions) and can compute other statistics is `vaex `_, so take a look there if you need more advanced functionality! Are the 2D histograms not transposed compared to what they should be? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There is technically no 'right' and 'wrong' orientation - here we adopt the convention which gives results consistent with Numpy, so: .. code:: python numpy.histogram2d(x, y, range=[[xmin, xmax], [ymin, ymax]], bins=[nx, ny]) should give the same result as: .. code:: python fast_histogram.histogram2d(x, y, range=[[xmin, xmax], [ymin, ymax]], bins=[nx, ny]) Why not contribute this to Numpy directly? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As mentioned above, the Numpy functions are much more versatile, so they could not be replaced by the ones here. One option would be to check in Numpy's functions for cases that are simple and dispatch to functions such as the ones here, or add dedicated functions for regular binning. I hope we can get this in Numpy in some form or another eventually, but for now, the aim is to have this available to packages that need to support a range of Numpy versions. Why not use Cython? ~~~~~~~~~~~~~~~~~~~ I originally implemented this in Cython, but found that I could get a 50% performance improvement by going straight to a C extension. What about using Numba? ~~~~~~~~~~~~~~~~~~~~~~~ I specifically want to keep this package as easy as possible to install, and while `Numba `__ is a great package, it is not trivial to install outside of Anaconda. Could this be parallelized? ~~~~~~~~~~~~~~~~~~~~~~~~~~~ This may benefit from parallelization under certain circumstances. The easiest solution might be to use OpenMP, but this won't work on all platforms, so it would need to be made optional. Couldn't you make it faster by using the GPU? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Almost certainly, though the aim here is to have an easily installable and portable package, and introducing GPUs is going to affect both of these. Why make a package specifically for this? This is a tiny amount of functionality ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Packages that need this could simply bundle their own C extension or Cython code to do this, but the main motivation for releasing this as a mini-package is to avoid making pure-Python packages into packages that require compilation just because of the need to compute fast histograms. Can I contribute? ~~~~~~~~~~~~~~~~~ Yes please! This is not meant to be a finished package, and I welcome pull request to improve things. .. |CI Status| image:: https://github.com/astrofrog/fast-histogram/actions/workflows/main.yml/badge.svg :target: https://github.com/astrofrog/fast-histogram/actions/workflows/main.yml .. |asv| image:: https://img.shields.io/badge/benchmarked%20by-asv-brightgreen.svg :target: https://astrofrog.github.io/fast-histogram ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337717.0 fast-histogram-0.11/fast_histogram.egg-info/SOURCES.txt0000644000175100001720000000115414242767165022372 0ustar00runnerdocker.gitignore CHANGES.rst LICENSE MANIFEST.in README.rst pyproject.toml setup.cfg setup.py speedup_compared.png tox.ini .github/workflows/main.yml comparison/README.rst comparison/benchmark.py comparison/plot.py fast_histogram/__init__.py fast_histogram/_histogram_core.c fast_histogram/histogram.py fast_histogram/version.py fast_histogram.egg-info/PKG-INFO fast_histogram.egg-info/SOURCES.txt fast_histogram.egg-info/dependency_links.txt fast_histogram.egg-info/not-zip-safe fast_histogram.egg-info/requires.txt fast_histogram.egg-info/top_level.txt fast_histogram/tests/__init__.py fast_histogram/tests/test_histogram.py././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337717.0 fast-histogram-0.11/fast_histogram.egg-info/dependency_links.txt0000644000175100001720000000000114242767165024553 0ustar00runnerdocker ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337715.0 fast-histogram-0.11/fast_histogram.egg-info/not-zip-safe0000644000175100001720000000000114242767163022731 0ustar00runnerdocker ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337717.0 fast-histogram-0.11/fast_histogram.egg-info/requires.txt0000644000175100001720000000006214242767165023103 0ustar00runnerdockernumpy [test] pytest pytest-cov hypothesis[numpy] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337717.0 fast-histogram-0.11/fast_histogram.egg-info/top_level.txt0000644000175100001720000000001714242767165023235 0ustar00runnerdockerfast_histogram ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/pyproject.toml0000644000175100001720000000076414242767131016715 0ustar00runnerdocker[build-system] requires = ["setuptools", "setuptools_scm", "wheel", "oldest-supported-numpy"] build-backend = 'setuptools.build_meta' [tool.cibuildwheel.linux] archs = ["auto", "aarch64"] [tool.cibuildwheel.macos] skip = "pp*" archs = ["x86_64", "arm64"] [tool.cibuildwheel.windows] skip = "pp*" [tool.cibuildwheel.linux.environment] CC = "gcc" [[tool.cibuildwheel.overrides]] select = "*-musllinux*" before-all = "apk add clang" environment = { CC="clang" } ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1653337717.8848438 fast-histogram-0.11/setup.cfg0000644000175100001720000000075714242767166015634 0ustar00runnerdocker[metadata] name = fast-histogram url = https://github.com/astrofrog/fast-histogram author = Thomas Robitaille author_email = thomas.robitaille@gmail.com license = BSD description = Fast simple 1D and 2D histograms long_description = file: README.rst [options] zip_safe = False packages = find: install_requires = numpy python_requires = >=3.6 [options.extras_require] test = pytest pytest-cov hypothesis[numpy] [bdist_wheel] py_limited_api = cp36 [egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/setup.py0000644000175100001720000000073614242767131015512 0ustar00runnerdocker#!/usr/bin/env python import os import sys import numpy from setuptools import setup from setuptools.extension import Extension setup(use_scm_version={'write_to': os.path.join('fast_histogram', 'version.py')}, ext_modules=[Extension("fast_histogram._histogram_core", [os.path.join('fast_histogram', '_histogram_core.c')], py_limited_api=True, include_dirs=[numpy.get_include()])]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/speedup_compared.png0000644000175100001720000006276614242767131020040 0ustar00runnerdockerPNG  IHDR6, sBIT|d pHYsaa?i9tEXtSoftwarematplotlib version 2.2.2, http://matplotlib.org/ IDATxwxTed2Ҩ!R*4MĆeuŲkwWQ 5 EP@ BH uǁ@ j&Lr2syS,Á0WQ`#"""nCFDDD܆ q 6"""6lDDDm(؈Paj8q"͚5#(( ڷoҥKwŒqXf3ZhVh>3}]~g7nLΝWjlV"""Ry}_gĉlڴƍGʕ(ODDDS͕rrr7oo>̙3T={d̘1u233{ٳg bkDDD5ϟjժxx81ra;v8VXxq޹ɓ';bbb;wt̞=QfMG׮]zcƌqz衇z衇<رc;w)SvZ5jT[nulݺ-[^zINNFiO+vV^M7݄f+,)j[٤M6pmԪUs95[Q^^^y[nO?ϤI e˖l68ooCBB rmnGhh[Vlr綁{Omv0Rrݻwc۩RJ W%"""e=6/={$22_f-[ơCիaaaٳQFѢE :vhf"""RJlN:ŰaÈ'88f͚l2uF\\V… DFFһwoƌj5l)L 6SNs]ԍ)*q 6"""6lDDDm(؈P`#"""nCFDDD܆ q 6"""6lDDDm(؈P`#"""nCFDDD܆ q 6"""6<.@J,HK33ig^8M$ ^~`Ň_{<|s(cȵSqgdb01BůW>OKsyI 㮭ӧ`@2]5 zښEDQ)k3r%G,ڝ ~f<.} p+֟iӢ)Y`O{ůW+K=,e `²ڌA\5(C&s>CDYDDLBK@85fUQ;X,5J`#R\<Р7ܿ n팽~&_][Q)n ð!͞aJW{bv""nAF$E WOa+`]DDDF O.f|5R1ShW3|V3qHipՌ\\͸#la̬?`#RX8 v#ɂ /X~=.\O>Yy,РՌk񬋫?ՌE\4믿ĉٴiիWgԩk׮̜9HV\I=(Ytq#lXo7#+t [[H 35\)''y摚Jٺu+vݻ罧jժ4i҄~Mff&ySRR%ȥ)+Ij[)T vcq"b9$7F`(m+2V6ps۲Gn׷8߾sNڷoOFF̚5^z1k,|!{ԪUI&]zcǎW^)p|֬YKDJS=x8۶I>Qͩ&W'"eYpz폾6<=.&iiiy$''T3=deeq1Ν;W_}Ŕ)SXv-UMnݨS|UwHΜ9/̵Э[7l6[|fIQʐx)?ǒ@FXm\}; ܻ}6o۰~y/|"l~y)))9lLEtt4[槟~gСdeeDŊޟ@zx{{8nJYRԶ2 7\$1cWR&7:s߷p綁{mvt#̻Rɭv?Awbm[Q]ֱq8dffҪU+l6111yٵkןPrA-ߌDD jsdBȹd[}ͮc /гgO"##9<_|k֬aٲe3|pFEhh(!!!<34m4o;Ac–x TUHio),xXV"+ U)SͩS6lӬY3-[Fn?~զN}||0a&L(D[M#v- ?~ 5$)φhr+Wf~*uclDA׽ 玙[>[QZ WfB (؈?͆B˞sW|dG%y\ʐU\;z Vo8,0"1_7^w|)HyZ:>a<I#"q8qΣd? %Hȅ/拫w{:4wX>p4h\P+xyyq! oof̘1,YGru׹X)&{A?vKa_+D/d^@&PzrBw}˽H`@8~ -1*qsݿ ށ0t1 9#رc9zhq""f 7>o<_:?in="0kj|+6@͢ES]ta֬YddhhuTmafW#"I9?43btٺu+۶mYfBEպ Yc v)NlYAY ?6C ȑ#,^`Z '<<\[+.c {IfW#"F=2w1CN_~jժp8p8./JDJ P2x8c .e+ >**SҤI|||I&L2յHitD]oly3"eOn/x]cOзo_͛Ǽy۷/O=/Rq("fXƆy`|+_WBN4O˹Bu'駟rwׯ͚5^si"R ԂOڷ f ~fW%R~mV3 k…x)D5[ m̮rzw'x7Tm~0{(l 2V)?w g_9GB]#?ƍi޼yq#"eUm>RK/DJFI5f9BͮsVT HOO/ZD2齰)B @E6oFb͚5$&&!"_(tk<~=lf5"0sp|#܈K8}+[nKS-.R5 ;aFNYCYow|fWV =߆Oo덐61߉m0.H*aĮtq'k O7`+P ]ym +_|̮-9l֭[;uTbDčyv {p+15!;!3ܖs]؈`L.z||X,rdc')6NϊJJJHHH`ٲe\wuX©k\wuطo_tŒq;[j h}|"D-Xy5{*;MpppGXXݺuwsZk׮cӦMĐMIMM|ǤI-[DrcfW#R",4,c`2r[Q$<<@o_YlYӧO'""[Gʕ uL233^_Z[nc۝.}NI}^IR&S፥\M]K[vHƲo ~g&Mn[ބG;K׷83߰cǎ|[v6lPBTƃD4 xѶ"*N"}ПyYp!֭zޖ-[b8pUb-X:CR% `|Hݠ@}x; s:ؼ+ꫴnݚ*U[Y=fo!00'`|}}9t9z",,={0j(ZhAǎ"b1%yeeƭص{q:=3{'HtO1cÆ 8q"Pp5ӧs}ŪUxpݛ1c`ZE$G?= k4Êl]e3C@%j9lСK>&dEFFvZ|">p0klx쪤W! V׼8lxf͚/\HyQ#4 ;` *'f| IG. ޙ-5 8l222mMFDW1 :=߽^fvUR%j,kH9qx@%h4Tz1)h,xDuo& 6"RZxx-|OcO[LCܦ=l3]N"W(ty0`mh-)&U?M1?ZW{1|8/3 zojR&>>>}`Z۷/k׮x{k0qNu`{Z7X}tK"]\ xU3F)S lO`,ZQFq uF~ӧaa'";z͓]#~]_cxI|.03"WYNX,p p ;,ZO?~mү_?US"􅽋`ɳpb -rsc̾%p>+`>X5Vlc-RD4xaÆ4lؐ{ӧOpB.\3<; IDATEheyYptd-ӗy[ ?VQ|ͫS܊fE3|pKHyP nz־ {BpuGŰ']>|77D%K{H.% C>k=]ۻ̔|+~ocG+C^Z"HFDJ/n4eҌ~ VAVsFij3()!&"[x=0֏/AY2L]b+7 bZk (؈Hqسo@w̮HO2n/] 14TLULR*)؜8q 6@nnns#GtIa""y<};`e&4 mͮ}1] G7.  RyR`3}ty rjŢ`#"#=4φ%੕]Ic߽Wmz 3R9lѣGnG)A7b lnxʶ2{[s]3!M+QYN4nv)y۫?R5ͮl9 Z+#jS5r: >yG=""QcmC;mvEߙFŐc:v^젪(*N7|>}l26mfw~ܸq.+NDz vMXv̮tq8wH

h]Ccu]E ͏?w}Ƿ~Kƍ ?ˊ+meYbz06cJ56iBkVy./\^u7'3!)|l{<<}hy|?ɗϷfkB+8l*TAkc_/?~Wμ~. B Jw\v^]1P~8H[*{D\40z~׋t^~"""iӦ >|={G""""t㏙znݺ1tPEDDD YQ'O&77Gy֯_O߾}yG\^Ha9lyNdmmV,ʼn8é[QW=""""E]f͚b(EDDD8=ƦgϞ=]vѪU+ׯˊqG`ܸqY,nS̟?{Kx_~{233yg={6t҅?ի;[9oEٱ7k׮cӦMĐMIMM{ϓO>ɂ /X~=.\O>#"""e>}:lݺN:ԩSG׮]9s&\=zQRN>- >>>DGGөS'V\k*$$[b޽{{VJ&Ml233{nn;]SQ\Mj[mnq>jӤQbEΝϏ]6W[0'))Yfq *ݻwVZL4uƎ+RYs"""b4N *9co0ydLB:u8x ?0=;vo穧/,uG;X~_p`Xzn}jwUVl6bbb}]0ؘTJLt/m<š} Ğ@Vv剈5{ln&~aLB-yGعsgr5=fo!000/Kpp0ÇgԨQ39 >m؋_OK|F6ۏ'Unm'ԟZTUSED%m5$勇j}Vї˴plGΤr̅|y[Pp(-/Gz%`R+DDcz|srrعs'QQQ ;RجԯD8ŎE:<8 #8x= 2.ODD `3o<7oE8r{'_ty"%޿/j%;n*""&11ʕ+d Bz>|8;wty"%bpZz}8|&A7f&""`SR%CNN˖-k׮)%Rֵ:r}t\r;/~L)6?vM4bЭ[76oެul{3xX,Oq cg l.""s:،;)SCacaadL:*BֳSf&""S]ok.F4n8 s~cxxV6""R l>z!|||V##GtIa"Q ގweGa~K⃡-1<rPfu]mp5EFܞ/nD먊sN~:DߏC[оv剈k 6Ss&U_9f3m3Ou#ՊED̠"נVX_>܁-bp.-DD%0c VZEBBw.+N,\˜EY4>wYum)"R6O<3f̠wykو iI<6{Φ1t&^Րt/;w.z*zDʴFUXXGj+b̢l9z4߻H+cEtttq"}l||gK^E;4$h#Mt5j>8q fT nF""ũP} cҥ4n͖]WH׺f G\SsC<5w;[$bx{jXW+T zR; f}m| x- 3<R`3}CĭY=,<յ-kT鹿7c#ڜ. +]۸z-Ν;ZDލY4zYlw%;'YDD57xgϺrj_f?Ў{ug>ina""nfFӈnoͱgz6&]HDLԻiGGFpB&wO̤u ""EqMfϞ=DEErvxi|JrDDMڵIL4###Z8Ν;Gڵ][H9nmwzv|g8l9BNNN㙙8UUEbpu5P{"+r<)71kQg)Bʷp¼˗/Ϸh_NNVf͚.-N%?+‡w$:"DDJBߊ%775j:77LG>}uѷo_Vb믿wb{kΩ), Pχ%"Л 8qvfvi""clbcc wҼys>?|-B|||cɒ%E,꺚!,q=jÓs~adf&"R9l~m̙zȐ!PZ5oԵzkV`+y{{SrGHH%ya6s(Oĉt+)]8i$fΜ @LL +Wdٲe̝;g}+V5kA y׉gffyyiv;v{ɬ rsJJfkӬj '3 7 + m+*rmgYN#eDFFOIؿ?m۶%))hX,,X o29s **X^~eٺu+WرcyW 5k~~~EM9 jtv-r\qwLPPP+Gdd$˖-^}͵:th&MкukXx޾=z4O?t디"##޽S0nCnݰl%%Em+=nS+r; nBX@_ rmg9l ĝwIݺuILLgϞ/DGGªR QQQ8p}V̒fhJڡ`'b2hfڂ6>(Զ˝ۧ]ǏgĈ4jԈ55?Q" +118TR#RmVi&iaV,"=66gy'|p{/BHH!!!;SJ9 /@XXtDYtK_l=Ļ?-\Mg߳gǎ#+++~[lᦛn{}il̽ĉٹs'9wUTᦛnbΜ9l{CӺfv+=E-"R>8l>ٹs'%b1c83s]|rg), wAjtLȋ'@wDpzOV'"ÝF|0;O$\Hq:4i҄;vжm[y6lJڵ]^\ͫмb.ٹ 6ty饗5ӿk=zn%K@v5s UqO 6;v 3=zΠvٳ3gΐ7\|5 5`چ#lvPB pȑ){^ِ!~v.Wcv9"".U̍7H*UX,nz>|إk{{͸M]VGf%" 6'OfРAڵkG5~s=>|)SqF6mZuX=,;9^V~:Ĵ f$"Rd<ѣ=z4ΝѣF:u4P ލ`'bөn+] lTB*TZDDCZUg寧X7Q3юxy:=@DTWKDc76%ϋ_O]lD$OX7 huz4DD`#"h\-gNjf%0O#plo.rDD &!!דʚDd>6 ?űzHtIIIaذaTVoN:QZ5nF1A:aߡ&$)-܂DD `yf[Ν;Grr2~-[l,EX IDAT$tOtx/dovp8.IDO9l/^̴iѣAAAңG>S/^\5I|lV39:ɢ]ȟr:؄\xpp0+VtIQ"Rz4̈p7&W$"ǜ6/O?4yN<ɳ>/Dtx:4LJF6j%%"[*L8E58vޜ>}I&w۶mTDLi!z6JdӾet0`@q!"\c{>:f%"f̘1Qwb ?xQ3جZSDJ$"aAM dljd>Y{DDq:xxx`Z!"J/k q\ {mxW\V^}U%/1z4nܘ9s0|p&"b~r,N}ܻenM۶mYr.'"\E?/?aá3&W$"`΄ ^+.'"eDX깯vn7")UbE,Kkc̙.-NDJ=Moސf$"fmV{EC~^79C'o_NЭa4bvY"RN9lb(CDʲ5*p:L\{Eo㚸>r@ Xū>Ty<֣kK[V{lږZ⾬ZkW{֫jWnWE*x_TTP4\ !$rj!ŋo&`$k1}q,{W]""{(DDD@&A*>A/VF6SqQʱps6sۉ]=b,zWTVV.^,Dpp0/^GѣXx1:v-[4uD:y⽘+"GE36AAA>, ƚ?MC`>Cncc~R?(<$EDaulN8Z8}M""祐0xH\.]=B6]tٳczgF.]WJJ d2l۶zAa8uꔵ%pLJWzhՊ\=*6K,]hDGG# IIIXdU*..Fxx8:ϟ?W_!-- [QXXhmDdgcz 1_Mx rױӧgϞ 7nЬY35tP :u `ѢE1cF Xz5n:uOCכڊ  \qxޜXG\^!yI?fDZL|d2nj>ŋر#9=z9r$7oիW׹D̚5u4QN%dV#:x:K9ݻh47ŋ_ C9rdCvYKnn.ϯڸ._\OxVE`` bbb4`@RR JeǴ-m5]ĶA/=u漤{k#.ֲ;>#;={|-ZEll*ռ @ZZ]R*\b<7$Vo3wáKȹSO21gTϛrqoO/l2̘1J\ԫW/8qAEԥuTϯ5CDUc gybDDeuʪvK%ZbT\uHJJ2!99}WW\k]\&rED$EV`ddd߱cvjվa_VV222pd2;3gn݊'Ob„ pwwG\\exop:zfQ>$ D$!Vc㭷ނN 8t{̝;˗/j_FTTyߗ_~VBBBJKKo}_xzzZ[69JϞ Nb[5V첈HB6P^^[ڶm?=U8pbdHLLDbbe)OwvGNo7]IՇW_˗\dggcĉ$';G`sˑ8L&""hP)//Ǯ]eUu5ٴ8"&BOdžM/oa.$`sebȑx뭷p 4uTHDܲ} sFDxV~zBAAyFݻw۴8">0SKMxoS F%:۷_Rm<((W^YaD$}2 FM״X7S쒈YlL&c۰j\1kD7{ñ;"WDD`3x`,Zȼ,PTT3g"66֦ѣax?$PZV'""KXl.\dt:qqqh߾=^O>)j$GFbY!"'euGFFN_=zypQ6EDhyck~}7DW777+xWl]=ž /m),"r" @߹s0yd 4ј.J9ibDDNCQ#F|t믿bӦM5k~j5T'_Oo?9ϠGi&vYD6o&`Xxq뀊+ DD֘x{:YXj?(2""ՠϊ䋡lA.1P+qJ&vIDt""{j  |""rTbǎ֬Y`^^yDD0G[ Q{37pVj8$&&'N`ĉƴi sm"d2>>\p> w_$"r@ 4ȼ~z˖-C||< lܸI$"fj X"e""r41dr޽LQӊg# [HW.vID@,6~~~ȑ#x RSx͈킀n)(ż.fȐ!6mRSS1}tWѱc&)*OW> |N6DT ͞=GSO=^...+WDLLLIDTS`oL`,ߗ|и*qSUerU R!vDD,6ZBjj*޽ (_6mHDTE~2Ž nUe?kܔT%U縷w!"Z{ Ih e&hu{ *n;Xg0Ag#arS)RV,W D^Ȃ*QS:9fj%zۯC`4 (֗W>SXjݪ!`D#w9T2如J Z.(F[xM0WBD \v ha&E{@APwJQXcPw?QRfDE xhqF ?+Z{[j ow!c!"B.[lJCM(4:f!2dDPH_2o!FQRS74UC'Pd1ىR!G wpwy oߎM2iu*߽]GRiY(=p.ӨkW%d2sa!"rj%hD]b-Y #D&=ZSv:Bo.Tw:mSUYWL&7,D!W _+F*"_{3?7:"m(ýP%H(TȠ~UERxPe6=QK* !v؁ //gٳgq)yD̚5u% ..wޅFSbt sflqM~0a0K297$iOբe˖V?UUf .ԻZZ5R%c {sN9I7@!5z9smڴ"""r@lNddee;v,Z-^~eK#"""Їrrr͛hժ_AAAbFDDDȡ.C"""" I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I IR Vk4 ())VJ{sN9I7@ƫqKI>EUXX/// F!'c2tdZ{4j\jlh4]W n稽=nKiHo5k޽{V8{oP߸3VߺlRmA@dd$Ο?3g$?c#RoS(\_\s\4IksV%4vgo͚w[gxro57B (TӧU6Vs}r>}1oj%?fKsÚm[媷ٛ{{vRu _+[1g.?%V ///ܽ{Qe9"؛srocoydflMP`P*w9'朤 ؛88cCDDD d0d0d0d0d0O?Ν;SNX|ԨQТE ;VRl.;;D׮]M6]wވ@hh(-[&vI6WRR L:URlJT"""4iTVVеkWXlܹs,""nnnضme… ѭ[7tSL,mort{FAϞ=qAx{{]MٳEEEXz56o,v96uu!""ٳ'Ν;f͚]ZFzݻwGZZ|||.ff̘ .]vX`L˖-qMhO=fϞh4y͔(**BqeIܸqéSRObxZglСC֭ڶm OOObΝbe3QQQ&ѦMDDD|}}۷o\m( t:F(e5 .ٳP x{{K.?AI"T*//N``k`BJJ d:/^`"22u׮]C۶mz]jlÇa2e[ݹs@BBZliM:sεWEoZx'l]p1bz9sسkƍ1nܸ.bUV:u*ڵkDGGcǎlU6l;#3fѣ0`+W@d&R٪[naXt=ʶ-zk޼9;,[yyy*?سlytӱd?Z^?Pc{3 HMM__~IIIHJJg kVj&য়~¥KpU8p)))l@ @غuk>}oF{L6m ~wyݔ)Sk6}VjHo#3klk֬K јo!lܸjl6m4! @ |||F#̚5n5[ې!C&ہgyƼnX+5y[f /4y Ր6n(u>䓦/HYYSm<&&P'OիWQXX۷g\Xқ3?A0a:ؽVkY[޽L&K.bkk^+0X[`` 8`>Wo޽Hl,ܼyF~~~ ⭙}`2<7xgp# [nE޽],oذaǝ;ڽ^kX[NN&NA &O01ʵȒΜ9_r2 S9s' a(*N޽{–-[]bY[~=z@.cРA1bkePmlĈ<Ѷޜ^'d,xPo,xe &ة"yPolj'(& :CwY6޼<6k=?˪le˖P(Rϯp{{sN97`c#...u~RR/RU!iޜ{sN1Pi^BFFѮ];㥗^B^cҥr x {{co7Q}XNlϞ=Z_/yZ \\\={ l)&7G؛YQDDD$<dž$$$$$$$$$YVB.lP(2dإXmܸq8e 񳢈Q&M,_OFvVF(j*K%GglqFİaðjժj L;wW^PHMMEbb""""E/IDATrJtj ?x 4o>>>6l~7~iL C&tU7-ZFK.5HD`CD at;wƋ/ouM'$$`ܹ8s ظq#lق A)>>iiiؽ{7r9Fbvhݺu}]^xHKKCzz:MVPZZ_ׯ#''À%%%RRRo>xxx`Ȑ!(++klK "j -A вeK!))ɼ~Ϟ=a۶m7sLAR ~~@8q oooaÆ m"""z)Zu~U)SAAAWX!tY0Lmz&ܹ}pƆܹs8t{9RĸqrZիXPPZjUm~C\\:tF`+Wj/1222p1L0:1i$DGGc޼ymgҥXb~s̄'<<<ooot:ID|&DDXh۶yLT*Ef͚պ]cÇG`` -[L&t޽ڡI&!""999Xr% zLLLD\\cǎ9s&֯_QFչ޽{? =& Xvm hD$"Zyy9֬Y> 111֍3k׮uܺu gΜ7|c>e߾} E^l2[_~Cxl2331fѣٳ'6l___h4z#"("O?L8ݻw5vXX}Vhҥ?:4i͛X bػw/._#-- ]tsÇ#""rss_@I-[ȑ#,$''FNNQ`!"Xn̘1#Gڧ\.둞ݻwŧ~Z? nݺ#$$1tP̚5ֶyyy8{,~gM6/pwwGJJ ڵkѣGK.xWPZZ" F푖={]9"rׯ_Ǵi̇("rGPPӱd!"""" d0d0d0d0d0d0d0d?ƯɊIENDB`././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1653337689.0 fast-histogram-0.11/tox.ini0000644000175100001720000000157514242767131015315 0ustar00runnerdocker[tox] envlist = py{27, 35, 36,37,38}-test style requires = setuptools >= 30.3.0 pip >= 19.3.1 isolated_build = true [testenv] changedir = test: .tmp/{envname} build_docs: docs description = test: run tests with pytest build_docs: invoke sphinx-build to build the HTML docs all: run tests with all optional dependencies dev: run tests with numpy and astropy dev versions deps = numpy110: numpy==1.10.* numpy111: numpy==1.11.* numpy112: numpy==1.12.* numpy113: numpy==1.13.* numpy114: numpy==1.14.* numpy115: numpy==1.15.* numpy116: numpy==1.16.* numpy117: numpy==1.17.* numpy118: numpy==1.18.* extras = test commands = pip freeze pytest --pyargs fast_histogram {posargs} [testenv:style] skip_install = true description = invoke style checks on package code deps = flake8 commands = flake8 fast_histogram