././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9740286 fscacher-0.4.3/0000755000175100001770000000000014717121743012710 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027358.0 fscacher-0.4.3/CHANGELOG.md0000644000175100001770000002054214717121736014526 0ustar00runnerdocker# 0.4.3 (Tue Nov 19 2024) #### ๐Ÿ› Bug Fix - Address lint warnings and drop Python 3.8 [#99](https://github.com/con/fscacher/pull/99) ([@yarikoptic](https://github.com/yarikoptic)) #### Authors: 1 - Yaroslav Halchenko ([@yarikoptic](https://github.com/yarikoptic)) --- # 0.4.2 (Tue Nov 19 2024) #### ๐Ÿงช Tests - Log when file timestamp unexpectedly in the future + more information in the failing test_memoize_path_dir [#98](https://github.com/con/fscacher/pull/98) ([@yarikoptic](https://github.com/yarikoptic)) - Stop testing against PyPy 3.8; retry failed `test_memoize_path_dir` on Windows [#95](https://github.com/con/fscacher/pull/95) ([@jwodder](https://github.com/jwodder)) #### Authors: 2 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) - Yaroslav Halchenko ([@yarikoptic](https://github.com/yarikoptic)) --- # 0.4.1 (Tue Jun 04 2024) #### ๐Ÿ› Bug Fix - Stop using/testing EOLed 3.6 and 3.7, use 3.9 for linting (3.8 EOLs soon) [#91](https://github.com/con/fscacher/pull/91) ([@yarikoptic](https://github.com/yarikoptic) [@jwodder](https://github.com/jwodder)) - ASV dropped --strict option in 0.6.0 release [#83](https://github.com/con/fscacher/pull/83) ([@yarikoptic](https://github.com/yarikoptic)) #### ๐Ÿ  Internal - Add a few folders I found locally into git ignore [#92](https://github.com/con/fscacher/pull/92) ([@yarikoptic](https://github.com/yarikoptic)) - [gh-actions](deps): Bump codecov/codecov-action from 3 to 4 [#89](https://github.com/con/fscacher/pull/89) ([@dependabot[bot]](https://github.com/dependabot[bot]) [@jwodder](https://github.com/jwodder)) - [gh-actions](deps): Bump actions/setup-python from 4 to 5 [#88](https://github.com/con/fscacher/pull/88) ([@dependabot[bot]](https://github.com/dependabot[bot])) - [gh-actions](deps): Bump actions/checkout from 3 to 4 [#82](https://github.com/con/fscacher/pull/82) ([@dependabot[bot]](https://github.com/dependabot[bot])) #### ๐Ÿงช Tests - Test against Python 3.12 and PyPy 3.10 [#84](https://github.com/con/fscacher/pull/84) ([@jwodder](https://github.com/jwodder)) - Use Python 3.8 to test against dev version of joblib [#86](https://github.com/con/fscacher/pull/86) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿ”ฉ Dependency Updates - Replace appdirs with platformdirs [#85](https://github.com/con/fscacher/pull/85) ([@jwodder](https://github.com/jwodder)) #### Authors: 3 - [@dependabot[bot]](https://github.com/dependabot[bot]) - John T. Wodder II ([@jwodder](https://github.com/jwodder)) - Yaroslav Halchenko ([@yarikoptic](https://github.com/yarikoptic)) --- # 0.4.0 (Wed Aug 16 2023) #### ๐Ÿš€ Enhancement - Add `exclude_kwargs` to memoization decorators [#38](https://github.com/con/fscacher/pull/38) ([@yarikoptic](https://github.com/yarikoptic) [@jwodder](https://github.com/jwodder)) #### Authors: 2 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) - Yaroslav Halchenko ([@yarikoptic](https://github.com/yarikoptic)) --- # 0.3.0 (Mon Feb 20 2023) #### ๐Ÿš€ Enhancement - Ignore cache for non-path-like arguments [#79](https://github.com/con/fscacher/pull/79) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿ› Bug Fix - Drop support for Python 3.6 [#80](https://github.com/con/fscacher/pull/80) ([@jwodder](https://github.com/jwodder) [@yarikoptic](https://github.com/yarikoptic)) #### ๐Ÿ  Internal - Update GitHub Actions action versions [#77](https://github.com/con/fscacher/pull/77) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿงช Tests - Test against more recent versions of PyPy [#81](https://github.com/con/fscacher/pull/81) ([@jwodder](https://github.com/jwodder)) - Test against Python 3.11 [#78](https://github.com/con/fscacher/pull/78) ([@jwodder](https://github.com/jwodder)) - Clean out vfat mount between benchmarks [#76](https://github.com/con/fscacher/pull/76) ([@jwodder](https://github.com/jwodder)) #### Authors: 2 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) - Yaroslav Halchenko ([@yarikoptic](https://github.com/yarikoptic)) --- # 0.2.0 (Tue Feb 22 2022) #### ๐Ÿš€ Enhancement - Support specifying a custom path for the cache; tokens becomes kwonly [#73](https://github.com/con/fscacher/pull/73) ([@jwodder](https://github.com/jwodder)) - make joblib ignore "path" , pass resolved as part of the fingerprinting kwargs arg [#63](https://github.com/con/fscacher/pull/63) ([@yarikoptic](https://github.com/yarikoptic) [@jwodder](https://github.com/jwodder)) #### ๐ŸŽ Performance - Cache directory fingerprint as a XORed hash of file fingerprints [#71](https://github.com/con/fscacher/pull/71) ([@jwodder](https://github.com/jwodder)) - Don't fingerprint paths when caching is ignored [#72](https://github.com/con/fscacher/pull/72) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿ  Internal - Improve linting configuration [#64](https://github.com/con/fscacher/pull/64) ([@jwodder](https://github.com/jwodder)) - Make versioneer.py use setuptools instead of distutils [#54](https://github.com/con/fscacher/pull/54) ([@jwodder](https://github.com/jwodder)) - Update codecov action to v2 [#53](https://github.com/con/fscacher/pull/53) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿงช Tests - Make benchmarks measure cache misses and hits separately [#74](https://github.com/con/fscacher/pull/74) ([@jwodder](https://github.com/jwodder)) - Update Python version used to test development joblib to 3.7 [#65](https://github.com/con/fscacher/pull/65) ([@jwodder](https://github.com/jwodder)) - Capture all logs during tests [#56](https://github.com/con/fscacher/pull/56) ([@jwodder](https://github.com/jwodder)) #### Authors: 2 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) - Yaroslav Halchenko ([@yarikoptic](https://github.com/yarikoptic)) --- # 0.1.6 (Thu Oct 07 2021) #### ๐Ÿ› Bug Fix - Revert "Limit joblib version to pre-1.1.0" [#52](https://github.com/con/fscacher/pull/52) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿงช Tests - Test against Python 3.10 [#49](https://github.com/con/fscacher/pull/49) ([@jwodder](https://github.com/jwodder)) - Change pypy3 to pypy-3.7 on GitHub Actions [#50](https://github.com/con/fscacher/pull/50) ([@jwodder](https://github.com/jwodder)) #### Authors: 1 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) --- # 0.1.5 (Thu Oct 07 2021) #### ๐Ÿ› Bug Fix - Limit joblib version to pre-1.1.0 [#48](https://github.com/con/fscacher/pull/48) ([@jwodder](https://github.com/jwodder)) - Test against and update for dev version of joblib [#42](https://github.com/con/fscacher/pull/42) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿ  Internal - Resimplify release workflow [#35](https://github.com/con/fscacher/pull/35) ([@jwodder](https://github.com/jwodder)) - Remove debug step [#34](https://github.com/con/fscacher/pull/34) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿงช Tests - Test handling of moving symlinks around in git-annex [#47](https://github.com/con/fscacher/pull/47) ([@jwodder](https://github.com/jwodder)) #### Authors: 1 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) --- # 0.1.4 (Mon Feb 22 2021) #### ๐Ÿ› Bug Fix - Fix versioneer+auto integration (or else) [#33](https://github.com/con/fscacher/pull/33) ([@jwodder](https://github.com/jwodder)) #### Authors: 1 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) --- # 0.1.3 (Mon Feb 22 2021) #### ๐Ÿ› Bug Fix - Try to debug versioneer failure [#32](https://github.com/con/fscacher/pull/32) ([@jwodder](https://github.com/jwodder)) #### Authors: 1 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) --- # 0.1.2 (Mon Feb 22 2021) #### ๐Ÿ› Bug Fix - Get auto and versioneer to play nice together [#31](https://github.com/con/fscacher/pull/31) ([@jwodder](https://github.com/jwodder)) #### Authors: 1 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) --- # 0.1.1 (Mon Feb 22 2021) #### ๐Ÿ› Bug Fix - Get tests to pass on Windows and macOS [#29](https://github.com/con/fscacher/pull/29) ([@jwodder](https://github.com/jwodder)) #### โš ๏ธ Pushed to `master` - Start CHANGELOG ([@jwodder](https://github.com/jwodder)) #### ๐Ÿ  Internal - Set up auto [#27](https://github.com/con/fscacher/pull/27) ([@jwodder](https://github.com/jwodder)) #### ๐Ÿงช Tests - Get asv to run pypy3 correctly [#30](https://github.com/con/fscacher/pull/30) ([@jwodder](https://github.com/jwodder)) #### Authors: 1 - John T. Wodder II ([@jwodder](https://github.com/jwodder)) --- # v0.1.0 (2021-02-18) Initial release ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/LICENSE0000644000175100001770000000210214717121722013705 0ustar00runnerdockerMIT License Copyright (c) 2020-2021 Center for Open Neuroscience Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/MANIFEST.in0000644000175100001770000000026114717121722014442 0ustar00runnerdockerinclude CHANGELOG.* CONTRIBUTORS.* LICENSE tox.ini include versioneer.py include asv.conf.json graft benchmarks graft docs prune docs/_build graft test global-exclude *.py[cod] ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9740286 fscacher-0.4.3/PKG-INFO0000644000175100001770000001235214717121743014010 0ustar00runnerdockerMetadata-Version: 2.1 Name: fscacher Version: 0.4.3 Summary: Caching results of operations on heavy file trees Home-page: https://github.com/con/fscacher Author: Center for Open Neuroscience Author-email: debian@onerussian.com Maintainer: John T. Wodder II Maintainer-email: fscacher@varonathe.org License: MIT Project-URL: Source Code, https://github.com/con/fscacher Project-URL: Bug Tracker, https://github.com/con/fscacher/issues Keywords: caching,file cache Classifier: Development Status :: 4 - Beta Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: License :: OSI Approved :: MIT License Classifier: Intended Audience :: Developers Classifier: Topic :: System :: Filesystems Requires-Python: >=3.9 Description-Content-Type: text/x-rst License-File: LICENSE Requires-Dist: joblib~=1.1 Requires-Dist: platformdirs Provides-Extra: benchmarks Requires-Dist: asv[virtualenv]<0.6.2,~=0.6.0; extra == "benchmarks" Provides-Extra: devel Requires-Dist: asv[virtualenv]<0.6.2,~=0.6.0; extra == "devel" Requires-Dist: pre-commit; extra == "devel" Provides-Extra: all Requires-Dist: asv[virtualenv]<0.6.2,~=0.6.0; extra == "all" Requires-Dist: pre-commit; extra == "all" .. image:: https://github.com/con/fscacher/workflows/Test/badge.svg?branch=master :target: https://github.com/con/fscacher/actions?workflow=Test :alt: CI Status .. image:: https://codecov.io/gh/con/fscacher/branch/master/graph/badge.svg :target: https://codecov.io/gh/con/fscacher .. image:: https://img.shields.io/pypi/pyversions/fscacher.svg :target: https://pypi.org/project/fscacher/ .. image:: https://img.shields.io/github/license/con/fscacher.svg :target: https://opensource.org/licenses/MIT :alt: MIT License `GitHub `_ | `PyPI `_ | `Issues `_ | `Changelog `_ ``fscacher`` provides a cache & decorator for memoizing functions whose outputs depend upon the contents of a file argument. If you have a function ``foo()`` that takes a file path as its first argument, and if the behavior of ``foo()`` is pure in the *contents* of the path and the values of its other arguments, ``fscacher`` can help cache that function, like so: .. code:: python from fscacher import PersistentCache cache = PersistentCache("insert_name_for_cache_here") @cache.memoize_path def foo(path, ...): ... Now the outputs of ``foo()`` will be cached for each set of input arguments and for a "fingerprint" (timestamps & size) of each ``path``. If ``foo()`` is called twice with the same set of arguments, the result from the first call will be reused for the second, unless the file pointed to by ``path`` changes, in which case the function will be run again. If ``foo()`` is called with a non-`path-like object `_ as the value of ``path``, the cache is ignored. ``memoize_path()`` optionally takes an ``exclude_kwargs`` argument, which must be a sequence of names of arguments of the decorated function that will be ignored for caching purposes. Caches are stored on-disk and thus persist between Python runs. To clear a given ``PersistentCache`` and erase its data store, call the ``clear()`` method. By default, caches are stored in the user-wide cache directory, under an fscacher-specific folder, with each one identified by the name passed to the constructor (which defaults to "cache" if not specified). To specify a different location, use the ``path`` argument to the constructor instead of passing a name: .. code:: python cache = PersistentCache(path="/my/custom/location") If your code runs in an environment where different sets of libraries or the like could be used in different runs, and these make a difference to the output of your function, you can make the caching take them into account by passing a list of library version strings or other identifiers for the current run as the ``token`` argument to the ``PersistentCache`` constructor. Finally, ``PersistentCache``'s constructor also optionally takes an ``envvar`` argument giving the name of an environment variable. If that environment variable is set to "``clear``" when the cache is constructed, the cache's ``clear()`` method will be called at the end of initialization. If the environment variable is set to "``ignore``" instead, then caching will be disabled, and the cache's ``memoize_path`` method will be a no-op. If the given environment variable is not set, or if ``envvar`` is not specified, then ``PersistentCache`` will query the ``FSCACHER_CACHE`` environment variable instead. Installation ============ ``fscacher`` requires Python 3.7 or higher. Just use `pip `_ for Python 3 (You have pip, right?) to install it and its dependencies:: python3 -m pip install fscacher ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/README.rst0000644000175100001770000000730114717121722014375 0ustar00runnerdocker.. image:: https://github.com/con/fscacher/workflows/Test/badge.svg?branch=master :target: https://github.com/con/fscacher/actions?workflow=Test :alt: CI Status .. image:: https://codecov.io/gh/con/fscacher/branch/master/graph/badge.svg :target: https://codecov.io/gh/con/fscacher .. image:: https://img.shields.io/pypi/pyversions/fscacher.svg :target: https://pypi.org/project/fscacher/ .. image:: https://img.shields.io/github/license/con/fscacher.svg :target: https://opensource.org/licenses/MIT :alt: MIT License `GitHub `_ | `PyPI `_ | `Issues `_ | `Changelog `_ ``fscacher`` provides a cache & decorator for memoizing functions whose outputs depend upon the contents of a file argument. If you have a function ``foo()`` that takes a file path as its first argument, and if the behavior of ``foo()`` is pure in the *contents* of the path and the values of its other arguments, ``fscacher`` can help cache that function, like so: .. code:: python from fscacher import PersistentCache cache = PersistentCache("insert_name_for_cache_here") @cache.memoize_path def foo(path, ...): ... Now the outputs of ``foo()`` will be cached for each set of input arguments and for a "fingerprint" (timestamps & size) of each ``path``. If ``foo()`` is called twice with the same set of arguments, the result from the first call will be reused for the second, unless the file pointed to by ``path`` changes, in which case the function will be run again. If ``foo()`` is called with a non-`path-like object `_ as the value of ``path``, the cache is ignored. ``memoize_path()`` optionally takes an ``exclude_kwargs`` argument, which must be a sequence of names of arguments of the decorated function that will be ignored for caching purposes. Caches are stored on-disk and thus persist between Python runs. To clear a given ``PersistentCache`` and erase its data store, call the ``clear()`` method. By default, caches are stored in the user-wide cache directory, under an fscacher-specific folder, with each one identified by the name passed to the constructor (which defaults to "cache" if not specified). To specify a different location, use the ``path`` argument to the constructor instead of passing a name: .. code:: python cache = PersistentCache(path="/my/custom/location") If your code runs in an environment where different sets of libraries or the like could be used in different runs, and these make a difference to the output of your function, you can make the caching take them into account by passing a list of library version strings or other identifiers for the current run as the ``token`` argument to the ``PersistentCache`` constructor. Finally, ``PersistentCache``'s constructor also optionally takes an ``envvar`` argument giving the name of an environment variable. If that environment variable is set to "``clear``" when the cache is constructed, the cache's ``clear()`` method will be called at the end of initialization. If the environment variable is set to "``ignore``" instead, then caching will be disabled, and the cache's ``memoize_path`` method will be a no-op. If the given environment variable is not set, or if ``envvar`` is not specified, then ``PersistentCache`` will query the ``FSCACHER_CACHE`` environment variable instead. Installation ============ ``fscacher`` requires Python 3.7 or higher. Just use `pip `_ for Python 3 (You have pip, right?) to install it and its dependencies:: python3 -m pip install fscacher ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/asv.conf.json0000644000175100001770000001505714717121722015325 0ustar00runnerdocker{ // The version of the config file format. Do not change, unless // you know what you are doing. "version": 1, // The name of the project being benchmarked "project": "fscacher", // The project's homepage "project_url": "https://github.com/con/fscacher", // The URL or local path of the source code repository for the // project being benchmarked "repo": ".", // The Python project's subdirectory in your repo. If missing or // the empty string, the project is assumed to be located at the root // of the repository. "repo_subdir": "", // Customizable commands for building, installing, and // uninstalling the project. See asv.conf.json documentation. // // "install_command": ["in-dir={env_dir} python -mpip install {wheel_file}"], // "uninstall_command": ["return-code=any python -mpip uninstall -y {project}"], // "build_command": [ // "python setup.py build", // "PIP_NO_BUILD_ISOLATION=false python -mpip wheel --no-deps --no-index -w {build_cache_dir} {build_dir}" // ], // List of branches to benchmark. If not provided, defaults to "master" // (for git) or "default" (for mercurial). // "branches": ["master"], // for git // "branches": ["default"], // for mercurial // The DVCS being used. If not set, it will be automatically // determined from "repo" by looking at the protocol in the URL // (if remote), or by looking for special directories, such as // ".git" (if local). "dvcs": "git", // The tool to use to create environments. May be "conda", // "virtualenv" or other value depending on the plugins in use. // If missing or the empty string, the tool will be automatically // determined by looking for tools on the PATH environment // variable. "environment_type": "virtualenv", // timeout in seconds for installing any dependencies in environment // defaults to 10 min //"install_timeout": 600, // the base URL to show a commit for the project. "show_commit_url": "https://github.com/con/fscacher/commit/", // The Pythons you'd like to test against. If not provided, defaults // to the current version of Python used to run `asv`. // "pythons": ["3.6", "3.7", "3.8", "3.9"], // The list of conda channel names to be searched for benchmark // dependency packages in the specified order // "conda_channels": ["conda-forge", "defaults"], // The matrix of dependencies to test. Each key is the name of a // package (in PyPI) and the values are version numbers. An empty // list or empty string indicates to just test against the default // (latest) version. null indicates that the package is to not be // installed. If the package to be tested is only available from // PyPi, and the 'environment_type' is conda, then you can preface // the package name by 'pip+', and the package will be installed via // pip (with all the conda available packages installed first, // followed by the pip installed packages). // "matrix": { "morecontext": [""] }, // Combinations of libraries/python versions can be excluded/included // from the set to test. Each entry is a dictionary containing additional // key-value pairs to include/exclude. // // An exclude entry excludes entries where all values match. The // values are regexps that should match the whole string. // // An include entry adds an environment. Only the packages listed // are installed. The 'python' key is required. The exclude rules // do not apply to includes. // // In addition to package names, the following keys are available: // // - python // Python version, as in the *pythons* variable above. // - environment_type // Environment type, as above. // - sys_platform // Platform, as in sys.platform. Possible values for the common // cases: 'linux2', 'win32', 'cygwin', 'darwin'. // // "exclude": [ // {"python": "3.2", "sys_platform": "win32"}, // skip py3.2 on windows // {"environment_type": "conda", "six": null}, // don't run without six on conda // ], // // "include": [ // // additional env for python2.7 // {"python": "2.7", "numpy": "1.8"}, // // additional env if run on windows+conda // {"platform": "win32", "environment_type": "conda", "python": "2.7", "libpython": ""}, // ], // The directory (relative to the current directory) that benchmarks are // stored in. If not provided, defaults to "benchmarks" "benchmark_dir": "benchmarks", // The directory (relative to the current directory) to cache the Python // environments in. If not provided, defaults to "env" "env_dir": ".asv/env", // The directory (relative to the current directory) that raw benchmark // results are stored in. If not provided, defaults to "results". "results_dir": ".asv/results", // The directory (relative to the current directory) that the html tree // should be written to. If not provided, defaults to "html". "html_dir": ".asv/html", // The number of characters to retain in the commit hashes. // "hash_length": 8, // `asv` will cache results of the recent builds in each // environment, making them faster to install next time. This is // the number of builds to keep, per environment. // "build_cache_size": 2, // The commits after which the regression search in `asv publish` // should start looking for regressions. Dictionary whose keys are // regexps matching to benchmark names, and values corresponding to // the commit (exclusive) after which to start looking for // regressions. The default is to start from the first commit // with results. If the commit is `null`, regression detection is // skipped for the matching benchmark. // // "regressions_first_commits": { // "some_benchmark": "352cdf", // Consider regressions only after this commit // "another_benchmark": null, // Skip regression detection altogether // }, // The thresholds for relative change in results, after which `asv // publish` starts reporting regressions. Dictionary of the same // form as in ``regressions_first_commits``, with values // indicating the thresholds. If multiple entries match, the // maximum is taken. If no entry matches, the default is 5%. // // "regressions_thresholds": { // "some_benchmark": 0.01, // Threshold of 1% // "another_benchmark": 0.5, // Threshold of 50% // }, } ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9700286 fscacher-0.4.3/benchmarks/0000755000175100001770000000000014717121743015025 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/benchmarks/__init__.py0000644000175100001770000000000014717121722017121 0ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/benchmarks/cache.py0000644000175100001770000000707614717121722016451 0ustar00runnerdockerfrom abc import ABC, abstractmethod from hashlib import sha256 import os from pathlib import Path import random from time import sleep, time from uuid import uuid4 from morecontext import envset from fscacher import PersistentCache class BaseCacheBenchmark(ABC): param_names = ["mode"] params = [["populate", "hit", "ignore"]] @abstractmethod def init_path(self, *args): # Must return the path created ... @staticmethod @abstractmethod def init_func(cache): # Must return the function ... def init_cache(self, ignore: bool = False): with envset("FSCACHER_CACHE", "ignore" if ignore else ""): self.cache = PersistentCache(path=str(uuid4())) self.func = self.init_func(self.cache) def setup(self, mode, *args): self.path = self.init_path(mode, *args) if mode == "hit": self.init_cache() self.func(self.path) elif mode == "ignore": self.init_cache(ignore=True) def time_cache(self, mode, *_args): if mode == "populate": self.init_cache() self.func(self.path) class TimeFile(BaseCacheBenchmark): FILE_SIZE = 1024 def init_path(self, *_args): with open("foo.dat", "wb") as fp: fp.write(bytes(random.choices(range(256), k=self.FILE_SIZE))) return "foo.dat" @staticmethod def init_func(cache): @cache.memoize_path def hashfile(path): # "emulate" slow invocation so significant raise in benchmark # consumed time would mean that we invoked it instead # of using cached value sleep(0.01) with open(path, "rb") as fp: return sha256(fp.read()).hexdigest() return hashfile class BaseDirectoryBenchmark(BaseCacheBenchmark): param_names = BaseCacheBenchmark.param_names + ["tmpdir"] params = BaseCacheBenchmark.params + [ os.environ.get("FSCACHER_BENCH_TMPDIRS", ".").split(":") ] @staticmethod @abstractmethod def get_layout(): ... def init_path(self, _mode, tmpdir): dirpath = Path(tmpdir, str(uuid4())) dirpath.mkdir(parents=True) base_time = time() dirs = [dirpath] layout = self.get_layout() for i, width in enumerate(layout): if i < len(layout) - 1: dirs2 = [] for d in dirs: for x in range(width): d2 = d / f"d{x}" d2.mkdir() dirs2.append(d2) dirs = dirs2 else: for j, d in enumerate(dirs): for x in range(width): f = d / f"f{x}.dat" f.write_bytes(b"\0" * random.randint(1, 1024)) t = base_time - x - j * width os.utime(f, (t, t)) return dirpath @staticmethod def init_func(cache): @cache.memoize_path def dirsize(path): total_size = 0 with os.scandir(path) as entries: for e in entries: if e.is_dir(): total_size += dirsize(e.path) else: total_size += e.stat().st_size return total_size return dirsize class TimeFlatDirectory(BaseDirectoryBenchmark): @staticmethod def get_layout(): return (100,) class TimeDeepDirectory(BaseDirectoryBenchmark): @staticmethod def get_layout(): return (3, 3, 3, 3) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/pyproject.toml0000644000175100001770000000016614717121722015624 0ustar00runnerdocker[build-system] requires = [ "setuptools >= 46.4.0", "wheel ~= 0.32" ] build-backend = "setuptools.build_meta" ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9740286 fscacher-0.4.3/setup.cfg0000644000175100001770000000300714717121743014531 0ustar00runnerdocker[metadata] name = fscacher description = Caching results of operations on heavy file trees long_description = file:README.rst long_description_content_type = text/x-rst author = Center for Open Neuroscience author_email = debian@onerussian.com maintainer = John T. Wodder II maintainer_email = fscacher@varonathe.org license = MIT license_files = LICENSE url = https://github.com/con/fscacher keywords = caching file cache classifiers = Development Status :: 4 - Beta Programming Language :: Python :: 3 :: Only Programming Language :: Python :: 3 Programming Language :: Python :: 3.9 Programming Language :: Python :: 3.10 Programming Language :: Python :: 3.11 Programming Language :: Python :: 3.12 Programming Language :: Python :: Implementation :: CPython Programming Language :: Python :: Implementation :: PyPy License :: OSI Approved :: MIT License Intended Audience :: Developers Topic :: System :: Filesystems project_urls = Source Code = https://github.com/con/fscacher Bug Tracker = https://github.com/con/fscacher/issues [options] packages = find: package_dir = =src python_requires = >=3.9 install_requires = joblib ~= 1.1 platformdirs [options.extras_require] benchmarks = asv[virtualenv] ~= 0.6.0, < 0.6.2 devel = %(benchmarks)s pre-commit all = %(devel)s [options.packages.find] where = src [versioneer] VCS = git style = pep440 versionfile_source = src/fscacher/_version.py versionfile_build = fscacher/_version.py tag_prefix = parentdir_prefix = [egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/setup.py0000644000175100001770000000060414717121722014417 0ustar00runnerdockerimport os.path import sys from setuptools import setup # This is needed for versioneer to be importable when building with PEP 517. # See and links # therein for more information. sys.path.append(os.path.dirname(__file__)) import versioneer setup( version=versioneer.get_version(), cmdclass=versioneer.get_cmdclass(), ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9700286 fscacher-0.4.3/src/0000755000175100001770000000000014717121743013477 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9740286 fscacher-0.4.3/src/fscacher/0000755000175100001770000000000014717121743015255 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/src/fscacher/__init__.py0000644000175100001770000000063714717121722017371 0ustar00runnerdocker""" Caching results of operations on heavy file trees Visit for more information. """ from ._version import get_versions from .cache import PersistentCache __version__ = get_versions()["version"] __author__ = "Center for Open Neuroscience" __author_email__ = "debian@onerussian.com" __license__ = "MIT" __url__ = "https://github.com/con/fscacher" __all__ = ["PersistentCache"] ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9740286 fscacher-0.4.3/src/fscacher/_version.py0000644000175100001770000000076114717121743017457 0ustar00runnerdocker # This file was generated by 'versioneer.py' (0.19) from # revision-control system data, or from the parent directory name of an # unpacked source archive. Distribution tarballs contain a pre-generated copy # of this file. import json version_json = ''' { "date": "2024-11-19T14:42:38+0000", "dirty": false, "error": null, "full-revisionid": "27b05a8b263f44f746fff9e097e54aef0b6932af", "version": "0.4.3" } ''' # END VERSION_JSON def get_versions(): return json.loads(version_json) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/src/fscacher/cache.py0000644000175100001770000002304014717121722016666 0ustar00runnerdockerfrom collections import deque, namedtuple from functools import partial, wraps from hashlib import md5 from inspect import Parameter, signature import logging import os import os.path as op import shutil import sys import time import joblib from platformdirs import PlatformDirs lgr = logging.getLogger(__name__) class PersistentCache(object): """Persistent cache providing @memoize and @memoize_path decorators""" _min_dtime = 0.01 # min difference between now and mtime to consider # for caching _cache_var_values = (None, "", "clear", "ignore") def __init__(self, name=None, *, path=None, tokens=None, envvar=None): """ Parameters ---------- name: str, optional Basename for the directory in which to store the cache. Mutually exclusive with `path`. path: str or pathlib.Path, optional Directory path at which to store the cache. If not specified, the cache is stored in `USER_CACHE/fscacher/NAME`, where NAME is the value of the `name` parameter (default: "cache"). Mutually exclusive with `name`. tokens: list of objects, optional To add to the fingerprint of @memoize_path (regular @memoize ATM does not use it). Could be e.g. versions of relevant/used python modules (pynwb, etc) envvar: str, optional Name of the environment variable to query for cache settings; if not set, `FSCACHER_CACHE` is used """ if path is None: dirs = PlatformDirs("fscacher") path = op.join(dirs.user_cache_dir, (name or "cache")) elif name is not None: raise ValueError("'name' and 'path' are mutually exclusive") self._memory = joblib.Memory(path, verbose=0) cntrl_value = None if envvar is not None: cntrl_var = envvar cntrl_value = os.environ.get(cntrl_var) if cntrl_value is None: cntrl_var = "FSCACHER_CACHE" cntrl_value = os.environ.get(cntrl_var) if cntrl_value not in self._cache_var_values: lgr.warning( f"{cntrl_var}={cntrl_value} is not understood and thus ignored." f" Known values are {self._cache_var_values}" ) if cntrl_value == "clear": self.clear() self._ignore_cache = cntrl_value == "ignore" self._tokens = tokens def clear(self): try: self._memory.clear(warn=False) except Exception as exc: lgr.debug("joblib failed to clear its cache: %s", exc) # and completely destroy the directory try: if op.exists(self._memory.location): shutil.rmtree(self._memory.location) except Exception as exc: lgr.warning(f"Failed to clear out the cache directory: {exc}") def memoize(self, f=None, *, exclude_kwargs=None): if f is None: return partial(self.memoize, exclude_kwargs=exclude_kwargs) if self._ignore_cache: return f return self._memory.cache(f, ignore=exclude_kwargs) def memoize_path(self, f=None, *, exclude_kwargs=None): if f is None: return partial(self.memoize_path, exclude_kwargs=exclude_kwargs) if self._ignore_cache: return f # we need to actually decorate a function fingerprint_kwarg = "_cache_fingerprint" @wraps(f) # important, so .memoize correctly caches different `f` def fingerprinted(path, *args, **kwargs): _ = kwargs.pop(fingerprint_kwarg) # discard lgr.debug("Running original %s on %r", f, path) return f(path, *args, **kwargs) # We need to add the fingerprint_kwarg to fingerprinted's signature so # that joblib doesn't complain: sig = signature(fingerprinted) fp_kwarg_param = Parameter( fingerprint_kwarg, Parameter.KEYWORD_ONLY, default=None ) sig2 = sig.replace( parameters=tuple(sig.parameters.values()) + (fp_kwarg_param,) ) fingerprinted.__signature__ = sig2 path_arg = next(iter(sig.parameters.keys())) # we need to ignore 'path' since we would like to dereference if symlink # but then expect joblib's caching work on both original and dereferenced # So we will add dereferenced path into fingerprint_kwarg fingerprinted = self.memoize( fingerprinted, exclude_kwargs=[path_arg] + (list(exclude_kwargs) if exclude_kwargs is not None else []), ) @wraps(f) def fingerprinter(*args, **kwargs): # we need to dereference symlinks and use that path in the function # call signature bound = sig.bind(*args, **kwargs) bound.apply_defaults() path_orig = bound.arguments[path_arg] try: path = op.realpath(path_orig) except TypeError: lgr.debug( "Calling %s directly since argument is not a path-like object", f ) return f(*args, **kwargs) if path != path_orig: lgr.log(5, "Dereferenced %r into %r", path_orig, path) if op.isdir(path): fprint = self._get_dir_fingerprint(path) else: fprint = self._get_file_fingerprint(path) if fprint is None: lgr.debug("Calling %s directly since no fingerprint for %r", f, path) # just call the function -- we have no fingerprint, # probably does not exist or permissions are wrong ret = f(*args, **kwargs) # We should still pass through if file was modified just now, # since that could mask out quick modifications. # Target use cases will not be like that. elif fprint.modified_in_window(self._min_dtime): lgr.debug("Calling %s directly since too short for %r", f, path) ret = f(*args, **kwargs) else: lgr.debug("Calling memoized version of %s for %s", f, path) # If there is a fingerprint -- inject it into the signature kwargs_ = kwargs.copy() kwargs_[fingerprint_kwarg] = ( (path,) + fprint.to_tuple() + (tuple(self._tokens) if self._tokens else ()) ) ret = fingerprinted(*args, **kwargs_) lgr.log(1, "Returning value %r", ret) return ret # and we memoize actually that function return fingerprinter @staticmethod def _get_file_fingerprint(path): """Simplistic generic file fingerprinting based on ctime, mtime, and size""" try: # we can't take everything, since atime can change, etc. # So let's take some s = os.stat(path, follow_symlinks=True) fprint = FileFingerprint.from_stat(s) lgr.log(5, "Fingerprint for %s: %s", path, fprint) return fprint except Exception as exc: lgr.debug(f"Cannot fingerprint {path}: {exc}") @staticmethod def _get_dir_fingerprint(path): fprint = DirFingerprint() dirqueue = deque([path]) try: while dirqueue: d = dirqueue.popleft() with os.scandir(d) as entries: for e in entries: if e.is_dir(follow_symlinks=True): dirqueue.append(e.path) else: s = e.stat(follow_symlinks=True) fprint.add_file(e.path, FileFingerprint.from_stat(s)) except Exception as exc: lgr.debug(f"Cannot fingerprint {path}: {exc}") return None else: return fprint class FileFingerprint(namedtuple("FileFingerprint", "mtime_ns ctime_ns size inode")): @classmethod def from_stat(cls, s): return cls(s.st_mtime_ns, s.st_ctime_ns, s.st_size, s.st_ino) def modified_in_window(self, min_dtime): return abs(elapsed_since(self.mtime_ns * 1e-9)) < min_dtime def to_tuple(self): return tuple(self) class DirFingerprint: def __init__(self): self.last_modified = None self.hash = None def add_file(self, path, fprint: FileFingerprint): fprint_hash = md5( ascii((str(path), fprint.to_tuple())).encode("us-ascii") ).digest() if self.hash is None: self.hash = fprint_hash self.last_modified = fprint.mtime_ns else: self.hash = xor_bytes(self.hash, fprint_hash) if self.last_modified < fprint.mtime_ns: self.last_modified = fprint.mtime_ns def modified_in_window(self, min_dtime): if self.last_modified is None: return False else: return abs(elapsed_since(self.last_modified * 1e-9)) < min_dtime def to_tuple(self): if self.hash is None: return (None,) else: return (self.hash.hex(),) def xor_bytes(b1: bytes, b2: bytes) -> bytes: length = max(len(b1), len(b2)) i1 = int.from_bytes(b1, sys.byteorder) i2 = int.from_bytes(b2, sys.byteorder) return (i1 ^ i2).to_bytes(length, sys.byteorder) def elapsed_since(t: float) -> float: t_now = time.time() dt = t_now - t if dt < 0: lgr.debug( "Time is in the future: %f; now: %f; dt=%g", t, t_now, dt ) return dt ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9740286 fscacher-0.4.3/src/fscacher/tests/0000755000175100001770000000000014717121743016417 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/src/fscacher/tests/__init__.py0000644000175100001770000000000014717121722020513 0ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/src/fscacher/tests/test_cache.py0000644000175100001770000004002514717121722021071 0ustar00runnerdockerfrom dataclasses import dataclass import logging import os import os.path as op from pathlib import Path import platform import shutil import subprocess import sys import time import pytest from .. import PersistentCache from ..cache import DirFingerprint, FileFingerprint platform_system = platform.system().lower() on_windows = platform_system == "windows" on_pypy = platform.python_implementation().lower() == "pypy" lgr = logging.getLogger(__name__) @pytest.fixture(autouse=True) def capture_all_logs(caplog): caplog.set_level(1, logger="fscacher") @pytest.fixture(scope="function") def cache(tmp_path_factory): return PersistentCache(path=tmp_path_factory.mktemp("cache")) @pytest.fixture(scope="function") def cache_tokens(tmp_path_factory): return PersistentCache(path=tmp_path_factory.mktemp("cache"), tokens=["0.0.1", 1]) def test_memoize(cache): # Simplest testing to start with, not relying on persisting across # independent processes _comp = [] @cache.memoize def f1(flag=False): if flag: raise ValueError("Got flag") if _comp: raise RuntimeError("Must not be recomputed") _comp.append(1) return 1 assert f1() == 1 assert f1() == 1 # Now with some args _comp = [] @cache.memoize def f2(*args): if args in _comp: raise RuntimeError("Must not be recomputed") _comp.append(args) return sum(args) assert f2(1) == 1 assert f2(1) == 1 assert f2(1, 2) == 3 assert f2(1, 2) == 3 assert _comp == [(1,), (1, 2)] def test_memoize_multiple(cache): # Make sure that with the same cache can cover multiple functions @cache.memoize def f1(): return 1 @cache.memoize def f2(): return 2 @cache.memoize def f3(): # nesting call into f2 return f2() + 1 for _ in range(3): assert f1() == 1 assert f2() == 2 assert f3() == 3 def test_memoize_path(cache, tmp_path): calls = [] @cache.memoize_path def memoread(path, arg, kwarg=None): calls.append([path, arg, kwarg]) with open(path) as f: return f.read() def check_new_memoread(arg, content, expect_new=False): ncalls = len(calls) assert memoread(path, arg) == content assert len(calls) == ncalls + 1 assert memoread(path, arg) == content assert len(calls) == ncalls + 1 + int(expect_new) fname = "file.dat" path = str(tmp_path / fname) with pytest.raises(IOError): memoread(path, 0) # and again with pytest.raises(IOError): memoread(path, 0) assert len(calls) == 2 with open(path, "w") as f: f.write("content") t0 = time.time() try: # unless this computer is too slow -- there should be less than # cache._min_dtime between our creating the file and testing, # so we would force a direct read: check_new_memoread(0, "content", True) except AssertionError: # pragma: no cover # if computer is indeed slow (happens on shared CIs) we might fail # because distance is too short if time.time() - t0 < cache._min_dtime: raise # if we were quick but still failed -- legit assert calls[-1] == [path, 0, None] # but if we sleep - should memoize time.sleep(cache._min_dtime * 1.1) check_new_memoread(1, "content") # and if we modify the file -- a new read time.sleep(cache._min_dtime * 1.1) with open(path, "w") as f: f.write("Content") ncalls = len(calls) assert memoread(path, 1) == "Content" assert len(calls) == ncalls + 1 time.sleep(cache._min_dtime * 1.1) check_new_memoread(0, "Content") # Check that symlinks should be dereferenced if not on_windows or ( sys.version_info[:2] >= (3, 8) and not (on_windows and on_pypy) ): # realpath doesn't work right on Windows on pre-3.8 Python, and PyPy on # Windows doesn't support symlinks at all, so skip the test then. symlink1 = str(tmp_path / (fname + ".link1")) try: os.symlink(fname, symlink1) except OSError: pass if op.islink(symlink1): # hopefully would just skip Windows if not supported ncalls = len(calls) assert memoread(symlink1, 0) == "Content" assert len(calls) == ncalls # no new call # and if we "clear", would it still work? cache.clear() check_new_memoread(1, "Content") @pytest.mark.flaky(reruns=5, condition=on_windows) def test_memoize_path_dir(cache, tmp_path): calls = [] @cache.memoize_path def memoread(path, arg, kwarg=None): calls.append([path, arg, kwarg]) total_size = 0 with os.scandir(path) as entries: for e in entries: if e.is_file(): total_size += e.stat().st_size return total_size def check_new_memoread(arg, content, expect_new=False): ncalls = len(calls) assert memoread(path, arg) == content assert len(calls) == ncalls + 1 assert memoread(path, arg) == content assert len(calls) == ncalls + 1 + int(expect_new) fname = "foo" path = tmp_path / fname with pytest.raises(IOError): memoread(path, 0) # and again with pytest.raises(IOError): memoread(path, 0) assert len(calls) == 2 path.mkdir() (path / "a.txt").write_text("Alpha") (path / "b.txt").write_text("Beta") t0 = time.time() try: # unless this computer is too slow -- there should be less than # cache._min_dtime between our creating the file and testing, # so we would force a direct read: check_new_memoread(0, 9, True) except AssertionError: # pragma: no cover # if computer is indeed slow (happens on shared CIs) we might fail # because distance is too short t_now = time.time() if t_now - t0 < cache._min_dtime: # Log more information to troubleshoot lgr.error(f"Failing test with t0={t0}, t_now={t_now}, " f"dt={t_now - t0}, min_dtime={cache._min_dtime}") for p in ("a.txt", "b.txt"): lgr.error(f" {p}: {op.getmtime(path / p)}") raise # if we were quick but still failed -- legit assert calls[-1] == [path, 0, None] # but if we sleep - should memoize time.sleep(cache._min_dtime * 1.1) check_new_memoread(1, 9) # and if we modify the file -- a new read time.sleep(cache._min_dtime * 1.1) (path / "c.txt").write_text("Gamma") ncalls = len(calls) assert memoread(path, 1) == 14 assert len(calls) == ncalls + 1 time.sleep(cache._min_dtime * 1.1) check_new_memoread(0, 14) # Check that symlinks should be dereferenced if not on_windows or ( sys.version_info[:2] >= (3, 8) and not (on_windows and on_pypy) ): # realpath doesn't work right on Windows on pre-3.8 Python, and PyPy on # Windows doesn't support symlinks at all, so skip the test then. symlink1 = str(tmp_path / (fname + ".link1")) try: os.symlink(fname, symlink1) except OSError: pass if op.islink(symlink1): # hopefully would just skip Windows if not supported ncalls = len(calls) assert memoread(symlink1, 0) == 14 assert len(calls) == ncalls # no new call # and if we "clear", would it still work? cache.clear() check_new_memoread(1, 14) def test_memoize_path_persist(tmp_path): from subprocess import PIPE, run script = tmp_path / "script.py" cachedir = tmp_path / "cache" script.write_text( "from os.path import basename\n" "from fscacher import PersistentCache\n" f"cache = PersistentCache(path={str(cachedir)!r})\n" "\n" "@cache.memoize_path\n" "def func(path):\n" " print('Running %s.' % basename(path), end='')\n" " return 'DONE'\n" "\n" f"print(func({str(script)!r}))\n" ) outputs = [ run([sys.executable, str(script)], stdout=PIPE, stderr=PIPE) for i in range(3) ] print("Full outputs: %s" % repr(outputs)) if b"File name too long" in outputs[0].stderr: # must be running during conda build which blows up paths with # _placehold_ers pytest.skip("seems to be running on conda and hitting the limits") assert outputs[0].stdout.strip().decode() == "Running script.py.DONE" for o in outputs[1:]: assert o.stdout.strip().decode() == "DONE" def test_memoize_path_tokens(tmp_path, cache, cache_tokens): calls = [] @cache.memoize_path def memoread(path, arg, kwarg=None): calls.append(["cache", path, arg, kwarg]) with open(path) as f: return f.read() @cache_tokens.memoize_path def memoread_tokens(path, arg, kwarg=None): calls.append(["cache_tokens", path, arg, kwarg]) with open(path) as f: return f.read() def check_new_memoread(call, arg, content, expect_first=True, expect_new=False): ncalls = len(calls) assert call(path, arg) == content assert len(calls) == ncalls + int(expect_first) assert call(path, arg) == content assert len(calls) == ncalls + int(expect_first) + int(expect_new) path = str(tmp_path / "file.dat") with open(path, "w") as f: f.write("content") time.sleep(cache._min_dtime * 1.1) # They both are independent, so both will cause a new readout check_new_memoread(memoread, 0, "content") check_new_memoread(memoread_tokens, 0, "content") @pytest.mark.parametrize( "fscacher_value,mycache_value,cleared,ignored", [ ("clear", None, True, False), ("q", "clear", True, False), ("ignore", "clear", True, False), ("clear", "", False, False), ("clear", "q", False, False), ("ignore", None, False, True), ("q", "ignore", False, True), ("clear", "ignore", False, True), ("ignore", "", False, False), ("ignore", "q", False, False), ], ) def test_cache_control_envvar( mocker, monkeypatch, fscacher_value, mycache_value, cleared, ignored, tmp_path ): if fscacher_value is not None: monkeypatch.setenv("FSCACHER_CACHE", fscacher_value) else: monkeypatch.delenv("FSCACHER_CACHE", raising=False) if mycache_value is not None: monkeypatch.setenv("MYCACHE_CONTROL", mycache_value) else: monkeypatch.delenv("MYCACHE_CONTROL", raising=False) clear_spy = mocker.spy(PersistentCache, "clear") c = PersistentCache(path=tmp_path, envvar="MYCACHE_CONTROL") assert clear_spy.called is cleared assert c._ignore_cache is ignored @pytest.mark.skipif(shutil.which("git-annex") is None, reason="git annex required") def test_follow_moved_symlink(cache, tmp_path): calls = [] @cache.memoize_path def memoread(path): calls.append([path]) with open(path) as f: return f.read() def git(*args): subprocess.run(["git", *args], cwd=tmp_path, check=True) content = "This is test text.\n" git("init") git("annex", "init") (tmp_path / "file.txt").write_text(content) git("annex", "add", "file.txt") git("commit", "-m", "Create file") assert op.islink(tmp_path / "file.txt") assert memoread(tmp_path / "file.txt") == content assert len(calls) == 1 assert memoread(tmp_path / "file.txt") == content assert len(calls) == 1 git("mv", "file.txt", "text.txt") git("commit", "-m", "Rename file") assert memoread(tmp_path / "text.txt") == content assert len(calls) == 1 assert memoread(tmp_path / "text.txt") == content assert len(calls) == 1 (tmp_path / "subdir").mkdir() git("mv", "text.txt", op.join("subdir", "text.txt")) git("commit", "-m", "Move file") assert memoread(tmp_path / "subdir" / "text.txt") == content assert len(calls) == 1 assert memoread(tmp_path / "subdir" / "text.txt") == content assert len(calls) == 1 def test_memoize_path_nonpath_arg(cache, tmp_path): calls = [] @cache.memoize_path def memoread(filepath, arg, kwarg=None): calls.append([filepath, arg, kwarg]) with open(filepath) as f: return f.read() path = str(tmp_path / "file.dat") with open(path, "w") as f: f.write("content") time.sleep(cache._min_dtime * 1.1) ncalls = len(calls) assert memoread(path, 1) == "content" assert len(calls) == ncalls + 1 assert memoread(arg=1, filepath=path) == "content" assert len(calls) == ncalls + 1 def test_dir_fingerprint_order_irrelevant(tmp_path): start = time.time() file1 = tmp_path / "apple.txt" file1.write_text("Apple\n") os.utime(file1, (start - 1, start - 1)) file2 = tmp_path / "banana.txt" file2.write_text("This is test text.\n") os.utime(file2, (start - 2, start - 2)) file3 = tmp_path / "coconut.txt" file3.write_text("Lorem ipsum dolor sit amet, consectetur adipisicing elit\n") os.utime(file3, (start - 3, start - 3)) df_tuples = [] for file_list in [ [file1, file2, file3], [file3, file2, file1], [file2, file1, file3], ]: dprint = DirFingerprint() for f in file_list: fprint = FileFingerprint.from_stat(os.stat(f)) dprint.add_file(f, fprint) df_tuples.append(dprint.to_tuple()) for i in range(1, len(df_tuples)): assert df_tuples[0] == df_tuples[i] def test_memoize_non_pathlike_arg(cache, tmp_path): calls = [] @cache.memoize_path def strify(x): calls.append(x) return str(x) path = tmp_path / "foo" path.touch() time.sleep(cache._min_dtime * 1.1) assert strify(path) == str(path) assert calls == [path] assert strify(42) == "42" assert calls == [path, 42] assert strify(path) == str(path) assert calls == [path, 42] assert strify(42) == "42" assert calls == [path, 42, 42] @dataclass class PathWrapper: path: Path def __fspath__(self) -> str: return str(self.path) def __str__(self) -> str: return str(self.path) def test_memoize_pathlike_arg(cache, tmp_path): calls = [] @cache.memoize_path def strify(x): calls.append(x) return str(x) path = tmp_path / "foo" path.touch() foo = PathWrapper(path) path2 = tmp_path / "bar" path2.touch() bar = PathWrapper(path2) time.sleep(cache._min_dtime * 1.1) assert strify(path) == str(path) assert calls == [path] assert strify(foo) == str(path) assert calls == [path] assert strify(bar) == str(tmp_path / "bar") assert calls == [path, bar] assert strify(path) == str(path) assert calls == [path, bar] assert strify(foo) == str(path) assert calls == [path, bar] assert strify(bar) == str(tmp_path / "bar") assert calls == [path, bar] def test_memoize_path_exclude_kwargs(cache, tmp_path): calls = [] @cache.memoize_path(exclude_kwargs=["extra"]) def memoread_extra(path, arg, kwarg=None, extra=None): calls.append((path, arg, kwarg, extra)) with open(path) as f: return f.read() path = tmp_path / "file.dat" path.write_text("content") time.sleep(cache._min_dtime * 1.1) assert memoread_extra(path, 1, extra="foo") == "content" assert calls == [(path, 1, None, "foo")] assert memoread_extra(path, 1, extra="bar") == "content" assert calls == [(path, 1, None, "foo")] assert memoread_extra(path, 1, kwarg="quux", extra="bar") == "content" assert calls == [(path, 1, None, "foo"), (path, 1, "quux", "bar")] path.write_text("different") time.sleep(cache._min_dtime * 1.1) assert memoread_extra(path, 1, extra="foo") == "different" assert calls == [ (path, 1, None, "foo"), (path, 1, "quux", "bar"), (path, 1, None, "foo"), ] assert memoread_extra(path, 1, extra="bar") == "different" assert calls == [ (path, 1, None, "foo"), (path, 1, "quux", "bar"), (path, 1, None, "foo"), ] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/src/fscacher/tests/test_util.py0000644000175100001770000000063514717121722021006 0ustar00runnerdockerimport pytest from ..cache import xor_bytes @pytest.mark.parametrize( "b1,b2,r", [ (b"\x12", b"\x34", b"\x26"), (b"\0\x12", b"\0\x34", b"\0\x26"), (b"\x12\0", b"\x34\0", b"\x26\0"), (b"\x12\xAB", b"\x34", b"\x26\xAB"), (b"\x12\xAB", b"\x34\xCD", b"\x26\x66"), ], ) def test_xor_bytes(b1: bytes, b2: bytes, r: bytes) -> None: assert xor_bytes(b1, b2) == r ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732027362.9740286 fscacher-0.4.3/src/fscacher.egg-info/0000755000175100001770000000000014717121743016747 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027362.0 fscacher-0.4.3/src/fscacher.egg-info/PKG-INFO0000644000175100001770000001235214717121742020046 0ustar00runnerdockerMetadata-Version: 2.1 Name: fscacher Version: 0.4.3 Summary: Caching results of operations on heavy file trees Home-page: https://github.com/con/fscacher Author: Center for Open Neuroscience Author-email: debian@onerussian.com Maintainer: John T. Wodder II Maintainer-email: fscacher@varonathe.org License: MIT Project-URL: Source Code, https://github.com/con/fscacher Project-URL: Bug Tracker, https://github.com/con/fscacher/issues Keywords: caching,file cache Classifier: Development Status :: 4 - Beta Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: License :: OSI Approved :: MIT License Classifier: Intended Audience :: Developers Classifier: Topic :: System :: Filesystems Requires-Python: >=3.9 Description-Content-Type: text/x-rst License-File: LICENSE Requires-Dist: joblib~=1.1 Requires-Dist: platformdirs Provides-Extra: benchmarks Requires-Dist: asv[virtualenv]<0.6.2,~=0.6.0; extra == "benchmarks" Provides-Extra: devel Requires-Dist: asv[virtualenv]<0.6.2,~=0.6.0; extra == "devel" Requires-Dist: pre-commit; extra == "devel" Provides-Extra: all Requires-Dist: asv[virtualenv]<0.6.2,~=0.6.0; extra == "all" Requires-Dist: pre-commit; extra == "all" .. image:: https://github.com/con/fscacher/workflows/Test/badge.svg?branch=master :target: https://github.com/con/fscacher/actions?workflow=Test :alt: CI Status .. image:: https://codecov.io/gh/con/fscacher/branch/master/graph/badge.svg :target: https://codecov.io/gh/con/fscacher .. image:: https://img.shields.io/pypi/pyversions/fscacher.svg :target: https://pypi.org/project/fscacher/ .. image:: https://img.shields.io/github/license/con/fscacher.svg :target: https://opensource.org/licenses/MIT :alt: MIT License `GitHub `_ | `PyPI `_ | `Issues `_ | `Changelog `_ ``fscacher`` provides a cache & decorator for memoizing functions whose outputs depend upon the contents of a file argument. If you have a function ``foo()`` that takes a file path as its first argument, and if the behavior of ``foo()`` is pure in the *contents* of the path and the values of its other arguments, ``fscacher`` can help cache that function, like so: .. code:: python from fscacher import PersistentCache cache = PersistentCache("insert_name_for_cache_here") @cache.memoize_path def foo(path, ...): ... Now the outputs of ``foo()`` will be cached for each set of input arguments and for a "fingerprint" (timestamps & size) of each ``path``. If ``foo()`` is called twice with the same set of arguments, the result from the first call will be reused for the second, unless the file pointed to by ``path`` changes, in which case the function will be run again. If ``foo()`` is called with a non-`path-like object `_ as the value of ``path``, the cache is ignored. ``memoize_path()`` optionally takes an ``exclude_kwargs`` argument, which must be a sequence of names of arguments of the decorated function that will be ignored for caching purposes. Caches are stored on-disk and thus persist between Python runs. To clear a given ``PersistentCache`` and erase its data store, call the ``clear()`` method. By default, caches are stored in the user-wide cache directory, under an fscacher-specific folder, with each one identified by the name passed to the constructor (which defaults to "cache" if not specified). To specify a different location, use the ``path`` argument to the constructor instead of passing a name: .. code:: python cache = PersistentCache(path="/my/custom/location") If your code runs in an environment where different sets of libraries or the like could be used in different runs, and these make a difference to the output of your function, you can make the caching take them into account by passing a list of library version strings or other identifiers for the current run as the ``token`` argument to the ``PersistentCache`` constructor. Finally, ``PersistentCache``'s constructor also optionally takes an ``envvar`` argument giving the name of an environment variable. If that environment variable is set to "``clear``" when the cache is constructed, the cache's ``clear()`` method will be called at the end of initialization. If the environment variable is set to "``ignore``" instead, then caching will be disabled, and the cache's ``memoize_path`` method will be a no-op. If the given environment variable is not set, or if ``envvar`` is not specified, then ``PersistentCache`` will query the ``FSCACHER_CACHE`` environment variable instead. Installation ============ ``fscacher`` requires Python 3.7 or higher. Just use `pip `_ for Python 3 (You have pip, right?) to install it and its dependencies:: python3 -m pip install fscacher ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027362.0 fscacher-0.4.3/src/fscacher.egg-info/SOURCES.txt0000644000175100001770000000076714717121742020644 0ustar00runnerdockerCHANGELOG.md LICENSE MANIFEST.in README.rst asv.conf.json pyproject.toml setup.cfg setup.py tox.ini versioneer.py benchmarks/__init__.py benchmarks/cache.py src/fscacher/__init__.py src/fscacher/_version.py src/fscacher/cache.py src/fscacher.egg-info/PKG-INFO src/fscacher.egg-info/SOURCES.txt src/fscacher.egg-info/dependency_links.txt src/fscacher.egg-info/requires.txt src/fscacher.egg-info/top_level.txt src/fscacher/tests/__init__.py src/fscacher/tests/test_cache.py src/fscacher/tests/test_util.py././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027362.0 fscacher-0.4.3/src/fscacher.egg-info/dependency_links.txt0000644000175100001770000000000114717121742023014 0ustar00runnerdocker ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027362.0 fscacher-0.4.3/src/fscacher.egg-info/requires.txt0000644000175100001770000000024714717121742021351 0ustar00runnerdockerjoblib~=1.1 platformdirs [all] asv[virtualenv]<0.6.2,~=0.6.0 pre-commit [benchmarks] asv[virtualenv]<0.6.2,~=0.6.0 [devel] asv[virtualenv]<0.6.2,~=0.6.0 pre-commit ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027362.0 fscacher-0.4.3/src/fscacher.egg-info/top_level.txt0000644000175100001770000000001114717121742021470 0ustar00runnerdockerfscacher ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/tox.ini0000644000175100001770000000266714717121722014233 0ustar00runnerdocker[tox] envlist = lint,py37,py38,py39,py310,py311,py312,pypy3 skip_missing_interpreters = True isolated_build = True minversion = 3.3.0 [testenv] deps = dev: joblib @ git+https://github.com/joblib/joblib.git pytest pytest-cov pytest-mock pytest-rerunfailures commands = pytest --pyargs {posargs} fscacher [testenv:lint] deps = flake8 flake8-bugbear flake8-builtins flake8-unused-arguments commands = flake8 benchmarks src [testenv:benchmark] skip_install = True extras = benchmarks commands = asv run {posargs} HEAD^1..HEAD [pytest] addopts = --cov=fscacher --no-cov-on-fail filterwarnings = error ignore:The distutils package is deprecated:DeprecationWarning:joblib ignore:`formatargspec` is deprecated:DeprecationWarning:joblib norecursedirs = test/data [coverage:run] branch = True parallel = True [coverage:paths] source = src .tox/**/site-packages [coverage:report] precision = 2 show_missing = True omit = src/fscacher/_version.py [flake8] doctests = True exclude = .*/,build/,dist/,test/data,venv/,_version.py hang-closing = False max-line-length = 88 unused-arguments-ignore-stub-functions = True select = A,B,B902,B950,C,E,E242,F,U100,W ignore = B005,E203,E262,E266,E501,W503 [isort] atomic = True force_sort_within_sections = True honor_noqa = True lines_between_sections = 0 profile = black reverse_relative = True sort_relative_in_force_sorted_sections = True src_paths = src ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1732027346.0 fscacher-0.4.3/versioneer.py0000644000175100001770000021077414717121722015453 0ustar00runnerdocker # Version: 0.19 """The Versioneer - like a rocketeer, but for versions. The Versioneer ============== * like a rocketeer, but for versions! * https://github.com/python-versioneer/python-versioneer * Brian Warner * License: Public Domain * Compatible with: Python 3.6, 3.7, 3.8, 3.9 and pypy3 * [![Latest Version][pypi-image]][pypi-url] * [![Build Status][travis-image]][travis-url] This is a tool for managing a recorded version number in distutils-based python projects. The goal is to remove the tedious and error-prone "update the embedded version string" step from your release process. Making a new release should be as easy as recording a new tag in your version-control system, and maybe making new tarballs. ## Quick Install * `pip install versioneer` to somewhere in your $PATH * add a `[versioneer]` section to your setup.cfg (see [Install](INSTALL.md)) * run `versioneer install` in your source tree, commit the results * Verify version information with `python setup.py version` ## Version Identifiers Source trees come from a variety of places: * a version-control system checkout (mostly used by developers) * a nightly tarball, produced by build automation * a snapshot tarball, produced by a web-based VCS browser, like github's "tarball from tag" feature * a release tarball, produced by "setup.py sdist", distributed through PyPI Within each source tree, the version identifier (either a string or a number, this tool is format-agnostic) can come from a variety of places: * ask the VCS tool itself, e.g. "git describe" (for checkouts), which knows about recent "tags" and an absolute revision-id * the name of the directory into which the tarball was unpacked * an expanded VCS keyword ($Id$, etc) * a `_version.py` created by some earlier build step For released software, the version identifier is closely related to a VCS tag. Some projects use tag names that include more than just the version string (e.g. "myproject-1.2" instead of just "1.2"), in which case the tool needs to strip the tag prefix to extract the version identifier. For unreleased software (between tags), the version identifier should provide enough information to help developers recreate the same tree, while also giving them an idea of roughly how old the tree is (after version 1.2, before version 1.3). Many VCS systems can report a description that captures this, for example `git describe --tags --dirty --always` reports things like "0.7-1-g574ab98-dirty" to indicate that the checkout is one revision past the 0.7 tag, has a unique revision id of "574ab98", and is "dirty" (it has uncommitted changes). The version identifier is used for multiple purposes: * to allow the module to self-identify its version: `myproject.__version__` * to choose a name and prefix for a 'setup.py sdist' tarball ## Theory of Operation Versioneer works by adding a special `_version.py` file into your source tree, where your `__init__.py` can import it. This `_version.py` knows how to dynamically ask the VCS tool for version information at import time. `_version.py` also contains `$Revision$` markers, and the installation process marks `_version.py` to have this marker rewritten with a tag name during the `git archive` command. As a result, generated tarballs will contain enough information to get the proper version. To allow `setup.py` to compute a version too, a `versioneer.py` is added to the top level of your source tree, next to `setup.py` and the `setup.cfg` that configures it. This overrides several distutils/setuptools commands to compute the version when invoked, and changes `setup.py build` and `setup.py sdist` to replace `_version.py` with a small static file that contains just the generated version data. ## Installation See [INSTALL.md](./INSTALL.md) for detailed installation instructions. ## Version-String Flavors Code which uses Versioneer can learn about its version string at runtime by importing `_version` from your main `__init__.py` file and running the `get_versions()` function. From the "outside" (e.g. in `setup.py`), you can import the top-level `versioneer.py` and run `get_versions()`. Both functions return a dictionary with different flavors of version information: * `['version']`: A condensed version string, rendered using the selected style. This is the most commonly used value for the project's version string. The default "pep440" style yields strings like `0.11`, `0.11+2.g1076c97`, or `0.11+2.g1076c97.dirty`. See the "Styles" section below for alternative styles. * `['full-revisionid']`: detailed revision identifier. For Git, this is the full SHA1 commit id, e.g. "1076c978a8d3cfc70f408fe5974aa6c092c949ac". * `['date']`: Date and time of the latest `HEAD` commit. For Git, it is the commit date in ISO 8601 format. This will be None if the date is not available. * `['dirty']`: a boolean, True if the tree has uncommitted changes. Note that this is only accurate if run in a VCS checkout, otherwise it is likely to be False or None * `['error']`: if the version string could not be computed, this will be set to a string describing the problem, otherwise it will be None. It may be useful to throw an exception in setup.py if this is set, to avoid e.g. creating tarballs with a version string of "unknown". Some variants are more useful than others. Including `full-revisionid` in a bug report should allow developers to reconstruct the exact code being tested (or indicate the presence of local changes that should be shared with the developers). `version` is suitable for display in an "about" box or a CLI `--version` output: it can be easily compared against release notes and lists of bugs fixed in various releases. The installer adds the following text to your `__init__.py` to place a basic version in `YOURPROJECT.__version__`: from ._version import get_versions __version__ = get_versions()['version'] del get_versions ## Styles The setup.cfg `style=` configuration controls how the VCS information is rendered into a version string. The default style, "pep440", produces a PEP440-compliant string, equal to the un-prefixed tag name for actual releases, and containing an additional "local version" section with more detail for in-between builds. For Git, this is TAG[+DISTANCE.gHEX[.dirty]] , using information from `git describe --tags --dirty --always`. For example "0.11+2.g1076c97.dirty" indicates that the tree is like the "1076c97" commit but has uncommitted changes (".dirty"), and that this commit is two revisions ("+2") beyond the "0.11" tag. For released software (exactly equal to a known tag), the identifier will only contain the stripped tag, e.g. "0.11". Other styles are available. See [details.md](details.md) in the Versioneer source tree for descriptions. ## Debugging Versioneer tries to avoid fatal errors: if something goes wrong, it will tend to return a version of "0+unknown". To investigate the problem, run `setup.py version`, which will run the version-lookup code in a verbose mode, and will display the full contents of `get_versions()` (including the `error` string, which may help identify what went wrong). ## Known Limitations Some situations are known to cause problems for Versioneer. This details the most significant ones. More can be found on Github [issues page](https://github.com/python-versioneer/python-versioneer/issues). ### Subprojects Versioneer has limited support for source trees in which `setup.py` is not in the root directory (e.g. `setup.py` and `.git/` are *not* siblings). The are two common reasons why `setup.py` might not be in the root: * Source trees which contain multiple subprojects, such as [Buildbot](https://github.com/buildbot/buildbot), which contains both "master" and "slave" subprojects, each with their own `setup.py`, `setup.cfg`, and `tox.ini`. Projects like these produce multiple PyPI distributions (and upload multiple independently-installable tarballs). * Source trees whose main purpose is to contain a C library, but which also provide bindings to Python (and perhaps other languages) in subdirectories. Versioneer will look for `.git` in parent directories, and most operations should get the right version string. However `pip` and `setuptools` have bugs and implementation details which frequently cause `pip install .` from a subproject directory to fail to find a correct version string (so it usually defaults to `0+unknown`). `pip install --editable .` should work correctly. `setup.py install` might work too. Pip-8.1.1 is known to have this problem, but hopefully it will get fixed in some later version. [Bug #38](https://github.com/python-versioneer/python-versioneer/issues/38) is tracking this issue. The discussion in [PR #61](https://github.com/python-versioneer/python-versioneer/pull/61) describes the issue from the Versioneer side in more detail. [pip PR#3176](https://github.com/pypa/pip/pull/3176) and [pip PR#3615](https://github.com/pypa/pip/pull/3615) contain work to improve pip to let Versioneer work correctly. Versioneer-0.16 and earlier only looked for a `.git` directory next to the `setup.cfg`, so subprojects were completely unsupported with those releases. ### Editable installs with setuptools <= 18.5 `setup.py develop` and `pip install --editable .` allow you to install a project into a virtualenv once, then continue editing the source code (and test) without re-installing after every change. "Entry-point scripts" (`setup(entry_points={"console_scripts": ..})`) are a convenient way to specify executable scripts that should be installed along with the python package. These both work as expected when using modern setuptools. When using setuptools-18.5 or earlier, however, certain operations will cause `pkg_resources.DistributionNotFound` errors when running the entrypoint script, which must be resolved by re-installing the package. This happens when the install happens with one version, then the egg_info data is regenerated while a different version is checked out. Many setup.py commands cause egg_info to be rebuilt (including `sdist`, `wheel`, and installing into a different virtualenv), so this can be surprising. [Bug #83](https://github.com/python-versioneer/python-versioneer/issues/83) describes this one, but upgrading to a newer version of setuptools should probably resolve it. ## Updating Versioneer To upgrade your project to a new release of Versioneer, do the following: * install the new Versioneer (`pip install -U versioneer` or equivalent) * edit `setup.cfg`, if necessary, to include any new configuration settings indicated by the release notes. See [UPGRADING](./UPGRADING.md) for details. * re-run `versioneer install` in your source tree, to replace `SRC/_version.py` * commit any changed files ## Future Directions This tool is designed to make it easily extended to other version-control systems: all VCS-specific components are in separate directories like src/git/ . The top-level `versioneer.py` script is assembled from these components by running make-versioneer.py . In the future, make-versioneer.py will take a VCS name as an argument, and will construct a version of `versioneer.py` that is specific to the given VCS. It might also take the configuration arguments that are currently provided manually during installation by editing setup.py . Alternatively, it might go the other direction and include code from all supported VCS systems, reducing the number of intermediate scripts. ## Similar projects * [setuptools_scm](https://github.com/pypa/setuptools_scm/) - a non-vendored build-time dependency * [minver](https://github.com/jbweston/miniver) - a lightweight reimplementation of versioneer ## License To make Versioneer easier to embed, all its code is dedicated to the public domain. The `_version.py` that it creates is also in the public domain. Specifically, both are released under the Creative Commons "Public Domain Dedication" license (CC0-1.0), as described in https://creativecommons.org/publicdomain/zero/1.0/ . [pypi-image]: https://img.shields.io/pypi/v/versioneer.svg [pypi-url]: https://pypi.python.org/pypi/versioneer/ [travis-image]: https://img.shields.io/travis/com/python-versioneer/python-versioneer.svg [travis-url]: https://travis-ci.com/github/python-versioneer/python-versioneer """ import configparser import errno import json import os import re import subprocess import sys class VersioneerConfig: """Container for Versioneer configuration parameters.""" def get_root(): """Get the project root directory. We require that all commands are run from the project root, i.e. the directory that contains setup.py, setup.cfg, and versioneer.py . """ root = os.path.realpath(os.path.abspath(os.getcwd())) setup_py = os.path.join(root, "setup.py") versioneer_py = os.path.join(root, "versioneer.py") if not (os.path.exists(setup_py) or os.path.exists(versioneer_py)): # allow 'python path/to/setup.py COMMAND' root = os.path.dirname(os.path.realpath(os.path.abspath(sys.argv[0]))) setup_py = os.path.join(root, "setup.py") versioneer_py = os.path.join(root, "versioneer.py") if not (os.path.exists(setup_py) or os.path.exists(versioneer_py)): err = ("Versioneer was unable to run the project root directory. " "Versioneer requires setup.py to be executed from " "its immediate directory (like 'python setup.py COMMAND'), " "or in a way that lets it use sys.argv[0] to find the root " "(like 'python path/to/setup.py COMMAND').") raise VersioneerBadRootError(err) try: # Certain runtime workflows (setup.py install/develop in a setuptools # tree) execute all dependencies in a single python process, so # "versioneer" may be imported multiple times, and python's shared # module-import table will cache the first one. So we can't use # os.path.dirname(__file__), as that will find whichever # versioneer.py was first imported, even in later projects. me = os.path.realpath(os.path.abspath(__file__)) me_dir = os.path.normcase(os.path.splitext(me)[0]) vsr_dir = os.path.normcase(os.path.splitext(versioneer_py)[0]) if me_dir != vsr_dir: print("Warning: build in %s is using versioneer.py from %s" % (os.path.dirname(me), versioneer_py)) except NameError: pass return root def get_config_from_root(root): """Read the project setup.cfg file to determine Versioneer config.""" # This might raise EnvironmentError (if setup.cfg is missing), or # configparser.NoSectionError (if it lacks a [versioneer] section), or # configparser.NoOptionError (if it lacks "VCS="). See the docstring at # the top of versioneer.py for instructions on writing your setup.cfg . setup_cfg = os.path.join(root, "setup.cfg") parser = configparser.ConfigParser() with open(setup_cfg, "r") as f: parser.read_file(f) VCS = parser.get("versioneer", "VCS") # mandatory def get(parser, name): if parser.has_option("versioneer", name): return parser.get("versioneer", name) return None cfg = VersioneerConfig() cfg.VCS = VCS cfg.style = get(parser, "style") or "" cfg.versionfile_source = get(parser, "versionfile_source") cfg.versionfile_build = get(parser, "versionfile_build") cfg.tag_prefix = get(parser, "tag_prefix") if cfg.tag_prefix in ("''", '""'): cfg.tag_prefix = "" cfg.parentdir_prefix = get(parser, "parentdir_prefix") cfg.verbose = get(parser, "verbose") return cfg class NotThisMethod(Exception): """Exception raised if a method is not valid for the current scenario.""" # these dictionaries contain VCS-specific tools LONG_VERSION_PY = {} HANDLERS = {} def register_vcs_handler(vcs, method): # decorator """Create decorator to mark a method as the handler of a VCS.""" def decorate(f): """Store f in HANDLERS[vcs][method].""" if vcs not in HANDLERS: HANDLERS[vcs] = {} HANDLERS[vcs][method] = f return f return decorate def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None): """Call the given command(s).""" assert isinstance(commands, list) p = None for c in commands: try: dispcmd = str([c] + args) # remember shell=False, so use git.cmd on windows, not just git p = subprocess.Popen([c] + args, cwd=cwd, env=env, stdout=subprocess.PIPE, stderr=(subprocess.PIPE if hide_stderr else None)) break except EnvironmentError: e = sys.exc_info()[1] if e.errno == errno.ENOENT: continue if verbose: print("unable to run %s" % dispcmd) print(e) return None, None else: if verbose: print("unable to find command, tried %s" % (commands,)) return None, None stdout = p.communicate()[0].strip().decode() if p.returncode != 0: if verbose: print("unable to run %s (error)" % dispcmd) print("stdout was %s" % stdout) return None, p.returncode return stdout, p.returncode LONG_VERSION_PY['git'] = r''' # This file helps to compute a version number in source trees obtained from # git-archive tarball (such as those provided by githubs download-from-tag # feature). Distribution tarballs (built by setup.py sdist) and build # directories (produced by setup.py build) will contain a much shorter file # that just contains the computed version number. # This file is released into the public domain. Generated by # versioneer-0.19 (https://github.com/python-versioneer/python-versioneer) """Git implementation of _version.py.""" import errno import os import re import subprocess import sys def get_keywords(): """Get the keywords needed to look up the version information.""" # these strings will be replaced by git during git-archive. # setup.py/versioneer.py will grep for the variable names, so they must # each be defined on a line of their own. _version.py will just call # get_keywords(). git_refnames = "%(DOLLAR)sFormat:%%d%(DOLLAR)s" git_full = "%(DOLLAR)sFormat:%%H%(DOLLAR)s" git_date = "%(DOLLAR)sFormat:%%ci%(DOLLAR)s" keywords = {"refnames": git_refnames, "full": git_full, "date": git_date} return keywords class VersioneerConfig: """Container for Versioneer configuration parameters.""" def get_config(): """Create, populate and return the VersioneerConfig() object.""" # these strings are filled in when 'setup.py versioneer' creates # _version.py cfg = VersioneerConfig() cfg.VCS = "git" cfg.style = "%(STYLE)s" cfg.tag_prefix = "%(TAG_PREFIX)s" cfg.parentdir_prefix = "%(PARENTDIR_PREFIX)s" cfg.versionfile_source = "%(VERSIONFILE_SOURCE)s" cfg.verbose = False return cfg class NotThisMethod(Exception): """Exception raised if a method is not valid for the current scenario.""" LONG_VERSION_PY = {} HANDLERS = {} def register_vcs_handler(vcs, method): # decorator """Create decorator to mark a method as the handler of a VCS.""" def decorate(f): """Store f in HANDLERS[vcs][method].""" if vcs not in HANDLERS: HANDLERS[vcs] = {} HANDLERS[vcs][method] = f return f return decorate def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None): """Call the given command(s).""" assert isinstance(commands, list) p = None for c in commands: try: dispcmd = str([c] + args) # remember shell=False, so use git.cmd on windows, not just git p = subprocess.Popen([c] + args, cwd=cwd, env=env, stdout=subprocess.PIPE, stderr=(subprocess.PIPE if hide_stderr else None)) break except EnvironmentError: e = sys.exc_info()[1] if e.errno == errno.ENOENT: continue if verbose: print("unable to run %%s" %% dispcmd) print(e) return None, None else: if verbose: print("unable to find command, tried %%s" %% (commands,)) return None, None stdout = p.communicate()[0].strip().decode() if p.returncode != 0: if verbose: print("unable to run %%s (error)" %% dispcmd) print("stdout was %%s" %% stdout) return None, p.returncode return stdout, p.returncode def versions_from_parentdir(parentdir_prefix, root, verbose): """Try to determine the version from the parent directory name. Source tarballs conventionally unpack into a directory that includes both the project name and a version string. We will also support searching up two directory levels for an appropriately named parent directory """ rootdirs = [] for i in range(3): dirname = os.path.basename(root) if dirname.startswith(parentdir_prefix): return {"version": dirname[len(parentdir_prefix):], "full-revisionid": None, "dirty": False, "error": None, "date": None} else: rootdirs.append(root) root = os.path.dirname(root) # up a level if verbose: print("Tried directories %%s but none started with prefix %%s" %% (str(rootdirs), parentdir_prefix)) raise NotThisMethod("rootdir doesn't start with parentdir_prefix") @register_vcs_handler("git", "get_keywords") def git_get_keywords(versionfile_abs): """Extract version information from the given file.""" # the code embedded in _version.py can just fetch the value of these # keywords. When used from setup.py, we don't want to import _version.py, # so we do it with a regexp instead. This function is not used from # _version.py. keywords = {} try: f = open(versionfile_abs, "r") for line in f.readlines(): if line.strip().startswith("git_refnames ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["refnames"] = mo.group(1) if line.strip().startswith("git_full ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["full"] = mo.group(1) if line.strip().startswith("git_date ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["date"] = mo.group(1) f.close() except EnvironmentError: pass return keywords @register_vcs_handler("git", "keywords") def git_versions_from_keywords(keywords, tag_prefix, verbose): """Get version information from git keywords.""" if not keywords: raise NotThisMethod("no keywords at all, weird") date = keywords.get("date") if date is not None: # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] # git-2.2.0 added "%%cI", which expands to an ISO-8601 -compliant # datestamp. However we prefer "%%ci" (which expands to an "ISO-8601 # -like" string, which we must then edit to make compliant), because # it's been around since git-1.5.3, and it's too difficult to # discover which version we're using, or to work around using an # older one. date = date.strip().replace(" ", "T", 1).replace(" ", "", 1) refnames = keywords["refnames"].strip() if refnames.startswith("$Format"): if verbose: print("keywords are unexpanded, not using") raise NotThisMethod("unexpanded keywords, not a git-archive tarball") refs = set([r.strip() for r in refnames.strip("()").split(",")]) # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of # just "foo-1.0". If we see a "tag: " prefix, prefer those. TAG = "tag: " tags = set([r[len(TAG):] for r in refs if r.startswith(TAG)]) if not tags: # Either we're using git < 1.8.3, or there really are no tags. We use # a heuristic: assume all version tags have a digit. The old git %%d # expansion behaves like git log --decorate=short and strips out the # refs/heads/ and refs/tags/ prefixes that would let us distinguish # between branches and tags. By ignoring refnames without digits, we # filter out many common branch names like "release" and # "stabilization", as well as "HEAD" and "master". tags = set([r for r in refs if re.search(r'\d', r)]) if verbose: print("discarding '%%s', no digits" %% ",".join(refs - tags)) if verbose: print("likely tags: %%s" %% ",".join(sorted(tags))) for ref in sorted(tags): # sorting will prefer e.g. "2.0" over "2.0rc1" if ref.startswith(tag_prefix): r = ref[len(tag_prefix):] if verbose: print("picking %%s" %% r) return {"version": r, "full-revisionid": keywords["full"].strip(), "dirty": False, "error": None, "date": date} # no suitable tags, so version is "0+unknown", but full hex is still there if verbose: print("no suitable tags, using unknown + full revision id") return {"version": "0+unknown", "full-revisionid": keywords["full"].strip(), "dirty": False, "error": "no suitable tags", "date": None} @register_vcs_handler("git", "pieces_from_vcs") def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command): """Get version from 'git describe' in the root of the source tree. This only gets called if the git-archive 'subst' keywords were *not* expanded, and _version.py hasn't already been rewritten with a short version string, meaning we're inside a checked out source tree. """ GITS = ["git"] if sys.platform == "win32": GITS = ["git.cmd", "git.exe"] out, rc = run_command(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=True) if rc != 0: if verbose: print("Directory %%s not under git control" %% root) raise NotThisMethod("'git rev-parse --git-dir' returned error") # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty] # if there isn't one, this yields HEX[-dirty] (no NUM) describe_out, rc = run_command(GITS, ["describe", "--tags", "--dirty", "--always", "--long", "--match", "%%s*" %% tag_prefix], cwd=root) # --long was added in git-1.5.5 if describe_out is None: raise NotThisMethod("'git describe' failed") describe_out = describe_out.strip() full_out, rc = run_command(GITS, ["rev-parse", "HEAD"], cwd=root) if full_out is None: raise NotThisMethod("'git rev-parse' failed") full_out = full_out.strip() pieces = {} pieces["long"] = full_out pieces["short"] = full_out[:7] # maybe improved later pieces["error"] = None # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty] # TAG might have hyphens. git_describe = describe_out # look for -dirty suffix dirty = git_describe.endswith("-dirty") pieces["dirty"] = dirty if dirty: git_describe = git_describe[:git_describe.rindex("-dirty")] # now we have TAG-NUM-gHEX or HEX if "-" in git_describe: # TAG-NUM-gHEX mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe) if not mo: # unparseable. Maybe git-describe is misbehaving? pieces["error"] = ("unable to parse git-describe output: '%%s'" %% describe_out) return pieces # tag full_tag = mo.group(1) if not full_tag.startswith(tag_prefix): if verbose: fmt = "tag '%%s' doesn't start with prefix '%%s'" print(fmt %% (full_tag, tag_prefix)) pieces["error"] = ("tag '%%s' doesn't start with prefix '%%s'" %% (full_tag, tag_prefix)) return pieces pieces["closest-tag"] = full_tag[len(tag_prefix):] # distance: number of commits since tag pieces["distance"] = int(mo.group(2)) # commit: short hex revision ID pieces["short"] = mo.group(3) else: # HEX: no tags pieces["closest-tag"] = None count_out, rc = run_command(GITS, ["rev-list", "HEAD", "--count"], cwd=root) pieces["distance"] = int(count_out) # total number of commits # commit date: see ISO-8601 comment in git_versions_from_keywords() date = run_command(GITS, ["show", "-s", "--format=%%ci", "HEAD"], cwd=root)[0].strip() # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1) return pieces def plus_or_dot(pieces): """Return a + if we don't already have one, else return a .""" if "+" in pieces.get("closest-tag", ""): return "." return "+" def render_pep440(pieces): """Build up version string, with post-release "local version identifier". Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you get a tagged build and then dirty it, you'll get TAG+0.gHEX.dirty Exceptions: 1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += plus_or_dot(pieces) rendered += "%%d.g%%s" %% (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0+untagged.%%d.g%%s" %% (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" return rendered def render_pep440_pre(pieces): """TAG[.post0.devDISTANCE] -- No -dirty. Exceptions: 1: no tags. 0.post0.devDISTANCE """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += ".post0.dev%%d" %% pieces["distance"] else: # exception #1 rendered = "0.post0.dev%%d" %% pieces["distance"] return rendered def render_pep440_post(pieces): """TAG[.postDISTANCE[.dev0]+gHEX] . The ".dev0" means dirty. Note that .dev0 sorts backwards (a dirty tree will appear "older" than the corresponding clean one), but you shouldn't be releasing software with -dirty anyways. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "g%%s" %% pieces["short"] else: # exception #1 rendered = "0.post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += "+g%%s" %% pieces["short"] return rendered def render_pep440_old(pieces): """TAG[.postDISTANCE[.dev0]] . The ".dev0" means dirty. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" else: # exception #1 rendered = "0.post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" return rendered def render_git_describe(pieces): """TAG[-DISTANCE-gHEX][-dirty]. Like 'git describe --tags --dirty --always'. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += "-%%d-g%%s" %% (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render_git_describe_long(pieces): """TAG-DISTANCE-gHEX[-dirty]. Like 'git describe --tags --dirty --always -long'. The distance/hash is unconditional. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] rendered += "-%%d-g%%s" %% (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render(pieces, style): """Render the given version pieces into the requested style.""" if pieces["error"]: return {"version": "unknown", "full-revisionid": pieces.get("long"), "dirty": None, "error": pieces["error"], "date": None} if not style or style == "default": style = "pep440" # the default if style == "pep440": rendered = render_pep440(pieces) elif style == "pep440-pre": rendered = render_pep440_pre(pieces) elif style == "pep440-post": rendered = render_pep440_post(pieces) elif style == "pep440-old": rendered = render_pep440_old(pieces) elif style == "git-describe": rendered = render_git_describe(pieces) elif style == "git-describe-long": rendered = render_git_describe_long(pieces) else: raise ValueError("unknown style '%%s'" %% style) return {"version": rendered, "full-revisionid": pieces["long"], "dirty": pieces["dirty"], "error": None, "date": pieces.get("date")} def get_versions(): """Get version information or return default if unable to do so.""" # I am in _version.py, which lives at ROOT/VERSIONFILE_SOURCE. If we have # __file__, we can work backwards from there to the root. Some # py2exe/bbfreeze/non-CPython implementations don't do __file__, in which # case we can only use expanded keywords. cfg = get_config() verbose = cfg.verbose try: return git_versions_from_keywords(get_keywords(), cfg.tag_prefix, verbose) except NotThisMethod: pass try: root = os.path.realpath(__file__) # versionfile_source is the relative path from the top of the source # tree (where the .git directory might live) to this file. Invert # this to find the root from __file__. for i in cfg.versionfile_source.split('/'): root = os.path.dirname(root) except NameError: return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to find root of source tree", "date": None} try: pieces = git_pieces_from_vcs(cfg.tag_prefix, root, verbose) return render(pieces, cfg.style) except NotThisMethod: pass try: if cfg.parentdir_prefix: return versions_from_parentdir(cfg.parentdir_prefix, root, verbose) except NotThisMethod: pass return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to compute version", "date": None} ''' @register_vcs_handler("git", "get_keywords") def git_get_keywords(versionfile_abs): """Extract version information from the given file.""" # the code embedded in _version.py can just fetch the value of these # keywords. When used from setup.py, we don't want to import _version.py, # so we do it with a regexp instead. This function is not used from # _version.py. keywords = {} try: f = open(versionfile_abs, "r") for line in f.readlines(): if line.strip().startswith("git_refnames ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["refnames"] = mo.group(1) if line.strip().startswith("git_full ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["full"] = mo.group(1) if line.strip().startswith("git_date ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["date"] = mo.group(1) f.close() except EnvironmentError: pass return keywords @register_vcs_handler("git", "keywords") def git_versions_from_keywords(keywords, tag_prefix, verbose): """Get version information from git keywords.""" if not keywords: raise NotThisMethod("no keywords at all, weird") date = keywords.get("date") if date is not None: # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] # git-2.2.0 added "%cI", which expands to an ISO-8601 -compliant # datestamp. However we prefer "%ci" (which expands to an "ISO-8601 # -like" string, which we must then edit to make compliant), because # it's been around since git-1.5.3, and it's too difficult to # discover which version we're using, or to work around using an # older one. date = date.strip().replace(" ", "T", 1).replace(" ", "", 1) refnames = keywords["refnames"].strip() if refnames.startswith("$Format"): if verbose: print("keywords are unexpanded, not using") raise NotThisMethod("unexpanded keywords, not a git-archive tarball") refs = set([r.strip() for r in refnames.strip("()").split(",")]) # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of # just "foo-1.0". If we see a "tag: " prefix, prefer those. TAG = "tag: " tags = set([r[len(TAG):] for r in refs if r.startswith(TAG)]) if not tags: # Either we're using git < 1.8.3, or there really are no tags. We use # a heuristic: assume all version tags have a digit. The old git %d # expansion behaves like git log --decorate=short and strips out the # refs/heads/ and refs/tags/ prefixes that would let us distinguish # between branches and tags. By ignoring refnames without digits, we # filter out many common branch names like "release" and # "stabilization", as well as "HEAD" and "master". tags = set([r for r in refs if re.search(r'\d', r)]) if verbose: print("discarding '%s', no digits" % ",".join(refs - tags)) if verbose: print("likely tags: %s" % ",".join(sorted(tags))) for ref in sorted(tags): # sorting will prefer e.g. "2.0" over "2.0rc1" if ref.startswith(tag_prefix): r = ref[len(tag_prefix):] if verbose: print("picking %s" % r) return {"version": r, "full-revisionid": keywords["full"].strip(), "dirty": False, "error": None, "date": date} # no suitable tags, so version is "0+unknown", but full hex is still there if verbose: print("no suitable tags, using unknown + full revision id") return {"version": "0+unknown", "full-revisionid": keywords["full"].strip(), "dirty": False, "error": "no suitable tags", "date": None} @register_vcs_handler("git", "pieces_from_vcs") def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command): """Get version from 'git describe' in the root of the source tree. This only gets called if the git-archive 'subst' keywords were *not* expanded, and _version.py hasn't already been rewritten with a short version string, meaning we're inside a checked out source tree. """ GITS = ["git"] if sys.platform == "win32": GITS = ["git.cmd", "git.exe"] out, rc = run_command(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=True) if rc != 0: if verbose: print("Directory %s not under git control" % root) raise NotThisMethod("'git rev-parse --git-dir' returned error") # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty] # if there isn't one, this yields HEX[-dirty] (no NUM) describe_out, rc = run_command(GITS, ["describe", "--tags", "--dirty", "--always", "--long", "--match", "%s*" % tag_prefix], cwd=root) # --long was added in git-1.5.5 if describe_out is None: raise NotThisMethod("'git describe' failed") describe_out = describe_out.strip() full_out, rc = run_command(GITS, ["rev-parse", "HEAD"], cwd=root) if full_out is None: raise NotThisMethod("'git rev-parse' failed") full_out = full_out.strip() pieces = {} pieces["long"] = full_out pieces["short"] = full_out[:7] # maybe improved later pieces["error"] = None # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty] # TAG might have hyphens. git_describe = describe_out # look for -dirty suffix dirty = git_describe.endswith("-dirty") pieces["dirty"] = dirty if dirty: git_describe = git_describe[:git_describe.rindex("-dirty")] # now we have TAG-NUM-gHEX or HEX if "-" in git_describe: # TAG-NUM-gHEX mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe) if not mo: # unparseable. Maybe git-describe is misbehaving? pieces["error"] = ("unable to parse git-describe output: '%s'" % describe_out) return pieces # tag full_tag = mo.group(1) if not full_tag.startswith(tag_prefix): if verbose: fmt = "tag '%s' doesn't start with prefix '%s'" print(fmt % (full_tag, tag_prefix)) pieces["error"] = ("tag '%s' doesn't start with prefix '%s'" % (full_tag, tag_prefix)) return pieces pieces["closest-tag"] = full_tag[len(tag_prefix):] # distance: number of commits since tag pieces["distance"] = int(mo.group(2)) # commit: short hex revision ID pieces["short"] = mo.group(3) else: # HEX: no tags pieces["closest-tag"] = None count_out, rc = run_command(GITS, ["rev-list", "HEAD", "--count"], cwd=root) pieces["distance"] = int(count_out) # total number of commits # commit date: see ISO-8601 comment in git_versions_from_keywords() date = run_command(GITS, ["show", "-s", "--format=%ci", "HEAD"], cwd=root)[0].strip() # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1) return pieces def do_vcs_install(manifest_in, versionfile_source, ipy): """Git-specific installation logic for Versioneer. For Git, this means creating/changing .gitattributes to mark _version.py for export-subst keyword substitution. """ GITS = ["git"] if sys.platform == "win32": GITS = ["git.cmd", "git.exe"] files = [manifest_in, versionfile_source] if ipy: files.append(ipy) try: me = __file__ if me.endswith(".pyc") or me.endswith(".pyo"): me = os.path.splitext(me)[0] + ".py" versioneer_file = os.path.relpath(me) except NameError: versioneer_file = "versioneer.py" files.append(versioneer_file) present = False try: f = open(".gitattributes", "r") for line in f.readlines(): if line.strip().startswith(versionfile_source): if "export-subst" in line.strip().split()[1:]: present = True f.close() except EnvironmentError: pass if not present: f = open(".gitattributes", "a+") f.write("%s export-subst\n" % versionfile_source) f.close() files.append(".gitattributes") run_command(GITS, ["add", "--"] + files) def versions_from_parentdir(parentdir_prefix, root, verbose): """Try to determine the version from the parent directory name. Source tarballs conventionally unpack into a directory that includes both the project name and a version string. We will also support searching up two directory levels for an appropriately named parent directory """ rootdirs = [] for i in range(3): dirname = os.path.basename(root) if dirname.startswith(parentdir_prefix): return {"version": dirname[len(parentdir_prefix):], "full-revisionid": None, "dirty": False, "error": None, "date": None} else: rootdirs.append(root) root = os.path.dirname(root) # up a level if verbose: print("Tried directories %s but none started with prefix %s" % (str(rootdirs), parentdir_prefix)) raise NotThisMethod("rootdir doesn't start with parentdir_prefix") SHORT_VERSION_PY = """ # This file was generated by 'versioneer.py' (0.19) from # revision-control system data, or from the parent directory name of an # unpacked source archive. Distribution tarballs contain a pre-generated copy # of this file. import json version_json = ''' %s ''' # END VERSION_JSON def get_versions(): return json.loads(version_json) """ def versions_from_file(filename): """Try to determine the version from _version.py if present.""" try: with open(filename) as f: contents = f.read() except EnvironmentError: raise NotThisMethod("unable to read _version.py") mo = re.search(r"version_json = '''\n(.*)''' # END VERSION_JSON", contents, re.M | re.S) if not mo: mo = re.search(r"version_json = '''\r\n(.*)''' # END VERSION_JSON", contents, re.M | re.S) if not mo: raise NotThisMethod("no version_json in _version.py") return json.loads(mo.group(1)) def write_to_version_file(filename, versions): """Write the given version number to the given _version.py file.""" os.unlink(filename) contents = json.dumps(versions, sort_keys=True, indent=1, separators=(",", ": ")) with open(filename, "w") as f: f.write(SHORT_VERSION_PY % contents) print("set %s to '%s'" % (filename, versions["version"])) def plus_or_dot(pieces): """Return a + if we don't already have one, else return a .""" if "+" in pieces.get("closest-tag", ""): return "." return "+" def render_pep440(pieces): """Build up version string, with post-release "local version identifier". Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you get a tagged build and then dirty it, you'll get TAG+0.gHEX.dirty Exceptions: 1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += plus_or_dot(pieces) rendered += "%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0+untagged.%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" return rendered def render_pep440_pre(pieces): """TAG[.post0.devDISTANCE] -- No -dirty. Exceptions: 1: no tags. 0.post0.devDISTANCE """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += ".post0.dev%d" % pieces["distance"] else: # exception #1 rendered = "0.post0.dev%d" % pieces["distance"] return rendered def render_pep440_post(pieces): """TAG[.postDISTANCE[.dev0]+gHEX] . The ".dev0" means dirty. Note that .dev0 sorts backwards (a dirty tree will appear "older" than the corresponding clean one), but you shouldn't be releasing software with -dirty anyways. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "g%s" % pieces["short"] else: # exception #1 rendered = "0.post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += "+g%s" % pieces["short"] return rendered def render_pep440_old(pieces): """TAG[.postDISTANCE[.dev0]] . The ".dev0" means dirty. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" else: # exception #1 rendered = "0.post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" return rendered def render_git_describe(pieces): """TAG[-DISTANCE-gHEX][-dirty]. Like 'git describe --tags --dirty --always'. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render_git_describe_long(pieces): """TAG-DISTANCE-gHEX[-dirty]. Like 'git describe --tags --dirty --always -long'. The distance/hash is unconditional. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render(pieces, style): """Render the given version pieces into the requested style.""" if pieces["error"]: return {"version": "unknown", "full-revisionid": pieces.get("long"), "dirty": None, "error": pieces["error"], "date": None} if not style or style == "default": style = "pep440" # the default if style == "pep440": rendered = render_pep440(pieces) elif style == "pep440-pre": rendered = render_pep440_pre(pieces) elif style == "pep440-post": rendered = render_pep440_post(pieces) elif style == "pep440-old": rendered = render_pep440_old(pieces) elif style == "git-describe": rendered = render_git_describe(pieces) elif style == "git-describe-long": rendered = render_git_describe_long(pieces) else: raise ValueError("unknown style '%s'" % style) return {"version": rendered, "full-revisionid": pieces["long"], "dirty": pieces["dirty"], "error": None, "date": pieces.get("date")} class VersioneerBadRootError(Exception): """The project root directory is unknown or missing key files.""" def get_versions(verbose=False): """Get the project version from whatever source is available. Returns dict with two keys: 'version' and 'full'. """ if "versioneer" in sys.modules: # see the discussion in cmdclass.py:get_cmdclass() del sys.modules["versioneer"] root = get_root() cfg = get_config_from_root(root) assert cfg.VCS is not None, "please set [versioneer]VCS= in setup.cfg" handlers = HANDLERS.get(cfg.VCS) assert handlers, "unrecognized VCS '%s'" % cfg.VCS verbose = verbose or cfg.verbose assert cfg.versionfile_source is not None, \ "please set versioneer.versionfile_source" assert cfg.tag_prefix is not None, "please set versioneer.tag_prefix" versionfile_abs = os.path.join(root, cfg.versionfile_source) # extract version from first of: _version.py, VCS command (e.g. 'git # describe'), parentdir. This is meant to work for developers using a # source checkout, for users of a tarball created by 'setup.py sdist', # and for users of a tarball/zipball created by 'git archive' or github's # download-from-tag feature or the equivalent in other VCSes. get_keywords_f = handlers.get("get_keywords") from_keywords_f = handlers.get("keywords") if get_keywords_f and from_keywords_f: try: keywords = get_keywords_f(versionfile_abs) ver = from_keywords_f(keywords, cfg.tag_prefix, verbose) if verbose: print("got version from expanded keyword %s" % ver) return ver except NotThisMethod: pass try: ver = versions_from_file(versionfile_abs) if verbose: print("got version from file %s %s" % (versionfile_abs, ver)) return ver except NotThisMethod: pass from_vcs_f = handlers.get("pieces_from_vcs") if from_vcs_f: try: pieces = from_vcs_f(cfg.tag_prefix, root, verbose) ver = render(pieces, cfg.style) if verbose: print("got version from VCS %s" % ver) return ver except NotThisMethod: pass try: if cfg.parentdir_prefix: ver = versions_from_parentdir(cfg.parentdir_prefix, root, verbose) if verbose: print("got version from parentdir %s" % ver) return ver except NotThisMethod: pass if verbose: print("unable to compute version") return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to compute version", "date": None} def get_version(): """Get the short version string for this project.""" return get_versions()["version"] def get_cmdclass(cmdclass=None): """Get the custom setuptools/distutils subclasses used by Versioneer. If the package uses a different cmdclass (e.g. one from numpy), it should be provide as an argument. """ if "versioneer" in sys.modules: del sys.modules["versioneer"] # this fixes the "python setup.py develop" case (also 'install' and # 'easy_install .'), in which subdependencies of the main project are # built (using setup.py bdist_egg) in the same python process. Assume # a main project A and a dependency B, which use different versions # of Versioneer. A's setup.py imports A's Versioneer, leaving it in # sys.modules by the time B's setup.py is executed, causing B to run # with the wrong versioneer. Setuptools wraps the sub-dep builds in a # sandbox that restores sys.modules to it's pre-build state, so the # parent is protected against the child's "import versioneer". By # removing ourselves from sys.modules here, before the child build # happens, we protect the child from the parent's versioneer too. # Also see https://github.com/python-versioneer/python-versioneer/issues/52 cmds = {} if cmdclass is None else cmdclass.copy() # we add "version" to both distutils and setuptools from setuptools import Command class cmd_version(Command): description = "report generated version string" user_options = [] boolean_options = [] def initialize_options(self): pass def finalize_options(self): pass def run(self): vers = get_versions(verbose=True) print("Version: %s" % vers["version"]) print(" full-revisionid: %s" % vers.get("full-revisionid")) print(" dirty: %s" % vers.get("dirty")) print(" date: %s" % vers.get("date")) if vers["error"]: print(" error: %s" % vers["error"]) cmds["version"] = cmd_version # we override "build_py" in both distutils and setuptools # # most invocation pathways end up running build_py: # distutils/build -> build_py # distutils/install -> distutils/build ->.. # setuptools/bdist_wheel -> distutils/install ->.. # setuptools/bdist_egg -> distutils/install_lib -> build_py # setuptools/install -> bdist_egg ->.. # setuptools/develop -> ? # pip install: # copies source tree to a tempdir before running egg_info/etc # if .git isn't copied too, 'git describe' will fail # then does setup.py bdist_wheel, or sometimes setup.py install # setup.py egg_info -> ? # we override different "build_py" commands for both environments if 'build_py' in cmds: _build_py = cmds['build_py'] elif "setuptools" in sys.modules: from setuptools.command.build_py import build_py as _build_py else: from distutils.command.build_py import build_py as _build_py class cmd_build_py(_build_py): def run(self): root = get_root() cfg = get_config_from_root(root) versions = get_versions() _build_py.run(self) # now locate _version.py in the new build/ directory and replace # it with an updated value if cfg.versionfile_build: target_versionfile = os.path.join(self.build_lib, cfg.versionfile_build) print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) cmds["build_py"] = cmd_build_py if "setuptools" in sys.modules: from setuptools.command.build_ext import build_ext as _build_ext else: from distutils.command.build_ext import build_ext as _build_ext class cmd_build_ext(_build_ext): def run(self): root = get_root() cfg = get_config_from_root(root) versions = get_versions() _build_ext.run(self) if self.inplace: # build_ext --inplace will only build extensions in # build/lib<..> dir with no _version.py to write to. # As in place builds will already have a _version.py # in the module dir, we do not need to write one. return # now locate _version.py in the new build/ directory and replace # it with an updated value target_versionfile = os.path.join(self.build_lib, cfg.versionfile_source) print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) cmds["build_ext"] = cmd_build_ext if "cx_Freeze" in sys.modules: # cx_freeze enabled? from cx_Freeze.dist import build_exe as _build_exe # nczeczulin reports that py2exe won't like the pep440-style string # as FILEVERSION, but it can be used for PRODUCTVERSION, e.g. # setup(console=[{ # "version": versioneer.get_version().split("+", 1)[0], # FILEVERSION # "product_version": versioneer.get_version(), # ... class cmd_build_exe(_build_exe): def run(self): root = get_root() cfg = get_config_from_root(root) versions = get_versions() target_versionfile = cfg.versionfile_source print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) _build_exe.run(self) os.unlink(target_versionfile) with open(cfg.versionfile_source, "w") as f: LONG = LONG_VERSION_PY[cfg.VCS] f.write(LONG % {"DOLLAR": "$", "STYLE": cfg.style, "TAG_PREFIX": cfg.tag_prefix, "PARENTDIR_PREFIX": cfg.parentdir_prefix, "VERSIONFILE_SOURCE": cfg.versionfile_source, }) cmds["build_exe"] = cmd_build_exe del cmds["build_py"] if 'py2exe' in sys.modules: # py2exe enabled? from py2exe.distutils_buildexe import py2exe as _py2exe class cmd_py2exe(_py2exe): def run(self): root = get_root() cfg = get_config_from_root(root) versions = get_versions() target_versionfile = cfg.versionfile_source print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) _py2exe.run(self) os.unlink(target_versionfile) with open(cfg.versionfile_source, "w") as f: LONG = LONG_VERSION_PY[cfg.VCS] f.write(LONG % {"DOLLAR": "$", "STYLE": cfg.style, "TAG_PREFIX": cfg.tag_prefix, "PARENTDIR_PREFIX": cfg.parentdir_prefix, "VERSIONFILE_SOURCE": cfg.versionfile_source, }) cmds["py2exe"] = cmd_py2exe # we override different "sdist" commands for both environments if 'sdist' in cmds: _sdist = cmds['sdist'] elif "setuptools" in sys.modules: from setuptools.command.sdist import sdist as _sdist else: from distutils.command.sdist import sdist as _sdist class cmd_sdist(_sdist): def run(self): versions = get_versions() self._versioneer_generated_versions = versions # unless we update this, the command will keep using the old # version self.distribution.metadata.version = versions["version"] return _sdist.run(self) def make_release_tree(self, base_dir, files): root = get_root() cfg = get_config_from_root(root) _sdist.make_release_tree(self, base_dir, files) # now locate _version.py in the new base_dir directory # (remembering that it may be a hardlink) and replace it with an # updated value target_versionfile = os.path.join(base_dir, cfg.versionfile_source) print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, self._versioneer_generated_versions) cmds["sdist"] = cmd_sdist return cmds CONFIG_ERROR = """ setup.cfg is missing the necessary Versioneer configuration. You need a section like: [versioneer] VCS = git style = pep440 versionfile_source = src/myproject/_version.py versionfile_build = myproject/_version.py tag_prefix = parentdir_prefix = myproject- You will also need to edit your setup.py to use the results: import versioneer setup(version=versioneer.get_version(), cmdclass=versioneer.get_cmdclass(), ...) Please read the docstring in ./versioneer.py for configuration instructions, edit setup.cfg, and re-run the installer or 'python versioneer.py setup'. """ SAMPLE_CONFIG = """ # See the docstring in versioneer.py for instructions. Note that you must # re-run 'versioneer.py setup' after changing this section, and commit the # resulting files. [versioneer] #VCS = git #style = pep440 #versionfile_source = #versionfile_build = #tag_prefix = #parentdir_prefix = """ INIT_PY_SNIPPET = """ from ._version import get_versions __version__ = get_versions()['version'] del get_versions """ def do_setup(): """Do main VCS-independent setup function for installing Versioneer.""" root = get_root() try: cfg = get_config_from_root(root) except (EnvironmentError, configparser.NoSectionError, configparser.NoOptionError) as e: if isinstance(e, (EnvironmentError, configparser.NoSectionError)): print("Adding sample versioneer config to setup.cfg", file=sys.stderr) with open(os.path.join(root, "setup.cfg"), "a") as f: f.write(SAMPLE_CONFIG) print(CONFIG_ERROR, file=sys.stderr) return 1 print(" creating %s" % cfg.versionfile_source) with open(cfg.versionfile_source, "w") as f: LONG = LONG_VERSION_PY[cfg.VCS] f.write(LONG % {"DOLLAR": "$", "STYLE": cfg.style, "TAG_PREFIX": cfg.tag_prefix, "PARENTDIR_PREFIX": cfg.parentdir_prefix, "VERSIONFILE_SOURCE": cfg.versionfile_source, }) ipy = os.path.join(os.path.dirname(cfg.versionfile_source), "__init__.py") if os.path.exists(ipy): try: with open(ipy, "r") as f: old = f.read() except EnvironmentError: old = "" if INIT_PY_SNIPPET not in old: print(" appending to %s" % ipy) with open(ipy, "a") as f: f.write(INIT_PY_SNIPPET) else: print(" %s unmodified" % ipy) else: print(" %s doesn't exist, ok" % ipy) ipy = None # Make sure both the top-level "versioneer.py" and versionfile_source # (PKG/_version.py, used by runtime code) are in MANIFEST.in, so # they'll be copied into source distributions. Pip won't be able to # install the package without this. manifest_in = os.path.join(root, "MANIFEST.in") simple_includes = set() try: with open(manifest_in, "r") as f: for line in f: if line.startswith("include "): for include in line.split()[1:]: simple_includes.add(include) except EnvironmentError: pass # That doesn't cover everything MANIFEST.in can do # (http://docs.python.org/2/distutils/sourcedist.html#commands), so # it might give some false negatives. Appending redundant 'include' # lines is safe, though. if "versioneer.py" not in simple_includes: print(" appending 'versioneer.py' to MANIFEST.in") with open(manifest_in, "a") as f: f.write("include versioneer.py\n") else: print(" 'versioneer.py' already in MANIFEST.in") if cfg.versionfile_source not in simple_includes: print(" appending versionfile_source ('%s') to MANIFEST.in" % cfg.versionfile_source) with open(manifest_in, "a") as f: f.write("include %s\n" % cfg.versionfile_source) else: print(" versionfile_source already in MANIFEST.in") # Make VCS-specific changes. For git, this means creating/changing # .gitattributes to mark _version.py for export-subst keyword # substitution. do_vcs_install(manifest_in, cfg.versionfile_source, ipy) return 0 def scan_setup_py(): """Validate the contents of setup.py against Versioneer's expectations.""" found = set() setters = False errors = 0 with open("setup.py", "r") as f: for line in f.readlines(): if "import versioneer" in line: found.add("import") if "versioneer.get_cmdclass()" in line: found.add("cmdclass") if "versioneer.get_version()" in line: found.add("get_version") if "versioneer.VCS" in line: setters = True if "versioneer.versionfile_source" in line: setters = True if len(found) != 3: print("") print("Your setup.py appears to be missing some important items") print("(but I might be wrong). Please make sure it has something") print("roughly like the following:") print("") print(" import versioneer") print(" setup( version=versioneer.get_version(),") print(" cmdclass=versioneer.get_cmdclass(), ...)") print("") errors += 1 if setters: print("You should remove lines like 'versioneer.VCS = ' and") print("'versioneer.versionfile_source = ' . This configuration") print("now lives in setup.cfg, and should be removed from setup.py") print("") errors += 1 return errors if __name__ == "__main__": cmd = sys.argv[1] if cmd == "setup": errors = do_setup() errors += scan_setup_py() if errors: sys.exit(1)