././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.425372 bayespy-0.6.2/0000755000175100001770000000000000000000000014001 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/CHANGELOG.rst0000644000175100001770000002432100000000000016024 0ustar00runnerdocker00000000000000Version 0.6.2 (2024-09-02) ++++++++++++++++++++++++++ Fixed ..... * Update versioneer to support recent Python versions. Version 0.6.1 (2024-02-28) ++++++++++++++++++++++++++ Fixed ..... * Add missing truncnorm package to setup.py Version 0.6.0 (2024-02-28) ++++++++++++++++++++++++++ Added ..... * Add preliminary support for truncation in Gaussian node. Version 0.5.28 (2024-02-22) +++++++++++++++++++++++++++ Fixed ..... * Fix PyPI publishing Version 0.5.27 (2024-02-22) +++++++++++++++++++++++++++ Fixed ..... * Fix dtype in categorical Markov chain fixed moments calculation. Version 0.5.26 (2023-05-25) +++++++++++++++++++++++++++ Fixed ..... * Fix deprecated ``np.int``. Version 0.5.25 (2022-12-28) +++++++++++++++++++++++++++ Fixed ..... * Fix a few bugs which caused demos to fail. Version 0.5.24 (2022-09-30) +++++++++++++++++++++++++++ Fixed ..... * Fix versioning in PyPI release tarballs. Version 0.5.23 (2022-09-30) +++++++++++++++++++++++++++ Added ..... * Support ``initialize_from_random`` and ``initialize_from_value`` for ``CategoricalMarkovChain``. Fixed ..... * Fix support for recent SciPy versions. Version 0.5.22 (2021-03-19) +++++++++++++++++++++++++++ Fixed ..... * Fix #122: Add support for arrays of number of trials in a mixture of multinomials and binomials. Version 0.5.21 (2021-03-04) +++++++++++++++++++++++++++ Fixed ..... * Use ``time.time`` instead of the deprecated ``time.clock``. Version 0.5.20 (2020-10-06) +++++++++++++++++++++++++++ Fixed ..... * Fix sequence indexing in Categorical moments. Version 0.5.19 (2019-12-11) +++++++++++++++++++++++++++ Fixed ..... * Improve memory usage in ``SumMultiply`` when some input nodes are just constants (e.g., NumPy arrays). Version 0.5.18 (2019-01-07) +++++++++++++++++++++++++++ Fixed ..... * Fix mask handling in Gate node. Version 0.5.17 (2018-04-18) +++++++++++++++++++++++++++ Changed ....... * Import ``plot`` module automatically if possible (i.e., if matplotlib available) Version 0.5.16 (2018-04-17) +++++++++++++++++++++++++++ Fixed ..... * Fix matplotlib dependency removal. Version 0.5.15 (2018-04-17) +++++++++++++++++++++++++++ Changed ....... * Matplotlib was removed from installation requirements. Version 0.5.14 (2018-03-09) +++++++++++++++++++++++++++ Added ..... * Support ``phi_bias`` for exponential family nodes. This can be used for simple regularization. Version 0.5.13 (2018-03-09) +++++++++++++++++++++++++++ Changed ....... * Support "prior" for GammaShape. Version 0.5.12 (2017-10-19) +++++++++++++++++++++++++++ Changed ....... * Skip all image comparison tests for now. Fixed ..... * Support (0,0)-shape matrices in Cholesky functions. Version 0.5.11 (2017-09-26) +++++++++++++++++++++++++++ Fixed ..... * Handle scalar moments of the innovation vector properly in Gaussian Markov chain. * Skip some failing image comparison unit tests. Image comparison tests will be deprecated at some point. Version 0.5.10 (2017-09-02) +++++++++++++++++++++++++++ Fixed ..... * Fix release Version 0.5.9 (2017-09-02) ++++++++++++++++++++++++++ Added ..... * Support tqdm for monitoring the iteration progress (#105). * Allow VB iteration without maximum number of iteration steps (#104). * Add ellipse patch creation from covariance or precision (#103). Version 0.5.8 (2017-05-13) ++++++++++++++++++++++++++ Fixed ..... * Implement random sampling for Poisson * Update some old licensing information Version 0.5.7 (2016-11-15) ++++++++++++++++++++++++++ Fixed ..... * Fix deterministic mappings in Mixture, which caused NaNs in results Version 0.5.6 (2016-11-08) ++++++++++++++++++++++++++ Fixed ..... * Remove significant reshaping overhead in Cholesky computations in linalg module * Fix minor plate multiplier issues Version 0.5.5 (2016-11-04) ++++++++++++++++++++++++++ Fixed ..... * Fix critical plate multiplier bug in Take node. The bug caused basically all models with Take node to be incorrect. * Fix ndim handling in GaussianGamma and Wishart * Support lists and other array-convertible formats in several nodes Version 0.5.4 (2016-10-27) ++++++++++++++++++++++++++ Added ..... * Add conversion from Gamma to scalar Wishart * Implement message from GaussianMarkovChain to its input parent node * Add generic unit test functions for messages and moments Changed ....... * Require NumPy 1.10 or greater Version 0.5.3 (2016-08-17) ++++++++++++++++++++++++++ Fixed ..... * Fix package metadata handling * Fix Travis test errors Version 0.5.2 (2016-08-17) ++++++++++++++++++++++++++ Added ..... * Add a node method to obtain the VB lower bound terms that contain the node Fixed ..... * Handle empty CLI argument lists in CLI argument parsing * Fix handling of the two variables (Gaussian and Gamma) in GaussianGamma methods * Fix minor bugs, including CGF in GaussianMarkovChain with inputs Version 0.5.1 (2016-05-17) ++++++++++++++++++++++++++ Fixed ..... * Accept lists as number of multinomial trials * Fix typo in handling concentration regularization shape Version 0.5.0 (2016-05-04) ++++++++++++++++++++++++++ Added ..... * Implement the following new nodes: - Take - MultiMixture - ConcatGaussian - GaussianWishart - GaussianGamma - Choose - Concentration - MaximumLikelihood - Function * Add preliminary support for maximum likelihood estimation (implemented only for Wishart moments now) * Support multiplying Wishart variable by a gamma variable (scale method in Wishart class) * Support GaussianWishart and GaussianGamma in GaussianMarkovChain * Support 1-p operation (complement) for beta variables * Implement random sampling for Multinomial node * Support ndim in many linalg functions and Gaussian-related nodes * Add conjugate gradient support for Multinomial and Mixture * Support monitoring of only some nodes when learning * Add diag() method to Gamma node * Add some examples as Jupyter notebooks Changed ....... * Simplify GaussianARD mean parent handling * Move documentation to Read the Docs Fixed ..... * Fix an axis mapping bug in Mixture (#39) * Fix NaN issue in Mixture with deterministic mappings (#66) * Fix Dirichlet node parent validation * Fix VB iteration when no data given (#67) * Fix axis label support in Hinton plots (#64) * Fix recursive node deletion Version 0.4.1 (2015-11-02) ++++++++++++++++++++++++++ * Define extra dependencies needed to build the documentation Version 0.4.0 (2015-11-02) +++++++++++++++++++++++++++ * Implement Add node for Gaussian nodes * Raise error if attempting to install on Python 2 * Return both relative and absolute errors from numerical gradient checking * Add nose plugin to filter unit test warnings appropriately Version 0.3.9 (2015-10-16) ++++++++++++++++++++++++++ * Fix Gaussian ARD node sampling Version 0.3.8 (2015-10-16) ++++++++++++++++++++++++++ * Fix Gaussian node sampling Version 0.3.7 (2015-09-23) ++++++++++++++++++++++++++ * Enable keyword arguments when plotting via the inference engine * Add initial support for logging Version 0.3.6 (2015-08-12) ++++++++++++++++++++++++++ * Add maximum likelihood node for the shape parameter of Gamma * Fix Hinton diagrams for 1-D and 0-D Gaussians * Fix autosave interval counter * Fix bugs in constant nodes Version 0.3.5 (2015-06-09) ++++++++++++++++++++++++++ * Fix indexing bug in VB optimization (not VB-EM) * Fix demos Version 0.3.4 (2015-06-09) ++++++++++++++++++++++++++ * Fix computation of probability density of Dirichlet nodes * Use unit tests for all code snippets in docstrings and documentation Version 0.3.3 (2015-06-05) ++++++++++++++++++++++++++ * Change license to the MIT license * Improve SumMultiply efficiency * Hinton diagrams for gamma variables * Possible to load only nodes from HDF5 results Version 0.3.2 (2015-03-16) ++++++++++++++++++++++++++ * Concatenate node added * Unit tests for plotting fixed Version 0.3.1 (2015-03-12) ++++++++++++++++++++++++++ * Gaussian mixture 2D plotting improvements * Covariance matrix sampling improvements * Minor documentation fixes Version 0.3 (2015-03-05) ++++++++++++++++++++++++ * Add gradient-based optimization methods (Riemannian/natural gradient or normal) * Add collapsed inference * Add the pattern search method * Add deterministic annealing * Add stochastic variational inference * Add optional input signals to Gaussian Markov chains * Add unit tests for plotting functions (by Hannu Hartikainen) * Add printing support to nodes * Drop Python 3.2 support Version 0.2.3 (2014-12-03) ++++++++++++++++++++++++++ * Fix matplotlib compatibility broken by recent changes in matplotlib * Add random sampling for Binomial and Bernoulli nodes * Fix minor bugs, for instance, in plot module Version 0.2.2 (2014-11-01) ++++++++++++++++++++++++++ * Fix normalization of categorical Markov chain probabilities (fixes HMM demo) * Fix initialization from parameter values Version 0.2.1 (2014-09-30) ++++++++++++++++++++++++++ * Add workaround for matplotlib 1.4.0 bug related to interactive mode which affected monitoring * Fix bugs in Hinton diagrams for Gaussian variables Version 0.2 (2014-08-06) ++++++++++++++++++++++++ * Added all remaining common distributions: Bernoulli, binomial, multinomial, Poisson, beta, exponential. * Added Gaussian arrays (not just scalars or vectors). * Added Gaussian Markov chains with time-varying or swithing dynamics. * Added discrete Markov chains (enabling hidden Markov models). * Added joint Gaussian-Wishart and Gaussian-gamma nodes. * Added deterministic gating node. * Added deterministic general sum-product node. * Added parameter expansion for Gaussian arrays and time-varying/switching Gaussian Markov chains. * Added new plotting functions: pdf, Hinton diagram. * Added monitoring of posterior distributions during iteration. * Finished documentation and added API. Version 0.1 (2013-07-25) ++++++++++++++++++++++++ * Added variational message passing inference engine. * Added the following common distributions: Gaussian vector, gamma, Wishart, Dirichlet, categorical. * Added Gaussian Markov chain. * Added parameter expansion for Gaussian vectors and Gaussian Markov chain. * Added stochastic mixture node. * Added deterministic dot product node. * Created preliminary version of the documentation. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/INSTALL.rst0000644000175100001770000001450400000000000015645 0ustar00runnerdocker00000000000000.. Copyright (C) 2011-2012,2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Installation ============ BayesPy is a Python 3 package and it can be installed from PyPI or the latest development version from GitHub. The instructions below explain how to set up the system by installing required packages, how to install BayesPy and how to compile this documentation yourself. However, if these instructions contain errors or some relevant details are missing, please file a bug report at https://github.com/bayespy/bayespy/issues. Installing BayesPy ------------------ BayesPy can be installed easily by using Pip if the system has been properly set up. If you have problems with the following methods, see the following section for some help on installing the requirements. For instance, a bug in recent versions of h5py and pip may require you to install some of the requirements manually. For users +++++++++ First, you may want to set up a virtual environment. Using virtual environment is optional but recommended. To create and activate a new virtual environment, run (in the folder in which you want to create the environment): .. code-block:: console virtualenv -p python3 --system-site-packages ENV source ENV/bin/activate The latest release of BayesPy can be installed from PyPI simply as .. code-block:: console pip install bayespy Alternatively, you can obtain the latest release with conda (from conda-forge channel). In such case, there is no need to use a virtual environment. Instead, you can install in a conda environment, in any platform, with the single command: .. code-block:: console conda install -c conda-forge bayespy If you want to install the latest development version of BayesPy, use GitHub instead: .. code-block:: console pip install git+https://github.com/bayespy/bayespy.git@develop For developers ++++++++++++++ If you want to install the development version of BayesPy in such a way that you can easily edit the package, follow these instructions. Get the git repository: .. code-block:: console git clone https://github.com/bayespy/bayespy.git cd bayespy Create and activate a new virtual environment (optional but recommended): .. code-block:: console virtualenv -p python3 --system-site-packages ENV source ENV/bin/activate Install BayesPy in editable mode: .. code-block:: console pip install -e . Checking installation +++++++++++++++++++++ If you have problems installing BayesPy, read the next section for more details. It is recommended to run the unit tests in order to check that BayesPy is working properly. Thus, install Nose and run the unit tests: .. code-block:: console pip install nose nosetests bayespy Installing requirements ----------------------- BayesPy requires Python 3.3 (or later) and the following packages: * NumPy (>=1.10.0), * SciPy (>=0.13.0) * matplotlib (>=1.2) * h5py Ideally, Pip should install the necessary requirements and a manual installation of these dependencies is not required. However, there are several reasons why the installation of these dependencies needs to be done manually in some cases. Thus, this section tries to give some details on how to set up your system. A proper installation of the dependencies for Python 3 can be a bit tricky and you may refer to http://www.scipy.org/install.html for more detailed instructions about the SciPy stack. Detailed instructions on installing recent SciPy stack for various platforms is out of the scope of these instructions, but we provide some general guidance here. There are basically three ways to install the dependencies: 1. Install a Python distribution which includes the packages. For Windows, Mac and Linux, there are several Python distributions which include all the necessary packages: http://www.scipy.org/install.html#scientific-python-distributions. For instance, you may try `Anaconda `_ or `Enthought `_. 2. Install the packages using the system package manager. On Linux, the packages might be called something like ``python-scipy`` or ``scipy``. However, it is possible that these system packages are not recent enough for BayesPy. 3. Install the packages using Pip: .. code-block:: console pip install "distribute>=0.6.28" pip install "numpy>=1.10.0" "scipy>=0.13.0" "matplotlib>=1.2" h5py This also makes sure you have recent enough version of Distribute (required by Matplotlib). However, this installation method may require that the system has some libraries needed for compiling (e.g., C compiler, Python development files, BLAS/LAPACK). For instance, on Ubuntu (>= 12.10), you may install the required system libraries for each package as: .. code-block:: console sudo apt-get build-dep python3-numpy sudo apt-get build-dep python3-scipy sudo apt-get build-dep python3-matplotlib sudo apt-get build-dep python-h5py Then installation using Pip should work. Compiling documentation ----------------------- This documentation can be found at http://bayespy.org/ in HTML and PDF formats. The documentation source files are also readable as such in reStructuredText format in ``doc/source/`` directory. It is possible to compile the documentation into HTML or PDF yourself. In order to compile the documentation, Sphinx is required and a few extensions for it. Those can be installed as: .. code-block:: console pip install "sphinx>=1.2.3" sphinxcontrib-tikz sphinxcontrib-bayesnet sphinxcontrib-bibtex "numpydoc>=0.5" Or you can simply install BayesPy with ``doc`` extra, which will take care of installing the required dependencies: .. code-block:: console pip install bayespy[doc] In order to visualize graphical models in HTML, you need to have ``ImageMagick`` or ``Netpbm`` installed. The documentation can be compiled to HTML and PDF by running the following commands in the ``doc`` directory: .. code-block:: console make html make latexpdf You can also run doctest to test code snippets in the documentation: .. code-block:: console make doctest or in the docstrings: .. code-block:: console nosetests --with-doctest --doctest-options="+ELLIPSIS" bayespy ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/LICENSE0000644000175100001770000000210200000000000015001 0ustar00runnerdocker00000000000000The MIT License (MIT) Copyright (c) 2011-2015 BayesPy developers Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/MANIFEST.in0000644000175100001770000000051500000000000015540 0ustar00runnerdocker00000000000000include README* include INSTALL* include LICENSE include CHANGELOG* # Include header files recursive-include bayespy *.h # Add documentation to sdist-generated tarballs recursive-include doc/source *.rst recursive-include doc/source *.py include doc/Makefile #include doc/make.bat include versioneer.py include bayespy/_version.py ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.425372 bayespy-0.6.2/PKG-INFO0000644000175100001770000001634700000000000015111 0ustar00runnerdocker00000000000000Metadata-Version: 2.1 Name: bayespy Version: 0.6.2 Summary: Variational Bayesian inference tools for Python Home-page: http://bayespy.org Author: Jaakko Luttinen Author-email: jaakko.luttinen@iki.fi License: UNKNOWN Description: BayesPy - Bayesian Python ========================= BayesPy provides tools for Bayesian inference with Python. The user constructs a model as a Bayesian network, observes data and runs posterior inference. The goal is to provide a tool which is efficient, flexible and extendable enough for expert use but also accessible for more casual users. Currently, only variational Bayesian inference for conjugate-exponential family (variational message passing) has been implemented. Future work includes variational approximations for other types of distributions and possibly other approximate inference methods such as expectation propagation, Laplace approximations, Markov chain Monte Carlo (MCMC) and other methods. Contributions are welcome. Project information ------------------- Copyright (C) 2011-2017 Jaakko Luttinen and other contributors (see below) BayesPy including the documentation is licensed under the MIT License. See LICENSE file for a text of the license or visit http://opensource.org/licenses/MIT. .. |chat| image:: https://badges.gitter.im/Join%20Chat.svg :target: https://gitter.im/bayespy/bayespy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge .. |release| image:: https://badge.fury.io/py/bayespy.svg :target: https://pypi.python.org/pypi/bayespy .. |conda-release| image:: https://anaconda.org/conda-forge/bayespy/badges/installer/conda.svg :target: https://anaconda.org/conda-forge/bayespy ============== ============================================= Latest release |release| |conda-release| Documentation http://bayespy.org Repository https://github.com/bayespy/bayespy.git Bug reports https://github.com/bayespy/bayespy/issues Author Jaakko Luttinen jaakko.luttinen@iki.fi Chat |chat| Mailing list bayespy@googlegroups.com ============== ============================================= Continuous integration ++++++++++++++++++++++ .. |travismaster| image:: https://travis-ci.org/bayespy/bayespy.svg?branch=master :target: https://travis-ci.org/bayespy/bayespy/ :align: middle .. |travisdevelop| image:: https://travis-ci.org/bayespy/bayespy.svg?branch=develop :target: https://travis-ci.org/bayespy/bayespy/ :align: middle .. |covermaster| image:: https://coveralls.io/repos/bayespy/bayespy/badge.svg?branch=master :target: https://coveralls.io/r/bayespy/bayespy?branch=master :align: middle .. |coverdevelop| image:: https://coveralls.io/repos/bayespy/bayespy/badge.svg?branch=develop :target: https://coveralls.io/r/bayespy/bayespy?branch=develop :align: middle .. |docsmaster| image:: https://img.shields.io/badge/docs-master-blue.svg?style=flat :target: http://www.bayespy.org/en/stable/ :align: middle .. |docsdevelop| image:: https://img.shields.io/badge/docs-develop-blue.svg?style=flat :target: http://www.bayespy.org/en/latest/ :align: middle ==================== =============== ============== ============= Branch Test status Test coverage Documentation ==================== =============== ============== ============= **master (stable)** |travismaster| |covermaster| |docsmaster| **develop (latest)** |travisdevelop| |coverdevelop| |docsdevelop| ==================== =============== ============== ============= Similar projects ---------------- `VIBES `_ (http://vibes.sourceforge.net/) allows variational inference to be performed automatically on a Bayesian network. It is implemented in Java and released under revised BSD license. `Bayes Blocks `_ (http://research.ics.aalto.fi/bayes/software/) is a C++/Python implementation of the variational building block framework. The framework allows easy learning of a wide variety of models using variational Bayesian learning. It is available as free software under the GNU General Public License. `Infer.NET `_ (http://research.microsoft.com/infernet/) is a .NET framework for machine learning. It provides message-passing algorithms and statistical routines for performing Bayesian inference. It is partly closed source and licensed for non-commercial use only. `PyMC `_ (https://github.com/pymc-devs/pymc) provides MCMC methods in Python. It is released under the Academic Free License. `OpenBUGS `_ (http://www.openbugs.info) is a software package for performing Bayesian inference using Gibbs sampling. It is released under the GNU General Public License. `Dimple `_ (http://dimple.probprog.org/) provides Gibbs sampling, belief propagation and a few other inference algorithms for Matlab and Java. It is released under the Apache License. `Stan `_ (http://mc-stan.org/) provides inference using MCMC with an interface for R and Python. It is released under the New BSD License. `PBNT - Python Bayesian Network Toolbox `_ (http://pbnt.berlios.de/) is Bayesian network library in Python supporting static networks with discrete variables. There was no information about the license. Contributors ------------ The list of contributors: * Jaakko Luttinen * Hannu Hartikainen * Deebul Nair * Christopher Cramer * Till Hoffmann Each file or the git log can be used for more detailed information. Keywords: variational Bayes,probabilistic programming,Bayesian networks,graphical models,variational message passing Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3.3 Classifier: Programming Language :: Python :: 3.4 Classifier: Development Status :: 4 - Beta Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: Intended Audience :: Science/Research Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Topic :: Scientific/Engineering Classifier: Topic :: Scientific/Engineering :: Information Analysis Provides-Extra: doc Provides-Extra: dev ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/README.rst0000644000175100001770000001241000000000000015466 0ustar00runnerdocker00000000000000BayesPy - Bayesian Python ========================= BayesPy provides tools for Bayesian inference with Python. The user constructs a model as a Bayesian network, observes data and runs posterior inference. The goal is to provide a tool which is efficient, flexible and extendable enough for expert use but also accessible for more casual users. Currently, only variational Bayesian inference for conjugate-exponential family (variational message passing) has been implemented. Future work includes variational approximations for other types of distributions and possibly other approximate inference methods such as expectation propagation, Laplace approximations, Markov chain Monte Carlo (MCMC) and other methods. Contributions are welcome. Project information ------------------- Copyright (C) 2011-2017 Jaakko Luttinen and other contributors (see below) BayesPy including the documentation is licensed under the MIT License. See LICENSE file for a text of the license or visit http://opensource.org/licenses/MIT. .. |chat| image:: https://badges.gitter.im/Join%20Chat.svg :target: https://gitter.im/bayespy/bayespy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge .. |release| image:: https://badge.fury.io/py/bayespy.svg :target: https://pypi.python.org/pypi/bayespy .. |conda-release| image:: https://anaconda.org/conda-forge/bayespy/badges/installer/conda.svg :target: https://anaconda.org/conda-forge/bayespy ============== ============================================= Latest release |release| |conda-release| Documentation http://bayespy.org Repository https://github.com/bayespy/bayespy.git Bug reports https://github.com/bayespy/bayespy/issues Author Jaakko Luttinen jaakko.luttinen@iki.fi Chat |chat| Mailing list bayespy@googlegroups.com ============== ============================================= Continuous integration ++++++++++++++++++++++ .. |travismaster| image:: https://travis-ci.org/bayespy/bayespy.svg?branch=master :target: https://travis-ci.org/bayespy/bayespy/ :align: middle .. |travisdevelop| image:: https://travis-ci.org/bayespy/bayespy.svg?branch=develop :target: https://travis-ci.org/bayespy/bayespy/ :align: middle .. |covermaster| image:: https://coveralls.io/repos/bayespy/bayespy/badge.svg?branch=master :target: https://coveralls.io/r/bayespy/bayespy?branch=master :align: middle .. |coverdevelop| image:: https://coveralls.io/repos/bayespy/bayespy/badge.svg?branch=develop :target: https://coveralls.io/r/bayespy/bayespy?branch=develop :align: middle .. |docsmaster| image:: https://img.shields.io/badge/docs-master-blue.svg?style=flat :target: http://www.bayespy.org/en/stable/ :align: middle .. |docsdevelop| image:: https://img.shields.io/badge/docs-develop-blue.svg?style=flat :target: http://www.bayespy.org/en/latest/ :align: middle ==================== =============== ============== ============= Branch Test status Test coverage Documentation ==================== =============== ============== ============= **master (stable)** |travismaster| |covermaster| |docsmaster| **develop (latest)** |travisdevelop| |coverdevelop| |docsdevelop| ==================== =============== ============== ============= Similar projects ---------------- `VIBES `_ (http://vibes.sourceforge.net/) allows variational inference to be performed automatically on a Bayesian network. It is implemented in Java and released under revised BSD license. `Bayes Blocks `_ (http://research.ics.aalto.fi/bayes/software/) is a C++/Python implementation of the variational building block framework. The framework allows easy learning of a wide variety of models using variational Bayesian learning. It is available as free software under the GNU General Public License. `Infer.NET `_ (http://research.microsoft.com/infernet/) is a .NET framework for machine learning. It provides message-passing algorithms and statistical routines for performing Bayesian inference. It is partly closed source and licensed for non-commercial use only. `PyMC `_ (https://github.com/pymc-devs/pymc) provides MCMC methods in Python. It is released under the Academic Free License. `OpenBUGS `_ (http://www.openbugs.info) is a software package for performing Bayesian inference using Gibbs sampling. It is released under the GNU General Public License. `Dimple `_ (http://dimple.probprog.org/) provides Gibbs sampling, belief propagation and a few other inference algorithms for Matlab and Java. It is released under the Apache License. `Stan `_ (http://mc-stan.org/) provides inference using MCMC with an interface for R and Python. It is released under the New BSD License. `PBNT - Python Bayesian Network Toolbox `_ (http://pbnt.berlios.de/) is Bayesian network library in Python supporting static networks with discrete variables. There was no information about the license. Contributors ------------ The list of contributors: * Jaakko Luttinen * Hannu Hartikainen * Deebul Nair * Christopher Cramer * Till Hoffmann Each file or the git log can be used for more detailed information. ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.401372 bayespy-0.6.2/bayespy/0000755000175100001770000000000000000000000015455 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/__init__.py0000644000175100001770000000105000000000000017562 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2016 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ from . import utils from . import inference from . import nodes try: from . import plot except ImportError: # Matplotlib not available pass from ._meta import __author__, __copyright__, __contact__, __license__ from . import _version __version__ = _version.get_versions()['version'] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/_meta.py0000644000175100001770000000025000000000000017111 0ustar00runnerdocker00000000000000 __author__ = 'Jaakko Luttinen' __contact__ = 'jaakko.luttinen@iki.fi' __copyright__ = '2011-2017, Jaakko Luttinen and contributors' __license__ = 'MIT License' ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.425372 bayespy-0.6.2/bayespy/_version.py0000644000175100001770000000076100000000000017657 0ustar00runnerdocker00000000000000 # This file was generated by 'versioneer.py' (0.29) from # revision-control system data, or from the parent directory name of an # unpacked source archive. Distribution tarballs contain a pre-generated copy # of this file. import json version_json = ''' { "date": "2024-09-02T13:42:01+0300", "dirty": false, "error": null, "full-revisionid": "37900855d32d1a5a77fed31be28aaa7a833a3fa7", "version": "0.6.2" } ''' # END VERSION_JSON def get_versions(): return json.loads(version_json) ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.405372 bayespy-0.6.2/bayespy/demos/0000755000175100001770000000000000000000000016564 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/__init__.py0000644000175100001770000000000000000000000020663 0ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/annealing.py0000644000175100001770000000640400000000000021076 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Demonstration of deterministic annealing. Deterministic annealing aims at avoiding convergence to local optima and finding the global optimum :cite:`Katahira:2008`. """ import numpy as np import scipy import matplotlib.pyplot as plt import bayespy.plot as myplt from bayespy.utils import misc from bayespy.utils import random from bayespy.nodes import GaussianARD, Categorical, Mixture from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations import bayespy.plot as bpplt from bayespy.demos import pca def run(N=500, seed=42, maxiter=100, plot=True): """ Run deterministic annealing demo for 1-D Gaussian mixture. """ if seed is not None: np.random.seed(seed) mu = GaussianARD(0, 1, plates=(2,), name='means') Z = Categorical([0.3, 0.7], plates=(N,), name='classes') Y = Mixture(Z, GaussianARD, mu, 1, name='observations') # Generate data z = Z.random() data = np.empty(N) for n in range(N): data[n] = [4, -4][z[n]] Y.observe(data) # Initialize means closer to the inferior local optimum in which the # cluster means are swapped mu.initialize_from_value([0, 6]) Q = VB(Y, Z, mu) Q.save() # # Standard VB-EM algorithm # Q.update(repeat=maxiter) mu_vbem = mu.u[0].copy() L_vbem = Q.compute_lowerbound() # # VB-EM with deterministic annealing # Q.load() beta = 0.01 while beta < 1.0: beta = min(beta*1.2, 1.0) print("Set annealing to %.2f" % beta) Q.set_annealing(beta) Q.update(repeat=maxiter, tol=1e-4) mu_anneal = mu.u[0].copy() L_anneal = Q.compute_lowerbound() print("==============================") print("RESULTS FOR VB-EM vs ANNEALING") print("Fixed component probabilities:", np.array([0.3, 0.7])) print("True component means:", np.array([4, -4])) print("VB-EM component means:", mu_vbem) print("VB-EM lower bound:", L_vbem) print("Annealed VB-EM component means:", mu_anneal) print("Annealed VB-EM lower bound:", L_anneal) return if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["n=", "seed=", "maxiter="]) except getopt.GetoptError: print('python annealing.py ') print('--n= Number of data points') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--seed": kwargs["seed"] = int(arg) elif opt in ("--n",): kwargs["N"] = int(arg) run(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/black_box.py0000644000175100001770000000654500000000000021074 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Black-box variational inference """ import numpy as np import scipy import matplotlib.pyplot as plt import bayespy.plot as myplt from bayespy.utils import misc from bayespy.utils import random from bayespy.nodes import GaussianARD, LogPDF, Dot from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations import bayespy.plot as bpplt from bayespy.demos import pca def run(M=10, N=100, D=5, seed=42, maxiter=100, plot=True): """ Run deterministic annealing demo for 1-D Gaussian mixture. """ raise NotImplementedError("Black box variational inference not yet implemented, sorry") if seed is not None: np.random.seed(seed) # Generate data data = np.dot(np.random.randn(M,D), np.random.randn(D,N)) # Construct model C = GaussianARD(0, 1, shape=(2,), plates=(M,1), name='C') X = GaussianARD(0, 1, shape=(2,), plates=(1,N), name='X') F = Dot(C, X) # Some arbitrary log likelihood def logpdf(y, f): """ exp(f) / (1 + exp(f)) = 1/(1+exp(-f)) -log(1+exp(-f)) = -log(exp(0)+exp(-f)) also: 1 - exp(f) / (1 + exp(f)) = (1 + exp(f) - exp(f)) / (1 + exp(f)) = 1 / (1 + exp(f)) = -log(1+exp(f)) = -log(exp(0)+exp(f)) """ return -np.logaddexp(0, -f * np.where(y, -1, +1)) Y = LogPDF(logpdf, F, samples=10, shape=()) #Y = GaussianARD(F, 1) Y.observe(data) Q = VB(Y, C, X) Q.ignore_bound_checks = True delay = 1 forgetting_rate = 0.7 for n in range(maxiter): # Observe a mini-batch #subset = np.random.choice(N, N_batch) #Y.observe(data[subset,:]) # Learn intermediate variables #Q.update(Z) # Set step length step = (n + delay) ** (-forgetting_rate) # Stochastic gradient for the global variables Q.gradient_step(C, X, scale=step) if plot: bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'r:') bpplt.pyplot.xlabel('CPU time (in seconds)') bpplt.pyplot.ylabel('VB lower bound') return if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["n=", "batch=", "seed=", "maxiter="]) except getopt.GetoptError: print('python stochastic_inference.py ') print('--n= Number of data points') print('--batch= Mini-batch size') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--seed": kwargs["seed"] = int(arg) elif opt in ("--n",): kwargs["N"] = int(arg) elif opt in ("--batch",): kwargs["N_batch"] = int(arg) run(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/categorical.py0000644000175100001770000000161300000000000021414 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from bayespy.utils import random from bayespy import nodes from bayespy.inference.vmp.vmp import VB def run(M=30, D=5): # Generate data y = np.random.randint(D, size=(M,)) # Construct model p = nodes.Dirichlet(1*np.ones(D), name='p') z = nodes.Categorical(p, plates=(M,), name='z') # Observe the data with randomly missing values mask = random.mask(M, p=0.5) z.observe(y, mask=mask) # Run VB-EM Q = VB(p, z) Q.update() # Show results z.show() p.show() if __name__ == '__main__': run() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/collapsed_cg.py0000644000175100001770000000632400000000000021562 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014-2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Demonstrate Riemannian conjugate gradient """ import numpy as np from bayespy.nodes import (GaussianARD, Gamma, SumMultiply) from bayespy.utils import random from bayespy.inference.vmp.vmp import VB import bayespy.plot as bpplt from bayespy.demos import mog def pca(): np.random.seed(41) M = 10 N = 3000 D = 5 # Construct the PCA model alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') W = GaussianARD(0, alpha, plates=(M,1), shape=(D,), name='W') X = GaussianARD(0, 1, plates=(1,N), shape=(D,), name='X') tau = Gamma(1e-3, 1e-3, name='tau') W.initialize_from_random() F = SumMultiply('d,d->', W, X) Y = GaussianARD(F, tau, name='Y') # Observe data data = np.sum(np.random.randn(M,1,D-1) * np.random.randn(1,N,D-1), axis=-1) + 1e-1 * np.random.randn(M,N) Y.observe(data) # Initialize VB engine Q = VB(Y, X, W, alpha, tau) # Take one update step (so phi is ok) Q.update(repeat=1) Q.save() # Run VB-EM Q.update(repeat=200) bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'k-') # Restore the state Q.load() # Run Riemannian conjugate gradient #Q.optimize(X, alpha, maxiter=100, collapsed=[W, tau]) Q.optimize(W, tau, maxiter=100, collapsed=[X, alpha]) bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'r:') bpplt.pyplot.show() def mixture_of_gaussians(): """Collapsed Riemannian conjugate gradient demo This is similar although not exactly identical to an experiment in (Hensman et al 2012). """ np.random.seed(41) # Number of samples N = 1000 # Number of clusters in the model (five in the data) K = 10 # Overlap parameter of clusters R = 2 # Construct the model Q = mog.gaussianmix_model(N, K, 2, covariance='diagonal') # Generate data from five Gaussian clusters mu = np.array([[0, 0], [R, R], [-R, R], [R, -R], [-R, -R]]) Z = random.categorical(np.ones(5), size=N) data = np.empty((N, 2)) for n in range(N): data[n,:] = mu[Z[n]] + np.random.randn(2) Q['Y'].observe(data) # Take one update step (so phi is ok) Q.update(repeat=1) Q.save() # Run standard VB-EM Q.update(repeat=1000, tol=0) bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'k-') # Restore the initial state Q.load() # Run Riemannian conjugate gradient Q.optimize('alpha', 'X', 'Lambda', collapsed=['z'], maxiter=300, tol=0) bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'r:') bpplt.pyplot.xlabel('CPU time (in seconds)') bpplt.pyplot.ylabel('VB lower bound') bpplt.pyplot.legend(['VB-EM', 'Collapsed Riemannian CG'], loc='lower right') ## bpplt.pyplot.figure() ## bpplt.pyplot.plot(data[:,0], data[:,1], 'rx') ## bpplt.pyplot.title('Data') bpplt.pyplot.show() if __name__ == "__main__": #pca() mixture_of_gaussians() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/gamma_shape.py0000644000175100001770000000074400000000000021405 0ustar00runnerdocker00000000000000 from bayespy import nodes from bayespy.inference import VB def run(): a = nodes.GammaShape(name='a') b = nodes.Gamma(1e-5, 1e-5, name='b') tau = nodes.Gamma(a, b, plates=(1000,), name='tau') tau.observe(nodes.Gamma(10, 20, plates=(1000,)).random()) Q = VB(tau, a, b) Q.update(repeat=1000) print("True gamma parameters:", 10.0, 20.0) print("Estimated parameters from 1000 samples:", a.u[0], b.u[0]) if __name__ == "__main__": run() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/hmm.py0000644000175100001770000001134700000000000017725 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Demonstrate categorical Markov chain with hidden Markov model (HMM) """ import numpy as np import matplotlib.pyplot as plt from bayespy.nodes import Gaussian, \ CategoricalMarkovChain, \ Dirichlet, \ Mixture, \ Categorical from bayespy.inference.vmp.vmp import VB import bayespy.plot as bpplt def hidden_markov_model(distribution, *args, K=3, N=100): # Prior for initial state probabilities alpha = Dirichlet(1e-3*np.ones(K), name='alpha') # Prior for state transition probabilities A = Dirichlet(1e-3*np.ones(K), plates=(K,), name='A') # Hidden states (with unknown initial state probabilities and state # transition probabilities) Z = CategoricalMarkovChain(alpha, A, states=N, name='Z') # Emission/observation distribution Y = Mixture(Z, distribution, *args, name='Y') Q = VB(Y, Z, alpha, A) return Q def mixture_model(distribution, *args, K=3, N=100): # Prior for state probabilities alpha = Dirichlet(1e-3*np.ones(K), name='alpha') # Cluster assignments Z = Categorical(alpha, plates=(N,), name='Z') # Observation distribution Y = Mixture(Z, distribution, *args, name='Y') Q = VB(Y, Z, alpha) return Q @bpplt.interactive def run(N=200, maxiter=10, seed=42, std=2.0, plot=True): # Use deterministic random numbers if seed is not None: np.random.seed(seed) # # Generate data # mu = np.array([ [0,0], [3,4], [6,0] ]) K = 3 p0 = np.ones(K) / K q = 0.9 # probability to stay in the same state r = (1-q)/(K-1) P = q*np.identity(K) + r*(np.ones((3,3))-np.identity(3)) y = np.zeros((N,2)) z = np.zeros(N) state = np.random.choice(K, p=p0) for n in range(N): z[n] = state y[n,:] = std*np.random.randn(2) + mu[state] state = np.random.choice(K, p=P[state]) plt.figure() # Plot data plt.subplot(1,3,1) plt.axis('equal') plt.title('True classification') colors = [ [[1,0,0], [0,1,0], [0,0,1]][int(state)] for state in z ] plt.plot(y[:,0], y[:,1], 'k-', zorder=-10) plt.scatter(y[:,0], y[:,1], c=colors, s=40) # # Use HMM # # Run VB inference for HMM Q_hmm = hidden_markov_model(Gaussian, mu, K*[std**(-2)*np.identity(2)], K=K, N=N) Q_hmm['Y'].observe(y) Q_hmm.update(repeat=maxiter) # Plot results plt.subplot(1,3,2) plt.axis('equal') plt.title('Classification with HMM') colors = Q_hmm['Y'].parents[0]._message_to_child()[0] plt.plot(y[:,0], y[:,1], 'k-', zorder=-10) plt.scatter(y[:,0], y[:,1], c=colors, s=40) # # Use mixture model # # For comparison, run VB for Gaussian mixture Q_mix = mixture_model(Gaussian, mu, K*[std**(-2)*np.identity(2)], K=K, N=N) Q_mix['Y'].observe(y) Q_mix.update(repeat=maxiter) # Plot results plt.subplot(1,3,3) plt.axis('equal') plt.title('Classification with mixture') colors = Q_mix['Y'].parents[0]._message_to_child()[0] plt.plot(y[:,0], y[:,1], 'k-', zorder=-10) plt.scatter(y[:,0], y[:,1], c=colors, s=40) if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["n=", "seed=", "std=", "maxiter="]) except getopt.GetoptError: print('python demo_lssm.py ') print('--n= Number of data vectors') print('--std= Standard deviation of the Gaussians') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--std": kwargs["std"] = float(arg) elif opt == "--seed": kwargs["seed"] = int(arg) elif opt in ("--n",): kwargs["N"] = int(arg) else: raise ValueError("Unhandled option given") run(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/lda.py0000644000175100001770000001611100000000000017676 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from bayespy import nodes from bayespy.inference import VB from bayespy.inference.vmp.nodes.constant import Constant from bayespy.inference.vmp.nodes.categorical import CategoricalMoments import bayespy.plot as bpplt def model(n_documents, n_topics, n_vocabulary, corpus, word_documents, plates_multiplier=1): ''' Construct Latent Dirichlet Allocation model. Parameters ---------- documents : int The number of documents topics : int The number of topics vocabulary : int The number of words in the vocabulary corpus : integer array The vocabulary index of each word in the corpus word_documents : integer array The document index of each word in the corpus ''' # Topic distributions for each document p_topic = nodes.Dirichlet(np.ones(n_topics), plates=(n_documents,), name='p_topic') # Word distributions for each topic p_word = nodes.Dirichlet(np.ones(n_vocabulary), plates=(n_topics,), name='p_word') # Use a simple wrapper node so that the value of this can be changed if one # uses stocahstic variational inference word_documents = Constant(CategoricalMoments(n_documents), word_documents, name='word_documents') # Choose a topic for each word in the corpus topics = nodes.Categorical(nodes.Gate(word_documents, p_topic), plates=(len(corpus),), plates_multiplier=(plates_multiplier,), name='topics') # Choose each word in the corpus from the vocabulary words = nodes.Categorical(nodes.Gate(topics, p_word), name='words') # Observe the corpus words.observe(corpus) # Break symmetry by random initialization p_topic.initialize_from_random() p_word.initialize_from_random() return VB(words, topics, p_word, p_topic, word_documents) def generate_data(n_documents, n_topics, n_vocabulary, n_words): # Generate random data from the generative model # Generate document assignments for the words word_documents = nodes.Categorical(np.ones(n_documents)/n_documents, plates=(n_words,)).random() # Topic distribution for each document p_topic = nodes.Dirichlet(1e-1*np.ones(n_topics), plates=(n_documents,)).random() # Word distribution for each topic p_word = nodes.Dirichlet(1e-1*np.ones(n_vocabulary), plates=(n_topics,)).random() # Topic for each word in each document topic = nodes.Categorical(p_topic[word_documents], plates=(n_words,)).random() # Each word in each document corpus = nodes.Categorical(p_word[topic], plates=(n_words,)).random() bpplt.pyplot.figure() bpplt.hinton(p_topic) bpplt.pyplot.title("True topic distribution for each document") bpplt.pyplot.xlabel("Topics") bpplt.pyplot.ylabel("Documents") bpplt.pyplot.figure() bpplt.hinton(p_word) bpplt.pyplot.title("True word distributions for each topic") bpplt.pyplot.xlabel("Words") bpplt.pyplot.ylabel("Topics") return (corpus, word_documents) def run(n_documents=30, n_topics=5, n_vocabulary=10, n_words=50000, stochastic=False, maxiter=1000, seed=None): if seed is not None: np.random.seed(seed) (corpus, word_documents) = generate_data(n_documents, n_topics, n_vocabulary, n_words) if not stochastic: Q = model(n_documents=n_documents, n_topics=n_topics, n_vocabulary=n_vocabulary, corpus=corpus, word_documents=word_documents) Q.update(repeat=maxiter) else: subset_size = 1000 Q = model(n_documents=n_documents, n_topics=n_topics, n_vocabulary=n_vocabulary, corpus=corpus[:subset_size], word_documents=word_documents[:subset_size], plates_multiplier=n_words/subset_size) Q.ignore_bound_checks = True delay = 1 forgetting_rate = 0.7 for n in range(maxiter): # Observe a mini-batch subset = np.random.choice(n_words, subset_size) Q['words'].observe(corpus[subset]) Q['word_documents'].set_value(word_documents[subset]) # Learn intermediate variables Q.update('topics') # Set step length step = (n + delay) ** (-forgetting_rate) # Stochastic gradient for the global variables Q.gradient_step('p_topic', 'p_word', scale=step) bpplt.pyplot.figure() bpplt.pyplot.plot(Q.L) bpplt.pyplot.figure() bpplt.hinton(Q['p_topic']) bpplt.pyplot.title("Posterior topic distribution for each document") bpplt.pyplot.xlabel("Topics") bpplt.pyplot.ylabel("Documents") bpplt.pyplot.figure() bpplt.hinton(Q['p_word']) bpplt.pyplot.title("Posterior word distributions for each topic") bpplt.pyplot.xlabel("Words") bpplt.pyplot.ylabel("Topics") return if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["documents=", "topics=", "vocabulary=", "words=", "stochastic", "seed=", "maxiter="]) except getopt.GetoptError: print('python lda.py ') print('--documents= The number of documents') print('--topics= The number of topics') print('--vocabulary= The size of the vocabulary') print('--words= The size of the corpus') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the RNG') print('--stochastic Use stochastic variational inference') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--seed": kwargs["seed"] = int(arg) elif opt == "--documents": kwargs["n_documents"] = int(arg) elif opt == "--topics": kwargs["n_topics"] = int(arg) elif opt == "--vocabulary": kwargs["n_vocabulary"] = int(arg) elif opt == "--words": kwargs["n_words"] = int(arg) elif opt == "--stochastic": kwargs["stochastic"] = True #raise NotImplementedError("Work in progress.. This demo is not yet finished") run(**kwargs) bpplt.pyplot.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/lssm.py0000644000175100001770000002470200000000000020121 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Demonstrate linear Gaussian state-space model. Some of the functions in this module are re-usable: * ``model`` can be used to construct the classical linear state-space model. * ``infer`` can be used to apply linear state-space model to given data. """ import numpy as np import scipy import matplotlib.pyplot as plt from bayespy.nodes import GaussianMarkovChain from bayespy.nodes import Gaussian, GaussianARD from bayespy.nodes import Gamma from bayespy.nodes import SumMultiply from bayespy.inference.vmp.nodes.gamma import diagonal from bayespy.utils import random from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations import bayespy.plot as bpplt def model(M=10, N=100, D=3): """ Construct linear state-space model. See, for instance, the following publication: "Fast variational Bayesian linear state-space model" Luttinen (ECML 2013) """ # Dynamics matrix with ARD alpha = Gamma(1e-5, 1e-5, plates=(D,), name='alpha') A = GaussianARD(0, alpha, shape=(D,), plates=(D,), plotter=bpplt.GaussianHintonPlotter(rows=0, cols=1, scale=0), name='A') A.initialize_from_value(np.identity(D)) # Latent states with dynamics X = GaussianMarkovChain(np.zeros(D), # mean of x0 1e-3*np.identity(D), # prec of x0 A, # dynamics np.ones(D), # innovation n=N, # time instances plotter=bpplt.GaussianMarkovChainPlotter(scale=2), name='X') X.initialize_from_value(np.random.randn(N,D)) # Mixing matrix from latent space to observation space using ARD gamma = Gamma(1e-5, 1e-5, plates=(D,), name='gamma') gamma.initialize_from_value(1e-2*np.ones(D)) C = GaussianARD(0, gamma, shape=(D,), plates=(M,1), plotter=bpplt.GaussianHintonPlotter(rows=0, cols=2, scale=0), name='C') C.initialize_from_value(np.random.randn(M,1,D)) # Observation noise tau = Gamma(1e-5, 1e-5, name='tau') tau.initialize_from_value(1e2) # Underlying noiseless function F = SumMultiply('i,i', C, X, name='F') # Noisy observations Y = GaussianARD(F, tau, name='Y') Q = VB(Y, F, C, gamma, X, A, alpha, tau, C) return Q def infer(y, D, mask=True, maxiter=100, rotate=True, debug=False, precompute=False, update_hyper=0, start_rotating=0, plot_C=True, monitor=True, autosave=None): """ Apply linear state-space model for the given data. """ (M, N) = np.shape(y) # Construct the model Q = model(M, N, D) if not plot_C: Q['C'].set_plotter(None) if autosave is not None: Q.set_autosave(autosave, iterations=10) # Observe data Q['Y'].observe(y, mask=mask) # Set up rotation speed-up if rotate: # Initial rotate the D-dimensional state space (X, A, C) # Does not update hyperparameters rotA_init = transformations.RotateGaussianARD(Q['A'], axis=0, precompute=precompute) rotX_init = transformations.RotateGaussianMarkovChain(Q['X'], rotA_init) rotC_init = transformations.RotateGaussianARD(Q['C'], axis=0, precompute=precompute) R_X_init = transformations.RotationOptimizer(rotX_init, rotC_init, D) # Rotate the D-dimensional state space (X, A, C) rotA = transformations.RotateGaussianARD(Q['A'], Q['alpha'], axis=0, precompute=precompute) rotX = transformations.RotateGaussianMarkovChain(Q['X'], rotA) rotC = transformations.RotateGaussianARD(Q['C'], Q['gamma'], axis=0, precompute=precompute) R_X = transformations.RotationOptimizer(rotX, rotC, D) # Keyword arguments for the rotation if debug: rotate_kwargs = {'maxiter': 10, 'check_bound': True, 'check_gradient': True} else: rotate_kwargs = {'maxiter': 10} # Plot initial distributions if monitor: Q.plot() # Run inference using rotations for ind in range(maxiter): if ind < update_hyper: # It might be a good idea to learn the lower level nodes a bit # before starting to learn the upper level nodes. Q.update('X', 'C', 'A', 'tau', plot=monitor) if rotate and ind >= start_rotating: # Use the rotation which does not update alpha nor beta R_X_init.rotate(**rotate_kwargs) else: Q.update(plot=monitor) if rotate and ind >= start_rotating: # It might be a good idea to not rotate immediately because it # might lead to pruning out components too efficiently before # even estimating them roughly R_X.rotate(**rotate_kwargs) # Return the posterior approximation return Q def simulate_data(M, N): """ Generate a dataset using linear state-space model. The process has two latent oscillation components and one random walk component. """ # Simulate some data D = 3 c = np.random.randn(M, D) w = 0.3 a = np.array([[np.cos(w), -np.sin(w), 0], [np.sin(w), np.cos(w), 0], [0, 0, 1]]) x = np.empty((N,D)) f = np.empty((M,N)) y = np.empty((M,N)) x[0] = 10*np.random.randn(D) f[:,0] = np.dot(c,x[0]) y[:,0] = f[:,0] + 3*np.random.randn(M) for n in range(N-1): x[n+1] = np.dot(a,x[n]) + np.random.randn(D) f[:,n+1] = np.dot(c,x[n+1]) y[:,n+1] = f[:,n+1] + 3*np.random.randn(M) return (y, f) @bpplt.interactive def demo(M=6, N=200, D=3, maxiter=100, debug=False, seed=42, rotate=True, precompute=False, plot=True, monitor=True): """ Run the demo for linear state-space model. """ # Use deterministic random numbers if seed is not None: np.random.seed(seed) # Get data (y, f) = simulate_data(M, N) # Add missing values randomly mask = random.mask(M, N, p=0.3) # Add missing values to a period of time mask[:,30:80] = False y[~mask] = np.nan # BayesPy doesn't require this. Just for plotting. # Run inference Q = infer(y, D, mask=mask, rotate=rotate, debug=debug, monitor=monitor, maxiter=maxiter) if plot: # Show results plt.figure() bpplt.timeseries_normal(Q['F'], scale=2) bpplt.timeseries(f, linestyle='-', color='b') bpplt.timeseries(y, linestyle='None', color='r', marker='.') if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["m=", "n=", "d=", "seed=", "maxiter=", "debug", "precompute", "no-plot", "no-monitor", "no-rotation"]) except getopt.GetoptError: print('python lssm.py ') print('--m= Dimensionality of data vectors') print('--n= Number of data vectors') print('--d= Dimensionality of the latent vectors in the model') print('--no-rotation Do not apply speed-up rotations') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') print('--debug Check that the rotations are implemented correctly') print('--no-plot Do not plot the results') print('--no-monitor Do not plot distributions during learning') print('--precompute Precompute some moments when rotating. May ' 'speed up or slow down.') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--no-rotation": kwargs["rotate"] = False elif opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--debug": kwargs["debug"] = True elif opt == "--precompute": kwargs["precompute"] = True elif opt == "--seed": kwargs["seed"] = int(arg) elif opt in ("--m",): kwargs["M"] = int(arg) elif opt in ("--n",): kwargs["N"] = int(arg) elif opt in ("--d",): kwargs["D"] = int(arg) elif opt in ("--no-plot"): kwargs["plot"] = False elif opt in ("--no-monitor"): kwargs["monitor"] = False else: raise ValueError("Unhandled option given") demo(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/lssm_sd.py0000644000175100001770000002477100000000000020615 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Demonstrate the linear state-space model with switching dynamics. The model differs from the classical linear state-space model in that it has a set of state dynamics matrices of which one is used at each time instance. A hidden Markov model is used to select the dynamics matrix. Some functions in this module are re-usable: * ``model`` can be used to construct the LSSM with switching dynamics. * ``infer`` can be used to apply the model to given data. """ import numpy as np import matplotlib.pyplot as plt from bayespy.nodes import (GaussianARD, SwitchingGaussianMarkovChain, CategoricalMarkovChain, Dirichlet, Mixture, Gamma, SumMultiply) from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations import bayespy.plot as bpplt def model(M=20, N=100, D=10, K=3): """ Construct the linear state-space model with switching dynamics. """ # # Switching dynamics (HMM) # # Prior for initial state probabilities rho = Dirichlet(1e-3*np.ones(K), name='rho') # Prior for state transition probabilities V = Dirichlet(1e-3*np.ones(K), plates=(K,), name='V') v = 10*np.identity(K) + 1*np.ones((K,K)) v /= np.sum(v, axis=-1, keepdims=True) V.initialize_from_value(v) # Hidden states (with unknown initial state probabilities and state # transition probabilities) Z = CategoricalMarkovChain(rho, V, states=N-1, name='Z', plotter=bpplt.CategoricalMarkovChainPlotter(), initialize=False) Z.u[0] = np.random.dirichlet(np.ones(K)) Z.u[1] = np.reshape(np.random.dirichlet(0.5*np.ones(K*K), size=(N-2)), (N-2, K, K)) # # Linear state-space models # # Dynamics matrix with ARD # (K,D) x () alpha = Gamma(1e-5, 1e-5, plates=(K,1,D), name='alpha') # (K,1,1,D) x (D) A = GaussianARD(0, alpha, shape=(D,), plates=(K,D), name='A', plotter=bpplt.GaussianHintonPlotter()) A.initialize_from_value(np.identity(D)*np.ones((K,D,D)) + 0.1*np.random.randn(K,D,D)) # Latent states with dynamics # (K,1) x (N,D) X = SwitchingGaussianMarkovChain(np.zeros(D), # mean of x0 1e-3*np.identity(D), # prec of x0 A, # dynamics Z, # dynamics selection np.ones(D), # innovation n=N, # time instances name='X', plotter=bpplt.GaussianMarkovChainPlotter()) X.initialize_from_value(10*np.random.randn(N,D)) # Mixing matrix from latent space to observation space using ARD # (K,1,1,D) x () gamma = Gamma(1e-5, 1e-5, plates=(D,), name='gamma') # (K,M,1) x (D) C = GaussianARD(0, gamma, shape=(D,), plates=(M,1), name='C', plotter=bpplt.GaussianHintonPlotter(rows=-3,cols=-1)) C.initialize_from_value(np.random.randn(M,1,D)) # Underlying noiseless function # (K,M,N) x () F = SumMultiply('i,i', C, X, name='F') # # Mixing the models # # Observation noise tau = Gamma(1e-5, 1e-5, name='tau') tau.initialize_from_value(1e2) # Emission/observation distribution Y = GaussianARD(F, tau, name='Y') Q = VB(Y, F, Z, rho, V, C, gamma, X, A, alpha, tau) return Q def infer(y, D, K, rotate=False, debug=False, maxiter=100, mask=True, plot_C=True, monitor=False, update_hyper=0, autosave=None): """ Apply LSSM with switching dynamics to the given data. """ (M, N) = np.shape(y) # Construct model Q = model(M=M, K=K, N=N, D=D) if not plot_C: Q['C'].set_plotter(None) if autosave is not None: Q.set_autosave(autosave, iterations=10) Q['Y'].observe(y, mask=mask) # Set up rotation speed-up if rotate: raise NotImplementedError() # Initial rotate the D-dimensional state space (X, A, C) # Do not update hyperparameters rotA_init = transformations.RotateGaussianARD(Q['A']) rotX_init = transformations.RotateSwitchingMarkovChain(Q['X'], Q['A'], Q['Z'], rotA_init) rotC_init = transformations.RotateGaussianARD(Q['C']) R_init = transformations.RotationOptimizer(rotX_init, rotC_init, D) # Rotate the D-dimensional state space (X, A, C) rotA = transformations.RotateGaussianARD(Q['A'], Q['alpha']) rotX = transformations.RotateSwitchingMarkovChain(Q['X'], Q['A'], Q['Z'], rotA) rotC = transformations.RotateGaussianARD(Q['C'], Q['gamma']) R = transformations.RotationOptimizer(rotX, rotC, D) if debug: rotate_kwargs = {'maxiter': 10, 'check_bound': True, 'check_gradient': True} else: rotate_kwargs = {'maxiter': 10} # Run inference if monitor: Q.plot() for n in range(maxiter): if n < update_hyper: Q.update('X', 'C', 'A', 'tau', 'Z', plot=monitor) if rotate: R_init.rotate(**rotate_kwargs) else: Q.update(plot=monitor) if rotate: R.rotate(**rotate_kwargs) return Q def simulate_data(N): """ Generate time-series data with switching dynamics. """ # Two states: 1) oscillation, 2) random walk w1 = 0.02 * 2*np.pi A = [ [[np.cos(w1), -np.sin(w1)], [np.sin(w1), np.cos(w1)]], [[ 1.0, 0.0], [ 0.0, 0.0]] ] C = [[1.0, 0.0]] # State switching probabilities q = 0.993 # probability to stay in the same state r = (1-q)/(2-1) # probability to switch P = q*np.identity(2) + r*(np.ones((2,2))-np.identity(2)) X = np.zeros((N, 2)) Z = np.zeros(N) Y = np.zeros(N) F = np.zeros(N) z = np.random.randint(2) x = np.random.randn(2) Z[0] = z X[0,:] = x for n in range(1,N): x = np.dot(A[z], x) + np.random.randn(2) f = np.dot(C, x) y = f + 5*np.random.randn() z = np.random.choice(2, p=P[z]) Z[n] = z X[n,:] = x Y[n] = y F[n] = f Y = Y[None,:] return (Y, F) @bpplt.interactive def demo(N=1000, maxiter=100, D=3, K=2, seed=42, plot=True, debug=False, rotate=False, monitor=True): """ Run the demo for linear state-space model with switching dynamics. """ # Use deterministic random numbers if seed is not None: np.random.seed(seed) # Generate data (Y, F) = simulate_data(N) # Plot observations if plot: plt.figure() bpplt.timeseries(F, linestyle='-', color='b') bpplt.timeseries(Y, linestyle='None', color='r', marker='x') # Apply the linear state-space model with switching dynamics Q = infer(Y, D, K, debug=debug, maxiter=maxiter, monitor=monitor, rotate=rotate, update_hyper=5) # Show results if plot: Q.plot() return if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["n=", "d=", "k=", "seed=", "debug", "no-rotation", "no-monitor", "no-plot", "maxiter="]) except getopt.GetoptError: print('python lssm_sd.py ') print('--n= Number of data vectors') print('--d= Latent space dimensionality') print('--k= Number of mixed models') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') print('--no-rotation Do not peform rotation speed ups') print('--no-plot Do not plot results') print('--no-monitor Do not plot distributions during VB learning') print('--debug Check that the rotations are implemented correctly') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--d": kwargs["D"] = int(arg) elif opt == "--k": kwargs["K"] = int(arg) elif opt == "--seed": kwargs["seed"] = int(arg) elif opt == "--no-rotation": kwargs["rotate"] = False elif opt == "--no-monitor": kwargs["monitor"] = False elif opt == "--no-plot": kwargs["plot"] = False elif opt == "--debug": kwargs["debug"] = True elif opt in ("--n",): kwargs["N"] = int(arg) else: raise ValueError("Unhandled option given") demo(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/lssm_tvd.py0000644000175100001770000003467600000000000021011 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Demonstrate the linear state-space model with time-varying dynamics. The observation is 1-D signal with changing frequency. The frequency oscillates so it can be learnt too. Missing values are used to create a few gaps in the data so the task is to reconstruct the gaps. For reference, see the following publication: (TODO) Some functions in this module are re-usable: * ``model`` can be used to construct the LSSM with switching dynamics. * ``infer`` can be used to apply the model to given data. """ import numpy as np import scipy import matplotlib.pyplot as plt from bayespy.nodes import (GaussianMarkovChain, VaryingGaussianMarkovChain, GaussianARD, Gamma, SumMultiply) from bayespy.utils import misc from bayespy.utils import random from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations from bayespy.inference.vmp.nodes.gaussian import GaussianMoments import bayespy.plot as bpplt def model(M, N, D, K): """ Construct the linear state-space model with time-varying dynamics For reference, see the following publication: (TODO) """ # # The model block for the latent mixing weight process # # Dynamics matrix with ARD # beta : (K) x () beta = Gamma(1e-5, 1e-5, plates=(K,), name='beta') # B : (K) x (K) B = GaussianARD(np.identity(K), beta, shape=(K,), plates=(K,), name='B', plotter=bpplt.GaussianHintonPlotter(rows=0, cols=1, scale=0), initialize=False) B.initialize_from_value(np.identity(K)) # Mixing weight process, that is, the weights in the linear combination of # state dynamics matrices # S : () x (N,K) S = GaussianMarkovChain(np.ones(K), 1e-6*np.identity(K), B, np.ones(K), n=N, name='S', plotter=bpplt.GaussianMarkovChainPlotter(scale=2), initialize=False) s = 10*np.random.randn(N,K) s[:,0] = 10 S.initialize_from_value(s) # # The model block for the latent states # # Projection matrix of the dynamics matrix # alpha : (K) x () alpha = Gamma(1e-5, 1e-5, plates=(D,K), name='alpha') alpha.initialize_from_value(1*np.ones((D,K))) # A : (D) x (D,K) A = GaussianARD(0, alpha, shape=(D,K), plates=(D,), name='A', plotter=bpplt.GaussianHintonPlotter(rows=0, cols=1, scale=0), initialize=False) # Initialize S and A such that A*S is almost an identity matrix a = np.zeros((D,D,K)) a[np.arange(D),np.arange(D),np.zeros(D,dtype=int)] = 1 a[:,:,0] = np.identity(D) / s[0,0] a[:,:,1:] = 0.1/s[0,0]*np.random.randn(D,D,K-1) A.initialize_from_value(a) # Latent states with dynamics # X : () x (N,D) X = VaryingGaussianMarkovChain(np.zeros(D), # mean of x0 1e-3*np.identity(D), # prec of x0 A, # dynamics matrices S._ensure_moments(S, GaussianMoments, ndim=1)[1:], # temporal weights np.ones(D), # innovation n=N, # time instances name='X', plotter=bpplt.GaussianMarkovChainPlotter(scale=2), initialize=False) X.initialize_from_value(np.random.randn(N,D)) # # The model block for observations # # Mixing matrix from latent space to observation space using ARD # gamma : (D) x () gamma = Gamma(1e-5, 1e-5, plates=(D,), name='gamma') gamma.initialize_from_value(1e-2*np.ones(D)) # C : (M,1) x (D) C = GaussianARD(0, gamma, shape=(D,), plates=(M,1), name='C', plotter=bpplt.GaussianHintonPlotter(rows=0, cols=2, scale=0)) C.initialize_from_value(np.random.randn(M,1,D)) # Noiseless process # F : (M,N) x () F = SumMultiply('d,d', C, X, name='F') # Observation noise # tau : () x () tau = Gamma(1e-5, 1e-5, name='tau') tau.initialize_from_value(1e2) # Observations # Y: (M,N) x () Y = GaussianARD(F, tau, name='Y') # Construct inference machine Q = VB(Y, F, C, gamma, X, A, alpha, tau, S, B, beta) return Q def infer(y, D, K, mask=True, maxiter=100, rotate=False, debug=False, precompute=False, update_hyper=0, start_rotating=0, start_rotating_weights=0, plot_C=True, monitor=True, autosave=None): """ Run VB inference for linear state-space model with time-varying dynamics. """ y = misc.atleast_nd(y, 2) (M, N) = np.shape(y) # Construct the model Q = model(M, N, D, K) if not plot_C: Q['C'].set_plotter(None) if autosave is not None: Q.set_autosave(autosave, iterations=10) # Observe data Q['Y'].observe(y, mask=mask) # Set up rotation speed-up if rotate: raise NotImplementedError() # Initial rotate the D-dimensional state space (X, A, C) # Does not update hyperparameters rotA_init = transformations.RotateGaussianARD(Q['A'], axis=0, precompute=precompute) rotX_init = transformations.RotateVaryingMarkovChain(Q['X'], Q['A'], Q['S']._convert(GaussianMoments)[...,1:,None], rotA_init) rotC_init = transformations.RotateGaussianARD(Q['C'], axis=0, precompute=precompute) R_X_init = transformations.RotationOptimizer(rotX_init, rotC_init, D) # Rotate the D-dimensional state space (X, A, C) rotA = transformations.RotateGaussianARD(Q['A'], Q['alpha'], axis=0, precompute=precompute) rotX = transformations.RotateVaryingMarkovChain(Q['X'], Q['A'], Q['S']._convert(GaussianMoments)[...,1:,None], rotA) rotC = transformations.RotateGaussianARD(Q['C'], Q['gamma'], axis=0, precompute=precompute) R_X = transformations.RotationOptimizer(rotX, rotC, D) # Rotate the K-dimensional latent dynamics space (S, A, C) rotB = transformations.RotateGaussianARD(Q['B'], Q['beta'], precompute=precompute) rotS = transformations.RotateGaussianMarkovChain(Q['S'], rotB) rotA = transformations.RotateGaussianARD(Q['A'], Q['alpha'], axis=-1, precompute=precompute) R_S = transformations.RotationOptimizer(rotS, rotA, K) if debug: rotate_kwargs = {'maxiter': 10, 'check_bound': True, 'check_gradient': True} else: rotate_kwargs = {'maxiter': 10} # Plot initial distributions if monitor: Q.plot() # Run inference using rotations for ind in range(maxiter): if ind < update_hyper: # It might be a good idea to learn the lower level nodes a bit # before starting to learn the upper level nodes. Q.update('X', 'C', 'A', 'tau', plot=monitor) if rotate and ind >= start_rotating: # Use the rotation which does not update alpha nor beta R_X_init.rotate(**rotate_kwargs) else: Q.update(plot=monitor) if rotate and ind >= start_rotating: # It might be a good idea to not rotate immediately because it # might lead to pruning out components too efficiently before # even estimating them roughly R_X.rotate(**rotate_kwargs) if ind >= start_rotating_weights: R_S.rotate(**rotate_kwargs) # Return the posterior approximation return Q def simulate_data(N): """ Generate a signal with changing frequency """ t = np.arange(N) a = 0.1 * 2*np.pi # base frequency b = 0.01 * 2*np.pi # frequency of the frequency change c = 8 # magnitude of the frequency change f = np.sin( a * (t + c*np.sin(b*t)) ) y = f + 0.1*np.random.randn(N) return (y, f) @bpplt.interactive def demo(N=1000, D=5, K=4, seed=42, maxiter=200, rotate=False, debug=False, precompute=False, plot=True): # Seed for random number generator if seed is not None: np.random.seed(seed) # Create data (y, f) = simulate_data(N) # Create some gaps mask_gaps = misc.trues(N) for m in range(100, N, 140): start = m end = min(m+15, N-1) mask_gaps[start:end] = False # Randomly missing values mask_random = np.logical_or(random.mask(N, p=0.8), np.logical_not(mask_gaps)) # Remove the observations mask = np.logical_and(mask_gaps, mask_random) y[~mask] = np.nan # BayesPy doesn't require NaNs, they're just for plotting. # Add row axes y = y[None,...] f = f[None,...] mask = mask[None,...] mask_gaps = mask_gaps[None,...] mask_random = mask_random[None,...] # Run the method Q = infer(y, D, K, mask=mask, maxiter=maxiter, rotate=rotate, debug=debug, precompute=precompute, update_hyper=10, start_rotating_weights=20, monitor=True) if plot: # Plot observations plt.figure() bpplt.timeseries_normal(Q['F'], scale=2) bpplt.timeseries(f, linestyle='-', color='b') bpplt.timeseries(y, linestyle='None', color='r', marker='.') plt.ylim([-2, 2]) # Plot latent space Q.plot('X') # Plot mixing weight space Q.plot('S') # Compute RMSE rmse_random = misc.rmse(Q['Y'].get_moments()[0][~mask_random], f[~mask_random]) rmse_gaps = misc.rmse(Q['Y'].get_moments()[0][~mask_gaps], f[~mask_gaps]) print("RMSE for randomly missing values: %f" % rmse_random) print("RMSE for gap values: %f" % rmse_gaps) if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", [ "n=", "d=", "k=", "seed=", "maxiter=", "debug", "precompute", "no-plot", "no-rotation"]) except getopt.GetoptError: print('python lssm_tvd.py ') print('--n= Number of data vectors') print('--d= Dimensionality of the latent vectors in the model') print('--k= Dimensionality of the latent mixing weights') print('--no-rotation Do not apply speed-up rotations') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') print('--debug Check that the rotations are implemented correctly') print('--no-plot Do not plot results') print('--precompute Precompute some moments when rotating. May ' 'speed up or slow down.') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--no-rotation": kwargs["rotate"] = False elif opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--debug": kwargs["debug"] = True elif opt == "--precompute": kwargs["precompute"] = True elif opt == "--seed": kwargs["seed"] = int(arg) elif opt == "--n": kwargs["N"] = int(arg) elif opt == "--d": kwargs["D"] = int(arg) elif opt == "--k": if int(arg) == 0: kwargs["K"] = None else: kwargs["K"] = int(arg) elif opt == "--no-plot": kwargs["plot"] = False else: raise ValueError("Unhandled argument given") demo(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/mog.py0000644000175100001770000000662100000000000017725 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import matplotlib.pyplot as plt import time from bayespy.utils import misc import bayespy.plot as bpplt from bayespy.inference.vmp import nodes from bayespy.inference.vmp.vmp import VB def gaussianmix_model(N, K, D, covariance='full'): # N = number of data vectors # K = number of clusters # D = dimensionality # Construct the Gaussian mixture model # K prior weights (for components) alpha = nodes.Dirichlet(1e-3*np.ones(K), name='alpha') # N K-dimensional cluster assignments (for data) z = nodes.Categorical(alpha, plates=(N,), name='z') if covariance.lower() == 'full': # K D-dimensional component means X = nodes.GaussianARD(0, 1e-3, shape=(D,), plates=(K,), name='X') # K D-dimensional component covariances Lambda = nodes.Wishart(D, 0.01*np.identity(D), plates=(K,), name='Lambda') # N D-dimensional observation vectors Y = nodes.Mixture(z, nodes.Gaussian, X, Lambda, plates=(N,), name='Y') elif covariance.lower() == 'diagonal': # K D-dimensional component means X = nodes.GaussianARD(0, 1e-3, plates=(D,K,), name='X') # Inverse variances Lambda = nodes.Gamma(1e-3, 1e-3, plates=(D,K), name='Lambda') # N D-dimensional observation vectors Y = nodes.Mixture(z[...,None], nodes.GaussianARD, X, Lambda, plates=(N,D), name='Y') elif covariance.lower() == 'isotropic': # K D-dimensional component means X = nodes.GaussianARD(0, 1e-3, plates=(D,K,), name='X') # Inverse variances Lambda = nodes.Gamma(1e-3, 1e-3, plates=(D,K), name='Lambda') # N D-dimensional observation vectors Y = nodes.Mixture(z[...,None], nodes.GaussianARD, X, Lambda, plates=(N,D), name='Y') z.initialize_from_random() return VB(Y, X, Lambda, z, alpha) @bpplt.interactive def run(N=50, K=5, D=2): # Generate data N1 = int(np.floor(0.5*N)) N2 = N - N1 y = np.vstack([np.random.normal(0, 0.5, size=(N1,D)), np.random.normal(10, 0.5, size=(N2,D))]) # Construct model Q = gaussianmix_model(N,K,D) # Observe data Q['Y'].observe(y) # Run inference Q.update(repeat=30) # Run predictive model zh = nodes.Categorical(Q['alpha'], name='zh') Yh = nodes.Mixture(zh, nodes.Gaussian, Q['X'], Q['Lambda'], name='Yh') zh.update() # Plot predictive pdf N1 = 400 N2 = 400 x1 = np.linspace(-3, 15, N1) x2 = np.linspace(-3, 15, N2) xh = misc.grid(x1, x2) lpdf = Yh.integrated_logpdf_from_parents(xh, 0) pdf = np.reshape(np.exp(lpdf), (N2,N1)) plt.clf() plt.contourf(x1, x2, pdf, 100) plt.scatter(y[:,0], y[:,1]) print('integrated pdf:', np.sum(pdf)*(18*18)/(N1*N2)) #Q['X'].show() #Q['alpha'].show() if __name__ == '__main__': run() plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/pattern_search.py0000644000175100001770000000755000000000000022147 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Demonstration of the pattern search method for PCA. The pattern searches are compared to standard VB-EM algorithm in CPU time. For more info on the pattern search method, see :cite:`Honkela:2002`. """ import numpy as np import scipy import matplotlib.pyplot as plt import bayespy.plot as myplt from bayespy.utils import misc from bayespy.utils import random from bayespy import nodes from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations import bayespy.plot as bpplt from bayespy.demos import pca def run(M=40, N=100, D_y=6, D=8, seed=42, rotate=False, maxiter=1000, debug=False, plot=True): """ Run pattern search demo for PCA. """ if seed is not None: np.random.seed(seed) # Generate data w = np.random.normal(0, 1, size=(M,1,D_y)) x = np.random.normal(0, 1, size=(1,N,D_y)) f = misc.sum_product(w, x, axes_to_sum=[-1]) y = f + np.random.normal(0, 0.2, size=(M,N)) # Construct model Q = pca.model(M, N, D) # Data with missing values mask = random.mask(M, N, p=0.5) # randomly missing y[~mask] = np.nan Q['Y'].observe(y, mask=mask) # Initialize some nodes randomly Q['X'].initialize_from_random() Q['W'].initialize_from_random() # Use a few VB-EM updates at the beginning Q.update(repeat=10) Q.save() # Standard VB-EM as a baseline Q.update(repeat=maxiter) if plot: bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'k-') # Restore initial state Q.load() # Pattern search method for comparison for n in range(maxiter): Q.pattern_search('W', 'tau', maxiter=3, collapsed=['X', 'alpha']) Q.update(repeat=20) if Q.has_converged(): break if plot: bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'r:') bpplt.pyplot.xlabel('CPU time (in seconds)') bpplt.pyplot.ylabel('VB lower bound') bpplt.pyplot.legend(['VB-EM', 'Pattern search'], loc='lower right') if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["m=", "n=", "d=", "k=", "seed=", "maxiter=", "debug"]) except getopt.GetoptError: print('python demo_pca.py ') print('--m= Dimensionality of data vectors') print('--n= Number of data vectors') print('--d= Dimensionality of the latent vectors in the model') print('--k= Dimensionality of the true latent vectors') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') print('--debug Check that the rotations are implemented correctly') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--rotate": kwargs["rotate"] = True elif opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--debug": kwargs["debug"] = True elif opt == "--seed": kwargs["seed"] = int(arg) elif opt in ("--m",): kwargs["M"] = int(arg) elif opt in ("--n",): kwargs["N"] = int(arg) elif opt in ("--d",): kwargs["D"] = int(arg) elif opt in ("--k",): kwargs["D_y"] = int(arg) run(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/pca.py0000644000175100001770000001106100000000000017700 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import matplotlib.pyplot as plt import bayespy.plot as myplt from bayespy.utils import misc from bayespy.utils import random from bayespy import nodes from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations import bayespy.plot as bpplt def model(M, N, D): # Construct the PCA model with ARD # ARD alpha = nodes.Gamma(1e-2, 1e-2, plates=(D,), name='alpha') # Loadings W = nodes.GaussianARD(0, alpha, shape=(D,), plates=(M,1), name='W') # States X = nodes.GaussianARD(0, 1, shape=(D,), plates=(1,N), name='X') # PCA F = nodes.SumMultiply('i,i', W, X, name='F') # Noise tau = nodes.Gamma(1e-2, 1e-2, name='tau') # Noisy observations Y = nodes.GaussianARD(F, tau, name='Y') # Initialize some nodes randomly X.initialize_from_random() W.initialize_from_random() return VB(Y, F, W, X, tau, alpha) @bpplt.interactive def run(M=10, N=100, D_y=3, D=5, seed=42, rotate=False, maxiter=1000, debug=False, plot=True): if seed is not None: np.random.seed(seed) # Generate data w = np.random.normal(0, 1, size=(M,1,D_y)) x = np.random.normal(0, 1, size=(1,N,D_y)) f = misc.sum_product(w, x, axes_to_sum=[-1]) y = f + np.random.normal(0, 0.1, size=(M,N)) # Construct model Q = model(M, N, D) # Data with missing values mask = random.mask(M, N, p=0.5) # randomly missing y[~mask] = np.nan Q['Y'].observe(y, mask=mask) # Run inference algorithm if rotate: # Use rotations to speed up learning rotW = transformations.RotateGaussianARD(Q['W'], Q['alpha']) rotX = transformations.RotateGaussianARD(Q['X']) R = transformations.RotationOptimizer(rotW, rotX, D) if debug: Q.callback = lambda : R.rotate(check_bound=True, check_gradient=True) else: Q.callback = R.rotate # Use standard VB-EM alone Q.update(repeat=maxiter) # Plot results if plot: plt.figure() bpplt.timeseries_normal(Q['F'], scale=2) bpplt.timeseries(f, color='g', linestyle='-') bpplt.timeseries(y, color='r', linestyle='None', marker='+') if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["m=", "n=", "d=", "k=", "seed=", "maxiter=", "debug", "rotate"]) except getopt.GetoptError: print('python demo_pca.py ') print('--m= Dimensionality of data vectors') print('--n= Number of data vectors') print('--d= Dimensionality of the latent vectors in the model') print('--k= Dimensionality of the true latent vectors') print('--rotate Apply speed-up rotations') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') print('--debug Check that the rotations are implemented correctly') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--rotate": kwargs["rotate"] = True elif opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--debug": kwargs["debug"] = True elif opt == "--seed": kwargs["seed"] = int(arg) elif opt in ("--m",): kwargs["M"] = int(arg) elif opt in ("--n",): kwargs["N"] = int(arg) elif opt in ("--d",): kwargs["D"] = int(arg) elif opt in ("--k",): kwargs["D_y"] = int(arg) run(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/saving.py0000644000175100001770000000767400000000000020443 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import matplotlib.pyplot as plt import h5py import tempfile import bayespy.plot as bpplt from bayespy.utils import misc from bayespy.utils import random from bayespy.inference.vmp import nodes from bayespy.inference.vmp.vmp import VB def pca_model(M, N, D): # Construct the PCA model with ARD # ARD alpha = nodes.Gamma(1e-2, 1e-2, plates=(D,), name='alpha') # Loadings W = nodes.Gaussian(np.zeros(D), alpha.as_diagonal_wishart(), name="W", plates=(M,1)) # States X = nodes.Gaussian(np.zeros(D), np.identity(D), name="X", plates=(1,N)) # PCA WX = nodes.Dot(W, X, name="WX") # Noise tau = nodes.Gamma(1e-2, 1e-2, name="tau", plates=()) # Noisy observations Y = nodes.GaussianARD(WX, tau, name="Y", plates=(M,N)) return (Y, WX, W, X, tau, alpha) @bpplt.interactive def run(M=10, N=100, D_y=3, D=5): seed = 45 print('seed =', seed) np.random.seed(seed) # Check HDF5 version. if h5py.version.hdf5_version_tuple < (1,8,7): print("WARNING! Your HDF5 version is %s. HDF5 versions <1.8.7 are not " "able to save empty arrays, thus you may experience problems if " "you for instance try to save before running any iteration steps." % str(h5py.version.hdf5_version_tuple)) # Generate data w = np.random.normal(0, 1, size=(M,1,D_y)) x = np.random.normal(0, 1, size=(1,N,D_y)) f = misc.sum_product(w, x, axes_to_sum=[-1]) y = f + np.random.normal(0, 0.5, size=(M,N)) # Construct model (Y, WX, W, X, tau, alpha) = pca_model(M, N, D) # Data with missing values mask = random.mask(M, N, p=0.9) # randomly missing mask[:,20:40] = False # gap missing y[~mask] = np.nan Y.observe(y, mask=mask) # Construct inference machine Q = VB(Y, W, X, tau, alpha, autosave_iterations=5) # Initialize some nodes randomly X.initialize_from_value(X.random()) W.initialize_from_value(W.random()) # Save the state into a HDF5 file filename = tempfile.NamedTemporaryFile(suffix='hdf5').name Q.update(X, W, alpha, tau, repeat=1) Q.save(filename=filename) # Inference loop. Q.update(X, W, alpha, tau, repeat=10) # Reload the state from the HDF5 file Q.load(filename=filename) # Inference loop again. Q.update(X, W, alpha, tau, repeat=10) # NOTE: Saving and loading requires that you have the model # constructed. "Save" does not store the model structure nor does "load" # read it. They are just used for reading and writing the contents of the # nodes. Thus, if you want to load, you first need to construct the same # model that was used for saving and then use load to set the states of the # nodes. plt.clf() WX_params = WX.get_parameters() fh = WX_params[0] * np.ones(y.shape) err_fh = 2*np.sqrt(WX_params[1] + 1/tau.get_moments()[0]) * np.ones(y.shape) for m in range(M): plt.subplot(M,1,m+1) #errorplot(y, error=None, x=None, lower=None, upper=None): bpplt.errorplot(fh[m], x=np.arange(N), error=err_fh[m]) plt.plot(np.arange(N), f[m], 'g') plt.plot(np.arange(N), y[m], 'r+') plt.figure() Q.plot_iteration_by_nodes() plt.figure() plt.subplot(2,2,1) bpplt.binary_matrix(W.mask) plt.subplot(2,2,2) bpplt.binary_matrix(X.mask) plt.subplot(2,2,3) #bpplt.binary_matrix(WX.get_mask()) plt.subplot(2,2,4) bpplt.binary_matrix(Y.mask) if __name__ == '__main__': run() plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/demos/stochastic_inference.py0000644000175100001770000001162500000000000023325 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Stochastic variational inference on mixture of Gaussians Stochastic variational inference is a scalable variational Bayesian learning method which utilizes stochastic gradient. For details, see :cite:`Hoffman:2013`. """ import numpy as np import scipy import matplotlib.pyplot as plt import bayespy.plot as myplt from bayespy.utils import misc from bayespy.utils import random from bayespy.nodes import Gaussian, Categorical, Mixture, Dirichlet from bayespy.inference.vmp.vmp import VB from bayespy.inference.vmp import transformations import bayespy.plot as bpplt from bayespy.demos import pca def run(N=100000, N_batch=50, seed=42, maxiter=100, plot=True): """ Run deterministic annealing demo for 1-D Gaussian mixture. """ if seed is not None: np.random.seed(seed) # Number of clusters in the model K = 20 # Dimensionality of the data D = 5 # Generate data K_true = 10 spread = 5 means = spread * np.random.randn(K_true, D) z = random.categorical(np.ones(K_true), size=N) data = np.empty((N,D)) for n in range(N): data[n] = means[z[n]] + np.random.randn(D) # # Standard VB-EM algorithm # # Full model mu = Gaussian(np.zeros(D), np.identity(D), plates=(K,), name='means') alpha = Dirichlet(np.ones(K), name='class probabilities') Z = Categorical(alpha, plates=(N,), name='classes') Y = Mixture(Z, Gaussian, mu, np.identity(D), name='observations') # Break symmetry with random initialization of the means mu.initialize_from_random() # Put the data in Y.observe(data) # Run inference Q = VB(Y, Z, mu, alpha) Q.save(mu) Q.update(repeat=maxiter) if plot: bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'k-') max_cputime = np.sum(Q.cputime[~np.isnan(Q.cputime)]) # # Stochastic variational inference # # Construct smaller model (size of the mini-batch) mu = Gaussian(np.zeros(D), np.identity(D), plates=(K,), name='means') alpha = Dirichlet(np.ones(K), name='class probabilities') Z = Categorical(alpha, plates=(N_batch,), plates_multiplier=(N/N_batch,), name='classes') Y = Mixture(Z, Gaussian, mu, np.identity(D), name='observations') # Break symmetry with random initialization of the means mu.initialize_from_random() # Inference engine Q = VB(Y, Z, mu, alpha, autosave_filename=Q.autosave_filename) Q.load(mu) # Because using mini-batches, messages need to be multiplied appropriately print("Stochastic variational inference...") Q.ignore_bound_checks = True maxiter *= int(N/N_batch) delay = 1 forgetting_rate = 0.7 for n in range(maxiter): # Observe a mini-batch subset = np.random.choice(N, N_batch) Y.observe(data[subset,:]) # Learn intermediate variables Q.update(Z) # Set step length step = (n + delay) ** (-forgetting_rate) # Stochastic gradient for the global variables Q.gradient_step(mu, alpha, scale=step) if np.sum(Q.cputime[:n]) > max_cputime: break if plot: bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'r:') bpplt.pyplot.xlabel('CPU time (in seconds)') bpplt.pyplot.ylabel('VB lower bound') bpplt.pyplot.legend(['VB-EM', 'Stochastic inference'], loc='lower right') bpplt.pyplot.title('VB for Gaussian mixture model') return if __name__ == '__main__': import sys, getopt, os try: opts, args = getopt.getopt(sys.argv[1:], "", ["n=", "batch=", "seed=", "maxiter="]) except getopt.GetoptError: print('python stochastic_inference.py ') print('--n= Number of data points') print('--batch= Mini-batch size') print('--maxiter= Maximum number of VB iterations') print('--seed= Seed (integer) for the random number generator') sys.exit(2) kwargs = {} for opt, arg in opts: if opt == "--maxiter": kwargs["maxiter"] = int(arg) elif opt == "--seed": kwargs["seed"] = int(arg) elif opt in ("--n",): kwargs["N"] = int(arg) elif opt in ("--batch",): kwargs["N_batch"] = int(arg) run(**kwargs) plt.show() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/discrete_example.py0000644000175100001770000000356600000000000021356 0ustar00runnerdocker00000000000000# This example could be simplified a little bit by using Bernoulli instead of # Categorical, but Categorical makes it possible to use more categories than # just TRUE and FALSE. import numpy as np from bayespy.nodes import Categorical, Mixture from bayespy.inference import VB # NOTE: Python's built-in booleans don't work nicely for indexing, thus define # own variables: FALSE = 0 TRUE = 1 def _or(p_false, p_true): """ Build probability table for OR-operation of two parents p_false: Probability table to use if both are FALSE p_true: Probability table to use if one or both is TRUE """ return np.take([p_false, p_true], [[FALSE, TRUE], [TRUE, TRUE]], axis=0) asia = Categorical([0.5, 0.5]) tuberculosis = Mixture(asia, Categorical, [[0.99, 0.01], [0.8, 0.2]]) smoking = Categorical([0.5, 0.5]) lung = Mixture(smoking, Categorical, [[0.98, 0.02], [0.25, 0.75]]) bronchitis = Mixture(smoking, Categorical, [[0.97, 0.03], [0.08, 0.92]]) xray = Mixture(tuberculosis, Mixture, lung, Categorical, _or([0.96, 0.04], [0.115, 0.885])) dyspnea = Mixture(bronchitis, Mixture, tuberculosis, Mixture, lung, Categorical, [_or([0.6, 0.4], [0.18, 0.82]), _or([0.11, 0.89], [0.04, 0.96])]) # Mark observations tuberculosis.observe(TRUE) smoking.observe(FALSE) bronchitis.observe(TRUE) # not a "chance" observation as in the original example # Run inference Q = VB(dyspnea, xray, bronchitis, lung, smoking, tuberculosis, asia) Q.update(repeat=100) # Show results print("P(asia):", asia.get_moments()[0][TRUE]) print("P(tuberculosis):", tuberculosis.get_moments()[0][TRUE]) print("P(smoking):", smoking.get_moments()[0][TRUE]) print("P(lung):", lung.get_moments()[0][TRUE]) print("P(bronchitis):", bronchitis.get_moments()[0][TRUE]) print("P(xray):", xray.get_moments()[0][TRUE]) print("P(dyspnea):", dyspnea.get_moments()[0][TRUE]) ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.405372 bayespy-0.6.2/bayespy/inference/0000755000175100001770000000000000000000000017413 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/__init__.py0000644000175100001770000000143700000000000021531 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Package for Bayesian inference engines Inference engines ----------------- .. autosummary:: :toctree: generated/ VB Parameter expansions -------------------- .. autosummary:: :toctree: generated/ vmp.transformations.RotationOptimizer vmp.transformations.RotateGaussian vmp.transformations.RotateGaussianARD vmp.transformations.RotateGaussianMarkovChain vmp.transformations.RotateSwitchingMarkovChain vmp.transformations.RotateVaryingMarkovChain vmp.transformations.RotateMultiple """ from .vmp.vmp import VB ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.405372 bayespy-0.6.2/bayespy/inference/vmp/0000755000175100001770000000000000000000000020215 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/__init__.py0000644000175100001770000000000000000000000022314 0ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.413372 bayespy-0.6.2/bayespy/inference/vmp/nodes/0000755000175100001770000000000000000000000021325 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/CovarianceFunctions.py0000644000175100001770000006774500000000000025665 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import itertools import numpy as np #import scipy as sp import scipy.sparse as sp # prefer CSC format #import scipy.linalg.decomp_cholesky as decomp #import scipy.linalg as linalg #import scipy.special as special #import matplotlib.pyplot as plt #import time #import profile import scipy.spatial.distance as dist #import scikits.sparse.distance as spdist from . import node as ef from bayespy.utils import misc as utils # Covariance matrices can be either arrays or matrices so be careful # with products and powers! Use explicit multiply or dot instead of # *-operator. def gp_cov_se(D2, overwrite=False): if overwrite: K = D2 K *= -0.5 np.exp(K, out=K) else: K = np.exp(-0.5*D2) return K def gp_cov_pp2_new(r, d, derivative=False): # Dimension dependent parameter q = 2 j = np.floor(d/2) + q + 1 # Polynomial coefficients a2 = j**2 + 4*j + 3 a1 = 3*j + 6 a0 = 3 # Two parts of the covariance function k1 = (1-r) ** (j+2) k2 = (a2*r**2 + a1*r + 3) # TODO: Check that derivative is 0, 1 or 2! if derivative == 0: # Return covariance return k1 * k2 / 3 dk1 = - (j+2) * (1-r)**(j+1) dk2 = 2*a2*r + a1 if derivative == 1: # Return first derivative of the covariance return (k1 * dk2 + dk1 * k2) / 3 ddk1 = (j+2) * (j+1) * (1-r)**j ddk2 = 2*a2 if derivative == 2: # Return second derivative of the covariance return (ddk1*k2 + 2*dk1*dk2 + k1*ddk2) / 3 def gp_cov_pp2(r, d, gradient=False): # Dimension dependent parameter j = np.floor(d/2) + 2 + 1 # Polynomial coefficients a2 = j**2 + 4*j + 3 a1 = 3*j + 6 a0 = 3 # Two parts of the covariance function k1 = (1-r) ** (j+2) k2 = (a2*r**2 + a1*r + 3) # The covariance function k = k1 * k2 / 3 if gradient: # The gradient w.r.t. r dk = k * (j+2) / (r-1) + k1 * (2*a2*r + a1) / 3 return (k, dk) else: return k def gp_cov_delta(N): # TODO: Use sparse matrices here! if N > 0: #print('in gpcovdelta', N, sp.identity(N).shape) return sp.identity(N) else: # Sparse matrices do not allow zero-length dimensions return np.identity(N) #return np.identity(N) #return np.asmatrix(np.identity(N)) def squared_distance(x1, x2): ## # Reshape arrays to 2-D arrays ## sh1 = np.shape(x1)[:-1] ## sh2 = np.shape(x2)[:-1] ## d = np.shape(x1)[-1] ## x1 = np.reshape(x1, (-1,d)) ## x2 = np.reshape(x2, (-1,d)) (m1,n1) = x1.shape (m2,n2) = x2.shape if m1 == 0 or m2 == 0: D2 = np.empty((m1,m2)) else: # Compute squared Euclidean distance D2 = dist.cdist(x1, x2, metric='sqeuclidean') #D2 = np.asmatrix(D2) # Reshape the result #D2 = np.reshape(D2, sh1 + sh2) return D2 # General rule for the parameters for covariance functions: # # (value, [ [dvalue1, ...], [dvalue2, ...], [dvalue3, ...], ...]) # # For instance, # # k = covfunc_se((1.0, []), (15, [ [1,update_grad] ])) # K = k((x1, [ [dx1,update_grad] ]), (x2, [])) # # Plain values are converted as: # value -> (value, []) def gp_standardize_input(x): if np.size(x) == 0: x = np.reshape(x, (0,0)) elif np.ndim(x) == 0: x = np.reshape(x, (1,1)) elif np.ndim(x) == 1: x = np.reshape(x, (-1,1)) elif np.ndim(x) == 2: x = np.atleast_2d(x) else: raise Exception("Standard GP inputs must be 2-dimensional") return x def gp_preprocess_inputs(x1,x2=None): #args = list(args) #if len(args) < 1 or len(args) > 2: #raise Exception("Number of inputs must be one or two") if x2 is None: x1 = gp_standardize_input(x1) return x1 else: if x1 is x2: x1 = gp_standardize_input(x1) x2 = x1 else: x1 = gp_standardize_input(x1) x2 = gp_standardize_input(x2) return (x1, x2) #return args ## def gp_preprocess_inputs(x1,x2=None): ## #args = list(args) ## #if len(args) < 1 or len(args) > 2: ## #raise Exception("Number of inputs must be one or two") ## if x2 is not None: len(args) == 2: ## if args[0] is args[1]: ## args[0] = gp_standardize_input(args[0]) ## args[1] = args[0] ## else: ## args[1] = gp_standardize_input(args[1]) ## args[0] = gp_standardize_input(args[0]) ## else: ## args[0] = gp_standardize_input(args[0]) ## return args # TODO: # General syntax for these covariance functions: # covfunc(hyper1, # hyper2, # ... # hyperN, # x1, # x2=None, # gradient=list_of_booleans_for_each_hyperparameter) def covfunc_zeros(x1, x2=None, gradient=False): inputs = gp_preprocess_inputs(*inputs) # Compute distance and covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) # Only variance vector asked N = np.shape(x1)[0] # TODO: Use sparse matrices! K = np.zeros(N) #K = np.asmatrix(np.zeros((N,1))) else: (x1,x2) = gp_preprocess_inputs(x1,x2) # Full covariance matrix asked #x1 = inputs[0] #x2 = inputs[1] # Number of inputs x1 N1 = np.shape(x1)[0] N2 = np.shape(x2)[0] # TODO: Use sparse matrices! K = np.zeros((N1,N2)) #K = np.asmatrix(np.zeros((N1,N2))) if gradient is not False: return (K, []) else: return K def covfunc_delta(amplitude, x1, x2=None, gradient=False): # Make sure that amplitude is a scalar, not an array object amplitude = utils.array_to_scalar(amplitude) ## if gradient: ## gradient_amplitude = gradient[0] ## else: ## gradient_amplitude = [] ## inputs = gp_preprocess_inputs(*inputs) # Compute distance and covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) # Only variance vector asked #x = inputs[0] N = np.shape(x1)[0] K = np.ones(N) * amplitude**2 else: (x1,x2) = gp_preprocess_inputs(x1,x2) # Full covariance matrix asked #x1 = inputs[0] #x2 = inputs[1] # Number of inputs x1 N1 = np.shape(x1)[0] # x1 == x2? if x1 is x2: delta = True # Delta covariance # # FIXME: Broadcasting doesn't work with sparse matrices, # so must use scalar multiplication K = gp_cov_delta(N1) * amplitude**2 #K = gp_cov_delta(N1).multiply(amplitude**2) else: delta = False # Number of inputs x2 N2 = np.shape(x2)[0] # Zero covariance if N1 > 0 and N2 > 0: K = sp.csc_matrix((N1,N2)) else: K = np.zeros((N1,N2)) # Gradient w.r.t. amplitude if gradient: # FIXME: Broadcasting doesn't work with sparse matrices, # so must use scalar multiplication gradient_amplitude = K*(2/amplitude) print("noise grad", gradient_amplitude) return (K, (gradient_amplitude,)) else: return K def covfunc_pp2(amplitude, lengthscale, x1, x2, gradient=False): # Make sure that hyperparameters are scalars, not an array objects amplitude = utils.array_to_scalar(amplitude) lengthscale = utils.array_to_scalar(lengthscale) #amplitude = theta[0] #lengthscale = theta[1] ## if gradient: ## gradient_amplitude = gradient[0] ## gradient_lengthscale = gradient[1] ## else: ## gradient_amplitude = [] ## gradient_lengthscale = [] ## inputs = gp_preprocess_inputs(*inputs) # Compute covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) # Compute variance vector K = np.ones(np.shape(x)[:-1]) K *= amplitude**2 # Compute gradient w.r.t. lengthscale if gradient: gradient_lengthscale = np.zeros(np.shape(x1)[:-1]) else: (x1,x2) = gp_preprocess_inputs(x1,x2) # Compute (sparse) distance matrix if x1 is x2: x1 = inputs[0] / (lengthscale) x2 = x1 D2 = spdist.pdist(x1, 1.0, form="full", format="csc") else: x1 = inputs[0] / (lengthscale) x2 = inputs[1] / (lengthscale) D2 = spdist.cdist(x1, x2, 1.0, format="csc") r = np.sqrt(D2.data) N1 = np.shape(x1)[0] N2 = np.shape(x2)[0] # Compute the covariances if gradient: (k, dk) = gp_cov_pp2(r, np.shape(x1)[-1], gradient=True) else: k = gp_cov_pp2(r, np.shape(x1)[-1]) k *= amplitude**2 # Compute gradient w.r.t. lengthscale if gradient: if N1 >= 1 and N2 >= 1: dk *= r * (-amplitude**2 / lengthscale) gradient_lengthscale = sp.csc_matrix((dk, D2.indices, D2.indptr), shape=(N1,N2)) else: gradient_lengthscale = np.empty((N1,N2)) # Form sparse covariance matrix if N1 >= 1 and N2 >= 1: ## K = sp.csc_matrix((k, ij), shape=(N1,N2)) K = sp.csc_matrix((k, D2.indices, D2.indptr), shape=(N1,N2)) else: K = np.empty((N1, N2)) #print(K.__class__) # Gradient w.r.t. amplitude if gradient: gradient_amplitude = K * (2 / amplitude) # Return values if gradient: print("pp2 grad", gradient_lengthscale) return (K, (gradient_amplitude, gradient_lengthscale)) else: return K def covfunc_se(amplitude, lengthscale, x1, x2=None, gradient=False): # Make sure that hyperparameters are scalars, not an array objects amplitude = utils.array_to_scalar(amplitude) lengthscale = utils.array_to_scalar(lengthscale) # Compute covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) #x = inputs[0] # Compute variance vector N = np.shape(x1)[0] K = np.ones(N) np.multiply(K, amplitude**2, out=K) # Compute gradient w.r.t. lengthscale if gradient: # TODO: Use sparse matrices? gradient_lengthscale = np.zeros(N) else: (x1,x2) = gp_preprocess_inputs(x1,x2) x1 = x1 / (lengthscale) x2 = x2 / (lengthscale) # Compute distance matrix K = squared_distance(x1, x2) # Compute gradient partly if gradient: gradient_lengthscale = np.divide(K, lengthscale) # Compute covariance matrix gp_cov_se(K, overwrite=True) np.multiply(K, amplitude**2, out=K) # Compute gradient w.r.t. lengthscale if gradient: gradient_lengthscale *= K # Gradient w.r.t. amplitude if gradient: gradient_amplitude = K * (2 / amplitude) # Return values if gradient: print("se grad", gradient_amplitude, gradient_lengthscale) return (K, (gradient_amplitude, gradient_lengthscale)) else: return K class CovarianceFunctionWrapper(): def __init__(self, covfunc, *params): # Parse parameter values and their gradients to separate lists self.covfunc = covfunc self.params = list(params) self.gradient_params = list() ## print(params) for ind in range(len(params)): if isinstance(params[ind], tuple): # Parse the value and the list of gradients from the # form: # ([value, ...], [ [grad1, ...], [grad2, ...], ... ]) self.gradient_params.append(params[ind][1]) self.params[ind] = params[ind][0][0] else: # No gradients, parse from the form: # [value, ...] self.gradient_params.append([]) self.params[ind] = params[ind][0] def fixed_covariance_function(self, *inputs, gradient=False): # What if this is called several times?? if gradient: ## grads = [[grad[0] for grad in self.gradient_params[ind]] ## for ind in range(len(self.gradient_params))] ## (K, dK) = self.covfunc(self.params, ## *inputs, ## gradient=self.gradient_params) arguments = tuple(self.params) + tuple(inputs) (K, dK) = self.covfunc(*arguments, gradient=True) ## (K, dK) = self.covfunc(self.params, ## *inputs, ## gradient=grads) DK = [] for ind in range(len(dK)): # Gradient w.r.t. covariance function's ind-th # hyperparameter dk = dK[ind] # Chain rule: Multiply by the gradient of the # hyperparameter w.r.t. parent node and append the # list DK: # DK = [ (dx1_1, callback), ..., (dx1_n, callback) ] for grad in self.gradient_params[ind]: #print(grad[0]) #print(grad[1:]) #print(dk) if sp.issparse(dk): print(dk.shape) print(grad[0].shape) DK += [ [dk.multiply(grad[0])] + grad[1:] ] else: DK += [ [np.multiply(dk,grad[0])] + grad[1:] ] #DK += [ [np.multiply(grad[0], dk)] + grad[1:] ] ## DK += [ (np.multiply(grad, dk),) + grad[1:] ## for grad in self.gradient_params[ind] ] ## for grad in self.gradient_params[ind]: ## DK += ( (np.multiply(grad, dk),) + grad[1:] ) ## DK = [] ## for ind in range(len(dK)): ## for (grad, dk) in zip(self.gradient_params[ind], dK[ind]): ## DK += [ [dk] + grad[1:] ] K = [K] return (K, DK) else: arguments = tuple(self.params) + tuple(inputs) #print(arguments) K = self.covfunc(*arguments, gradient=False) return [K] class CovarianceFunction(ef.Node): def __init__(self, covfunc, *args, **kwargs): self.covfunc = covfunc params = list(args) for i in range(len(args)): # Check constant parameters if utils.is_numeric(args[i]): params[i] = ef.NodeConstant([np.asanyarray(args[i])], dims=[np.shape(args[i])]) # TODO: Parameters could be constant functions? :) ef.Node.__init__(self, *params, dims=[(np.inf, np.inf)], **kwargs) def __call__(self, x1, x2): """ Compute covariance matrix for inputs x1 and x2. """ covfunc = self.message_to_child() return covfunc(x1, x2)[0] def message_to_child(self, gradient=False): params = [parent.message_to_child(gradient=gradient) for parent in self.parents] covfunc = self.get_fixed_covariance_function(*params) return covfunc def get_fixed_covariance_function(self, *params): get_cov_func = CovarianceFunctionWrapper(self.covfunc, *params) return get_cov_func.fixed_covariance_function ## def covariance_function(self, *params): ## # Parse parameter values and their gradients to separate lists ## params = list(params) ## gradient_params = list() ## print(params) ## for ind in range(len(params)): ## if isinstance(params[ind], tuple): ## # Parse the value and the list of gradients from the ## # form: ## # ([value, ...], [ [grad1, ...], [grad2, ...], ... ]) ## gradient_params.append(params[ind][1]) ## params[ind] = params[ind][0][0] ## else: ## # No gradients, parse from the form: ## # [value, ...] ## gradient_params.append([]) ## params[ind] = params[ind][0] ## # This gradient_params changes mysteriously.. ## print('grad_params before') ## if isinstance(self, SquaredExponential): ## print(gradient_params) ## def cov(*inputs, gradient=False): ## if gradient: ## print('grad_params after') ## print(gradient_params) ## grads = [[grad[0] for grad in gradient_params[ind]] ## for ind in range(len(gradient_params))] ## print('CovarianceFunction.cov') ## #if isinstance(self, SquaredExponential): ## #print(self.__class__) ## #print(grads) ## (K, dK) = self.covfunc(params, ## *inputs, ## gradient=grads) ## for ind in range(len(dK)): ## for (grad, dk) in zip(gradient_params[ind], dK[ind]): ## grad[0] = dk ## K = [K] ## dK = [] ## for grad in gradient_params: ## dK += grad ## return (K, dK) ## else: ## K = self.covfunc(params, ## *inputs, ## gradient=False) ## return [K] ## return cov class Sum(CovarianceFunction): def __init__(self, *args, **kwargs): CovarianceFunction.__init__(self, None, *args, **kwargs) def get_fixed_covariance_function(self, *covfunc_parents): def covfunc(*inputs, gradient=False): K_sum = None if gradient: dK_sum = list() for k in covfunc_parents: if gradient: (K, dK) = k(*inputs, gradient=gradient) print("dK in sum", dK) dK_sum += dK #print("dK_sum in sum", dK_sum) else: K = k(*inputs, gradient=gradient) if K_sum is None: K_sum = K[0] else: try: K_sum += K[0] except: # You have to do this way, for instance, if # K_sum is sparse and K[0] is dense. K_sum = K_sum + K[0] if gradient: #print("dK_sum on: ", dK_sum) #print('covsum', dK_sum) return ([K_sum], dK_sum) else: return [K_sum] return covfunc class Delta(CovarianceFunction): def __init__(self, amplitude, **kwargs): CovarianceFunction.__init__(self, covfunc_delta, amplitude, **kwargs) class Zeros(CovarianceFunction): def __init__(self, **kwargs): CovarianceFunction.__init__(self, covfunc_zeros, **kwargs) class SquaredExponential(CovarianceFunction): def __init__(self, amplitude, lengthscale, **kwargs): CovarianceFunction.__init__(self, covfunc_se, amplitude, lengthscale, **kwargs) class PiecewisePolynomial2(CovarianceFunction): def __init__(self, amplitude, lengthscale, **kwargs): CovarianceFunction.__init__(self, covfunc_pp2, amplitude, lengthscale, **kwargs) # TODO: Rename to Blocks or Joint ? class Multiple(CovarianceFunction): def __init__(self, covfuncs, **kwargs): self.d = len(covfuncs) #self.sparse = sparse parents = [covfunc for row in covfuncs for covfunc in row] CovarianceFunction.__init__(self, None, *parents, **kwargs) def get_fixed_covariance_function(self, *covfuncs): def cov(*inputs, gradient=False): # Computes the covariance matrix from blocks which all # have their corresponding covariance functions if len(inputs) < 2: # For one input, return the variance vector instead of # the covariance matrix x1 = inputs[0] # Collect variance vectors from the covariance # functions corresponding to the diagonal blocks K = [covfuncs[i*self.d+i](x1[i], gradient=gradient)[0] for i in range(self.d)] # Form the variance vector from the collected vectors if gradient: raise Exception('Gradient not yet implemented.') else: ## print("in cov multiple") ## for (k,kf) in zip(K,covfuncs): ## print(np.shape(k), k.__class__, kf) #K = np.vstack(K) K = np.concatenate(K) else: x1 = inputs[0] x2 = inputs[1] # Collect the covariance matrix (and possibly # gradients) from each block. #print('cov mat collection begins') K = [[covfuncs[i*self.d+j](x1[i], x2[j], gradient=gradient) for j in range(self.d)] for i in range(self.d)] #print('cov mat collection ends') # Remove matrices that have zero length dimensions? if gradient: K = [[K[i][j] for j in range(self.d) if np.shape(K[i][j][0][0])[1] != 0] for i in range(self.d) if np.shape(K[i][0][0][0])[0] != 0] else: K = [[K[i][j] for j in range(self.d) if np.shape(K[i][j][0])[1] != 0] for i in range(self.d) if np.shape(K[i][0][0])[0] != 0] n_blocks = len(K) #print("nblocks", n_blocks) #print("K", K) # Check whether all blocks are sparse is_sparse = True for i in range(n_blocks): for j in range(n_blocks): if gradient: A = K[i][j][0][0] else: A = K[i][j][0] if not sp.issparse(A): is_sparse = False if gradient: ## Compute the covariance matrix and the gradients # Create block matrices of zeros. This helps in # computing the gradient. if is_sparse: # Empty sparse matrices. Some weird stuff here # because sparse matrices can't have zero # length dimensions. Z = [[sp.csc_matrix(np.shape(K[i][j][0][0])) for j in range(n_blocks)] for i in range(n_blocks)] else: # Empty dense matrices Z = [[np.zeros(np.shape(K[i][j][0][0])) for j in range(n_blocks)] for i in range(n_blocks)] ## for j in range(self.d)] ## for i in range(self.d)] # Compute gradients block by block dK = list() for i in range(n_blocks): for j in range(n_blocks): # Store the zero block z_old = Z[i][j] # Go through the gradients for the (i,j) # block for dk in K[i][j][1]: # Keep other blocks at zero and set # the gradient to (i,j) block. Form # the matrix from blocks if is_sparse: Z[i][j] = dk[0] dk[0] = sp.bmat(Z).tocsc() else: if sp.issparse(dk[0]): Z[i][j] = dk[0].toarray() else: Z[i][j] = dk[0] #print("Z on:", Z) dk[0] = np.asarray(np.bmat(Z)) # Append the computed gradient matrix # to the list of gradients dK.append(dk) # Restore the zero block Z[i][j] = z_old ## Compute the covariance matrix but not the ## gradients if is_sparse: # Form the full sparse covariance matrix from # blocks. Ignore blocks having a zero-length # axis because sparse matrices consider zero # length as an invalid shape (BUG IN SCIPY?). K = [[K[i][j][0][0] for j in range(n_blocks)] for i in range(n_blocks)] K = sp.bmat(K).tocsc() else: # Form the full dense covariance matrix from # blocks. Transform sparse blocks to dense # blocks. K = [[K[i][j][0][0] if not sp.issparse(K[i][j][0][0]) else K[i][j][0][0].toarray() for j in range(n_blocks)] for i in range(n_blocks)] K = np.asarray(np.bmat(K)) else: ## Compute the covariance matrix but not the ## gradients if is_sparse: # Form the full sparse covariance matrix from # blocks. Ignore blocks having a zero-length # axis because sparse matrices consider zero # length as an invalid shape (BUG IN SCIPY?). K = [[K[i][j][0] for j in range(n_blocks)] for i in range(n_blocks)] K = sp.bmat(K).tocsc() else: # Form the full dense covariance matrix from # blocks. Transform sparse blocks to dense # blocks. K = [[K[i][j][0] if not sp.issparse(K[i][j][0]) else K[i][j][0].toarray() for j in range(n_blocks)] for i in range(n_blocks)] K = np.asarray(np.bmat(K)) if gradient: return ([K], dK) else: return [K] return cov ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/GaussianProcesses.py0000644000175100001770000006254100000000000025350 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import itertools import numpy as np #import scipy as sp #import scipy.linalg.decomp_cholesky as decomp import scipy.linalg as linalg #import scipy.special as special #import matplotlib.pyplot as plt #import time #import profile #import scipy.spatial.distance as distance import scipy.sparse as sp from bayespy.utils import misc as utils from . import node as EF from . import CovarianceFunctions as CF class CovarianceMatrix: def cholesky(self): pass def multiply(A, B): return np.multiply(A,B) # m prior mean function # k prior covariance function # x data inputs # z processed data outputs (z = inv(Cov) * (y-m(x))) # U data covariance Cholesky factor def gp_posterior_moment_function(m, k, x, y, k_sparse=None, pseudoinputs=None, noise=None): # Prior # FIXME: We are ignoring the covariance of mu now.. mu = m(x)[0] ## if np.ndim(mu) == 1: ## mu = np.asmatrix(mu).T ## else: ## mu = np.asmatrix(mu) K_noise = None if noise != None: if K_noise is None: K_noise = noise else: K_noise += noise if k_sparse != None: if K_noise is None: K_noise = k_sparse(x,x)[0] else: K_noise += k_sparse(x,x)[0] if pseudoinputs != None: p = pseudoinputs #print('in pseudostuff') #print(K_noise) #print(np.shape(K_noise)) K_pp = k(p,p)[0] K_xp = k(x,p)[0] U = utils.chol(K_noise) # Compute Lambda Lambda = K_pp + np.dot(K_xp.T, utils.chol_solve(U, K_xp)) U_lambda = utils.chol(Lambda) # Compute statistics for posterior predictions #print(np.shape(U_lambda)) #print(np.shape(y)) z = utils.chol_solve(U_lambda, np.dot(K_xp.T, utils.chol_solve(U, y - mu))) U = utils.chol(K_pp) # Now we can forget the location of the observations and # consider only the pseudoinputs when predicting. x = p else: K = K_noise if K is None: K = k(x,x)[0] else: try: K += k(x,x)[0] except: K = K + k(x,x)[0] # Compute posterior GP N = len(y) U = None z = None if N > 0: U = utils.chol(K) z = utils.chol_solve(U, y-mu) def get_moments(h, covariance=1, mean=True): K_xh = k(x, h)[0] if k_sparse != None: try: # This may not work, for instance, if either one is a # sparse matrix. K_xh += k_sparse(x, h)[0] except: K_xh = K_xh + k_sparse(x, h)[0] # NumPy has problems when mixing matrices and arrays. # Matrices may appear, for instance, when you sum an array and # a sparse matrix. Make sure the result is either an array or # a sparse matrix (not dense matrix!), because matrix objects # cause lots of problems: # # array.dot(array) = array # matrix.dot(array) = matrix # sparse.dot(array) = array if not sp.issparse(K_xh): K_xh = np.asarray(K_xh) # Function for computing posterior moments if mean: # Mean vector # FIXME: Ignoring the covariance of prior mu m_h = m(h)[0] if z != None: m_h += K_xh.T.dot(z) else: m_h = None # Compute (co)variance matrix/vector if covariance: if covariance == 1: ## Compute variance vector k_h = k(h)[0] if k_sparse != None: k_h += k_sparse(h)[0] if U != None: if isinstance(K_xh, np.ndarray): k_h -= np.einsum('i...,i...', K_xh, utils.chol_solve(U, K_xh)) else: # TODO: This isn't very efficient way, but # einsum doesn't work for sparse matrices.. # This may consume A LOT of memory for sparse # matrices. k_h -= np.asarray(K_xh.multiply(utils.chol_solve(U, K_xh))).sum(axis=0) if pseudoinputs != None: if isinstance(K_xh, np.ndarray): k_h += np.einsum('i...,i...', K_xh, utils.chol_solve(U_lambda, K_xh)) else: # TODO: This isn't very efficient way, but # einsum doesn't work for sparse matrices.. # This may consume A LOT of memory for sparse # matrices. k_h += np.asarray(K_xh.multiply(utils.chol_solve(U_lambda, K_xh))).sum(axis=0) # Ensure non-negative variances k_h[k_h<0] = 0 return (m_h, k_h) elif covariance == 2: ## Compute full covariance matrix K_hh = k(h,h)[0] if k_sparse != None: K_hh += k_sparse(h)[0] if U != None: K_hh -= K_xh.T.dot(utils.chol_solve(U,K_xh)) #K_hh -= np.dot(K_xh.T, utils.chol_solve(U,K_xh)) if pseudoinputs != None: K_hh += K_xh.T.dot(utils.chol_solve(U_lambda, K_xh)) #K_hh += np.dot(K_xh.T, utils.chol_solve(U_lambda, K_xh)) return (m_h, K_hh) else: return (m_h, None) return get_moments # Constant function using GP mean protocol class Constant(EF.Node): def __init__(self, f, **kwargs): self.f = f EF.Node.__init__(self, dims=[(np.inf,)], **kwargs) def message_to_child(self, gradient=False): # Wrapper def func(x, gradient=False): if gradient: return ([self.f(x), None], []) else: return [self.f(x), None] return func #class MultiDimensional(EF.NodeVariable): # """ A multi-dimensional Gaussian process f(x). """ ## class ToGaussian(EF.NodeVariable): ## """ Deterministic node which transform a Gaussian process into ## finite-dimensional Gaussian variable. """ ## def __init__(self, f, x, **kwargs): ## EF.NodeVariable.__init__(self, ## f, ## x, ## plates= ## dims= # Deterministic node for creating a set of GPs which can be used as a # mean function to a general GP node. class Multiple(EF.Node): def __init__(self, GPs, **kwargs): # Ignore plates EF.NodeVariable.__init__(self, *GPs, plates=(), dims=[(np.inf,), (np.inf,np.inf)], **kwargs) def message_to_parent(self, index): raise Exception("not implemented yet") def message_to_child(self, gradient=False): u = [parent.message_to_child() for parent in self.parents] def get_moments(xh, **kwargs): mh_all = [] khh_all = [] for i in range(len(self.parents)): xi = np.array(xh[i]) #print(xi) #print(np.shape(xi)) #print(xi) # FIXME: We are ignoring the covariance of mu now.. if gradient: ((mh, khh), dm) = u[i](xi, **kwargs) else: (mh, khh) = u[i](xi, **kwargs) #mh = u[i](xi, **kwargs)[0] #print(mh) #print(mh_all) ## print(mh) ## print(khh) ## print(np.shape(mh)) mh_all = np.concatenate([mh_all, mh]) #print(np.shape(mh_all)) if khh != None: print(khh) raise Exception('Not implemented yet for covariances') #khh_all = np.concatenate([khh_all, khh]) # FIXME: Compute gradients! if gradient: return ([mh_all, khh_all], []) else: return [mh_all, khh_all] #return [mh_all, khh_all] return get_moments # Gaussian process distribution class GaussianProcess(EF.Node): def __init__(self, m, k, k_sparse=None, pseudoinputs=None, **kwargs): self.x = np.array([]) self.f = np.array([]) ## self.x_obs = np.zeros((0,1)) ## self.f_obs = np.zeros((0,)) if pseudoinputs != None: pseudoinputs = EF.NodeConstant([pseudoinputs], dims=[np.shape(pseudoinputs)]) # By default, posterior == prior self.m = None #m self.k = None #k if isinstance(k, list) and isinstance(m, list): if len(k) != len(m): raise Exception('The number of mean and covariance functions must be equal.') k = CF.Multiple(k) m = Multiple(m) elif isinstance(k, list): D = len(k) k = CF.Multiple(k) m = Multiple(D*[m]) elif isinstance(m, list): D = len(m) k = CF.Multiple(D*[k]) m = Multiple(m) # Ignore plates EF.NodeVariable.__init__(self, m, k, k_sparse, pseudoinputs, plates=(), dims=[(np.inf,), (np.inf,np.inf)], **kwargs) def __call__(self, x, covariance=None): if not covariance: return self.u(x, covariance=False)[0] elif covariance.lower() == 'vector': return self.u(x, covariance=1) elif covariance.lower() == 'matrix': return self.u(x, covariance=2) else: raise Exception("Unknown covariance type requested") def message_to_parent(self, index): if index == 0: k = self.parents[1].message_to_child()[0] K = k(self.x, self.x) return [self.x, self.mu, K] if index == 1: raise Exception("not implemented yet") def message_to_child(self): if self.observed: raise Exception("Observable GP should not have children.") return self.u def get_parameters(self): return self.u def observe(self, x, f): self.observed = True self.x = x self.f = f ## if np.ndim(f) == 1: ## self.f = np.asmatrix(f).T ## else: ## self.f = np.asmatrix(f) # You might want: # - mean for x # - covariance (and mean) for x # - variance (and mean) for x # - i.e., mean and/or (co)variance for x # - covariance for x1 and x2 def lower_bound_contribution(self, gradient=False): # Get moment functions from parents m = self.parents[0].message_to_child(gradient=gradient) k = self.parents[1].message_to_child(gradient=gradient) if self.parents[2]: k_sparse = self.parents[2].message_to_child(gradient=gradient) else: k_sparse = None if self.parents[3]: pseudoinputs = self.parents[3].message_to_child(gradient=gradient) #pseudoinputs = self.parents[3].message_to_child(gradient=gradient)[0] else: pseudoinputs = None ## m = self.parents[0].message_to_child(gradient=gradient)[0] ## k = self.parents[1].message_to_child(gradient=gradient)[0] # Compute the parameters (covariance matrices etc) using # parents' moment functions DKs_xx = [] DKd_xx = [] DKd_xp = [] DKd_pp = [] Dxp = [] Dmu = [] if gradient: # FIXME: We are ignoring the covariance of mu now.. ((mu, _), Dmu) = m(self.x, gradient=True) ## if k_sparse: ## ((Ks_xx,), DKs_xx) = k_sparse(self.x, self.x, gradient=True) if pseudoinputs: ((Ks_xx,), DKs_xx) = k_sparse(self.x, self.x, gradient=True) ((xp,), Dxp) = pseudoinputs ((Kd_pp,), DKd_pp) = k(xp,xp, gradient=True) ((Kd_xp,), DKd_xp) = k(self.x, xp, gradient=True) else: ((K_xx,), DKd_xx) = k(self.x, self.x, gradient=True) if k_sparse: ((Ks_xx,), DKs_xx) = k_sparse(self.x, self.x, gradient=True) try: K_xx += Ks_xx except: K_xx = K_xx + Ks_xx else: # FIXME: We are ignoring the covariance of mu now.. (mu, _) = m(self.x) ## if k_sparse: ## (Ks_xx,) = k_sparse(self.x, self.x) if pseudoinputs: (Ks_xx,) = k_sparse(self.x, self.x) (xp,) = pseudoinputs (Kd_pp,) = k(xp, xp) (Kd_xp,) = k(self.x, xp) else: (K_xx,) = k(self.x, self.x) if k_sparse: (Ks_xx,) = k_sparse(self.x, self.x) try: K_xx += Ks_xx except: K_xx = K_xx + Ks_xx mu = mu[0] #K = K[0] # Log pdf if self.observed: ## Log pdf for directly observed GP f0 = self.f - mu #print('hereiam') #print(K) if pseudoinputs: ## Pseudo-input approximation # Decompose the full-rank sparse/noise covariance matrix try: Us_xx = utils.cholesky(Ks_xx) except linalg.LinAlgError: print('Noise/sparse covariance not positive definite') return -np.inf # Use Woodbury-Sherman-Morrison formula with the # following notation: # # y2 = f0' * inv(Kd_xp*inv(Kd_pp)*Kd_xp' + Ks_xx) * f0 # # z = Ks_xx \ f0 # Lambda = Kd_pp + Kd_xp'*inv(Ks_xx)*Kd_xp # nu = inv(Lambda) * (Kd_xp' * (Ks_xx \ f0)) # rho = Kd_xp * inv(Lambda) * (Kd_xp' * (Ks_xx \ f0)) # # y2 = f0' * z - z' * rho z = Us_xx.solve(f0) Lambda = Kd_pp + np.dot(Kd_xp.T, Us_xx.solve(Kd_xp)) ## z = utils.chol_solve(Us_xx, f0) ## Lambda = Kd_pp + np.dot(Kd_xp.T, ## utils.chol_solve(Us_xx, Kd_xp)) try: U_Lambda = utils.cholesky(Lambda) #U_Lambda = utils.chol(Lambda) except linalg.LinAlgError: print('Lambda not positive definite') return -np.inf nu = U_Lambda.solve(np.dot(Kd_xp.T, z)) #nu = utils.chol_solve(U_Lambda, np.dot(Kd_xp.T, z)) rho = np.dot(Kd_xp, nu) y2 = np.dot(f0, z) - np.dot(z, rho) # Use matrix determinant lemma # # det(Kd_xp*inv(Kd_pp)*Kd_xp' + Ks_xx) # = det(Kd_pp + Kd_xp'*inv(Ks_xx)*Kd_xp) # * det(inv(Kd_pp)) * det(Ks_xx) # = det(Lambda) * det(Ks_xx) / det(Kd_pp) try: Ud_pp = utils.cholesky(Kd_pp) #Ud_pp = utils.chol(Kd_pp) except linalg.LinAlgError: print('Covariance of pseudo inputs not positive definite') return -np.inf logdet = (U_Lambda.logdet() + Us_xx.logdet() - Ud_pp.logdet()) ## logdet = (utils.logdet_chol(U_Lambda) ## + utils.logdet_chol(Us_xx) ## - utils.logdet_chol(Ud_pp)) # Compute the log pdf L = gaussian_logpdf(y2, 0, 0, logdet, np.size(self.f)) # Add the variational cost of the pseudo-input # approximation # Compute gradients for (dmu, func) in Dmu: # Derivative w.r.t. mean vector d = np.nan # Send the derivative message func(d) for (dKs_xx, func) in DKs_xx: # Compute derivative w.r.t. covariance matrix d = np.nan # Send the derivative message func(d) for (dKd_xp, func) in DKd_xp: # Compute derivative w.r.t. covariance matrix d = np.nan # Send the derivative message func(d) V = Ud_pp.solve(Kd_xp.T) Z = Us_xx.solve(V.T) ## V = utils.chol_solve(Ud_pp, Kd_xp.T) ## Z = utils.chol_solve(Us_xx, V.T) for (dKd_pp, func) in DKd_pp: # Compute derivative w.r.t. covariance matrix d = (0.5 * np.trace(Ud_pp.solve(dKd_pp)) - 0.5 * np.trace(U_Lambda.solve(dKd_pp)) + np.dot(nu, np.dot(dKd_pp, nu)) + np.trace(np.dot(dKd_pp, np.dot(V,Z)))) ## d = (0.5 * np.trace(utils.chol_solve(Ud_pp, dKd_pp)) ## - 0.5 * np.trace(utils.chol_solve(U_Lambda, dKd_pp)) ## + np.dot(nu, np.dot(dKd_pp, nu)) ## + np.trace(np.dot(dKd_pp, ## np.dot(V,Z)))) # Send the derivative message func(d) for (dxp, func) in Dxp: # Compute derivative w.r.t. covariance matrix d = np.nan # Send the derivative message func(d) else: ## Full exact (no pseudo approximations) try: U = utils.cholesky(K_xx) #U = utils.chol(K_xx) except linalg.LinAlgError: print('non positive definite, return -inf') return -np.inf z = U.solve(f0) #z = utils.chol_solve(U, f0) #print(K) L = utils.gaussian_logpdf(np.dot(f0, z), 0, 0, U.logdet(), ## utils.logdet_chol(U), np.size(self.f)) for (dmu, func) in Dmu: # Derivative w.r.t. mean vector d = -np.sum(z) # Send the derivative message func(d) for (dK, func) in DKd_xx: # Compute derivative w.r.t. covariance matrix # # TODO: trace+chol_solve should be handled better # for sparse matrices. Use sparse-inverse! d = 0.5 * (dK.dot(z).dot(z) - U.trace_solve_gradient(dK)) ## - np.trace(U.solve(dK))) ## d = 0.5 * (dK.dot(z).dot(z) ## - np.trace(utils.chol_solve(U, dK))) #print('derivate', d, dK) ## d = 0.5 * (np.dot(z, np.dot(dK, z)) ## - np.trace(utils.chol_solve(U, dK))) # # Send the derivative message func(d) for (dK, func) in DKs_xx: # Compute derivative w.r.t. covariance matrix d = 0.5 * (dK.dot(z).dot(z) - U.trace_solve_gradient(dK)) ## - np.trace(U.solve(dK))) ## d = 0.5 * (dK.dot(z).dot(z) ## - np.trace(utils.chol_solve(U, dK))) ## d = 0.5 * (np.dot(z, np.dot(dK, z)) ## - np.trace(utils.chol_solve(U, dK))) # Send the derivative message func(d) else: ## Log pdf for latent GP raise Exception('Not implemented yet') return L ## Let f1 be observed and f2 latent function values. # Compute #L = gaussian_logpdf(sum_product(np.outer(self.f,self.f) + self.Cov, # Compute def update(self): # Messages from parents m = self.parents[0].message_to_child() k = self.parents[1].message_to_child() if self.parents[2]: k_sparse = self.parents[2].message_to_child() else: k_sparse = None if self.parents[3]: pseudoinputs = self.parents[3].message_to_child()[0] else: pseudoinputs = None ## m = self.parents[0].message_to_child()[0] ## k = self.parents[1].message_to_child()[0] if self.observed: # Observations of this node self.u = gp_posterior_moment_function(m, k, self.x, self.f, k_sparse=k_sparse, pseudoinputs=pseudoinputs) else: x = np.array([]) y = np.array([]) # Messages from children for (child,index) in self.children: (msg, mask) = child.message_to_parent(index) # Ignoring masks and plates.. # m[0] is the inputs x = np.concatenate((x, msg[0]), axis=-2) # m[1] is the observations y = np.concatenate((y, msg[1])) # m[2] is the covariance matrix V = linalg.block_diag(V, msg[2]) self.u = gp_posterior_moment_function(m, k, x, y, covariance=V) self.x = x self.f = y # At least for now, simplify this GP node such that a GP is either # observed or latent. If it is observed, it doesn't take messages from # children, actually, it should not even have children! ## # Pseudo for GPFA: ## k1 = gp_cov_se(magnitude=theta1, lengthscale=theta2) ## k2 = gp_cov_periodic(magnitude=.., lengthscale=.., period=..) ## k3 = gp_cov_rq(magnitude=.., lengthscale=.., alpha=..) ## f = NodeGPSet(0, [k1,k2,k3]) # assumes block diagonality ## # f = NodeGPSet(0, [[k11,k12,k13],[k21,k22,k23],[k31,k32,k33]]) ## X = GaussianFromGP(f, [ [[t0,0],[t0,1],[t0,2]], [t1,0],[t1,1],[t1,2], ..]) ## ... ## # Construct a sum of GPs if interested only in the sum term ## k1 = gp_cov_se(magnitude=theta1, lengthscale=theta2) ## k2 = gp_cov_periodic(magnitude=.., lengthscale=.., period=..) ## k = gp_cov_sum(k1, k2) ## f = NodeGP(0, k) ## f.observe(x, y) ## f.update() ## (mp, kp) = f.get_parameters() ## # Construct a sum of GPs when interested also in the individual ## # GPs: ## k1 = gp_cov_se(magnitude=theta1, lengthscale=theta2) ## k2 = gp_cov_periodic(magnitude=.., lengthscale=.., period=..) ## k3 = gp_cov_delta(magnitude=theta3) ## f = NodeGPSum(0, [k1,k2,k3]) ## x = np.array([1,2,3,4,5,6,7,8,9,10]) ## y = np.sin(x[0]) + np.random.normal(0, 0.1, (10,)) ## # Observe the sum (index 0) ## f.observe((0,x), y) ## # Inference ## f.update() ## (mp, kp) = f.get_parameters() ## # Mean of the sum ## mp[0](...) ## # Mean of the individual terms ## mp[1](...) ## mp[2](...) ## mp[3](...) ## # Covariance of the sum ## kp[0][0](..., ...) ## # Other covariances ## kp[1][1](..., ...) ## kp[2][2](..., ...) ## kp[3][3](..., ...) ## kp[1][2](..., ...) ## kp[1][3](..., ...) ## kp[2][3](..., ...) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/__init__.py0000644000175100001770000000301000000000000023430 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ # Import some most commonly used nodes from . import * from .bernoulli import Bernoulli from .binomial import Binomial from .categorical import Categorical from .multinomial import Multinomial from .poisson import Poisson from .beta import Beta from .beta import Complement from .dirichlet import Dirichlet, Concentration DirichletConcentration = Concentration BetaConcentration = lambda **kwargs: Concentration(2, **kwargs) from .exponential import Exponential from .gaussian import Gaussian, GaussianARD from .wishart import Wishart from .gamma import Gamma, GammaShape from .gaussian import (GaussianGamma, GaussianWishart) from .gaussian_markov_chain import GaussianMarkovChain from .gaussian_markov_chain import VaryingGaussianMarkovChain from .gaussian_markov_chain import SwitchingGaussianMarkovChain from .categorical_markov_chain import CategoricalMarkovChain from .mixture import Mixture, MultiMixture from .gate import Gate from .gate import Choose from .concatenate import Concatenate from .dot import Dot from .dot import SumMultiply from .add import Add from .take import Take from .concat_gaussian import ConcatGaussian from .logpdf import LogPDF from .constant import Constant from .ml import MaximumLikelihood from .ml import Function ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/add.py0000644000175100001770000001035300000000000022431 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import functools from .deterministic import Deterministic from .gaussian import Gaussian, GaussianMoments from bayespy.utils import linalg class Add(Deterministic): r""" Node for computing sums of Gaussian nodes: :math:`X+Y+Z`. Examples -------- >>> import numpy as np >>> from bayespy import nodes >>> X = nodes.Gaussian(np.zeros(2), np.identity(2), plates=(3,)) >>> Y = nodes.Gaussian(np.ones(2), np.identity(2)) >>> Z = nodes.Add(X, Y) >>> print("Mean:\n", Z.get_moments()[0]) Mean: [[1. 1.]] >>> print("Second moment:\n", Z.get_moments()[1]) Second moment: [[[3. 1.] [1. 3.]]] Notes ----- Shapes of the nodes must be identical. Plates are broadcasted. This node sums nodes that are independent in the posterior approximation. However, summing variables puts a strong coupling among the variables, which is lost in this construction. Thus, it is usually better to use a single Gaussian node to represent the set of the summed variables and use SumMultiply node to compute the sum. In that way, the correlation between the variables is not lost. However, in some cases it is necessary or useful to use Add node. See also -------- Dot, SumMultiply """ def __init__(self, *nodes, **kwargs): """ Add(X1, X2, ...) """ ndim = None for node in nodes: try: node = self._ensure_moments(node, GaussianMoments, ndim=None) except ValueError: pass else: ndim = node._moments.ndim break nodes = [self._ensure_moments(node, GaussianMoments, ndim=ndim) for node in nodes] N = len(nodes) if N < 2: raise ValueError("Give at least two parents") nodes = list(nodes) for n in range(N-1): if nodes[n].dims != nodes[n+1].dims: raise ValueError("Nodes do not have identical shapes") ndim = len(nodes[0].dims[0]) dims = tuple(nodes[0].dims) shape = dims[0] self._moments = GaussianMoments(shape) self._parent_moments = N * [GaussianMoments(shape)] self.ndim = ndim self.N = N super().__init__(*nodes, dims=dims, **kwargs) def _compute_moments(self, *u_parents): """ Compute the moments of the sum """ u0 = functools.reduce(np.add, (u_parent[0] for u_parent in u_parents)) u1 = functools.reduce(np.add, (u_parent[1] for u_parent in u_parents)) for i in range(self.N): for j in range(i+1, self.N): xi_xj = linalg.outer(u_parents[i][0], u_parents[j][0], ndim=self.ndim) xj_xi = linalg.transpose(xi_xj, ndim=self.ndim) u1 = u1 + xi_xj + xj_xi return [u0, u1] def _compute_message_to_parent(self, index, m, *u_parents): """ Compute the message to a parent node. .. math:: (\sum_i \mathbf{x}_i)^T \mathbf{M}_2 (\sum_j \mathbf{x}_j) + (\sum_i \mathbf{x}_i)^T \mathbf{m}_1 Moments of the parents are .. math:: u_1^{(i)} = \langle \mathbf{x}_i \rangle \\ u_2^{(i)} = \langle \mathbf{x}_i \mathbf{x}_i^T \rangle Thus, the message for :math:`i`-th parent is .. math:: \phi_{x_i}^{(1)} = \mathbf{m}_1 + 2 \mathbf{M}_2 \sum_{j\neq i} \mathbf{x}_j \\ \phi_{x_i}^{(2)} = \mathbf{M}_2 """ # Remove the moments of the parent that receives the message u_parents = u_parents[:index] + u_parents[(index+1):] m0 = (m[0] + linalg.mvdot( 2*m[1], functools.reduce(np.add, (u_parent[0] for u_parent in u_parents)), ndim=self.ndim)) m1 = m[1] return [m0, m1] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/bernoulli.py0000644000175100001770000001050200000000000023670 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ A module for the Bernoulli distribution node """ import numpy as np from .binomial import (BinomialMoments, BinomialDistribution) from .expfamily import ExponentialFamily from .beta import BetaMoments from .node import Moments class BernoulliMoments(BinomialMoments): """ Class for the moments of Bernoulli variables. """ def __init__(self): super().__init__(1) class BernoulliDistribution(BinomialDistribution): """ Class for the VMP formulas of Bernoulli variables. """ def __init__(self): super().__init__(1) class Bernoulli(ExponentialFamily): r""" Node for Bernoulli random variables. The node models a binary random variable :math:`z \in \{0,1\}` with prior probability :math:`p \in [0,1]` for value one: .. math:: z \sim \mathrm{Bernoulli}(p). Parameters ---------- p : beta-like node Probability of a successful trial Examples -------- >>> import warnings >>> warnings.filterwarnings('ignore', category=RuntimeWarning) >>> from bayespy.nodes import Bernoulli, Beta >>> p = Beta([1e-3, 1e-3]) >>> z = Bernoulli(p, plates=(10,)) >>> z.observe([0, 1, 1, 1, 0, 1, 1, 1, 0, 1]) >>> p.update() >>> import bayespy.plot as bpplt >>> import numpy as np >>> bpplt.pdf(p, np.linspace(0, 1, num=100)) [] """ _moments = BernoulliMoments() _distribution = BernoulliDistribution() def __init__(self, p, **kwargs): """ Create Bernoulli node. """ super().__init__(p, **kwargs) @classmethod def _constructor(cls, p, **kwargs): """ Constructs distribution and moments objects. """ p = cls._ensure_moments(p, BetaMoments) parent_moments = (p._moments,) parents = [p] return ( parents, kwargs, ( (), ), cls._total_plates(kwargs.get('plates'), cls._distribution.plates_from_parent(0, p.plates)), cls._distribution, cls._moments, parent_moments) def __str__(self): """ Print the distribution using standard parameterization. """ p = 1 / (1 + np.exp(-self.phi[0])) return ("%s ~ Bernoulli(p)\n" " p = \n" "%s\n" % (self.name, p)) from .deterministic import Deterministic from .categorical import Categorical, CategoricalMoments class CategoricalToBernoulli(Deterministic): """ A node for converting 2-class categorical moments to Bernoulli moments. """ def __init__(self, Z, **kwargs): """ Create a categorical MC moments to categorical moments conversion node. """ # Convert parent to proper type. Z must be a node. if not isinstance(Z._moments, CategoricalMoments): raise ValueError("Input node must be categorical") K = Z.dims[0][-1] if K != 2: raise Moments.NoConverterError("Only 2-class categorical can be converted to " "Bernoulli") dims = ( (), ) self._moments = BernoulliMoments() self._parent_moments = (CategoricalMoments(2),) super().__init__(Z, dims=dims, **kwargs) def _compute_moments(self, u_Z): """ Compute the moments given the moments of the parents. """ u0 = u_Z[0][...,0] u = [u0] return u def _compute_message_to_parent(self, index, m, u_Z): """ Compute the message to a parent. """ if index == 0: m0 = np.concatenate([m[0][...,None], np.zeros(np.shape(m[0]))[...,None]], axis=-1) return [m0] else: raise ValueError("Incorrect parent index") # Make use of the conversion node CategoricalMoments.add_converter(BernoulliMoments, CategoricalToBernoulli) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/beta.py0000644000175100001770000001215600000000000022617 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ A module for the beta distribution node """ import numpy as np import scipy.special as special from .deterministic import Deterministic from .dirichlet import (DirichletMoments, DirichletDistribution, Dirichlet) from .node import Moments, ensureparents class BetaMoments(DirichletMoments): """ Class for the moments of beta variables. """ def __init__(self): super().__init__(2) def compute_fixed_moments(self, p): """ Compute the moments for a fixed value """ p = np.asanyarray(p)[...,None] * [1,-1] + [0,1] self.dims = ( (2,), ) return super().compute_fixed_moments(p) @classmethod def from_values(cls, p): """ Return the shape of the moments for a fixed value. """ return cls() class BetaDistribution(DirichletDistribution): """ Class for the VMP formulas of beta variables. Although the realizations are scalars (probability p), the moments is a two-dimensional vector: [log(p), log(1-p)]. """ def compute_message_to_parent(self, parent, index, u_self, u_alpha): """ Compute the message to a parent node. """ return super().compute_message_to_parent(parent, index, u_self, u_alpha) def compute_phi_from_parents(self, u_alpha, mask=True): """ Compute the natural parameter vector given parent moments. """ return super().compute_phi_from_parents(u_alpha, mask=mask) def compute_moments_and_cgf(self, phi, mask=True): """ Compute the moments and :math:`g(\phi)`. """ return super().compute_moments_and_cgf(phi, mask) def compute_cgf_from_parents(self, u_alpha): """ Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ return super().compute_cgf_from_parents(u_alpha) def compute_fixed_moments_and_f(self, p, mask=True): """ Compute the moments and :math:`f(x)` for a fixed value. """ p = np.asanyarray(p)[...,None] * [1,-1] + [0,1] return super().compute_fixed_moments_and_f(p, mask=mask) def random(self, *phi, plates=None): """ Draw a random sample from the distribution. """ p = super().random(*phi, plates=plates) return p[...,0] class Beta(Dirichlet): r""" Node for beta random variables. The node models a probability variable :math:`p \in [0,1]` as .. math:: p \sim \mathrm{Beta}(a, b) where :math:`a` and :math:`b` are prior counts for success and failure, respectively. Parameters ---------- alpha : (...,2)-shaped array Two-element vector containing :math:`a` and :math:`b` Examples -------- >>> import warnings >>> warnings.filterwarnings('ignore', category=RuntimeWarning) >>> from bayespy.nodes import Bernoulli, Beta >>> p = Beta([1e-3, 1e-3]) >>> z = Bernoulli(p, plates=(10,)) >>> z.observe([0, 1, 1, 1, 0, 1, 1, 1, 0, 1]) >>> p.update() >>> import bayespy.plot as bpplt >>> import numpy as np >>> bpplt.pdf(p, np.linspace(0, 1, num=100)) [] """ _moments = BetaMoments() _distribution = BetaDistribution() def __init__(self, alpha, **kwargs): """ Create beta node """ super().__init__(alpha, **kwargs) @classmethod def _constructor(cls, alpha, **kwargs): """ Constructs distribution and moments objects. """ retval = super()._constructor(alpha, **kwargs) if retval[2] != cls._moments.dims: raise ValueError("Parent has wrong dimensionality. Must be a " "two-dimensional vector.") return ( retval[0], retval[1], retval[2], retval[3], cls._distribution, cls._moments, retval[6] ) def complement(self): return Complement(self) def __str__(self): """ Print the distribution using standard parameterization. """ a = self.phi[0][...,0] b = self.phi[0][...,1] return ("%s ~ Beta(a, b)\n" " a = \n" "%s\n" " b = \n" "%s\n" % (self.name, a, b)) class Complement(Deterministic): """ Perform 1-p where p is a Beta node. """ _moments = BetaMoments() _parent_moments = (BetaMoments(),) def __init__(self, p, **kwargs): super().__init__(p, dims=p.dims, **kwargs) def _compute_message_to_parent(self, index, m, u_p): if index != 0: raise IndexError() m0 = m[0][...,-1::-1] return [m0] def _compute_moments(self, u_p): u0 = u_p[0][...,-1::-1] return [u0] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/binomial.py0000644000175100001770000001507300000000000023477 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ A module for the binomial distribution node """ import numpy as np import scipy.special as special from .expfamily import (ExponentialFamily, ExponentialFamilyDistribution, useconstructor) from .beta import BetaMoments from .poisson import PoissonMoments from .node import (Moments, ensureparents) from bayespy.utils import misc, random class BinomialMoments(PoissonMoments): """ Class for the moments of binomial variables """ def __init__(self, N): self.N = N super().__init__() def compute_fixed_moments(self, x): """ Compute the moments for a fixed value """ # Make sure the values are integers in valid range x = np.asanyarray(x) if np.any(x > self.N): raise ValueError("Invalid count") return super().compute_fixed_moments() def compute_dims_from_values(self, x): """ Return the shape of the moments for a fixed value. The realizations are scalars, thus the shape of the moment is (). """ raise DeprecationWarning() return super().compute_dims_from_values() class BinomialDistribution(ExponentialFamilyDistribution): """ Class for the VMP formulas of binomial variables. """ def __init__(self, N): N = np.asanyarray(N) if not misc.isinteger(N): raise ValueError("Number of trials must be integer") if np.any(N < 0): raise ValueError("Number of trials must be non-negative") self.N = np.asanyarray(N) super().__init__() def compute_message_to_parent(self, parent, index, u_self, u_p): """ Compute the message to a parent node. """ if index == 0: x = u_self[0][...,None] n = self.N[...,None] m0 = x*[1, -1] + n*[0, 1] m = [m0] return m else: raise ValueError("Incorrect parent index") def compute_phi_from_parents(self, u_p, mask=True): """ Compute the natural parameter vector given parent moments. """ logp0 = u_p[0][...,0] logp1 = u_p[0][...,1] phi0 = logp0 - logp1 return [phi0] def compute_moments_and_cgf(self, phi, mask=True): """ Compute the moments and :math:`g(\phi)`. """ u0 = self.N / (1 + np.exp(-phi[0])) g = -self.N * np.log1p(np.exp(phi[0])) return ( [u0], g ) def compute_cgf_from_parents(self, u_p): """ Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ logp0 = u_p[0][...,0] logp1 = u_p[0][...,1] return self.N * logp1 def compute_fixed_moments_and_f(self, x, mask=True): """ Compute the moments and :math:`f(x)` for a fixed value. """ # Make sure the values are integers in valid range x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Counts must be integer") if np.any(x < 0) or np.any(x > self.N): raise ValueError("Invalid count") # Now, the moments are just the counts u = [x] f = (special.gammaln(self.N+1) - special.gammaln(x+1) - special.gammaln(self.N-x+1)) return (u, f) def random(self, *phi, plates=None): """ Draw a random sample from the distribution. """ p = random.logodds_to_probability(phi[0]) return np.random.binomial(self.N, p, size=plates) def squeeze(self, axis): try: N_squeezed = np.squeeze(self.N, axis) except ValueError as err: raise ValueError( "The number of trials must be constant over a squeezed axis, " "so the corresponding array axis must be singleton. " "Cannot squeeze axis {0} from a binomial distribution " "because the number of trials arrays has shape {2}, so " "the given axis has length {1} != 1. ".format( axis, np.shape(self.N)[axis], np.shape(self.N), ) ) from err else: return BinomialDistribution(N_squeezed) class Binomial(ExponentialFamily): r""" Node for binomial random variables. The node models the number of successes :math:`x \in \{0, \ldots, n\}` in :math:`n` trials with probability :math:`p` for success: .. math:: x \sim \mathrm{Binomial}(n, p). Parameters ---------- n : scalar or array Number of trials p : beta-like node or scalar or array Probability of a success in a trial Examples -------- >>> import warnings >>> warnings.filterwarnings('ignore', category=RuntimeWarning) >>> from bayespy.nodes import Binomial, Beta >>> p = Beta([1e-3, 1e-3]) >>> x = Binomial(10, p) >>> x.observe(7) >>> p.update() >>> import bayespy.plot as bpplt >>> import numpy as np >>> bpplt.pdf(p, np.linspace(0, 1, num=100)) [] See also -------- Bernoulli, Multinomial, Beta """ def __init__(self, n, p, **kwargs): """ Create binomial node """ super().__init__(n, p, **kwargs) @classmethod def _constructor(cls, n, p, **kwargs): """ Constructs distribution and moments objects. """ p = cls._ensure_moments(p, BetaMoments) parents = [p] moments = BinomialMoments(n) parent_moments = (p._moments,) distribution = BinomialDistribution(n) return ( parents, kwargs, ( (), ), cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, p.plates), np.shape(n)), distribution, moments, parent_moments) def __str__(self): """ Print the distribution using standard parameterization. """ p = 1 / (1 + np.exp(-self.phi[0])) n = self._distribution.N return ("%s ~ Binomial(n, p)\n" " n = \n" "%s\n" " p = \n" "%s\n" % (self.name, n, p)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/categorical.py0000644000175100001770000001272300000000000024161 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012,2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the categorical distribution node. """ import numpy as np from .node import ensureparents from .expfamily import (ExponentialFamily, useconstructor) from .multinomial import (MultinomialMoments, MultinomialDistribution, Multinomial) from .dirichlet import DirichletMoments from bayespy.utils import random from bayespy.utils import misc class CategoricalMoments(MultinomialMoments): """ Class for the moments of categorical variables. """ def compute_fixed_moments(self, x): """ Compute the moments for a fixed value """ # Check that x is valid x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Values must be integers") if np.any(x < 0) or np.any(x >= self.categories): raise ValueError("Invalid category index") u0 = np.zeros((np.size(x), self.categories)) u0[(np.arange(np.size(x)), np.ravel(x))] = 1 u0 = np.reshape(u0, np.shape(x) + (self.categories,)) return [u0] @classmethod def from_values(cls, x, categories): """ Return the shape of the moments for a fixed value. The observations are scalar. """ return cls(categories) raise DeprecationWarning() return ( (self.D,), ) def get_instance_conversion_kwargs(self): return dict(categories=self.categories) def get_instance_converter(self, categories): if categories is not None and categories != self.categories: raise ValueError( "No automatic conversion from CategoricalMoments to " "CategoricalMoments with different number of categories" ) return None class CategoricalDistribution(MultinomialDistribution): """ Class for the VMP formulas of categorical variables. """ def __init__(self, categories): """ Create VMP formula node for a categorical variable `categories` is the total number of categories. """ if not isinstance(categories, int): raise ValueError("Number of categories must be integer") if categories < 0: raise ValueError("Number of categoriess must be non-negative") self.D = categories super().__init__(1) def compute_fixed_moments_and_f(self, x, mask=True): """ Compute the moments and :math:`f(x)` for a fixed value. """ # Check the validity of x x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Values must be integers") if np.any(x < 0) or np.any(x >= self.D): raise ValueError("Invalid category index") # Form a binary matrix with only one non-zero (1) in the last axis u0 = np.zeros((np.size(x), self.D)) u0[(np.arange(np.size(x)), np.ravel(x))] = 1 u0 = np.reshape(u0, np.shape(x) + (self.D,)) u = [u0] # f(x) is zero f = 0 return (u, f) def random(self, *phi, plates=None): """ Draw a random sample from the distribution. """ logp = phi[0] logp -= np.amax(logp, axis=-1, keepdims=True) p = np.exp(logp) return random.categorical(p, size=plates) def squeeze(self, axis): return self class Categorical(ExponentialFamily): r""" Node for categorical random variables. The node models a categorical random variable :math:`x \in \{0,\ldots,K-1\}` with prior probabilities :math:`\{p_0, \ldots, p_{K-1}\}` for each category: .. math:: p(x=k) = p_k \quad \text{for } k\in \{0,\ldots,K-1\}. Parameters ---------- p : Dirichlet-like node or (...,K)-array Probabilities for each category See also -------- Bernoulli, Multinomial, Dirichlet """ def __init__(self, p, **kwargs): """ Create Categorical node. """ super().__init__(p, **kwargs) @classmethod def _constructor(cls, p, **kwargs): """ Constructs distribution and moments objects. This method is called if useconstructor decorator is used for __init__. Becase the distribution and moments object depend on the number of categories, that is, they depend on the parent node, this method can be used to construct those objects. """ # Get the number of categories p = cls._ensure_moments(p, DirichletMoments) D = p.dims[0][0] parent_moments = (p._moments,) parents = [p] distribution = CategoricalDistribution(D) moments = CategoricalMoments(D) return (parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, p.plates)), distribution, moments, parent_moments) def __str__(self): """ Print the distribution using standard parameterization. """ p = self.u[0] return ("%s ~ Categorical(p)\n" " p = \n" "%s\n" % (self.name, p)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/categorical_markov_chain.py0000644000175100001770000003337100000000000026704 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the categorical Markov chain node. """ import numpy as np from .deterministic import Deterministic from .expfamily import (ExponentialFamily, ExponentialFamilyDistribution, useconstructor) from .node import (Moments, ensureparents) from .categorical import CategoricalMoments from .dirichlet import (Dirichlet, DirichletMoments) from bayespy.utils import misc, random class CategoricalMarkovChainMoments(Moments): """ Class for the moments of categorical Markov chain variables. """ def __init__(self, categories, length): """ Create moments object for categorical Markov chain variables. """ self.categories = categories self.length = length self.dims = ( (categories,), (length-1, categories, categories) ) return def compute_fixed_moments(self, x): """ Compute the moments for a fixed value """ # Check that x is valid x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Values must be integers") if np.any(x < 0) or np.any(x >= self.categories): raise ValueError("Invalid category index") plates = np.shape(x)[:-1] u0_size = np.prod(plates, dtype=int) u0 = np.zeros((u0_size, self.categories)) u0[(np.arange(u0_size), np.ravel(x[...,0]))] = 1.0 us_size = u0_size * (self.length - 1) us = np.zeros((us_size, self.categories, self.categories)) us[(np.arange(us_size), np.ravel(x[...,:-1]), np.ravel(x[...,1:]))] = 1.0 return [ np.reshape(u0, plates + (self.categories,)), np.reshape(us, plates + (self.length-1, self.categories, self.categories)), ] @classmethod def from_values(cls, x, categories): """ Return the shape of the moments for a fixed value. """ raise NotImplementedError("from_values not implemented " "for %s" % (self.__class__.__name__)) class CategoricalMarkovChainDistribution(ExponentialFamilyDistribution): """ Class for the VMP formulas of categorical Markov chain variables. """ def __init__(self, categories, states): """ Create VMP formula node for a categorical variable `categories` is the total number of categories. `states` is the length of the chain. """ self.K = categories self.N = states def compute_message_to_parent(self, parent, index, u, u_p0, u_P): """ Compute the message to a parent node. """ if index == 0: return [ u[0] ] elif index == 1: return [ u[1] ] else: raise ValueError("Parent index out of bounds") def compute_weights_to_parent(self, index, weights): """ Maps the mask to the plates of a parent. """ if index == 0: return weights elif index == 1: # Add plate axis for the time axis and row axis of the transition # matrix return np.asanyarray(weights)[...,None,None] else: raise ValueError("Parent index out of bounds") def compute_phi_from_parents(self, u_p0, u_P, mask=True): """ Compute the natural parameter vector given parent moments. """ phi0 = u_p0[0] phi1 = u_P[0] * np.ones((self.N-1,self.K,self.K)) return [phi0, phi1] def compute_moments_and_cgf(self, phi, mask=True): """ Compute the moments and :math:`g(\phi)`. """ logp0 = phi[0] logP = phi[1] (z0, zz, cgf) = random.alpha_beta_recursion(logp0, logP) u = [z0, zz] return (u, cgf) def compute_cgf_from_parents(self, u_p0, u_P): """ Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ return 0 def compute_fixed_moments_and_f(self, x, mask=True): """ Compute the moments and :math:`f(x)` for a fixed value. """ raise NotImplementedError() def plates_to_parent(self, index, plates): """ Resolves the plate mapping to a parent. Given the plates of the node's moments, this method returns the plates that the message to a parent has for the parent's distribution. """ if index == 0: return plates elif index == 1: return plates + (self.N-1, self.K) else: raise ValueError("Parent index out of bounds") def plates_from_parent(self, index, plates): """ Resolve the plate mapping from a parent. Given the plates of a parent's moments, this method returns the plates that the moments has for this distribution. """ if index == 0: return plates elif index == 1: return plates[:-2] else: raise ValueError("Parent index out of bounds") def random(self, *phi, plates=None): """ Draw a random sample from the distribution. """ # Convert natural parameters to transition probabilities p0 = np.exp(phi[0] - misc.logsumexp(phi[0], axis=-1, keepdims=True)) P = np.exp(phi[1] - misc.logsumexp(phi[1], axis=-1, keepdims=True)) # Explicit broadcasting P = P * np.ones(plates)[...,None,None,None] # Allocate memory Z = np.zeros(plates + (self.N,), dtype=np.int64) # Draw initial state Z[...,0] = random.categorical(p0, size=plates) # Create [0,1,2,...,len(plate_axis)] indices for each plate axis and # make them broadcast properly nplates = len(plates) plates_ind = [np.arange(plate)[(Ellipsis,)+(nplates-i-1)*(None,)] for (i, plate) in enumerate(plates)] plates_ind = tuple(plates_ind) # Draw next states iteratively for n in range(self.N-1): # Select the transition probabilities for the current state but take # into account the plates. This leads to complex NumPy # indexing.. :) time_ind = min(n, np.shape(P)[-3]-1) ind = plates_ind + (time_ind, Z[...,n], Ellipsis) p = P[ind] # Draw next state z = random.categorical(P[ind]) Z[...,n+1] = z return Z class CategoricalMarkovChain(ExponentialFamily): r""" Node for categorical Markov chain random variables. The node models a Markov chain which has a discrete set of K possible states and the next state depends only on the previous state and the state transition probabilities. The graphical model is shown below: .. bayesnet:: \tikzstyle{latent} += [minimum size=30pt]; \node[latent] (x0) {$x_0$}; \node[latent, right=of x0] (x1) {$x_1$}; \node[right=of x1] (dots) {$\cdots$}; \node[latent, right=of dots] (xn) {$x_{N-1}$}; \edge {x0}{x1}; \edge {x1}{dots}; \edge {dots}{xn}; \node[latent, above=of x0] (pi) {$\boldsymbol{\pi}$}; \node[latent, above=of dots] (A) {$\mathbf{A}$}; \edge {pi} {x0}; \edge {A} {x1,dots,xn}; where :math:`\boldsymbol{\pi}` contains the probabilities for the initial state and :math:`\mathbf{A}` is the state transition probability matrix. It is possible to have :math:`\mathbf{A}` varying in time. .. math:: p(x_0, \ldots, x_{N-1}) &= p(x_0) \prod^{N-1}_{n=1} p(x_n|x_{n-1}), where .. math:: p(x_0=k) &= \pi_k, \quad \text{for } k \in \{0,\ldots,K-1\}, \\ p(x_n=j|x_{n-1}=i) &= a_{ij}^{(n-1)} \quad \text{for } n=1,\ldots,N-1,\, i\in\{1,\ldots,K-1\},\, j\in\{1,\ldots,K-1\} \\ a_{ij}^{(n)} &= [\mathbf{A}_n]_{ij} This node can be used to construct hidden Markov models by using :class:`Mixture` for the emission distribution. Parameters ---------- pi : Dirichlet-like node or (...,K)-array :math:`\boldsymbol{\pi}`, probabilities for the first state. :math:`K`-dimensional Dirichlet. A : Dirichlet-like node or (K,K)-array or (...,1,K,K)-array or (...,N-1,K,K)-array :math:`\mathbf{A}`, probabilities for state transitions. :math:`K`-dimensional Dirichlet with plates (K,) or (...,1,K) or (...,N-1,K). states : int, optional :math:`N`, the length of the chain. See also -------- Categorical, Dirichlet, GaussianMarkovChain, Mixture, SwitchingGaussianMarkovChain """ def __init__(self, pi, A, states=None, **kwargs): """ Create categorical Markov chain """ super().__init__(pi, A, states=states, **kwargs) @classmethod def _constructor(cls, p0, P, states=None, **kwargs): """ Constructs distribution and moments objects. This method is called if useconstructor decorator is used for __init__. Becase the distribution and moments object depend on the number of categories, that is, they depend on the parent node, this method can be used to construct those objects. """ p0 = cls._ensure_moments(p0, DirichletMoments) P = cls._ensure_moments(P, DirichletMoments) # Number of categories D = p0.dims[0][0] parent_moments = (p0._moments, P._moments) # Number of states if len(P.plates) < 2: if states is None: raise ValueError("Could not infer the length of the Markov " "chain") N = int(states) else: if P.plates[-2] == 1: if states is None: N = 2 else: N = int(states) else: if states is not None and P.plates[-2]+1 != states: raise ValueError("Given length of the Markov chain is " "inconsistent with the transition " "probability matrix") N = P.plates[-2] + 1 if p0.dims != P.dims: raise ValueError("Initial state probability vector and state " "transition probability matrix have different " "size") if len(P.plates) < 1 or P.plates[-1] != D: raise ValueError("Transition probability matrix is not square") dims = ( (D,), (N-1,D,D) ) parents = [p0, P] distribution = CategoricalMarkovChainDistribution(D, N) moments = CategoricalMarkovChainMoments(D, N) return (parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, p0.plates), distribution.plates_from_parent(1, P.plates)), distribution, moments, parent_moments) class CategoricalMarkovChainToCategorical(Deterministic): """ A node for converting categorical MC moments to categorical moments. """ def __init__(self, Z, **kwargs): """ Create a categorical MC moments to categorical moments conversion node. """ # Convert parent to proper type. Z must be a node. Z = self._ensure_moments(Z, CategoricalMarkovChainMoments) K = Z.dims[0][-1] dims = ( (K,), ) self._moments = CategoricalMoments(K) self._parent_moments = (Z._moments,) super().__init__(Z, dims=dims, **kwargs) def _compute_moments(self, u_Z): """ Compute the moments given the moments of the parents. """ # Add time axis to p0 p0 = u_Z[0][...,None,:] # Sum joint probability arrays to marginal probability vectors zz = u_Z[1] p = np.sum(zz, axis=-2) # Broadcast p0 and p to same shape, except the time axis plates_p0 = np.shape(p0)[:-2] plates_p = np.shape(p)[:-2] shape = misc.broadcasted_shape(plates_p0, plates_p) + (1,1) p0 = p0 * np.ones(shape) p = p * np.ones(shape) # Concatenate P = np.concatenate((p0,p), axis=-2) return [P] def _compute_message_to_parent(self, index, m, u_Z): """ Compute the message to a parent. """ m0 = m[0][...,0,:] m1 = m[0][...,1:,None,:] return [m0, m1] def _compute_weights_to_parent(self, index, weights): """ Compute the mask used for messages sent to a parent. """ if index == 0: # "Sum" over the last axis # TODO/FIXME: Check this. BUG I THINK. return np.sum(weights, axis=-1) else: raise ValueError("Parent index out of bounds") def _plates_to_parent(self, index): if index == 0: return self.plates[:-1] else: raise ValueError("Parent index out of bounds") def _plates_from_parent(self, index): if index == 0: N = self.parents[0].dims[1][0] return self.parents[0].plates + (N+1,) else: raise ValueError("Parent index out of bounds") # Make use of the conversion node CategoricalMarkovChainMoments.add_converter(CategoricalMoments, CategoricalMarkovChainToCategorical) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/concat_gaussian.py0000644000175100001770000000746100000000000025050 0ustar00runnerdocker00000000000000import numpy as np from bayespy.utils import misc from bayespy.utils import linalg from .gaussian import GaussianMoments from .deterministic import Deterministic class ConcatGaussian(Deterministic): """Concatenate Gaussian vectors along the variable axis (not plate axis) NOTE: This concatenates on the variable axis! That is, the dimensionality of the resulting Gaussian vector is the sum of the dimensionalities of the input Gaussian vectors. TODO: Add support for Gaussian arrays and arbitrary concatenation axis. """ def __init__(self, *nodes, **kwargs): # Number of nodes to concatenate N = len(nodes) # This is stuff that will be useful when implementing arbitrary # concatenation. That is, first determine ndim. # # # Convert nodes to Gaussians (if they are not nodes, don't worry) # nodes_gaussian = [] # for node in nodes: # try: # node_gaussian = node._convert(GaussianMoments) # except AttributeError: # Moments.NoConverterError: # nodes_gaussian.append(node) # else: # nodes_gaussian.append(node_gaussian) # nodes = nodes_gaussian # # # Determine shape from the first Gaussian node # shape = None # for node in nodes: # try: # shape = node.dims[0] # except AttibuteError: # pass # else: # break # if shape is None: # raise ValueError("Couldn't determine shape from the input nodes") # # ndim = len(shape) nodes = [self._ensure_moments(node, GaussianMoments, ndim=1) for node in nodes] D = sum(node.dims[0][0] for node in nodes) shape = (D,) self._moments = GaussianMoments(shape) self._parent_moments = [node._moments for node in nodes] # Make sure all parents are Gaussian vectors if any(len(node.dims[0]) != 1 for node in nodes): raise ValueError("Input nodes must be (Gaussian) vectors") self.slices = tuple(np.cumsum([0] + [node.dims[0][0] for node in nodes])) D = self.slices[-1] return super().__init__(*nodes, dims=((D,), (D, D)), **kwargs) def _compute_moments(self, *u_nodes): x = misc.concatenate(*[u[0] for u in u_nodes], axis=-1) xx = misc.block_diag(*[u[1] for u in u_nodes]) # Explicitly broadcast xx to plates of x x_plates = np.shape(x)[:-1] xx = np.ones(x_plates)[...,None,None] * xx # Compute the cross-covariance terms using the means of each variable # (because covariances are zero for factorized nodes in the VB # approximation) i_start = 0 for m in range(len(u_nodes)): i_end = i_start + np.shape(u_nodes[m][0])[-1] j_start = 0 for n in range(m): j_end = j_start + np.shape(u_nodes[n][0])[-1] xm_xn = linalg.outer(u_nodes[m][0], u_nodes[n][0], ndim=1) xx[...,i_start:i_end,j_start:j_end] = xm_xn xx[...,j_start:j_end,i_start:i_end] = misc.T(xm_xn) j_start = j_end i_start = i_end return [x, xx] def _compute_message_to_parent(self, i, m, *u_nodes): r = self.slices # Pick the proper parts from the message array m0 = m[0][...,r[i]:r[i+1]] m1 = m[1][...,r[i]:r[i+1],r[i]:r[i+1]] # Handle cross-covariance terms by using the mean of the covariate node for (j, u) in enumerate(u_nodes): if j != i: m0 = m0 + 2 * np.einsum( '...ij,...j->...i', m[1][...,r[i]:r[i+1],r[j]:r[j+1]], u[0] ) return [m0, m1] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/concatenate.py0000644000175100001770000001350500000000000024167 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from bayespy.utils import misc from .deterministic import Deterministic from .node import Moments class Concatenate(Deterministic): """ Concatenate similar nodes along a plate axis. Nodes must be of same type and dimensionality. Also, plates must be identical except for the plate axis along which the concatenation is performed. See also -------- numpy.concatenate """ def __init__(self, *nodes, axis=-1, **kwargs): if axis >= 0: raise ValueError("Currently, only negative axis indeces " "are allowed.") self._axis = axis parent_moments = None for node in nodes: try: parent_moments = node._moments except: pass else: break if parent_moments is None: raise ValueError("Couldn't determine parent moments") # All parents must have same moments self._parent_moments = (parent_moments,) * len(nodes) self._moments = parent_moments # Convert nodes try: nodes = [ self._ensure_moments( node, parent_moments.__class__, **parent_moments.get_instance_conversion_kwargs() ) for node in nodes ] except Moments.NoConverterError: raise ValueError("Parents have different moments") # Dimensionality of the node dims = tuple([dim for dim in nodes[0].dims]) for node in nodes: if node.dims != dims: raise ValueError("Parents have different dimensionalities") super().__init__( *nodes, dims=dims, allow_dependent_parents=True, # because parent plates are kept separate **kwargs ) # Compute start indices for each parent on the concatenated plate axis self._indices = np.zeros(len(nodes)+1, dtype=np.int64) self._indices[1:] = np.cumsum([int(parent.plates[axis]) for parent in self.parents]) self._lengths = [parent.plates[axis] for parent in self.parents] return def _get_id_list(self): """ Parents don't need to be independent for this node so remove duplicates """ return list(set(super()._get_id_list())) def _compute_plates_to_parent(self, index, plates): plates = list(plates) plates[self._axis] = self.parents[index].plates[self._axis] return tuple(plates) def _compute_plates_from_parent(self, index, plates): plates = list(plates) plates[self._axis] = 0 for parent in self.parents: plates[self._axis] += parent.plates[self._axis] return tuple(plates) def _plates_multiplier_from_parent(self, index): multipliers = [parent.plates_multiplier for parent in self.parents] for m in multipliers: if np.any(np.array(m) != 1): raise ValueError("Concatenation node does not support plate " "multipliers.") return () def _compute_weights_to_parent(self, index, weights): axis = self._axis indices = self._indices[index:(index+1)] if np.ndim(weights) >= abs(axis) and np.shape(weights)[axis] > 1: # Take the middle one of the returned three arrays return np.split(weights, indices, axis=axis)[1] else: return weights def _compute_message_to_parent(self, index, m, *u_parents): msg = [] indices = self._indices[index:(index+2)] for i in range(len(m)): # Fix plate axis to array axis axis = self._axis - len(self.dims[i]) # Find the slice from the message if np.ndim(m[i]) >= abs(axis) and np.shape(m[i])[axis] > 1: mi = np.split(m[i], indices, axis=axis)[1] else: mi = m[i] msg.append(mi) return msg def _compute_moments(self, *u_parents): # TODO/FIXME: Unfortunately, np.concatenate doesn't support # broadcasting but moment messages may use broadcasting. # # WORKAROUND: Broadcast the arrays explcitly to have same shape # except for the concatenated axis. u = [] for i in range(len(self.dims)): # Fix plate axis to array axis axis = self._axis - len(self.dims[i]) # Find broadcasted shape ui_parents = [u_parent[i] for u_parent in u_parents] shapes = [list(np.shape(uip)) for uip in ui_parents] for i in range(len(shapes)): if len(shapes[i]) >= abs(axis): shapes[i][axis] = 1 ## shapes = [np.shape(uip[:axis]) + (1,) + np.shape(uip[(axis+1)]) ## if np.ndim(uip) >= abs(self._axis) else ## np.shape(uip) ## for uip in ui_parents] bc_shape = misc.broadcasted_shape(*shapes) # Concatenated axis must be broadcasted explicitly bc_shapes = [misc.broadcasted_shape(bc_shape, (length,) + (1,)*(abs(axis)-1)) for length in self._lengths] # Broadcast explicitly ui_parents = [uip * np.ones(shape) for (uip, shape) in zip(ui_parents, bc_shapes)] # Concatenate ui = np.concatenate(ui_parents, axis=axis) u.append(ui) return u ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/constant.py0000644000175100001770000000535400000000000023537 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012,2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from bayespy.utils import misc from .node import Node, Moments class Constant(Node): r""" Node for presenting constant values. The node wraps arrays into proper node type. """ def __init__(self, moments, x, **kwargs): if not isinstance(moments, Moments) and issubclass(moments, Moments): raise ValueError("Give moments as an object instance instead of a class") self._moments = moments self.x = x # Compute moments self.u = self._moments.compute_fixed_moments(x) # Dimensions of the moments dims = self._moments.dims # Resolve plates D = len(dims[0]) if D > 0: plates = np.shape(self.u[0])[:-D] else: plates = np.shape(self.u[0]) kwargs.setdefault('plates', plates) self._parent_moments = () # Parent constructor super().__init__(dims=dims, **kwargs) def _get_id_list(self): """ Returns the stochastic ID list. This method is used to check that same stochastic nodes are not direct parents of a node several times. It is only valid if there are intermediate stochastic nodes. To put it another way: each ID corresponds to one factor q(..) in the posterior approximation. Different IDs mean different factors, thus they mean independence. The parents must have independent factors. Stochastic nodes should return their unique ID. Deterministic nodes should return the IDs of their parents. Constant nodes should return empty list of IDs. """ return [] def get_moments(self): return self.u def set_value(self, x): x = np.asanyarray(x) #shapes = [np.shape(ui) for ui in self.u] self.u = self._moments.compute_fixed_moments(x) for (i, dimsi) in enumerate(self.dims): correct_shape = tuple(self.plates) + tuple(dimsi) given_shape = np.shape(self.u[i]) if not misc.is_shape_subset(given_shape, correct_shape): raise ValueError( "Incorrect shape {0} for the array, expected {1}" .format(given_shape, correct_shape) ) return def lower_bound_contribution(self, gradient=False, **kwargs): # Deterministic functions are delta distributions so the lower bound # contribuion is zero. return 0 def random(self): return self.x ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/converters.py0000644000175100001770000000302500000000000024071 0ustar00runnerdocker00000000000000# Copyright (c) 2016 Jaakko Luttinen # MIT License from .deterministic import Deterministic class NodeConverter(Deterministic): """ Simple wrapper to transform moment converters into nodes """ def __init__(self, moments_converter, node): self.moments_converter = moments_converter self._parent_moments = (node._moments,) self._moments = moments_converter.moments super().__init__(node, dims=self._moments.dims) def _compute_moments(self, u_node): return self.moments_converter.compute_moments(u_node) def _compute_message_to_parent(self, index, m_child, u_node): if index != 0: raise IndexError() return self.moments_converter.compute_message_to_parent(m_child, u_node) def _compute_weights_to_parent(self, index, weights): if index != 0: raise IndexError() return self.moments_converter.compute_weights_to_parent(weights) def _compute_plates_to_parent(self, index, plates): if index != 0: raise IndexError() return self.moments_converter.plates_to_parent(plates) def _compute_plates_from_parent(self, index, plates): if index != 0: raise IndexError() return self.moments_converter.plates_from_parent(plates) def _compute_plates_multiplier_from_parent(self, index, plates_multiplier): if index != 0: raise IndexError() return self.moments_converter.plates_multiplier_from_parent( plates_multiplier ) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/deterministic.py0000644000175100001770000002730000000000000024544 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import functools import numpy as np from bayespy.utils import misc from .node import Node, Moments class Deterministic(Node): """ Base class for deterministic nodes. Sub-classes must implement: 1. For implementing the deterministic function: _compute_moments(self, *u) 2. One of the following options: a) Simple methods: _compute_message_to_parent(self, index, m, *u) b) More control with: _compute_message_and_mask_to_parent(self, index, m, *u) Sub-classes may need to re-implement: 1. If they manipulate plates: _compute_weights_to_parent(index, mask) _compute_plates_to_parent(self, index, plates) _compute_plates_from_parent(self, index, plates) """ def __init__(self, *args, **kwargs): super().__init__(*args, plates=None, notify_parents=False, **kwargs) def _get_id_list(self): """ Returns the stochastic ID list. This method is used to check that same stochastic nodes are not direct parents of a node several times. It is only valid if there are intermediate stochastic nodes. To put it another way: each ID corresponds to one factor q(..) in the posterior approximation. Different IDs mean different factors, thus they mean independence. The parents must have independent factors. Stochastic nodes should return their unique ID. Deterministic nodes should return the IDs of their parents. Constant nodes should return empty list of IDs. """ id_list = [] for parent in self.parents: id_list = id_list + parent._get_id_list() return id_list def get_moments(self): u_parents = self._message_from_parents() return self._compute_moments(*u_parents) def _compute_message_and_mask_to_parent(self, index, m_children, *u_parents): # The following methods should be implemented by sub-classes. m = self._compute_message_to_parent(index, m_children, *u_parents) mask = self._compute_weights_to_parent(index, self.mask) != 0 return (m, mask) def _get_message_and_mask_to_parent(self, index, u_parent=None): u_parents = self._message_from_parents(exclude=index) u_parents[index] = u_parent if u_parent is not None: u_self = self._compute_moments(*u_parents) else: u_self = None m_children = self._message_from_children(u_self=u_self) return self._compute_message_and_mask_to_parent(index, m_children, *u_parents) def _compute_moments(self, *u_parents): """ Compute the moments given the moments of the parents. """ raise NotImplementedError() def _compute_message_to_parent(self, index, m_children, *u_parents): """ Compute the message to a parent. """ raise NotImplementedError() def _add_child(self, child, index): """ Add a child node. Only child nodes that are stochastic (or have stochastic children recursively) are counted as children because deterministic nodes without stochastic children do not have any messages to send so the parents do not need to know about the deterministic node. A deterministic node does not notify its parents when created, but if it gets a stochastic child node, then notify parents. This method is called only if a stochastic child (recursively) node is added, thus there is at least one stochastic node below this deterministic node. Parameters ---------- child : node index : int The parent index of this node for the child node. The child node recognizes its parents by their index number. """ super()._add_child(child, index) # Now that this deterministic node has non-deterministic children, # notify parents for (ind,parent) in enumerate(self.parents): parent._add_child(self, ind) def _remove_child(self, child, index): """ Remove a child node. Only child nodes that are stochastic (or have stochastic children recursively) are counted as children because deterministic nodes without stochastic children do not have any messages to send so the parents do not need to know about the deterministic node. So, if the deterministic node does not have any stochastic children left after removal, remove it from its parents. """ super()._remove_child(child, index) # Check whether there are any children left. If not, remove from parents if len(self.children) == 0: for (ind, parent) in enumerate(self.parents): parent._remove_child(self, ind) def lower_bound_contribution(self, gradient=False, **kwargs): # Deterministic functions are delta distributions so the lower bound # contribuion is zero. return 0 def random(self): samples = [parent.random() for parent in self.parents] return self._compute_function(*samples) def tile(X, tiles): """ Tile the plates of the input node. x = [a,b,c] y = tile(x, 2) = [a,b,c,a,b,c] There should be no need to tile plates that have unit length because they are handled properly by the broadcasting rules already. Parameters ---------- X : Node Input node to be tiled. tiles : int, tuple Tiling of the plates (broadcasting rules for plates apply). See also -------- numpy.tile """ # Make sure `tiles` is tuple (even if an integer is given) tiles = tuple(np.ravel(tiles)) class _Tile(Deterministic): _parent_moments = (Moments(),) def __init__(self, X, **kwargs): self._moments = X._moments super().__init__(X, dims=X.dims, **kwargs) def _compute_plates_to_parent(self, index, plates): plates = list(plates) for i in range(-len(tiles), 0): plates[i] = plates[i] // tiles[i] return tuple(plates) def _compute_plates_from_parent(self, index, plates): return tuple(misc.multiply_shapes(plates, tiles)) def _compute_weights_to_parent(self, index, weights): # Idea: Reshape the message array such that every other axis # will be summed and every other kept. # Make plates equal length plates = self._plates_to_parent(index) shape_m = np.shape(weights) (plates, tiles_m, shape_m) = misc.make_equal_length( plates, tiles, shape_m ) # Handle broadcasting rules for axes that have unit length in # the message (although the plate may be non-unit length). Also, # compute the corresponding broadcasting_multiplier. plates = list(plates) tiles_m = list(tiles_m) for j in range(len(plates)): if shape_m[j] == 1: plates[j] = 1 tiles_m[j] = 1 # Combine the tuples by picking every other from tiles_ind and # every other from shape shape = functools.reduce(lambda x,y: x+y, zip(tiles_m, plates)) # ..and reshape the array, that is, every other axis corresponds # to tiles and every other to plates/dimensions in parents weights = np.reshape(weights, shape) # Sum over every other axis axes = tuple(range(0,len(shape),2)) weights = np.sum(weights, axis=axes) # Remove extra leading axes ndim_parent = len(self.parents[index].plates) weights = misc.squeeze_to_dim(weights, ndim_parent) return weights def _compute_message_to_parent(self, index, m, u_X): m = list(m) for ind in range(len(m)): # Idea: Reshape the message array such that every other axis # will be summed and every other kept. shape_ind = self._plates_to_parent(index) + self.dims[ind] # Add variable dimensions to tiles tiles_ind = tiles + (1,)*len(self.dims[ind]) # Make shape tuples equal length shape_m = np.shape(m[ind]) (tiles_ind, shape, shape_m) = misc.make_equal_length(tiles_ind, shape_ind, shape_m) # Handle broadcasting rules for axes that have unit length in # the message (although the plate may be non-unit length). Also, # compute the corresponding broadcasting multiplier. r = 1 shape = list(shape) tiles_ind = list(tiles_ind) for j in range(len(shape)): if shape_m[j] == 1: r *= tiles_ind[j] shape[j] = 1 tiles_ind[j] = 1 # Combine the tuples by picking every other from tiles_ind and # every other from shape shape = functools.reduce(lambda x,y: x+y, zip(tiles_ind, shape)) # ..and reshape the array, that is, every other axis corresponds # to tiles and every other to plates/dimensions in parents m[ind] = np.reshape(m[ind], shape) # Sum over every other axis axes = tuple(range(0,len(shape),2)) m[ind] = r * np.sum(m[ind], axis=axes) # Remove extra leading axes ndim_parent = len(self.parents[index].get_shape(ind)) m[ind] = misc.squeeze_to_dim(m[ind], ndim_parent) return m def _compute_moments(self, u_X): """ Tile the plates of the parent's moments. """ # Utilize broadcasting: If a tiled axis is unit length in u_X, there # is no need to tile it. u = list() for ind in range(len(u_X)): ui = u_X[ind] shape_u = np.shape(ui) if np.ndim(ui) > 0: # Add variable dimensions tiles_ind = tiles + (1,)*len(self.dims[ind]) # Utilize broadcasting: Do not tile leading empty axes nd = min(len(tiles_ind), np.ndim(ui)) tiles_ind = tiles_ind[(-nd):] # For simplicity, make tiles and shape equal length (tiles_ind, shape_u) = misc.make_equal_length(tiles_ind, shape_u) # Utilize broadcasting: Use tiling only if the parent's # moment has non-unit axis length. tiles_ind = [tile if sh > 1 else 1 for (tile, sh) in zip(tiles_ind, shape_u)] # Tile ui = np.tile(ui, tiles_ind) u.append(ui) return u return _Tile(X, name="tile(%s, %s)" % (X.name, tiles)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/dirichlet.py0000644000175100001770000002553600000000000023661 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012,2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the Dirichlet distribution node. """ import numpy as np from scipy import special from bayespy.utils import random from bayespy.utils import misc from bayespy.utils import linalg from .stochastic import Stochastic from .expfamily import ExponentialFamily, ExponentialFamilyDistribution from .constant import Constant from .node import Node, Moments, ensureparents class ConcentrationMoments(Moments): """ Class for the moments of Dirichlet conjugate-prior variables. """ def __init__(self, categories): self.categories = categories self.dims = ( (categories,), () ) return def compute_fixed_moments(self, alpha): """ Compute the moments for a fixed value """ alpha = np.asanyarray(alpha) if np.ndim(alpha) < 1: raise ValueError("The prior sample sizes must be a vector") if np.any(alpha < 0): raise ValueError("The prior sample sizes must be non-negative") gammaln_sum = special.gammaln(np.sum(alpha, axis=-1)) sum_gammaln = np.sum(special.gammaln(alpha), axis=-1) z = gammaln_sum - sum_gammaln return [alpha, z] @classmethod def from_values(cls, alpha): """ Return the shape of the moments for a fixed value. """ if np.ndim(alpha) < 1: raise ValueError("The array must be at least 1-dimensional array.") categories = np.shape(alpha)[-1] return cls(categories) class DirichletMoments(Moments): """ Class for the moments of Dirichlet variables. """ def __init__(self, categories): self.categories = categories self.dims = ( (categories,), ) def compute_fixed_moments(self, p): """ Compute the moments for a fixed value """ # Check that probabilities are non-negative p = np.asanyarray(p) if np.ndim(p) < 1: raise ValueError("Probabilities must be given as a vector") if np.any(p < 0) or np.any(p > 1): raise ValueError("Probabilities must be in range [0,1]") if not np.allclose(np.sum(p, axis=-1), 1.0): raise ValueError("Probabilities must sum to one") # Normalize probabilities p = p / np.sum(p, axis=-1, keepdims=True) # Message is log-probabilities logp = np.log(p) u = [logp] return u @classmethod def from_values(cls, x): """ Return the shape of the moments for a fixed value. """ if np.ndim(x) < 1: raise ValueError("Probabilities must be given as a vector") categories = np.shape(x)[-1] return cls(categories) class DirichletDistribution(ExponentialFamilyDistribution): """ Class for the VMP formulas of Dirichlet variables. """ def compute_message_to_parent(self, parent, index, u_self, u_alpha): r""" Compute the message to a parent node. """ logp = u_self[0] m0 = logp m1 = 1 return [m0, m1] def compute_phi_from_parents(self, u_alpha, mask=True): r""" Compute the natural parameter vector given parent moments. """ return [u_alpha[0]] def compute_moments_and_cgf(self, phi, mask=True): r""" Compute the moments and :math:`g(\phi)`. .. math:: \overline{\mathbf{u}} (\boldsymbol{\phi}) &= \begin{bmatrix} \psi(\phi_1) - \psi(\sum_d \phi_{1,d}) \end{bmatrix} \\ g_{\boldsymbol{\phi}} (\boldsymbol{\phi}) &= TODO """ if np.any(np.asanyarray(phi) <= 0): raise ValueError("Natural parameters should be positive") sum_gammaln = np.sum(special.gammaln(phi[0]), axis=-1) gammaln_sum = special.gammaln(np.sum(phi[0], axis=-1)) psi_sum = special.psi(np.sum(phi[0], axis=-1, keepdims=True)) # Moments u0 = special.psi(phi[0]) - psi_sum u = [u0] # G g = gammaln_sum - sum_gammaln return (u, g) def compute_cgf_from_parents(self, u_alpha): r""" Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ return u_alpha[1] def compute_fixed_moments_and_f(self, p, mask=True): r""" Compute the moments and :math:`f(x)` for a fixed value. .. math:: u(p) = \begin{bmatrix} \log(p_1) \\ \vdots \\ \log(p_D) \end{bmatrix} .. math:: f(p) = - \sum_d \log(p_d) """ # Check that probabilities are non-negative p = np.asanyarray(p) if np.ndim(p) < 1: raise ValueError("Probabilities must be given as a vector") if np.any(p < 0) or np.any(p > 1): raise ValueError("Probabilities must be in range [0,1]") if not np.allclose(np.sum(p, axis=-1), 1.0): raise ValueError("Probabilities must sum to one") # Normalize probabilities p = p / np.sum(p, axis=-1, keepdims=True) # Message is log-probabilities logp = np.log(p) u = [logp] f = - np.sum(logp, axis=-1) return (u, f) def random(self, *phi, plates=None): r""" Draw a random sample from the distribution. """ return random.dirichlet(phi[0], size=plates) def compute_gradient(self, g, u, phi): r""" Compute the moments and :math:`g(\phi)`. \psi(\phi_1) - \psi(\sum_d \phi_{1,d}) Standard gradient given the gradient with respect to the moments, that is, given the Riemannian gradient :math:`\tilde{\nabla}`: .. math:: \nabla &= \begin{bmatrix} (\psi^{(1)}(\phi_1) - \psi^{(1)}(\sum_d \phi_{1,d}) \nabla_1 \end{bmatrix} """ sum_phi = np.sum(phi[0], axis=-1, keepdims=True) d0 = g[0] * (special.polygamma(1, phi[0]) - special.polygamma(1, sum_phi)) return [d0] class Concentration(Stochastic): _parent_moments = () def __init__(self, D, regularization=True, **kwargs): """ ML estimation node for concentration parameters. Parameters ---------- D : int Number of categories regularization : 2-tuple of arrays (optional) "Prior" log-probability and "prior" sample number """ self.D = D self.dims = ( (D,), () ) self._moments = ConcentrationMoments(D) super().__init__(dims=self.dims, initialize=False, **kwargs) self.u = self._moments.compute_fixed_moments(np.ones(D)) if regularization is None or regularization is False: regularization = [0, 0] elif regularization is True: # Decent default regularization? regularization = [np.log(1/D), 1] self.regularization = regularization return @property def regularization(self): return self.__regularization @regularization.setter def regularization(self, regularization): if len(regularization) != 2: raise ValueError("Regularization must 2-tuple") if not misc.is_shape_subset(np.shape(regularization[0]), self.get_shape(0)): raise ValueError("Wrong shape") if not misc.is_shape_subset(np.shape(regularization[1]), self.get_shape(1)): raise ValueError("Wrong shape") self.__regularization = regularization return def _update_distribution_and_lowerbound(self, m): r""" Find maximum likelihood estimate for the concentration parameter """ a = np.ones(self.D) da = np.inf logp = m[0] + self.regularization[0] N = m[1] + self.regularization[1] # Compute sufficient statistic mean_logp = logp / N[...,None] # It is difficult to estimate values lower than 0.02 because the # Dirichlet distributed probability vector starts to give numerically # zero random samples for lower values. if np.any(np.isinf(mean_logp)): raise ValueError( "Cannot estimate DirichletConcentration because of infs. This " "means that there are numerically zero probabilities in the " "child Dirichlet node." ) # Fixed-point iteration while np.any(np.abs(da / a) > 1e-5): a_new = misc.invpsi( special.psi(np.sum(a, axis=-1, keepdims=True)) + mean_logp ) da = a_new - a a = a_new self.u = self._moments.compute_fixed_moments(a) return def initialize_from_value(self, x): self.u = self._moments.compute_fixed_moments(x) return def lower_bound_contribution(self): return ( linalg.inner(self.u[0], self.regularization[0], ndim=1) + self.u[1] * self.regularization[1] ) class Dirichlet(ExponentialFamily): r""" Node for Dirichlet random variables. The node models a set of probabilities :math:`\{\pi_0, \ldots, \pi_{K-1}\}` which satisfy :math:`\sum_{k=0}^{K-1} \pi_k = 1` and :math:`\pi_k \in [0,1] \ \forall k=0,\ldots,K-1`. .. math:: p(\pi_0, \ldots, \pi_{K-1}) = \mathrm{Dirichlet}(\alpha_0, \ldots, \alpha_{K-1}) where :math:`\alpha_k` are concentration parameters. The posterior approximation has the same functional form but with different concentration parameters. Parameters ---------- alpha : (...,K)-shaped array Prior counts :math:`\alpha_k` See also -------- Beta, Categorical, Multinomial, CategoricalMarkovChain """ _distribution = DirichletDistribution() @classmethod def _constructor(cls, alpha, **kwargs): """ Constructs distribution and moments objects. """ # Number of categories alpha = cls._ensure_moments(alpha, ConcentrationMoments) parent_moments = (alpha._moments,) parents = [alpha] categories = alpha.dims[0][0] moments = DirichletMoments(categories) return ( parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), alpha.plates), cls._distribution, moments, parent_moments ) def __str__(self): """ Show distribution as a string """ alpha = self.phi[0] return ("%s ~ Dirichlet(alpha)\n" " alpha =\n" "%s" % (self.name, alpha)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/dot.py0000644000175100001770000006035100000000000022472 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from bayespy.utils import misc from .node import Node from .deterministic import Deterministic from .gaussian import Gaussian, GaussianMoments from .gaussian import GaussianGammaMoments from .ml import DeltaMoments class SumMultiply(Deterministic): r""" Node for computing general products and sums of Gaussian nodes. The node is similar to `numpy.einsum`, which is a very general function for computing dot products, sums, products and other sums of products of arrays. For instance, consider the following arrays: >>> import numpy as np >>> X = np.random.randn(2, 3, 4) >>> Y = np.random.randn(3, 5) >>> Z = np.random.randn(4, 2) Then, the Einstein summation can be used as: >>> np.einsum('abc,bd,ca->da', X, Y, Z) array([[...]]) SumMultiply node can be used similarly for Gaussian nodes. For instance, consider the following Gaussian nodes: >>> from bayespy.nodes import GaussianARD >>> X = GaussianARD(0, 1, shape=(2, 3, 4)) >>> Y = GaussianARD(0, 1, shape=(3, 5)) >>> Z = GaussianARD(0, 1, shape=(4, 2)) Then, similarly to `numpy.einsum`, SumMultiply could be used as: >>> from bayespy.nodes import SumMultiply >>> SumMultiply('abc,bd,ca->da', X, Y, Z) or >>> SumMultiply(X, [0,1,2], Y, [1,3], Z, [2,0], [3,0]) which is similar to the alternative syntax of numpy.einsum. This node operates similarly as numpy.einsum. However, you must use all the elements of each node, that is, an operation like np.einsum('ii->i',X) is not allowed. Thus, for each node, each axis must be given unique id. The id identifies which axes correspond to which axes between the different nodes. Also, Ellipsis ('...') is not yet supported for simplicity. It would also have some problems with constant inputs (because how to determine ndim), so let us just forget it for now. Each output axis must appear in the input mappings. The keys must refer to variable dimension axes only, not plate axes. The input nodes may be Gaussian-gamma (isotropic) nodes. The output message is Gaussian-gamma (isotropic) if any of the input nodes is Gaussian-gamma. Examples -------- Sum over the rows: 'ij->j' Inner product of three vectors: 'i,i,i' Matrix-vector product: 'ij,j->i' Matrix-matrix product: 'ik,kj->ij' Outer product: 'i,j->ij' Vector-matrix-vector product: 'i,ij,j' Notes ----- This operation can be extremely slow if not used wisely. For large and complex operations, it is sometimes more efficient to split the operation into multiple nodes. For instance, the example above could probably be computed faster by >>> XZ = SumMultiply(X, [0,1,2], Z, [2,0], [0,1]) >>> F = SumMultiply(XZ, [0,1], Y, [1,2], [2,0]) because the third axis ('c') could be summed out already in the first operation. This same effect applies also to numpy.einsum in general. """ def __init__(self, *args, iterator_axis=None, **kwargs): """ SumMultiply(Node1, map1, Node2, map2, ..., NodeN, mapN [, map_out]) """ args = list(args) if len(args) < 2: raise ValueError("Not enough inputs") if iterator_axis is not None: raise NotImplementedError("Iterator axis not implemented yet") if iterator_axis is not None and not isinstance(iterator_axis, int): raise ValueError("Iterator axis must be integer") # Two different parsing methods, depends on how the arguments are given if misc.is_string(args[0]): # This is the format: # SumMultiply('ik,k,kj->ij', X, Y, Z) strings = args[0] nodes = args[1:] # Remove whitespace strings = misc.remove_whitespace(strings) # Split on '->' (should contain only one '->' or none) strings = strings.split('->') if len(strings) > 2: raise ValueError('The string contains too many ->') strings_in = strings[0] if len(strings) == 2: string_out = strings[1] else: string_out = '' # Split former part on ',' (the number of parts should be equal to # nodes) strings_in = strings_in.split(',') if len(strings_in) != len(nodes): raise ValueError('Number of given input nodes is different ' 'from the input keys in the string') # Split strings into key lists using single character keys keysets = [list(string_in) for string_in in strings_in] keys_out = list(string_out) else: # This is the format: # SumMultiply(X, [0,2], Y, [2], Z, [2,1], [0,1]) # If given, the output mapping is the last argument if len(args) % 2 == 0: keys_out = [] else: keys_out = args.pop(-1) # Node and axis mapping are given in turns nodes = args[::2] keysets = args[1::2] # Find all the keys (store only once each) full_keyset = [] for keyset in keysets: full_keyset += keyset #full_keyset += list(keyset.keys()) full_keyset = list(set(full_keyset)) # Input and output messages are Gaussian unless there is at least one # Gaussian-gamma message from the parents self.gaussian_gamma = False self.is_constant = [] for i in range(len(nodes)): try: # First, try to handle the node as a "constant" with so-called # delta moments. These consume much less memory as only the # "value" (or the first moment) is used. nodes[i] = self._ensure_moments( nodes[i], DeltaMoments, ndim=len(keysets[i]) ) except DeltaMoments.NoConverterError: self.is_constant.append(False) try: # Second, try to convert to gaussian moments. These moments # will consume more memory as it calculates the second # moment too. nodes[i] = self._ensure_moments( nodes[i], GaussianMoments, ndim=len(keysets[i]) ) except GaussianMoments.NoConverterError: self.gaussian_gamma = True else: self.is_constant.append(True) if self.gaussian_gamma: nodes = [ ( self._ensure_moments( node, GaussianGammaMoments, ndim=len(keyset) ) if not is_const else node ) for (node, keyset, is_const) in zip(nodes, keysets, self.is_constant) ] self._parent_moments = tuple(node._moments for node in nodes) # # Check the validity of each node # for n in range(len(nodes)): # Check that the maps and the size of the variable are consistent if len(nodes[n].dims[0]) != len(keysets[n]): raise ValueError("Wrong number of keys (%d) for the node " "number %d with %d dimensions" % (len(keysets[n]), n, len(nodes[n].dims[0]))) # Check that the keys are unique if len(set(keysets[n])) != len(keysets[n]): raise ValueError("Axis keys for node number %d are not unique" % n) # Check the validity of output keys: each output key must be included in # the input keys if len(keys_out) != len(set(keys_out)): raise ValueError("Output keys are not unique") for key in keys_out: if key not in full_keyset: raise ValueError("Output key %s does not appear in any input" % key) # Check the validity of the nodes with respect to the key mapping. # Check that the node dimensions map and broadcast properly, that is, # all the nodes using the same key for axes must have equal size for # those axes (or size 1). broadcasted_size = {} for key in full_keyset: broadcasted_size[key] = 1 for (node, keyset) in zip(nodes, keysets): try: # Find the axis for the key index = keyset.index(key) except ValueError: # OK, this node doesn't use this key for any axis pass else: # Length of the axis for that key node_size = node.dims[0][index] if node_size != broadcasted_size[key]: if broadcasted_size[key] == 1: # Apply broadcasting broadcasted_size[key] = node_size elif node_size != 1: # Different sizes and neither has size 1 raise ValueError("Axes using key %s do not " "broadcast properly" % key) # Compute the shape of the output shape = tuple([broadcasted_size[key] for key in keys_out]) if self.gaussian_gamma: self._moments = GaussianGammaMoments(shape) else: self._moments = GaussianMoments(shape) # Rename the keys to [0,1,...,N-1] where N is the total number of keys self.N_keys = len(full_keyset) self.out_keys = [full_keyset.index(key) for key in keys_out] self.in_keys = [ [full_keyset.index(key) for key in keyset] for keyset in keysets ] super().__init__(*nodes, dims=self._moments.dims, **kwargs) def _compute_function(self, *x_parents): # TODO: Add unit tests for this function (xs, alphas) = ( (x_parents, 1) if not self.gaussian_gamma else zip(*x_parents) ) # Add Ellipsis for the plates in_keys = [[Ellipsis] + k for k in self.in_keys] out_keys = [Ellipsis] + self.out_keys samples_and_keys = misc.zipper_merge(xs, in_keys) y = np.einsum(*(samples_and_keys + [out_keys])) return ( y if not self.gaussian_gamma else (y, misc.multiply(*alphas)) ) def _compute_moments(self, *u_parents): # Compute the number of plate axes for each node plate_counts0 = [ (np.ndim(u_parent[0]) - len(keys)) for (keys,u_parent) in zip(self.in_keys, u_parents) ] plate_counts1 = [ ( # Gaussian moments: Use second moments "matrix" (np.ndim(u_parent[1]) - 2*len(keys)) if not is_const else # Delta moments: Use first moment "vector" (np.ndim(u_parent[0]) - len(keys)) ) for (keys, u_parent, is_const) in zip( self.in_keys, u_parents, self.is_constant ) ] # The number of plate axes for the output N0 = max(plate_counts0) N1 = max(plate_counts1) # The total number of unique keys used (keys are 0,1,...,N_keys-1) D = self.N_keys # # Compute the mean # out_all_keys = list(range(D+N0-1, D-1, -1)) + self.out_keys #nodes_dim_keys = self.nodes_dim_keys in_all_keys = [list(range(D+plate_count-1, D-1, -1)) + keys for (plate_count, keys) in zip(plate_counts0, self.in_keys)] u0 = [u[0] for u in u_parents] args = misc.zipper_merge(u0, in_all_keys) + [out_all_keys] x0 = np.einsum(*args) # # Compute the covariance # out_all_keys = (list(range(2*D+N1-1, 2*D-1, -1)) + [D+key for key in self.out_keys] + self.out_keys) in_all_keys = [ x for (plate_count, node_keys, is_const) in zip( plate_counts1, self.in_keys, self.is_constant, ) for x in ( # Gaussian moments: Use the second moment [ list(range(2*D+plate_count-1, 2*D-1, -1)) + [D+key for key in node_keys] + node_keys ] if not is_const else # Delta moments: Use the first moment tiwce [ ( list(range(2*D+plate_count-1, 2*D-1, -1)) + [D+key for key in node_keys] ), ( list(range(2*D+plate_count-1, 2*D-1, -1)) + node_keys ), ] ) ] u1 = [ x for (u, is_const) in zip(u_parents, self.is_constant) for x in ( # Gaussian moments: Use the second moment [u[1]] if not is_const else # Delta moments: Use the first moment twice [u[0], u[0]] ) ] args = misc.zipper_merge(u1, in_all_keys) + [out_all_keys] x1 = np.einsum(*args) if not self.gaussian_gamma: return [x0, x1] # Compute Gaussian-gamma specific moments x2 = 1 x3 = 0 for i in range(len(u_parents)): x2 = x2 * (1 if self.is_constant[i] else u_parents[i][2]) x3 = x3 + (0 if self.is_constant[i] else u_parents[i][3]) return [x0, x1, x2, x3] def get_parameters(self): # Compute mean and variance u = self.get_moments() u[1] -= u[0]**2 return u def _message_to_parent(self, index, u_parent=None): """ Compute the message and mask to a parent node. """ if self.is_constant[index]: raise NotImplementedError( "Message to DeltaMoments parent not yet implemented." ) # Check index if index >= len(self.parents): raise ValueError("Parent index larger than the number of parents") # Get messages from other parents and children u_parents = self._message_from_parents(exclude=index) m = self._message_from_children() mask = self.mask # Normally we don't need to care about masks when computing the # message. However, in this node we want to avoid computing huge message # arrays so we sum some axes already here. Thus, we need to apply the # mask. # # Actually, we don't need to care about masks because the message from # children has already been masked. parent = self.parents[index] # # Compute the first message # msg = [None, None] # Compute the two messages for ind in range(2): # The total number of keys for the non-plate dimensions N = (ind+1) * self.N_keys parent_num_dims = len(parent.dims[ind]) parent_num_plates = len(parent.plates) parent_plate_keys = list(range(N + parent_num_plates, N, -1)) parent_dim_keys = self.in_keys[index] if ind == 1: parent_dim_keys = ([key + self.N_keys for key in self.in_keys[index]] + parent_dim_keys) args = [] # This variable counts the maximum number of plates of the # arguments, thus it will tell the number of plates in the result # (if the artificially added plates above were ignored). result_num_plates = 0 result_plates = () # Mask and its keysr mask_num_plates = np.ndim(mask) mask_plates = np.shape(mask) mask_plate_keys = list(range(N + mask_num_plates, N, -1)) result_num_plates = max(result_num_plates, mask_num_plates) result_plates = misc.broadcasted_shape(result_plates, mask_plates) # Moments and keys of other parents for (k, u) in enumerate(u_parents): if k != index: num_dims = ( (ind+1) * len(self.in_keys[k]) if not self.is_constant[k] else len(self.in_keys[k]) ) ui = ( u[ind] if not self.is_constant[k] else u[0] ) num_plates = np.ndim(ui) - num_dims plates = np.shape(ui)[:num_plates] plate_keys = list(range(N + num_plates, N, -1)) if ind == 0: args.append(ui) args.append(plate_keys + self.in_keys[k]) else: in_keys2 = [key + self.N_keys for key in self.in_keys[k]] if not self.is_constant[k]: # Gaussian moments: Use second moment once args.append(ui) args.append(plate_keys + in_keys2 + self.in_keys[k]) else: # Delta moments: Use first moment twice args.append(ui) args.append(plate_keys + self.in_keys[k]) args.append(ui) args.append(plate_keys + in_keys2) result_num_plates = max(result_num_plates, num_plates) result_plates = misc.broadcasted_shape(result_plates, plates) # Message and keys from children child_num_dims = (ind+1) * len(self.out_keys) child_num_plates = np.ndim(m[ind]) - child_num_dims child_plates = np.shape(m[ind])[:child_num_plates] child_plate_keys = list(range(N + child_num_plates, N, -1)) child_dim_keys = self.out_keys if ind == 1: child_dim_keys = ([key + self.N_keys for key in self.out_keys] + child_dim_keys) args.append(m[ind]) args.append(child_plate_keys + child_dim_keys) result_num_plates = max(result_num_plates, child_num_plates) result_plates = misc.broadcasted_shape(result_plates, child_plates) # Output keys, that is, the keys of the parent[index] parent_keys = parent_plate_keys + parent_dim_keys # Performance trick: Check which axes can be summed because they # have length 1 or are non-existing in parent[index]. Thus, remove # keys corresponding to unit length axes in parent[index] so that # einsum sums over those axes. After computations, these axes must # be added back in order to get the correct shape for the message. # Also, remove axes/keys that are in output (parent[index]) but not in # any inputs (children and other parents). parent_shape = parent.get_shape(ind) removed_axes = [] for j in range(len(parent_keys)): if parent_shape[j] == 1: # Remove the key (take into account the number of keys that # have already been removed) del parent_keys[j-len(removed_axes)] removed_axes.append(j) else: # Remove the key if it doesn't appear in any of the # messages from children or other parents. if not np.any([parent_keys[j-len(removed_axes)] in keys for keys in args[1::2]]): del parent_keys[j-len(removed_axes)] removed_axes.append(j) args.append(parent_keys) # THE BEEF: Compute the message msg[ind] = np.einsum(*args) # Find the correct shape for the message array message_shape = list(np.shape(msg[ind])) # First, add back the axes with length 1 for ax in removed_axes: message_shape.insert(ax, 1) # Second, remove leading axes for plates that were not present in # the child nor other parents' messages. This is not really # necessary, but it is just elegant to remove the leading unit # length axes that we added artificially at the beginning just # because we wanted the key mapping to be simple. if parent_num_plates > result_num_plates: del message_shape[:(parent_num_plates-result_num_plates)] # Then, the actual reshaping msg[ind] = np.reshape(msg[ind], message_shape) # Broadcasting is not supported for variable dimensions, thus force # explicit correct shape for variable dimensions var_dims = parent.dims[ind] msg[ind] = msg[ind] * np.ones(var_dims) # Apply plate multiplier: If this node has non-unit plates that are # unit plates in the parent, those plates are summed. However, if # the message has unit axis for that plate, it should be first # broadcasted to the plates of this node and then summed to the # plates of the parent. In order to avoid this broadcasting and # summing, it is more efficient to just multiply by the correct # factor. r = self.broadcasting_multiplier(self.plates, result_plates, parent.plates) if r != 1: msg[ind] *= r if self.gaussian_gamma: alphas = [ (u_parents[i][2] if not is_const else 1.0) for (i, is_const) in zip(range(len(u_parents)), self.is_constant) if i != index ] m2 = self._compute_message(m[2], mask, *alphas, ndim=0, plates_from=self.plates, plates_to=parent.plates) m3 = self._compute_message(m[3], mask, ndim=0, plates_from=self.plates, plates_to=parent.plates) msg = msg + [m2, m3] return msg def Dot(*args, **kwargs): """ Node for computing inner product of several Gaussian vectors. This is a simple wrapper of the much more general SumMultiply. For now, it is here for backward compatibility. """ einsum = 'i' + ',i'*(len(args)-1) return SumMultiply(einsum, *args, **kwargs) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/expfamily.py0000644000175100001770000004430500000000000023703 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import warnings import numpy as np from bayespy.utils import misc from .node import ensureparents from .stochastic import Stochastic, Distribution class ExponentialFamilyDistribution(Distribution): """ Sub-classes implement distribution specific computations. """ # # The following methods are for ExponentialFamily distributions # def compute_message_to_parent(self, parent, index, u_self, *u_parents): raise NotImplementedError() def compute_phi_from_parents(self, *u_parents, mask=True): raise NotImplementedError() def compute_moments_and_cgf(self, phi, mask=True): raise NotImplementedError() # # The following methods are for Mixture class # def compute_cgf_from_parents(self, *u_parents): raise NotImplementedError() def compute_fixed_moments_and_f(self, x, mask=True): raise NotImplementedError() def compute_logpdf(self, u, phi, g, f, ndims): """ Compute E[log p(X)] given E[u], E[phi], E[g] and E[f]. Does not sum over plates.""" # TODO/FIXME: Should I take into account what is latent or # observed, or what is even totally ignored (by the mask). L = g + f for (phi_i, u_i, ndims_i) in zip(phi, u, ndims): # Axes to sum (dimensions of the variable, not the plates) axis_sum = tuple(range(-ndims_i,0)) # Compute the term # TODO/FIXME: Use einsum! L = L + np.sum( np.where(u_i != 0, phi_i, 0) * u_i, axis=axis_sum ) return L def compute_gradient(self, g, u, phi): r""" Compute the standard gradient with respect to the natural parameters. """ raise NotImplementedError("Standard gradient not yet implemented for %s" % (self.__class__.__name__)) def useconstructor(__init__): def constructor_decorator(self, *args, **kwargs): if (self.dims is None or self._distribution is None or self._moments is None or self._parent_moments is None): (args, kwargs, dims, plates, dist, stats, pstats) = \ self._constructor(*args, **kwargs) self.dims = dims self._distribution = dist self._moments = stats self._parent_moments = pstats self.plates = plates __init__(self, *args, **kwargs) return constructor_decorator class ExponentialFamily(Stochastic): """ A base class for nodes using natural parameterization `phi`. phi Sub-classes must implement the following static methods: _compute_message_to_parent(index, u_self, *u_parents) _compute_phi_from_parents(*u_parents, mask) _compute_moments_and_cgf(phi, mask) _compute_fixed_moments_and_f(x, mask=True) Sub-classes may need to re-implement: 1. If they manipulate plates: _compute_weights_to_parent(index, weights) _compute_plates_to_parent(self, index, plates) _compute_plates_from_parent(self, index, plates) """ # Sub-classes should overwrite this (possibly using _constructor) dims = None # Sub-classes should overwrite this _distribution = None @useconstructor def __init__(self, *parents, initialize=True, phi_bias=None, **kwargs): self.annealing = 1.0 # Terms for the lower bound (G for latent and F for observed) self.g = np.array(np.nan) self.f = np.array(np.nan) self._phi_bias = phi_bias if phi_bias is not None else len(self.dims) * [0.0] super().__init__(*parents, initialize=initialize, dims=self.dims, **kwargs) if not initialize: axes = len(self.plates)*(1,) self.phi = [misc.nans(axes+dim) for dim in self.dims] @classmethod @ensureparents def _constructor(cls, *parents, **kwargs): """ Constructs distribution and moments objects. If __init__ uses useconstructor decorator, this method is called to construct distribution and moments objects. The method is given the same inputs as __init__. For some nodes, some of these can't be "static" class attributes, then the node class must overwrite this method to construct the objects manually. The point of distribution class is to move general distribution but not-node specific code. The point of moments class is to define the messaging protocols. """ parent_plates = [cls._distribution.plates_from_parent(ind, parent.plates) for (ind, parent) in enumerate(parents)] return (parents, kwargs, cls.dims, cls._total_plates(kwargs.get('plates'), *parent_plates), cls._distribution, cls._moments, cls._parent_moments) def _initialize_from_parent_moments(self, *u_parents): if not np.all(self.observed): # Update natural parameters using parents self._update_phi_from_parents(*u_parents) # Update moments mask = np.logical_not(self.observed) (u, g) = self._distribution.compute_moments_and_cgf(self.phi, mask=mask) # TODO/FIXME/BUG: You should use observation mask in order to not # overwrite them! self._set_moments_and_cgf(u, g, mask=mask) def initialize_from_prior(self): u_parents = self._message_from_parents() self._initialize_from_parent_moments(*u_parents) def initialize_from_parameters(self, *args): u_parents = [p_mom.compute_fixed_moments(x) for (p_mom, x) in zip(self._parent_moments, args)] self._initialize_from_parent_moments(*u_parents) def initialize_from_value(self, x, *args): # Update moments from value mask = np.logical_not(self.observed) u = self._moments.compute_fixed_moments(x, *args) # Check that the shape is correct for i in range(len(u)): ndim = len(self.dims[i]) if ndim > 0: if np.shape(u[i])[-ndim:] != self.dims[i]: raise ValueError("The initial value for node %s has invalid shape %s." % (np.shape(x))) self._set_moments_and_cgf(u, np.inf, mask=mask) def initialize_from_random(self): """ Set the variable to a random sample from the current distribution. """ #self.initialize_from_prior() X = self.random() self.initialize_from_value(X) def _update_phi_from_parents(self, *u_parents): # TODO/FIXME: Could this be combined to the function # _update_distribution_and_lowerbound ? # No, because some initialization methods may want to use this. # This makes correct broadcasting self.phi = [ a + b for (a, b) in zip( self._distribution.compute_phi_from_parents(*u_parents), self._phi_bias ) ] # Make sure phi has the correct number of axes. It makes life # a bit easier elsewhere. for i in range(len(self.phi)): axes = len(self.plates) + self.ndims[i] - np.ndim(self.phi[i]) if axes > 0: # Add axes self.phi[i] = misc.add_leading_axes(self.phi[i], axes) elif axes < 0: # Remove extra leading axes first = -(len(self.plates)+self.ndims[i]) sh = np.shape(self.phi[i])[first:] self.phi[i] = np.reshape(self.phi[i], sh) # Check that the shape is correct if not misc.is_shape_subset(np.shape(self.phi[i]), self.get_shape(i)): raise ValueError("Incorrect shape of phi[%d] in node class %s. " "Shape is %s but it should be broadcastable " "to shape %s." % (i, self.__class__.__name__, np.shape(self.phi[i]), self.get_shape(i))) def _set_moments_and_cgf(self, u, g, mask=True): self._set_moments(u, mask=mask) self.g = np.where(mask, g, self.g) return def get_riemannian_gradient(self): r""" Computes the Riemannian/natural gradient. """ u_parents = self._message_from_parents() m_children = self._message_from_children() # TODO/FIXME: Put observed plates to zero? # Compute the gradient phi = [ a + b for (a, b) in zip( self._distribution.compute_phi_from_parents(*u_parents), self._phi_bias ) ] for i in range(len(self.phi)): phi[i] = self.annealing * (phi[i] + m_children[i]) - self.phi[i] phi[i] = phi[i] * np.ones(self.get_shape(i)) return phi def get_gradient(self, rg): r""" Computes gradient with respect to the natural parameters. The function takes the Riemannian gradient as an input. This is for three reasons: 1) You probably want to use the Riemannian gradient anyway so this helps avoiding accidental use of this function. 2) The gradient is computed by using the Riemannian gradient and chain rules. 3) Probably you need both Riemannian and normal gradients anyway so you can provide it to this function to avoid re-computing it.""" g = self._distribution.compute_gradient(rg, self.u, self.phi) for i in range(len(g)): g[i] /= self.annealing return g ## def update_parameters(self, d, scale=1.0): ## r""" ## Update the parameters of the VB distribution given a change. ## The parameters should be such that they can be used for ## optimization, that is, use log transformation for positive ## parameters. ## """ ## phi = self.get_parameters() ## for i in range(len(phi)): ## phi[i] = phi[i] + scale*d[i] ## self.set_parameters(phi) ## return def get_parameters(self): r""" Return parameters of the VB distribution. The parameters should be such that they can be used for optimization, that is, use log transformation for positive parameters. """ return [np.copy(p) for p in self.phi] def _decode_parameters(self, x): return [np.copy(p) for p in x] def set_parameters(self, x): r""" Set the parameters of the VB distribution. The parameters should be such that they can be used for optimization, that is, use log transformation for positive parameters. """ self.phi = self._decode_parameters(x) self._update_moments_and_cgf() return def _update_distribution_and_lowerbound(self, m_children, *u_parents): # Update phi first from parents.. self._update_phi_from_parents(*u_parents) # .. then just add children's message self.phi = [self.annealing * (phi + m) for (phi, m) in zip(self.phi, m_children)] # Update u and g self._update_moments_and_cgf() def _update_moments_and_cgf(self): """ Update moments and cgf based on current phi. """ # Mask for plates to update (i.e., unobserved plates) update_mask = np.logical_not(self.observed) # Compute the moments (u) and CGF (g)... (u, g) = self._distribution.compute_moments_and_cgf(self.phi, mask=update_mask) # ... and store them self._set_moments_and_cgf(u, g, mask=update_mask) def observe(self, x, *args, mask=True): """ Fix moments, compute f and propagate mask. """ # Compute fixed moments (u, f) = self._distribution.compute_fixed_moments_and_f(x, *args, mask=mask) # # Check the dimensionality of the observations # self._check_shape() # for (i,v) in enumerate(u): # # This is what the dimensionality "should" be # s = self.plates + self.dims[i] # t = np.shape(v) # if s != t: # msg = "Dimensionality of the observations incorrect." # msg += "\nShape of input: " + str(t) # msg += "\nExpected shape: " + str(s) # msg += "\nCheck plates." # raise Exception(msg) # Set the moments. Shape checking is done there. self._set_moments(u, mask=mask, broadcast=False) self.f = np.where(mask, f, self.f) # Observed nodes should not be ignored self.observed = mask self._update_mask() def lower_bound_contribution(self, gradient=False, ignore_masked=True): r"""Compute E[ log p(X|parents) - log q(X) ] If deterministic annealing is used, the term E[ -log q(X) ] is divided by the anneling coefficient. That is, phi and cgf of q are multiplied by the temperature (inverse annealing coefficient). """ # Annealing temperature T = 1 / self.annealing # Messages from parents u_parents = self._message_from_parents() phi = [ a # + b # TODO: Should the bias be added here or not? for (a, b) in zip( self._distribution.compute_phi_from_parents(*u_parents), self._phi_bias ) ] # G from parents L = self._distribution.compute_cgf_from_parents(*u_parents) # G for unobserved variables (ignored variables are handled properly # automatically) latent_mask = np.logical_not(self.observed) # G and F if np.all(self.observed): z = np.nan elif T == 1: z = -self.g else: z = -T * self.g ## TRIED THIS BUT IT WAS WRONG: ## z = -T * self.g + (1-T) * self.f ## if np.any(np.isnan(self.f)): ## warnings.warn("F(x) not implemented for node %s. This " ## "is required for annealed lower bound " ## "computation." % self.__class__.__name__) ## ## It was wrong because the optimal q distribution has f which is ## weighted by 1/T and here the f of q is weighted by T so the ## total weight is 1, thus it cancels out with f of p. L = L + np.where(self.observed, self.f, z) for (phi_p, phi_q, u_q, dims) in zip(phi, self.phi, self.u, self.dims): # Form a mask which puts observed variables to zero and # broadcasts properly latent_mask_i = misc.add_trailing_axes( misc.add_leading_axes( latent_mask, len(self.plates) - np.ndim(latent_mask)), len(dims)) axis_sum = tuple(range(-len(dims),0)) # Compute the term phi_q = np.where(latent_mask_i, phi_q, 0) # Apply annealing phi_diff = phi_p - T * phi_q # Handle 0 * -inf phi_diff = np.where(u_q != 0, phi_diff, 0) # TODO/FIXME: Use einsum here? Z = np.sum(phi_diff * u_q, axis=axis_sum) L = L + Z if ignore_masked: return (np.sum(np.where(self.mask, L, 0)) * self.broadcasting_multiplier(self.plates, np.shape(L), np.shape(self.mask)) * np.prod(self.plates_multiplier)) else: return (np.sum(L) * self.broadcasting_multiplier(self.plates, np.shape(L)) * np.prod(self.plates_multiplier)) def logpdf(self, X, mask=True): """ Compute the log probability density function Q(X) of this node. """ if mask is not True: raise NotImplementedError('Mask not yet implemented') (u, f) = self._distribution.compute_fixed_moments_and_f(X, mask=mask) Z = 0 for (phi_d, u_d, dims) in zip(self.phi, u, self.dims): axis_sum = tuple(range(-len(dims),0)) # TODO/FIXME: Use einsum here? Z = Z + np.sum(phi_d * u_d, axis=axis_sum) #Z = Z + misc.sum_multiply(phi_d, u_d, axis=axis_sum) return (self.g + f + Z) def pdf(self, X, mask=True): """ Compute the probability density function of this node. """ return np.exp(self.logpdf(X, mask=mask)) def _save(self, group): """ Save the state of the node into a HDF5 file. group can be the root """ ## if name is None: ## name = self.name ## subgroup = group.create_group(name) for i in range(len(self.phi)): misc.write_to_hdf5(group, self.phi[i], 'phi%d' % i) misc.write_to_hdf5(group, self.f, 'f') misc.write_to_hdf5(group, self.g, 'g') super()._save(group) def _load(self, group): """ Load the state of the node from a HDF5 file. """ # TODO/FIXME: Check that the shapes are correct! for i in range(len(self.phi)): phii = group['phi%d' % i][...] self.phi[i] = phii self.f = group['f'][...] self.g = group['g'][...] super()._load(group) def random(self): """ Draw a random sample from the distribution. """ return self._distribution.random(*(self.phi), plates=self.plates) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/exponential.py0000644000175100001770000000413100000000000024224 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the exponential distribution node. """ from .gamma import (GammaMoments, Gamma) from .expfamily import ExponentialFamily ExponentialMoments = GammaMoments class Exponential(Gamma): r""" Node for exponential random variables. .. warning:: Use :class:`Gamma` instead of this. `Exponential(l)` is equivalent to `Gamma(1, l)`. Parameters ---------- l : gamma-like node or scalar or array Rate parameter See also -------- Gamma, Poisson Notes ----- For simplicity, this is just a gamma node with the first parent fixed to one. Note that this is a bit inconsistent with the BayesPy philosophy which states that the node does not only define the form of the prior distribution but more importantly the form of the posterior approximation. Thus, one might expect that this node would have exponential posterior distribution approximation. However, it has a gamma distribution. Also, the moments are gamma moments although only E[x] would be the moment of a exponential random variable. All this was done because: a) gamma was already implemented, so there was no need to implement anything, and b) people might easily use Exponential node as a prior definition and expect to get gamma posterior (which is what happens now). Maybe some day a pure Exponential node is implemented and the users are advised to use Gamma(1,b) if they want to use an exponential prior distribution but gamma posterior approximation. """ def __init__(self, l, **kwargs): raise NotImplementedError("Not yet implemented. Use Gamma(1, lambda)") super().__init__(1, l, **kwargs) @classmethod def _constructor(cls, l, **kwargs): raise NotImplementedError("Not yet implemented. Use Gamma(1, lambda)") ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/gamma.py0000644000175100001770000002502500000000000022765 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012,2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the gamma distribution node. """ import numpy as np import scipy.special as special from .node import Node, Moments, ensureparents from .deterministic import Deterministic from .stochastic import Stochastic from .expfamily import ExponentialFamily, ExponentialFamilyDistribution from .constant import Constant from bayespy.utils import misc from bayespy.utils import random def diagonal(alpha): """ Create a diagonal Wishart node from a Gamma node. """ return _GammaToDiagonalWishart(alpha, name=alpha.name + " as Wishart") class GammaPriorMoments(Moments): """ Class for the moments of the shape parameter in gamma distributions. """ dims = ( (), () ) def compute_fixed_moments(self, a): """ Compute the moments for a fixed value """ a = np.asanyarray(a) if np.any(a <= 0): raise ValueError("Shape parameter must be positive") u0 = a u1 = special.gammaln(a) return [u0, u1] @classmethod def from_values(cls, a): """ Return the shape of the moments for a fixed value. """ return cls() class GammaMoments(Moments): """ Class for the moments of gamma variables. """ dims = ( (), () ) def compute_fixed_moments(self, x): """ Compute the moments for a fixed value """ x = np.asanyarray(x) if np.any(x < 0): raise ValueError("Values must be positive") u0 = x u1 = np.log(x) return [u0, u1] @classmethod def from_values(cls, x): """ Return the shape of the moments for a fixed value. """ return cls() class GammaDistribution(ExponentialFamilyDistribution): """ Class for the VMP formulas of gamma variables. """ def compute_message_to_parent(self, parent, index, u_self, u_a, u_b): r""" Compute the message to a parent node. """ x = u_self[0] logx = u_self[1] if index == 0: b = u_b[0] logb = u_b[1] return [logx + logb, -1] elif index == 1: a = u_a[0] return [-x, a] else: raise ValueError("Index out of bounds") def compute_phi_from_parents(self, *u_parents, mask=True): r""" Compute the natural parameter vector given parent moments. """ return [-u_parents[1][0], 1*u_parents[0][0]] def compute_moments_and_cgf(self, phi, mask=True): r""" Compute the moments and :math:`g(\phi)`. .. math:: \overline{\mathbf{u}} (\boldsymbol{\phi}) &= \begin{bmatrix} - \frac{\phi_2} {\phi_1} \\ \psi(\phi_2) - \log(-\phi_1) \end{bmatrix} \\ g_{\boldsymbol{\phi}} (\boldsymbol{\phi}) &= TODO """ with np.errstate(invalid='raise', divide='raise'): log_b = np.log(-phi[0]) u0 = phi[1] / (-phi[0]) u1 = special.digamma(phi[1]) - log_b u = [u0, u1] g = phi[1] * log_b - special.gammaln(phi[1]) return (u, g) def compute_cgf_from_parents(self, *u_parents): r""" Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ a = u_parents[0][0] gammaln_a = u_parents[0][1] #special.gammaln(a) b = u_parents[1][0] log_b = u_parents[1][1] g = a * log_b - gammaln_a return g def compute_fixed_moments_and_f(self, x, mask=True): r""" Compute the moments and :math:`f(x)` for a fixed value. """ x = np.asanyarray(x) if np.any(x < 0): raise ValueError("Values must be positive") logx = np.log(x) u = [x, logx] f = -logx return (u, f) def random(self, *phi, plates=None): r""" Draw a random sample from the distribution. """ return random.gamma(phi[1], -1/phi[0], size=plates) def compute_gradient(self, g, u, phi): r""" Compute the moments and :math:`g(\phi)`. .. math:: \mathrm{d}\overline{\mathbf{u}} &= \begin{bmatrix} - \frac{\mathrm{d}\phi_2} {phi_1} + \frac{\phi_2}{\phi_1^2} \mathrm{d}\phi_1 \\ \psi^{(1)}(\phi_2) \mathrm{d}\phi_2 - \frac{1}{\phi_1} \mathrm{d}\phi_1 \end{bmatrix} Standard gradient given the gradient with respect to the moments, that is, given the Riemannian gradient :math:`\tilde{\nabla}`: .. math:: \nabla = \begin{bmatrix} \nabla_1 \frac{\phi_2}{\phi_1^2} - \nabla_2 \frac{1}{\phi_1} \\ \nabla_2 \psi^{(1)}(\phi_2) - \nabla_1 \frac {1} {\phi_1} \end{bmatrix} """ d0 = g[0] * phi[1] / phi[0]**2 - g[1] / phi[0] d1 = g[1] * special.polygamma(1, phi[1]) - g[0] / phi[0] return [d0, d1] class Gamma(ExponentialFamily): """ Node for gamma random variables. Parameters ---------- a : scalar or array Shape parameter b : gamma-like node or scalar or array Rate parameter """ dims = ( (), () ) _distribution = GammaDistribution() _moments = GammaMoments() _parent_moments = (GammaPriorMoments(), GammaMoments()) def __init__(self, a, b, **kwargs): """ Create gamma random variable node """ super().__init__(a, b, **kwargs) def __str__(self): """ Print the distribution using standard parameterization. """ a = self.phi[1] b = -self.phi[0] return ("%s ~ Gamma(a, b)\n" " a =\n" "%s\n" " b =\n" "%s\n" % (self.name, a, b)) def as_wishart(self, ndim=0): if ndim != 0: raise NotImplementedError() return _GammaToScalarWishart(self, name=self.name + " as Wishart") def as_diagonal_wishart(self): return _GammaToDiagonalWishart(self, name=self.name + " as Wishart") def diag(self): return self.as_diagonal_wishart() class GammaShape(Stochastic): """ ML point estimator for the shape parameter of the gamma distribution """ dims = ( (), () ) _moments = GammaPriorMoments() _parent_moments = () def __init__(self, m0=0, m1=0, **kwargs): """ Create gamma random variable node """ super().__init__(dims=self.dims, initialize=False, **kwargs) self.u = self._moments.compute_fixed_moments(1) self._m0 = m0 self._m1 = m1 return def _update_distribution_and_lowerbound(self, m): r""" Find maximum likelihood estimate for the shape parameter Messages from children appear in the lower bound as .. math:: m_0 \cdot x + m_1 \cdot \log(\Gamma(x)) Take derivative, put it zero and solve: .. math:: m_0 + m_1 \cdot d\log(\Gamma(x)) &= 0 \\ m_0 + m_1 \cdot \psi(x) &= 0 \\ x &= \psi^{-1}(-\frac{m_0}{m_1}) where :math:`\psi^{-1}` is the inverse digamma function. """ # Maximum likelihood estimate m0 = self._m0 + m[0] m1 = self._m1 + m[1] x = misc.invpsi(-m0 / m1) # Compute moments self.u = self._moments.compute_fixed_moments(x) return def initialize_from_value(self, x): self.u = self._moments.compute_fixed_moments(x) return def lower_bound_contribution(self): return 0 class _GammaToDiagonalWishart(Deterministic): """ Transform a set of gamma scalars into a diagonal Wishart matrix. The last plate is used as the diagonal dimension. """ _parent_moments = [GammaMoments()] @ensureparents def __init__(self, alpha, **kwargs): # Check for constant if misc.is_numeric(alpha): alpha = Constant(Gamma)(alpha) if len(alpha.plates) == 0: raise Exception("Gamma variable needs to have plates in " "order to be used as a diagonal Wishart.") D = alpha.plates[-1] # FIXME: Put import here to avoid circular dependency import from .wishart import WishartMoments self._moments = WishartMoments((D,)) dims = ( (D,D), () ) # Construct the node super().__init__(alpha, dims=self._moments.dims, **kwargs) def _plates_to_parent(self, index): D = self.dims[0][0] return self.plates + (D,) def _plates_from_parent(self, index): return self.parents[index].plates[:-1] @staticmethod def _compute_weights_to_parent(index, weights): return weights[..., np.newaxis] def get_moments(self): u = self.parents[0].get_moments() # Form a diagonal matrix from the gamma variables return [np.identity(self.dims[0][0]) * u[0][...,np.newaxis], np.sum(u[1], axis=(-1))] @staticmethod def _compute_message_to_parent(index, m_children, *u_parents): # Take the diagonal m0 = np.einsum('...ii->...i', m_children[0]) m1 = np.reshape(m_children[1], np.shape(m_children[1]) + (1,)) return [m0, m1] class _GammaToScalarWishart(Deterministic): """ Transform gamma scalar moments to ndim=0 scalar Wishart moments """ _parent_moments = [GammaMoments()] @ensureparents def __init__(self, alpha, **kwargs): # Check for constant if misc.is_numeric(alpha): alpha = Constant(Gamma)(alpha) # FIXME: Put import here to avoid circular dependency import from .wishart import WishartMoments self._moments = WishartMoments(()) dims = ( (), () ) # Construct the node super().__init__(alpha, dims=self._moments.dims, **kwargs) def get_moments(self): return self.parents[0].get_moments() @staticmethod def _compute_message_to_parent(index, m_children, *u_parents): return m_children ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/gate.py0000644000175100001770000001745400000000000022632 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ """ import numpy as np from bayespy.utils import misc from .node import Node, Moments from .deterministic import Deterministic from .categorical import CategoricalMoments from .concatenate import Concatenate class Gate(Deterministic): """ Deterministic gating of one node. Gating is performed over one plate axis. Note: You should not use gating for several variables which parents of a same node if the gates use the same gate assignments. In such case, the results will be wrong. The reason is a general one: A stochastic node may not be a parent of another node via several paths unless at most one path has no other stochastic nodes between them. """ def __init__(self, Z, X, gated_plate=-1, moments=None, **kwargs): """ Constructor for the gating node. Parameters ---------- Z : Categorical-like node A variable which chooses the index along the gated plate axis X : node The node whose plate axis is gated gated_plate : int (optional) The index of the plate axis to be gated (by default, -1, that is, the last axis). """ if gated_plate >= 0: raise ValueError("Cluster plate must be negative integer") self.gated_plate = gated_plate if moments is not None: X = self._ensure_moments( X, moments.__class__, **moments.get_instance_conversion_kwargs() ) if not isinstance(X, Node): raise ValueError("X must be a node or moments should be provided") X_moments = X._moments self._moments = X_moments dims = X.dims if len(X.plates) < abs(gated_plate): raise ValueError("The gated node does not have a plate axis is " "gated") K = X.plates[gated_plate] Z = self._ensure_moments(Z, CategoricalMoments, categories=K) self._parent_moments = (Z._moments, X_moments) if Z.dims != ( (K,), ): raise ValueError("Inconsistent number of clusters") self.K = K super().__init__(Z, X, dims=dims, **kwargs) def _compute_moments(self, u_Z, u_X): """ """ u = [] for i in range(len(u_X)): # Make the moments of Z and X broadcastable and move the gated plate # to be the last axis in the moments, then sum-product over that # axis ndim = len(self.dims[i]) z = misc.add_trailing_axes(u_Z[0], ndim) z = misc.moveaxis(z, -ndim-1, -1) gated_axis = self.gated_plate - ndim if np.ndim(u_X[i]) < abs(gated_axis): x = misc.add_trailing_axes(u_X[i], 1) else: x = misc.moveaxis(u_X[i], gated_axis, -1) ui = misc.sum_product(z, x, axes_to_sum=-1) u.append(ui) return u def _compute_message_to_parent(self, index, m_child, u_Z, u_X): """ """ if index == 0: m0 = 0 # Compute Child * X, sum over variable axes and move the gated axis # to be the last. Need to do some shape changing in order to make # Child and X to broadcast properly. for i in range(len(m_child)): ndim = len(self.dims[i]) c = m_child[i][...,None] c = misc.moveaxis(c, -1, -ndim-1) gated_axis = self.gated_plate - ndim x = u_X[i] if np.ndim(x) < abs(gated_axis): x = np.expand_dims(x, -ndim-1) else: x = misc.moveaxis(x, gated_axis, -ndim-1) axes = tuple(range(-ndim, 0)) m0 = m0 + misc.sum_product(c, x, axes_to_sum=axes) # Make sure the variable axis does not use broadcasting m0 = m0 * np.ones(self.K) # Send the message m = [m0] return m elif index == 1: m = [] for i in range(len(m_child)): # Make the moments of Z and the message from children # broadcastable. The gated plate is handled as the last axis in # the arrays and moved to the correct position at the end. # Add variable axes to Z moments ndim = len(self.dims[i]) z = misc.add_trailing_axes(u_Z[0], ndim) z = misc.moveaxis(z, -ndim-1, -1) # Axis index of the gated plate gated_axis = self.gated_plate - ndim # Add the gate axis to the message from the children c = misc.add_trailing_axes(m_child[i], 1) # Compute the message to parent mi = z * c # Add extra axes if necessary if np.ndim(mi) < abs(gated_axis): mi = misc.add_leading_axes(mi, abs(gated_axis) - np.ndim(mi)) # Move the axis to the correct position mi = misc.moveaxis(mi, -1, gated_axis) m.append(mi) return m else: raise ValueError("Invalid parent index") def _compute_weights_to_parent(self, index, weights): """ """ if index == 0: return weights elif index == 1: if self.gated_plate >= 0: raise ValueError("Gated plate axis must be negative") return ( np.expand_dims(weights, axis=self.gated_plate) if np.ndim(weights) >= abs(self.gated_plate) else weights ) else: raise ValueError("Invalid parent index") def _compute_plates_to_parent(self, index, plates): """ """ if index == 0: return plates elif index == 1: plates = list(plates) # Add the cluster plate axis if self.gated_plate < 0: knd = len(plates) + self.gated_plate + 1 else: raise RuntimeError("Cluster plate axis must be negative") plates.insert(knd, self.K) return tuple(plates) else: raise ValueError("Invalid parent index") def _compute_plates_from_parent(self, index, plates): """ """ if index == 0: return plates elif index == 1: plates = list(plates) # Remove the cluster plate, if the parent has it if len(plates) >= abs(self.gated_plate): plates.pop(self.gated_plate) return tuple(plates) else: raise ValueError("Invalid parent index") def Choose(z, *nodes): """Choose plate elements from nodes based on a categorical variable. For instance: .. testsetup:: from bayespy.nodes import * .. code-block:: python >>> import bayespy as bp >>> z = [0, 0, 2, 1] >>> x0 = bp.nodes.GaussianARD(0, 1) >>> x1 = bp.nodes.GaussianARD(10, 1) >>> x2 = bp.nodes.GaussianARD(20, 1) >>> x = bp.nodes.Choose(z, x0, x1, x2) >>> print(x.get_moments()[0]) [ 0. 0. 20. 10.] This is basically just a thin wrapper over applying Gate node over the concatenation of the nodes. """ categories = len(nodes) z = Deterministic._ensure_moments( z, CategoricalMoments, categories=categories ) nodes = [node[...,None] for node in nodes] combined = Concatenate(*nodes) return Gate(z, combined) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/gaussian.py0000644000175100001770000023757000000000000023527 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the Gaussian distribution and similar distributions. """ import numpy as np from scipy import special import truncnorm from bayespy.utils import (random, misc, linalg) from bayespy.utils.linalg import dot, mvdot from .expfamily import (ExponentialFamily, ExponentialFamilyDistribution, useconstructor) from .wishart import (WishartMoments, WishartPriorMoments) from .gamma import (GammaMoments, GammaDistribution, GammaPriorMoments) from .deterministic import Deterministic from .node import (Moments, ensureparents) # # MOMENTS # class GaussianMoments(Moments): r""" Class for the moments of Gaussian variables. """ def __init__(self, shape): self.shape = shape self.ndim = len(shape) self.dims = (shape, 2*shape) super().__init__() def compute_fixed_moments(self, x): r""" Compute the moments for a fixed value """ x = np.asanyarray(x) x = misc.atleast_nd(x, self.ndim) return [x, linalg.outer(x, x, ndim=self.ndim)] @classmethod def from_values(cls, x, ndim): r""" Return the shape of the moments for a fixed value. """ if ndim == 0: return cls(()) else: return cls(np.shape(x)[-ndim:]) def get_instance_conversion_kwargs(self): return dict(ndim=self.ndim) def get_instance_converter(self, ndim): if ndim == self.ndim or ndim is None: return None return GaussianToGaussian(self, ndim) class GaussianToGaussian(): def __init__(self, moments_from, ndim_to): if not isinstance(moments_from, GaussianMoments): raise ValueError() if ndim_to < 0: return ValueError("ndim_to must be non-negative") self.shape_from = moments_from.shape self.ndim_from = moments_from.ndim self.ndim_to = ndim_to if self.ndim_to > self.ndim_from: raise ValueError() if self.ndim_to == 0: self.moments = GaussianMoments(()) else: self.moments = GaussianMoments(self.shape_from[-self.ndim_to:]) return def compute_moments(self, u): if self.ndim_to == self.ndim_from: return u u0 = u[0] u1 = misc.get_diag(u[1], ndim=self.ndim_from, ndim_to=self.ndim_to) return [u0, u1] def compute_message_to_parent(self, m, u_parent): # Handle broadcasting in m_child m0 = m[0] * np.ones(self.shape_from) m1 = ( misc.make_diag(m[1], ndim=self.ndim_from, ndim_from=self.ndim_to) * misc.identity(*self.shape_from) ) return [m0, m1] def compute_weights_to_parent(self, weights): diff = self.ndim_from - self.ndim_to if diff == 0: return weights return np.sum( weights * np.ones(self.shape_from[:diff]), #misc.atleast_nd(weights, diff), axis=tuple(range(-diff, 0)) ) def plates_multiplier_from_parent(self, plates_multiplier): diff = self.ndim_from - self.ndim_to return plates_multiplier + diff * (1,) def plates_from_parent(self, plates): diff = self.ndim_from - self.ndim_to if diff == 0: return plates return plates + self.shape_from[:diff] def plates_to_parent(self, plates): diff = self.ndim_from - self.ndim_to if diff == 0: return plates return plates[:-diff] class GaussianGammaMoments(Moments): r""" Class for the moments of Gaussian-gamma-ISO variables. """ def __init__(self, shape): r""" Create moments object for Gaussian-gamma isotropic variables ndim=0: scalar ndim=1: vector ndim=2: matrix ... """ self.shape = shape self.ndim = len(shape) self.dims = (shape, 2*shape, (), ()) super().__init__() def compute_fixed_moments(self, x_alpha): r""" Compute the moments for a fixed value `x` is a mean vector. `alpha` is a precision scale """ (x, alpha) = x_alpha x = np.asanyarray(x) alpha = np.asanyarray(alpha) u0 = x * misc.add_trailing_axes(alpha, self.ndim) u1 = (linalg.outer(x, x, ndim=self.ndim) * misc.add_trailing_axes(alpha, 2*self.ndim)) u2 = np.copy(alpha) u3 = np.log(alpha) u = [u0, u1, u2, u3] return u @classmethod def from_values(cls, x_alpha, ndim): r""" Return the shape of the moments for a fixed value. """ (x, alpha) = x_alpha if ndim == 0: shape = ( (), (), (), () ) else: shape = np.shape(x)[-ndim:] return cls(shape) def get_instance_conversion_kwargs(self): return dict(ndim=self.ndim) def get_instance_converter(self, ndim): # FIXME/TODO: IMPLEMENT THIS CORRECTLY! if ndim != self.ndim: raise NotImplementedError( "Conversion to different ndim in GaussianMoments not yet " "implemented." ) return None class GaussianWishartMoments(Moments): r""" Class for the moments of Gaussian-Wishart variables. """ def __init__(self, shape): self.shape = shape self.ndim = len(shape) self.dims = ( shape, (), 2*shape, () ) super().__init__() def compute_fixed_moments(self, x, Lambda): r""" Compute the moments for a fixed value `x` is a vector. `Lambda` is a precision matrix """ x = np.asanyarray(x) Lambda = np.asanyarray(Lambda) u0 = linalg.mvdot(Lambda, x, ndim=self.ndim) u1 = np.einsum( '...i,...ij,...j->...', misc.flatten_axes(x, self.ndim), misc.flatten_axes(Lambda, self.ndim, self.ndim), misc.flatten_axes(x, self.ndim) ) u2 = np.copy(Lambda) u3 = linalg.logdet_cov(Lambda, ndim=self.ndim) return [u0, u1, u2, u3] @classmethod def from_values(self, x, Lambda, ndim): r""" Return the shape of the moments for a fixed value. """ if ndim == 0: return cls(()) else: if np.ndim(x) < ndim: raise ValueError("Mean must be a vector") shape = np.shape(x)[-ndim:] if np.shape(Lambda)[-2*ndim:] != shape + shape: raise ValueError("Shapes inconsistent") return cls(shape) # # DISTRIBUTIONS # class GaussianDistribution(ExponentialFamilyDistribution): r""" Class for the VMP formulas of Gaussian variables. Currently, supports only vector variables. Notes ----- Message passing equations: .. math:: \mathbf{x} &\sim \mathcal{N}(\boldsymbol{\mu}, \mathbf{\Lambda}), .. math:: \mathbf{x},\boldsymbol{\mu} \in \mathbb{R}^{D}, \quad \mathbf{\Lambda} \in \mathbb{R}^{D \times D}, \quad \mathbf{\Lambda} \text{ symmetric positive definite} .. math:: \log\mathcal{N}( \mathbf{x} | \boldsymbol{\mu}, \mathbf{\Lambda} ) &= - \frac{1}{2} \mathbf{x}^{\mathrm{T}} \mathbf{\Lambda} \mathbf{x} + \mathbf{x}^{\mathrm{T}} \mathbf{\Lambda} \boldsymbol{\mu} - \frac{1}{2} \boldsymbol{\mu}^{\mathrm{T}} \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log |\mathbf{\Lambda}| - \frac{D}{2} \log (2\pi) """ def __init__(self, shape): self.shape = shape self.ndim = len(shape) self.set_limits(None, None) super().__init__() def set_limits(self, minimum=None, maximum=None): self.minimum = minimum self.maximum = maximum self.has_limits = minimum is not None or maximum is not None return def compute_message_to_parent(self, parent, index, u, u_mu_Lambda): r""" Compute the message to a parent node. .. math:: \boldsymbol{\phi}_{\boldsymbol{\mu}} (\mathbf{x}, \mathbf{\Lambda}) &= \left[ \begin{matrix} \mathbf{\Lambda} \mathbf{x} \\ - \frac{1}{2} \mathbf{\Lambda} \end{matrix} \right] \\ \boldsymbol{\phi}_{\mathbf{\Lambda}} (\mathbf{x}, \boldsymbol{\mu}) &= \left[ \begin{matrix} - \frac{1}{2} \mathbf{xx}^{\mathrm{T}} + \frac{1}{2} \mathbf{x}\boldsymbol{\mu}^{\mathrm{T}} + \frac{1}{2} \boldsymbol{\mu}\mathbf{x}^{\mathrm{T}} - \frac{1}{2} \boldsymbol{\mu\mu}^{\mathrm{T}} \\ \frac{1}{2} \end{matrix} \right] """ if index == 0: x = u[0] xx = u[1] m0 = x m1 = -0.5 m2 = -0.5*xx m3 = 0.5 return [m0, m1, m2, m3] else: raise ValueError("Index out of bounds") def compute_phi_from_parents(self, u_mu_Lambda, mask=True): r""" Compute the natural parameter vector given parent moments. .. math:: \boldsymbol{\phi} (\boldsymbol{\mu}, \mathbf{\Lambda}) &= \left[ \begin{matrix} \mathbf{\Lambda} \boldsymbol{\mu} \\ - \frac{1}{2} \mathbf{\Lambda} \end{matrix} \right] """ Lambda_mu = u_mu_Lambda[0] Lambda = u_mu_Lambda[2] return [Lambda_mu, -0.5 * Lambda] def compute_moments_and_cgf(self, phi, mask=True): r""" Compute the moments and :math:`g(\phi)`. .. math:: \overline{\mathbf{u}} (\boldsymbol{\phi}) &= \left[ \begin{matrix} - \frac{1}{2} \boldsymbol{\phi}^{-1}_2 \boldsymbol{\phi}_1 \\ \frac{1}{4} \boldsymbol{\phi}^{-1}_2 \boldsymbol{\phi}_1 \boldsymbol{\phi}^{\mathrm{T}}_1 \boldsymbol{\phi}^{-1}_2 - \frac{1}{2} \boldsymbol{\phi}^{-1}_2 \end{matrix} \right] \\ g_{\boldsymbol{\phi}} (\boldsymbol{\phi}) &= \frac{1}{4} \boldsymbol{\phi}^{\mathrm{T}}_1 \boldsymbol{\phi}^{-1}_2 \boldsymbol{\phi}_1 + \frac{1}{2} \log | -2 \boldsymbol{\phi}_2 | """ # TODO: Compute -2*phi[1] and simplify the formulas L = linalg.chol(-2*phi[1], ndim=self.ndim) k = np.shape(phi[0])[-1] Cov = linalg.chol_inv(L, ndim=self.ndim) mu = linalg.chol_solve(L, phi[0], ndim=self.ndim) # G g = (-0.5 * linalg.inner(mu, phi[0], ndim=self.ndim) + 0.5 * linalg.chol_logdet(L, ndim=self.ndim)) if self.has_limits: if self.ndim != 1: raise NotImplementedError("Limits for ndim!=1 not yet supported") (p, u0, u1)= truncnorm.moments( mu, Cov, self.minimum, self.maximum, 2, ) logp = np.log(p) else: u0 = mu u1 = Cov + linalg.outer(u0, u0, ndim=self.ndim) logp = 0 u = [u0, u1] return (u, g - logp) def compute_cgf_from_parents(self, u_mu_Lambda): r""" Compute :math:`\mathrm{E}_{q(p)}[g(p)]` .. math:: g (\boldsymbol{\mu}, \mathbf{\Lambda}) &= - \frac{1}{2} \operatorname{tr}(\boldsymbol{\mu\mu}^{\mathrm{T}} \mathbf{\Lambda} ) + \frac{1}{2} \log |\mathbf{\Lambda}| """ mu_Lambda_mu = u_mu_Lambda[1] logdet_Lambda = u_mu_Lambda[3] g = -0.5*mu_Lambda_mu + 0.5*logdet_Lambda return g def compute_fixed_moments_and_f(self, x, mask=True): r""" Compute the moments and :math:`f(x)` for a fixed value. .. math:: \mathbf{u} (\mathbf{x}) &= \left[ \begin{matrix} \mathbf{x} \\ \mathbf{xx}^{\mathrm{T}} \end{matrix} \right] \\ f(\mathbf{x}) &= - \frac{D}{2} \log(2\pi) """ k = np.shape(x)[-1] u = [x, linalg.outer(x, x, ndim=self.ndim)] f = -k/2*np.log(2*np.pi) return (u, f) def compute_gradient(self, g, u, phi): r""" Compute the standard gradient with respect to the natural parameters. Gradient of the moments: .. math:: \mathrm{d}\overline{\mathbf{u}} &= \begin{bmatrix} \frac{1}{2} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} \phi_1 - \frac{1}{2} \phi_2^{-1} \mathrm{d}\phi_1 \\ - \frac{1}{4} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} \phi_1 \phi_1^{\mathrm{T}} \phi_2^{-1} - \frac{1}{4} \phi_2^{-1} \phi_1 \phi_1^{\mathrm{T}} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} + \frac{1}{2} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} + \frac{1}{4} \phi_2^{-1} \mathrm{d}\phi_1 \phi_1^{\mathrm{T}} \phi_2^{-1} + \frac{1}{4} \phi_2^{-1} \phi_1 \mathrm{d}\phi_1^{\mathrm{T}} \phi_2^{-1} \end{bmatrix} \\ &= \begin{bmatrix} 2 (\overline{u}_2 - \overline{u}_1 \overline{u}_1^{\mathrm{T}}) \mathrm{d}\phi_2 \overline{u}_1 + (\overline{u}_2 - \overline{u}_1 \overline{u}_1^{\mathrm{T}}) \mathrm{d}\phi_1 \\ u_2 d\phi_2 u_2 - 2 u_1 u_1^T d\phi_2 u_1 u_1^T + 2 (u_2 - u_1 u_1^T) d\phi_1 u_1^T \end{bmatrix} Standard gradient given the gradient with respect to the moments, that is, given the Riemannian gradient :math:`\tilde{\nabla}`: .. math:: \nabla = \begin{bmatrix} (\overline{u}_2 - \overline{u}_1 \overline{u}_1^{\mathrm{T}}) \tilde{\nabla}_1 + 2 (u_2 - u_1 u_1^T) \tilde{\nabla}_2 u_1 \\ (u_2 - u_1 u_1^T) \tilde{\nabla}_1 u_1^T + u_1 \tilde{\nabla}_1^T (u_2 - u_1 u_1^T) + 2 u_2 \tilde{\nabla}_2 u_2 - 2 u_1 u_1^T \tilde{\nabla}_2 u_1 u_1^T \end{bmatrix} """ ndim = 1 x = u[0] xx = u[1] # Some helpful variables x_x = linalg.outer(x, x, ndim=self.ndim) Cov = xx - x_x cov_g0 = linalg.mvdot(Cov, g[0], ndim=self.ndim) cov_g0_x = linalg.outer(cov_g0, x, ndim=self.ndim) g1_x = linalg.mvdot(g[1], x, ndim=self.ndim) # Compute gradient terms d0 = cov_g0 + 2 * linalg.mvdot(Cov, g1_x, ndim=self.ndim) d1 = (cov_g0_x + linalg.transpose(cov_g0_x, ndim=self.ndim) + 2 * linalg.mmdot(xx, linalg.mmdot(g[1], xx, ndim=self.ndim), ndim=self.ndim) - 2 * x_x * misc.add_trailing_axes(linalg.inner(g1_x, x, ndim=self.ndim), 2*self.ndim)) return [d0, d1] def random(self, *phi, plates=None): r""" Draw a random sample from the distribution. """ # TODO/FIXME: You shouldn't draw random values for # observed/fixed elements! # Note that phi[1] is -0.5*inv(Cov) U = linalg.chol(-2*phi[1], ndim=self.ndim) mu = linalg.chol_solve(U, phi[0], ndim=self.ndim) shape = plates + self.shape z = np.random.randn(*shape) # Denote Lambda = -2*phi[1] # Then, Cov = inv(Lambda) = inv(U'*U) = inv(U) * inv(U') # Thus, compute mu + U\z z = linalg.solve_triangular(U, z, trans='N', lower=False, ndim=self.ndim) return mu + z class GaussianARDDistribution(ExponentialFamilyDistribution): r""" ... Log probability density function: .. math:: \log p(x|\mu, \alpha) = -\frac{1}{2} x^T \mathrm{diag}(\alpha) x + x^T \mathrm{diag}(\alpha) \mu - \frac{1}{2} \mu^T \mathrm{diag}(\alpha) \mu + \frac{1}{2} \sum_i \log \alpha_i - \frac{D}{2} \log(2\pi) Parent has moments: .. math:: \begin{bmatrix} \alpha \circ \mu \\ \alpha \circ \mu \circ \mu \\ \alpha \\ \log(\alpha) \end{bmatrix} """ def __init__(self, shape): self.shape = shape self.ndim = len(shape) super().__init__() def compute_message_to_parent(self, parent, index, u, u_mu_alpha): r""" ... .. math:: m = \begin{bmatrix} x \\ [-\frac{1}{2}, \ldots, -\frac{1}{2}] \\ -\frac{1}{2} \mathrm{diag}(xx^T) \\ [\frac{1}{2}, \ldots, \frac{1}{2}] \end{bmatrix} """ if index == 0: x = u[0] x2 = misc.get_diag(u[1], ndim=self.ndim) m0 = x m1 = -0.5 * np.ones(self.shape) m2 = -0.5 * x2 m3 = 0.5 * np.ones(self.shape) return [m0, m1, m2, m3] else: raise ValueError("Invalid parent index") def compute_weights_to_parent(self, index, weights): r""" Maps the mask to the plates of a parent. """ if index != 0: raise IndexError() return misc.add_trailing_axes(weights, self.ndim) def compute_phi_from_parents(self, u_mu_alpha, mask=True): alpha_mu = u_mu_alpha[0] alpha = u_mu_alpha[2] #mu = u_mu[0] #alpha = u_alpha[0] ## if np.ndim(mu) < self.ndim_mu: ## raise ValueError("Moment of mu does not have enough dimensions") ## mu = misc.add_axes(mu, ## axis=np.ndim(mu)-self.ndim_mu, ## num=self.ndim-self.ndim_mu) phi0 = alpha_mu phi1 = -0.5 * alpha if self.ndim > 0: # Ensure that phi is not using broadcasting for variable # dimension axes ones = np.ones(self.shape) phi0 = ones * phi0 phi1 = ones * phi1 # Make a diagonal matrix phi1 = misc.diag(phi1, ndim=self.ndim) return [phi0, phi1] def compute_moments_and_cgf(self, phi, mask=True): if self.ndim == 0: # Use scalar equations u0 = -phi[0] / (2*phi[1]) u1 = u0**2 - 1 / (2*phi[1]) u = [u0, u1] g = (-0.5 * u[0] * phi[0] + 0.5 * np.log(-2*phi[1])) # TODO/FIXME: You could use these equations if phi is a scalar # in practice although ndim>0 (because the shape can be, e.g., # (1,1,1,1) for ndim=4). else: # Reshape to standard vector and matrix D = np.prod(self.shape) phi0 = np.reshape(phi[0], phi[0].shape[:-self.ndim] + (D,)) phi1 = np.reshape(phi[1], phi[1].shape[:-2*self.ndim] + (D,D)) # Compute the moments L = linalg.chol(-2*phi1) Cov = linalg.chol_inv(L) u0 = linalg.chol_solve(L, phi0) u1 = linalg.outer(u0, u0) + Cov # Compute CGF g = (- 0.5 * np.einsum('...i,...i', u0, phi0) + 0.5 * linalg.chol_logdet(L)) # Reshape to arrays u0 = np.reshape(u0, u0.shape[:-1] + self.shape) u1 = np.reshape(u1, u1.shape[:-2] + self.shape + self.shape) u = [u0, u1] return (u, g) def compute_cgf_from_parents(self, u_mu_alpha): r""" Compute the value of the cumulant generating function. """ # Compute sum(mu^2 * alpha) correctly for broadcasted shapes alpha_mu2 = u_mu_alpha[1] logdet_alpha = u_mu_alpha[3] axes = tuple(range(-self.ndim, 0)) # TODO/FIXME: You could use plate multiplier type of correction instead # of explicitly broadcasting with ones. if self.ndim > 0: alpha_mu2 = misc.sum_multiply(alpha_mu2, np.ones(self.shape), axis=axes) if self.ndim > 0: logdet_alpha = misc.sum_multiply(logdet_alpha, np.ones(self.shape), axis=axes) # Compute g g = -0.5*alpha_mu2 + 0.5*logdet_alpha return g def compute_fixed_moments_and_f(self, x, mask=True): r""" Compute u(x) and f(x) for given x. """ if self.ndim > 0 and np.shape(x)[-self.ndim:] != self.shape: raise ValueError("Invalid shape") k = np.prod(self.shape) u = [x, linalg.outer(x, x, ndim=self.ndim)] f = -k/2*np.log(2*np.pi) return (u, f) def plates_to_parent(self, index, plates): r""" Resolves the plate mapping to a parent. Given the plates of the node's moments, this method returns the plates that the message to a parent has for the parent's distribution. """ if index != 0: raise IndexError() return plates + self.shape def plates_from_parent(self, index, plates): r""" Resolve the plate mapping from a parent. Given the plates of a parent's moments, this method returns the plates that the moments has for this distribution. """ if index != 0: raise IndexError() if self.ndim == 0: return plates else: return plates[:-self.ndim] def random(self, *phi, plates=None): r""" Draw a random sample from the Gaussian distribution. """ # TODO/FIXME: You shouldn't draw random values for # observed/fixed elements! D = self.ndim if D == 0: dims = () else: dims = np.shape(phi[0])[-D:] if np.prod(dims) == 1.0: # Scalar Gaussian phi1 = phi[1] if D > 0: # Because the covariance matrix has shape (1,1,...,1,1), # that is 2*D number of ones, remove the extra half of the # shape phi1 = np.reshape(phi1, np.shape(phi1)[:-2*D] + D*(1,)) var = -0.5 / phi1 std = np.sqrt(var) mu = var * phi[0] shape = plates + dims z = np.random.randn(*shape) x = mu + std * z else: N = np.prod(dims) dims_cov = dims + dims # Reshape precision matrix plates_cov = np.shape(phi[1])[:-2*D] V = -2 * np.reshape(phi[1], plates_cov + (N,N)) # Compute Cholesky U = linalg.chol(V) # Reshape mean vector plates_phi0 = np.shape(phi[0])[:-D] phi0 = np.reshape(phi[0], plates_phi0 + (N,)) mu = linalg.chol_solve(U, phi0) # Compute mu + U\z shape = plates + (N,) z = np.random.randn(*shape) # Denote Lambda = -2*phi[1] # Then, Cov = inv(Lambda) = inv(U'*U) = inv(U) * inv(U') # Thus, compute mu + U\z x = mu + linalg.solve_triangular(U, z, trans='N', lower=False) x = np.reshape(x, plates + dims) return x def compute_gradient(self, g, u, phi): r""" Compute the standard gradient with respect to the natural parameters. Gradient of the moments: .. math:: \mathrm{d}\overline{\mathbf{u}} &= \begin{bmatrix} \frac{1}{2} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} \phi_1 - \frac{1}{2} \phi_2^{-1} \mathrm{d}\phi_1 \\ - \frac{1}{4} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} \phi_1 \phi_1^{\mathrm{T}} \phi_2^{-1} - \frac{1}{4} \phi_2^{-1} \phi_1 \phi_1^{\mathrm{T}} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} + \frac{1}{2} \phi_2^{-1} \mathrm{d}\phi_2 \phi_2^{-1} + \frac{1}{4} \phi_2^{-1} \mathrm{d}\phi_1 \phi_1^{\mathrm{T}} \phi_2^{-1} + \frac{1}{4} \phi_2^{-1} \phi_1 \mathrm{d}\phi_1^{\mathrm{T}} \phi_2^{-1} \end{bmatrix} \\ &= \begin{bmatrix} 2 (\overline{u}_2 - \overline{u}_1 \overline{u}_1^{\mathrm{T}}) \mathrm{d}\phi_2 \overline{u}_1 + (\overline{u}_2 - \overline{u}_1 \overline{u}_1^{\mathrm{T}}) \mathrm{d}\phi_1 \\ u_2 d\phi_2 u_2 - 2 u_1 u_1^T d\phi_2 u_1 u_1^T + 2 (u_2 - u_1 u_1^T) d\phi_1 u_1^T \end{bmatrix} Standard gradient given the gradient with respect to the moments, that is, given the Riemannian gradient :math:`\tilde{\nabla}`: .. math:: \nabla = \begin{bmatrix} (\overline{u}_2 - \overline{u}_1 \overline{u}_1^{\mathrm{T}}) \tilde{\nabla}_1 + 2 (u_2 - u_1 u_1^T) \tilde{\nabla}_2 u_1 \\ (u_2 - u_1 u_1^T) \tilde{\nabla}_1 u_1^T + u_1 \tilde{\nabla}_1^T (u_2 - u_1 u_1^T) + 2 u_2 \tilde{\nabla}_2 u_2 - 2 u_1 u_1^T \tilde{\nabla}_2 u_1 u_1^T \end{bmatrix} """ ndim = self.ndim x = u[0] xx = u[1] # Some helpful variables x_x = linalg.outer(x, x, ndim=ndim) Cov = xx - x_x cov_g0 = linalg.mvdot(Cov, g[0], ndim=ndim) cov_g0_x = linalg.outer(cov_g0, x, ndim=ndim) g1_x = linalg.mvdot(g[1], x, ndim=ndim) # Compute gradient terms d0 = cov_g0 + 2 * linalg.mvdot(Cov, g1_x, ndim=ndim) d1 = (cov_g0_x + linalg.transpose(cov_g0_x, ndim=ndim) + 2 * linalg.mmdot(xx, linalg.mmdot(g[1], xx, ndim=ndim), ndim=ndim) - 2 * x_x * misc.add_trailing_axes(linalg.inner(g1_x, x, ndim=ndim), 2*ndim)) return [d0, d1] class GaussianGammaDistribution(ExponentialFamilyDistribution): r""" Class for the VMP formulas of Gaussian-Gamma-ISO variables. Currently, supports only vector variables. Log pdf of the prior: .. math:: \log p(\mathbf{x}, \tau | \boldsymbol{\mu}, \mathbf{\Lambda}, a, b) =& - \frac{1}{2} \tau \mathbf{x}^T \mathbf{\Lambda} \mathbf{x} + \frac{1}{2} \tau \mathbf{x}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \tau \boldsymbol{\mu}^T \mathbf{\Lambda} \mathbf{x} - \frac{1}{2} \tau \boldsymbol{\mu}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log|\mathbf{\Lambda}| + \frac{D}{2} \log\tau - \frac{D}{2} \log(2\pi) \\ & - b \tau + a \log\tau - \log\tau + a \log b - \log \Gamma(a) Log pdf of the posterior approximation: .. math:: \log q(\mathbf{x}, \tau) =& \tau \mathbf{x}^T \boldsymbol{\phi}_1 + \tau \mathbf{x}^T \mathbf{\Phi}_2 \mathbf{x} + \tau \phi_3 + \log\tau \phi_4 + g(\boldsymbol{\phi}_1, \mathbf{\Phi}_2, \phi_3, \phi_4) + f(x, \tau) """ def __init__(self, shape): self.shape = shape self.ndim = len(shape) super().__init__() def compute_message_to_parent(self, parent, index, u, u_mu_Lambda, u_a, u_b): r""" Compute the message to a parent node. - Parent :math:`(\boldsymbol{\mu}, \mathbf{\Lambda})` Moments: .. math:: \begin{bmatrix} \mathbf{\Lambda}\boldsymbol{\mu} \\ \boldsymbol{\mu}^T\mathbf{\Lambda}\boldsymbol{\mu} \\ \mathbf{\Lambda} \\ \log|\mathbf{\Lambda}| \end{bmatrix} Message: .. math:: \begin{bmatrix} \langle \tau \mathbf{x} \rangle \\ - \frac{1}{2} \langle \tau \rangle \\ - \frac{1}{2} \langle \tau \mathbf{xx}^T \rangle \\ \frac{1}{2} \end{bmatrix} - Parent :math:`a`: Moments: .. math:: \begin{bmatrix} a \\ \log \Gamma(a) \end{bmatrix} Message: .. math:: \begin{bmatrix} \langle \log\tau \rangle + \langle \log b \rangle \\ -1 \end{bmatrix} - Parent :math:`b`: Moments: .. math:: \begin{bmatrix} b \\ \log b \end{bmatrix} Message: .. math:: \begin{bmatrix} - \langle \tau \rangle \\ \langle a \rangle \end{bmatrix} """ x_tau = u[0] xx_tau = u[1] tau = u[2] logtau = u[3] if index == 0: m0 = x_tau m1 = -0.5 * tau m2 = -0.5 * xx_tau m3 = 0.5 return [m0, m1, m2, m3] elif index == 1: logb = u_b[1] m0 = logtau + logb m1 = -1 return [m0, m1] elif index == 2: a = u_a[0] m0 = -tau m1 = a return [m0, m1] else: raise ValueError("Index out of bounds") def compute_phi_from_parents(self, u_mu_Lambda, u_a, u_b, mask=True): r""" Compute the natural parameter vector given parent moments. """ Lambda_mu = u_mu_Lambda[0] mu_Lambda_mu = u_mu_Lambda[1] Lambda = u_mu_Lambda[2] a = u_a[0] b = u_b[0] phi = [Lambda_mu, -0.5*Lambda, -0.5*mu_Lambda_mu - b, a] return phi def compute_moments_and_cgf(self, phi, mask=True): r""" Compute the moments and :math:`g(\phi)`. """ # Compute helpful variables V = -2*phi[1] L_V = linalg.chol(V, ndim=self.ndim) logdet_V = linalg.chol_logdet(L_V, ndim=self.ndim) mu = linalg.chol_solve(L_V, phi[0], ndim=self.ndim) Cov = linalg.chol_inv(L_V, ndim=self.ndim) a = phi[3] b = -phi[2] - 0.5 * linalg.inner(mu, phi[0], ndim=self.ndim) log_b = np.log(b) # Compute moments u2 = a / b u3 = -log_b + special.psi(a) u0 = mu * misc.add_trailing_axes(u2, self.ndim) u1 = Cov + ( linalg.outer(mu, mu, ndim=self.ndim) * misc.add_trailing_axes(u2, 2 * self.ndim) ) u = [u0, u1, u2, u3] # Compute g g = 0.5*logdet_V + a*log_b - special.gammaln(a) return (u, g) def compute_cgf_from_parents(self, u_mu_Lambda, u_a, u_b): r""" Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ logdet_Lambda = u_mu_Lambda[3] a = u_a[0] gammaln_a = u_a[1] log_b = u_b[1] g = 0.5*logdet_Lambda + a*log_b - gammaln_a return g def compute_fixed_moments_and_f(self, x_alpha, mask=True): r""" Compute the moments and :math:`f(x)` for a fixed value. """ (x, alpha) = x_alpha logalpha = np.log(alpha) u0 = x * misc.add_trailing_axes(alpha, self.ndim) u1 = linalg.outer(x, x, ndim=self.ndim) * misc.add_trailing_axes(alpha, 2*self.ndim) u2 = alpha u3 = logalpha u = [u0, u1, u2, u3] if self.ndim > 0: D = np.prod(np.shape(x)[-self.ndim:]) else: D = 1 f = (D/2 - 1) * logalpha - D/2 * np.log(2*np.pi) return (u, f) def random(self, *phi, plates=None): r""" Draw a random sample from the distribution. """ # TODO/FIXME: This is incorrect, I think. Gamma distribution parameters # aren't directly those, because phi has some parts from the Gaussian # distribution. alpha = GammaDistribution().random( phi[2], phi[3], plates=plates ) mu = GaussianARDDistribution(self.shape).random( misc.add_trailing_axes(alpha, self.ndim) * phi[0], misc.add_trailing_axes(alpha, 2*self.ndim) * phi[1], plates=plates ) return (mu, alpha) class GaussianWishartDistribution(ExponentialFamilyDistribution): r""" Class for the VMP formulas of Gaussian-Wishart variables. Currently, supports only vector variables. .. math:: \log p(\mathbf{x}, \mathbf{\Lambda} | \boldsymbol{\mu}, \alpha, n, \mathbf{V}) =& - \frac{1}{2} \alpha \mathbf{x}^T \mathbf{\Lambda} \mathbf{x} + \frac{1}{2} \alpha \mathbf{x}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \alpha \boldsymbol{\mu}^T \mathbf{\Lambda} \mathbf{x} - \frac{1}{2} \alpha \boldsymbol{\mu}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log|\mathbf{\Lambda}| + \frac{D}{2} \log\alpha - \frac{D}{2} \log(2\pi) \\ & - \frac{1}{2} \mathrm{tr}(\mathbf{V}\mathbf{\Lambda}) + \frac{n-d-1}{2} \log|\mathbf{\Lambda}| - \frac{nd}{2}\log 2 - \frac{n}{2} \log|\mathbf{V}| - \log\Gamma_d(\frac{n}{2}) Posterior approximation: .. math:: \log q(\mathbf{x}, \mathbf{\Lambda}) =& \mathbf{x}^T \mathbf{\Lambda} \boldsymbol{\phi}_1 + \phi_2 \mathbf{x}^T \mathbf{\Lambda} \mathbf{x} + \mathrm{tr}(\mathbf{\Lambda} \mathbf{\Phi}_3) + \phi_4 \log|\mathbf{\Lambda}| + g(\boldsymbol{\phi}_1, \phi_2, \mathbf{\Phi}_3, \phi_4) + f(\mathbf{x}, \mathbf{\Lambda}) """ def compute_message_to_parent(self, parent, index, u, u_mu_alpha, u_n, u_V): r""" Compute the message to a parent node. For parent :math:`q(\boldsymbol{\mu}, \alpha)`: .. math:: \alpha \boldsymbol{\mu}^T \mathbf{m}_1 \Rightarrow & \mathbf{m}_1 = \langle \mathbf{\Lambda x} \rangle \\ \alpha \boldsymbol{\mu}^T \mathbf{M}_2 \boldsymbol{\mu} \Rightarrow & \mathbf{M}_2 = - \frac{1}{2} \langle \mathbf{\Lambda} \rangle \\ \alpha m_3 \Rightarrow & m_3 = - \frac{1}{2} \langle \mathbf{x}^T \mathbf{\Lambda} \mathbf{x} \rangle \\ m_4 \log \alpha \Rightarrow & m_4 = \frac{d}{2} For parent :math:`q(\mathbf{V})`: .. math:: \mathbf{M}_1 &= \frac{\partial \langle \log p \rangle}{\partial \langle \mathbf{V} \rangle} = -\frac{1}{2} \langle \mathbf{\Lambda} \rangle \\ \mathbf{M}_2 &= \frac{\partial \langle \log p \rangle}{\partial \langle \log|\mathbf{V}| \rangle} = ... """ if index == 0: m0 m1 m2 m3 raise NotImplementedError() elif index == 1: raise NotImplementedError() elif index == 2: raise NotImplementedError() else: raise ValueError("Index out of bounds") def compute_phi_from_parents(self, u_mu_alpha, u_n, u_V, mask=True): r""" Compute the natural parameter vector given parent moments. """ alpha_mu = u_mu_alpha[0] alpha_mumu = u_mu_alpha[1] alpha = u_mu_alpha[2] V = u_V[0] n = u_n[0] phi0 = alpha_mu phi1 = -0.5 * alpha phi2 = -0.5 * (V + alpha_mumu) phi3 = 0.5 * n return [phi0, phi1, phi2, phi3] def compute_moments_and_cgf(self, phi, mask=True): r""" Compute the moments and :math:`g(\phi)`. """ # TODO/FIXME: This isn't probably correct. Phi[2:] has terms that are # related to the Gaussian also, not only Wishart. u_Lambda = WishartDistribution((D,)).compute_moments_and_cgf(phi[2:]) raise NotImplementedError() return (u, g) def compute_cgf_from_parents(self, u_mu_alpha, u_n, u_V): r""" Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ raise NotImplementedError() return g def compute_fixed_moments_and_f(self, x, Lambda, mask=True): r""" Compute the moments and :math:`f(x)` for a fixed value. """ raise NotImplementedError() return (u, f) def random(self, *params, plates=None): r""" Draw a random sample from the distribution. """ raise NotImplementedError() # # NODES # class _GaussianTemplate(ExponentialFamily): def translate(self, b, debug=False): """ Transforms the current posterior by adding a bias to the mean Parameters ---------- b : array Constant to add """ ndim = len(self.dims[0]) if ndim > 0 and np.shape(b)[-ndim:] != self.dims[0]: raise ValueError("Bias has incorrect shape") x = self.u[0] xb = linalg.outer(x, b, ndim=ndim) bx = linalg.transpose(xb, ndim=ndim) bb = linalg.outer(b, b, ndim=ndim) uh = [ self.u[0] + b, self.u[1] + xb + bx + bb ] Lambda = -2 * self.phi[1] Lambda_b = linalg.mvdot(Lambda, b, ndim=ndim) dg = -0.5 * ( linalg.inner(b, Lambda_b, ndim=ndim) + 2 * linalg.inner(x, Lambda_b, ndim=ndim) ) phih = [ self.phi[0] + Lambda_b, self.phi[1] ] self._check_shape(uh) self._check_shape(phih) self.u = uh self.phi = phih self.g = self.g + dg # TODO: This is all just debugging stuff and can be removed if debug: uh = [ui.copy() for ui in uh] gh = self.g.copy() self._update_moments_and_cgf() if any(not np.allclose(uih, ui, atol=1e-6) for (uih, ui) in zip(uh, self.u)): raise RuntimeError("BUG") if not np.allclose(self.g, gh, atol=1e-6): raise RuntimeError("BUG") return class Gaussian(_GaussianTemplate): r""" Node for Gaussian variables. The node represents a :math:`D`-dimensional vector from the Gaussian distribution: .. math:: \mathbf{x} &\sim \mathcal{N}(\boldsymbol{\mu}, \mathbf{\Lambda}), where :math:`\boldsymbol{\mu}` is the mean vector and :math:`\mathbf{\Lambda}` is the precision matrix (i.e., inverse of the covariance matrix). .. math:: \mathbf{x},\boldsymbol{\mu} \in \mathbb{R}^{D}, \quad \mathbf{\Lambda} \in \mathbb{R}^{D \times D}, \quad \mathbf{\Lambda} \text{ symmetric positive definite} Parameters ---------- mu : Gaussian-like node or GaussianGamma-like node or GaussianWishart-like node or array Mean vector Lambda : Wishart-like node or array Precision matrix See also -------- Wishart, GaussianARD, GaussianWishart, GaussianGamma """ def __init__(self, mu, Lambda, **kwargs): r""" Create Gaussian node """ super().__init__(mu, Lambda, **kwargs) @classmethod def _constructor(cls, mu, Lambda, ndim=1, **kwargs): r""" Constructs distribution and moments objects. """ mu_Lambda = WrapToGaussianWishart(mu, Lambda, ndim=ndim) shape = mu_Lambda._moments.shape moments = GaussianMoments(shape) parent_moments = (mu_Lambda._moments,) if mu_Lambda.dims != ( shape, (), shape+shape, () ): raise Exception("Parents have wrong dimensionality") distribution = GaussianDistribution(shape) parents = [mu_Lambda] return (parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, mu_Lambda.plates)), distribution, moments, parent_moments) def initialize_from_parameters(self, mu, Lambda): u = self._parent_moments[0].compute_fixed_moments(mu, Lambda) self._initialize_from_parent_moments(u) def observe_limits(self, minimum=-np.inf, maximum=np.inf): self._distribution.set_limits(minimum, maximum) self._update_mask() return def _set_mask(self, mask): self.mask = np.logical_or( mask, np.logical_or( self.observed, self._distribution.has_limits, ), ) def __str__(self): ndim = len(self.dims[0]) mu = self.u[0] Cov = self.u[1] - linalg.outer(mu, mu, ndim=ndim) return ("%s ~ Gaussian(mu, Cov)\n" " mu = \n" "%s\n" " Cov = \n" "%s\n" % (self.name, mu, Cov)) def rotate(self, R, inv=None, logdet=None, Q=None): # TODO/FIXME: Combine and refactor all these rotation transformations # into _GaussianTemplate if self._moments.ndim != 1: raise NotImplementedError("Not implemented for ndim!=1 yet") if inv is not None: invR = inv else: invR = np.linalg.inv(R) if logdet is not None: logdetR = logdet else: logdetR = np.linalg.slogdet(R)[1] # It would be more efficient and simpler, if you just rotated the # moments and didn't touch phi. However, then you would need to call # update() before lower_bound_contribution. This is more error-safe. # Rotate plates, if plate rotation matrix is given. Assume that there's # only one plate-axis if Q is not None: # Rotate moments using Q self.u[0] = np.einsum('ik,kj->ij', Q, self.u[0]) sumQ = np.sum(Q, axis=0) # Rotate natural parameters using Q self.phi[1] = np.einsum('d,dij->dij', sumQ**(-2), self.phi[1]) self.phi[0] = np.einsum('dij,dj->di', -2*self.phi[1], self.u[0]) # Transform parameters using R self.phi[0] = mvdot(invR.T, self.phi[0]) self.phi[1] = dot(invR.T, self.phi[1], invR) if Q is not None: self._update_moments_and_cgf() else: # Transform moments and g using R self.u[0] = mvdot(R, self.u[0]) self.u[1] = dot(R, self.u[1], R.T) self.g -= logdetR def rotate_matrix(self, R1, R2, inv1=None, logdet1=None, inv2=None, logdet2=None, Q=None): r""" The vector is reshaped into a matrix by stacking the row vectors. Computes R1*X*R2', which is identical to kron(R1,R2)*x (??) Note that this is slightly different from the standard Kronecker product definition because Numpy stacks row vectors instead of column vectors. Parameters ---------- R1 : ndarray A matrix from the left R2 : ndarray A matrix from the right """ if self._moments.ndim != 1: raise NotImplementedError("Not implemented for ndim!=1 yet") if Q is not None: # Rotate moments using Q self.u[0] = np.einsum('ik,kj->ij', Q, self.u[0]) sumQ = np.sum(Q, axis=0) # Rotate natural parameters using Q self.phi[1] = np.einsum('d,dij->dij', sumQ**(-2), self.phi[1]) self.phi[0] = np.einsum('dij,dj->di', -2*self.phi[1], self.u[0]) if inv1 is None: inv1 = np.linalg.inv(R1) if logdet1 is None: logdet1 = np.linalg.slogdet(R1)[1] if inv2 is None: inv2 = np.linalg.inv(R2) if logdet2 is None: logdet2 = np.linalg.slogdet(R2)[1] D1 = np.shape(R1)[0] D2 = np.shape(R2)[0] # Reshape into matrices sh0 = np.shape(self.phi[0])[:-1] + (D1,D2) sh1 = np.shape(self.phi[1])[:-2] + (D1,D2,D1,D2) phi0 = np.reshape(self.phi[0], sh0) phi1 = np.reshape(self.phi[1], sh1) # Apply rotations to phi #phi0 = dot(inv1, phi0, inv2.T) phi0 = dot(inv1.T, phi0, inv2) phi1 = np.einsum('...ia,...abcd->...ibcd', inv1.T, phi1) phi1 = np.einsum('...ic,...abcd->...abid', inv1.T, phi1) phi1 = np.einsum('...ib,...abcd->...aicd', inv2.T, phi1) phi1 = np.einsum('...id,...abcd->...abci', inv2.T, phi1) # Reshape back into vectors self.phi[0] = np.reshape(phi0, self.phi[0].shape) self.phi[1] = np.reshape(phi1, self.phi[1].shape) # It'd be better to rotate the moments too.. self._update_moments_and_cgf() class GaussianARD(_GaussianTemplate): r""" Node for Gaussian variables with ARD prior. The node represents a :math:`D`-dimensional vector from the Gaussian distribution: .. math:: \mathbf{x} &\sim \mathcal{N}(\boldsymbol{\mu}, \mathrm{diag}(\boldsymbol{\alpha})), where :math:`\boldsymbol{\mu}` is the mean vector and :math:`\mathrm{diag}(\boldsymbol{\alpha})` is the diagonal precision matrix (i.e., inverse of the covariance matrix). .. math:: \mathbf{x},\boldsymbol{\mu} \in \mathbb{R}^{D}, \quad \alpha_d > 0 \text{ for } d=0,\ldots,D-1 *Note:* The form of the posterior approximation is a Gaussian distribution with full covariance matrix instead of a diagonal matrix. Parameters ---------- mu : Gaussian-like node or GaussianGamma-like node or array Mean vector alpha : gamma-like node or array Diagonal elements of the precision matrix See also -------- Gamma, Gaussian, GaussianGamma, GaussianWishart """ def __init__(self, mu, alpha, ndim=None, shape=None, **kwargs): r""" Create GaussianARD node. """ super().__init__(mu, alpha, ndim=ndim, shape=shape, **kwargs) @classmethod def _constructor(cls, mu, alpha, ndim=None, shape=None, **kwargs): r""" Constructs distribution and moments objects. If __init__ uses useconstructor decorator, this method is called to construct distribution and moments objects. The method is given the same inputs as __init__. For some nodes, some of these can't be "static" class attributes, then the node class must overwrite this method to construct the objects manually. The point of distribution class is to move general distribution but not-node specific code. The point of moments class is to define the messaging protocols. """ mu_alpha = WrapToGaussianGamma(mu, alpha, ndim=0) if ndim is None: if shape is not None: ndim = len(shape) else: shape = () ndim = 0 else: if shape is not None: if ndim != len(shape): raise ValueError("Given shape and ndim inconsistent") else: if ndim == 0: shape = () else: if ndim > len(mu_alpha.plates): raise ValueError( "Cannot determine shape for ndim={0} because parent " "full shape has ndim={1}." .format(ndim, len(mu_alpha.plates)) ) shape = mu_alpha.plates[-ndim:] moments = GaussianMoments(shape) parent_moments = [GaussianGammaMoments(())] distribution = GaussianARDDistribution(shape) plates = cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, mu_alpha.plates)) parents = [mu_alpha] return (parents, kwargs, moments.dims, plates, distribution, moments, parent_moments) def initialize_from_parameters(self, mu, alpha): # Explicit broadcasting so the shapes match mu = mu * np.ones(np.shape(alpha)) alpha = alpha * np.ones(np.shape(mu)) # Compute parent moments u = self._parent_moments[0].compute_fixed_moments([mu, alpha]) # Initialize distribution self._initialize_from_parent_moments(u) def initialize_from_mean_and_covariance(self, mu, Cov): ndim = len(self._distribution.shape) u = [mu, Cov + linalg.outer(mu, mu, ndim=ndim)] mask = np.logical_not(self.observed) # TODO: You could compute the CGF but it requires Cholesky of # Cov. Do it later. self._set_moments_and_cgf(u, np.nan, mask=mask) return def __str__(self): mu = self.u[0] Cov = self.u[1] - linalg.outer(mu, mu) return ("%s ~ Gaussian(mu, Cov)\n" " mu = \n" "%s\n" " Cov = \n" "%s\n" % (self.name, mu, Cov)) def rotate(self, R, inv=None, logdet=None, axis=-1, Q=None, subset=None, debug=False): if Q is not None: raise NotImplementedError() if subset is not None: raise NotImplementedError() # TODO/FIXME: Combine and refactor all these rotation transformations # into _GaussianTemplate ndim = len(self._distribution.shape) if inv is not None: invR = inv else: invR = np.linalg.inv(R) if logdet is not None: logdetR = logdet else: logdetR = np.linalg.slogdet(R)[1] self.phi[0] = rotate_mean(self.phi[0], invR.T, axis=axis, ndim=ndim) self.phi[1] = rotate_covariance(self.phi[1], invR.T, axis=axis, ndim=ndim) self.u[0] = rotate_mean(self.u[0], R, axis=axis, ndim=ndim) self.u[1] = rotate_covariance(self.u[1], R, axis=axis, ndim=ndim) s = list(self.dims[0]) s.pop(axis) self.g -= logdetR * np.prod(s) # TODO: This is all just debugging stuff and can be removed if debug: uh = [ui.copy() for ui in self.u] gh = self.g.copy() self._update_moments_and_cgf() if any(not np.allclose(uih, ui, atol=1e-6) for (uih, ui) in zip(uh, self.u)): raise RuntimeError("BUG") if not np.allclose(self.g, gh, atol=1e-6): raise RuntimeError("BUG") return def rotate_plates(self, Q, plate_axis=-1): r""" Approximate rotation of a plate axis. Mean is rotated exactly but covariance/precision matrix is rotated approximately. """ ndim = len(self._distribution.shape) # Rotate moments using Q if not isinstance(plate_axis, int): raise ValueError("Plate axis must be integer") if plate_axis >= 0: plate_axis -= len(self.plates) if plate_axis < -len(self.plates) or plate_axis >= 0: raise ValueError("Axis out of bounds") u0 = rotate_mean(self.u[0], Q, ndim=ndim+(-plate_axis), axis=0) sumQ = misc.add_trailing_axes(np.sum(Q, axis=0), 2*ndim-plate_axis-1) phi1 = sumQ**(-2) * self.phi[1] phi0 = -2 * matrix_dot_vector(phi1, u0, ndim=ndim) self.phi[0] = phi0 self.phi[1] = phi1 self._update_moments_and_cgf() return class GaussianGamma(ExponentialFamily): r""" Node for Gaussian-gamma (isotropic) random variables. The prior: .. math:: p(x, \alpha| \mu, \Lambda, a, b) p(x|\alpha, \mu, \Lambda) = \mathcal{N}(x | \mu, \alpha Lambda) p(\alpha|a, b) = \mathcal{G}(\alpha | a, b) The posterior approximation :math:`q(x, \alpha)` has the same Gaussian-gamma form. Currently, supports only vector variables. """ @classmethod def _constructor(cls, mu, Lambda, a, b, ndim=1, **kwargs): r""" Constructs distribution and moments objects. This method is called if useconstructor decorator is used for __init__. `mu` is the mean/location vector `alpha` is the scale `V` is the scale matrix `n` is the degrees of freedom """ # Convert parent nodes mu_Lambda = WrapToGaussianWishart(mu, Lambda, ndim=ndim) a = cls._ensure_moments(a, GammaPriorMoments) b = cls._ensure_moments(b, GammaMoments) shape = mu_Lambda.dims[0] distribution = GaussianGammaDistribution(shape) moments = GaussianGammaMoments(shape) parent_moments = ( mu_Lambda._moments, a._moments, b._moments, ) # Check shapes if mu_Lambda.dims != ( shape, (), 2*shape, () ): raise ValueError("mu and Lambda have wrong shape") if a.dims != ( (), () ): raise ValueError("a has wrong shape") if b.dims != ( (), () ): raise ValueError("b has wrong shape") # List of parent nodes parents = [mu_Lambda, a, b] return (parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, mu_Lambda.plates), distribution.plates_from_parent(1, a.plates), distribution.plates_from_parent(2, b.plates)), distribution, moments, parent_moments) def translate(self, b, debug=False): if self._moments.ndim != 1: raise NotImplementedError("Only ndim=1 supported at the moment") tau = self.u[2] x = self.u[0] / tau[...,None] xb = linalg.outer(x, b, ndim=1) bx = linalg.transpose(xb, ndim=1) bb = linalg.outer(b, b, ndim=1) uh = [ self.u[0] + tau[...,None] * b, self.u[1] + tau[...,None,None] * (xb + bx + bb), self.u[2], self.u[3] ] Lambda = -2 * self.phi[1] dtau = -0.5 * ( np.einsum('...ij,...i,...j->...', Lambda, b, b) + 2 * np.einsum('...ij,...i,...j->...', Lambda, b, x) ) phih = [ self.phi[0] + np.einsum('...ij,...j->...i', Lambda, b), self.phi[1], self.phi[2] + dtau, self.phi[3] ] self._check_shape(uh) self._check_shape(phih) self.phi = phih self.u = uh # TODO: This is all just debugging stuff and can be removed if debug: uh = [ui.copy() for ui in uh] gh = self.g.copy() self._update_moments_and_cgf() if any(not np.allclose(uih, ui, atol=1e-6) for (uih, ui) in zip(uh, self.u)): raise RuntimeError("BUG") if not np.allclose(self.g, gh, atol=1e-6): raise RuntimeError("BUG") return def rotate(self, R, inv=None, logdet=None, debug=False): if self._moments.ndim != 1: raise NotImplementedError("Only ndim=1 supported at the moment") if inv is None: inv = np.linalg.inv(R) if logdet is None: logdet = np.linalg.slogdet(R)[1] uh = [ rotate_mean(self.u[0], R), rotate_covariance(self.u[1], R), self.u[2], self.u[3] ] phih = [ rotate_mean(self.phi[0], inv.T), rotate_covariance(self.phi[1], inv.T), self.phi[2], self.phi[3] ] self._check_shape(uh) self._check_shape(phih) self.phi = phih self.u = uh self.g = self.g - logdet # TODO: This is all just debugging stuff and can be removed if debug: uh = [ui.copy() for ui in uh] gh = self.g.copy() self._update_moments_and_cgf() if any(not np.allclose(uih, ui, atol=1e-6) for (uih, ui) in zip(uh, self.u)): raise RuntimeError("BUG") if not np.allclose(self.g, gh, atol=1e-6): raise RuntimeError("BUG") return def plotmatrix(self): r""" Creates a matrix of marginal plots. On diagonal, are marginal plots of each variable. Off-diagonal plot (i,j) shows the joint marginal density of x_i and x_j. """ import bayespy.plot as bpplt if self.ndim != 1: raise NotImplementedError("Only ndim=1 supported at the moment") if np.prod(self.plates) != 1: raise ValueError("Currently, does not support plates in the node.") if len(self.dims[0]) != 1: raise ValueError("Currently, supports only vector variables") # Dimensionality of the Gaussian D = self.dims[0][0] # Compute standard parameters tau = self.u[2] mu = self.u[0] mu = mu / misc.add_trailing_axes(tau, 1) Cov = self.u[1] - linalg.outer(self.u[0], mu, ndim=1) Cov = Cov / misc.add_trailing_axes(tau, 2) a = self.phi[3] b = -self.phi[2] - 0.5*linalg.inner(self.phi[0], mu, ndim=1) # Create subplots (fig, axes) = bpplt.pyplot.subplots(D+1, D+1) # Plot marginal Student t distributions for i in range(D): for j in range(i+1): if i == j: bpplt._pdf_t(*(random.gaussian_gamma_to_t(mu[i], Cov[i,i], a, b, ndim=0)), axes=axes[i,i]) else: S = Cov[np.ix_([i,j],[i,j])] (m, S, nu) = random.gaussian_gamma_to_t(mu[[i,j]], S, a, b) bpplt._contour_t(m, S, nu, axes=axes[i,j]) bpplt._contour_t(m, S, nu, axes=axes[j,i], transpose=True) # Plot Gaussian-gamma marginal distributions for k in range(D): bpplt._contour_gaussian_gamma(mu[k], Cov[k,k], a, b, axes=axes[D,k]) bpplt._contour_gaussian_gamma(mu[k], Cov[k,k], a, b, axes=axes[k,D], transpose=True) # Plot gamma marginal distribution bpplt._pdf_gamma(a, b, axes=axes[D,D]) return axes def get_gaussian_location(self): r""" Return the mean and variance of the distribution """ if self._moments.ndim != 1: raise NotImplementedError("Only ndim=1 supported at the moment") tau = self.u[2] tau_mu = self.u[0] return tau_mu / tau[...,None] def get_gaussian_mean_and_variance(self): r""" Return the mean and variance of the distribution """ if self.ndim != 1: raise NotImplementedError("Only ndim=1 supported at the moment") a = self.phi[3] nu = 2*a if np.any(nu <= 1): raise ValueError("Mean not defined for degrees of freedom <= 1") if np.any(nu <= 2): raise ValueError("Variance not defined if degrees of freedom <= 2") tau = self.u[2] tau_mu = self.u[0] mu = tau_mu / misc.add_trailing_axes(tau, 1) var = misc.get_diag(self.u[1], ndim=1) - tau_mu*mu var = var / misc.add_trailing_axes(tau, 1) var = nu / (nu-2) * var return (mu, var) def get_marginal_logpdf(self, gaussian=None, gamma=None): r""" Get the (marginal) log pdf of a subset of the variables Parameters ---------- gaussian : list or None Indices of the Gaussian variables to keep or None gamma : bool or None True if keep the gamma variable, otherwise False or None Returns ------- function A function which computes log-pdf """ if self.ndim != 1: raise NotImplementedError("Only ndim=1 supported at the moment") if gaussian is None and not gamma: raise ValueError("Must give some variables") # Compute standard parameters tau = self.u[2] mu = self.u[0] mu = mu / misc.add_trailing_axes(tau, 1) Cov = np.linalg.inv(-2*self.phi[1]) if not np.allclose(Cov, self.u[1] - linalg.outer(self.u[0], mu, ndim=1)): raise Exception("WAAAT") #Cov = Cov / misc.add_trailing_axes(tau, 2) a = self.phi[3] b = -self.phi[2] - 0.5*linalg.inner(self.phi[0], mu, ndim=1) if not gamma: # Student t distributions inds = list(gaussian) mu = mu[inds] Cov = Cov[np.ix_(inds, inds)] (mu, Cov, nu) = random.gaussian_gamma_to_t(mu, Cov, a, b, ndim=1) L = linalg.chol(Cov) logdet_Cov = linalg.chol_logdet(L) D = len(inds) def logpdf(x): y = x - mu v = linalg.chol_solve(L, y) z2 = linalg.inner(y, v, ndim=1) return random.t_logpdf(z2, logdet_Cov, nu, D) return logpdf elif gaussian is None: # Gamma distribution def logpdf(x): logx = np.log(x) return random.gamma_logpdf(b*x, logx, a*logx, a*np.log(b), special.gammaln(a)) return logpdf else: # Gaussian-gamma distribution inds = list(gaussian) mu = mu[inds] Cov = Cov[np.ix_(inds, inds)] D = len(inds) L = linalg.chol(Cov) logdet_Cov = linalg.chol_logdet(L) def logpdf(x): tau = x[...,-1] logtau = np.log(tau) x = x[...,:-1] y = x - mu v = linalg.chol_solve(L, y) * tau[...,None] z2 = linalg.inner(y, v, ndim=1) return (random.gaussian_logpdf(z2, 0, 0, logdet_Cov + D*logtau, D) + random.gamma_logpdf(b*tau, logtau, a*logtau, a*np.log(b), special.gammaln(a))) return logpdf class GaussianWishart(ExponentialFamily): r""" Node for Gaussian-Wishart random variables. The prior: .. math:: p(x, \Lambda| \mu, \alpha, V, n) p(x|\Lambda, \mu, \alpha) = \mathcal(N)(x | \mu, \alpha^{-1} Lambda^{-1}) p(\Lambda|V, n) = \mathcal(W)(\Lambda | n, V) The posterior approximation :math:`q(x, \Lambda)` has the same Gaussian-Wishart form. Currently, supports only vector variables. """ _distribution = GaussianWishartDistribution() @classmethod def _constructor(cls, mu, alpha, n, V, **kwargs): r""" Constructs distribution and moments objects. This method is called if useconstructor decorator is used for __init__. `mu` is the mean/location vector `alpha` is the scale `n` is the degrees of freedom `V` is the scale matrix """ # Convert parent nodes mu_alpha = WrapToGaussianGamma(mu, alpha, ndim=1) D = mu_alpha.dims[0][0] shape = mu_alpha._moments.shape moments = GaussianWishartMoments(shape) n = cls._ensure_moments(n, WishartPriorMoments, d=D) V = cls._ensure_moments(V, WishartMoments, ndim=1) parent_moments = ( mu_alpha._moments, n._moments, V._moments ) # Check shapes if mu_alpha.dims != ( (D,), (D,D), (), () ): raise ValueError("mu and alpha have wrong shape") if V.dims != ( (D,D), () ): raise ValueError("Precision matrix has wrong shape") if n.dims != ( (), () ): raise ValueError("Degrees of freedom has wrong shape") parents = [mu_alpha, n, V] return (parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), cls._distribution.plates_from_parent(0, mu_alpha.plates), cls._distribution.plates_from_parent(1, n.plates), cls._distribution.plates_from_parent(2, V.plates)), cls._distribution, moments, parent_moments) # # CONVERTERS # class GaussianToGaussianGamma(Deterministic): r""" Converter for Gaussian moments to Gaussian-gamma isotropic moments Combines the Gaussian moments with gamma moments for a fixed value 1. """ def __init__(self, X, **kwargs): r""" """ if not isinstance(X._moments, GaussianMoments): raise ValueError("Wrong moments, should be Gaussian") shape = X._moments.shape self.ndim = X._moments.ndim self._moments = GaussianGammaMoments(shape) self._parent_moments = [GaussianMoments(shape)] shape = X.dims[0] dims = ( shape, 2*shape, (), () ) super().__init__(X, dims=dims, **kwargs) def _compute_moments(self, u_X): r""" """ x = u_X[0] xx = u_X[1] u = [x, xx, 1, 0] return u def _compute_message_to_parent(self, index, m_child, u_X): r""" """ if index == 0: m = m_child[:2] return m else: raise ValueError("Invalid parent index") def _compute_function(self, x): return (x, 1) GaussianMoments.add_converter(GaussianGammaMoments, GaussianToGaussianGamma) class GaussianGammaToGaussianWishart(Deterministic): r""" """ def __init__(self, X_alpha, **kwargs): raise NotImplementedError() GaussianGammaMoments.add_converter(GaussianWishartMoments, GaussianGammaToGaussianWishart) # # WRAPPERS # # These wrappers form a single node from two nodes for messaging purposes. # class WrapToGaussianGamma(Deterministic): r""" """ def __init__(self, X, alpha, ndim=None, **kwargs): r""" """ # In case X is a numerical array, convert it to Gaussian first try: X = self._ensure_moments(X, GaussianMoments, ndim=ndim) except Moments.NoConverterError: pass try: ndim = X._moments.ndim except AttributeError as err: raise TypeError("ndim needs to be given explicitly") from err X = self._ensure_moments(X, GaussianGammaMoments, ndim=ndim) if len(X.dims[0]) != ndim: raise RuntimeError("Conversion failed ndim.") shape = X.dims[0] dims = ( shape, 2 * shape, (), () ) self.shape = shape self.ndim = len(shape) self._moments = GaussianGammaMoments(shape) self._parent_moments = [ GaussianGammaMoments(shape), GammaMoments() ] super().__init__(X, alpha, dims=dims, **kwargs) def _compute_moments(self, u_X, u_alpha): r""" """ (tau_x, tau_xx, tau, logtau) = u_X (alpha, logalpha) = u_alpha u0 = tau_x * misc.add_trailing_axes(alpha, self.ndim) u1 = tau_xx * misc.add_trailing_axes(alpha, 2 * self.ndim) u2 = tau * alpha u3 = logtau + logalpha return [u0, u1, u2, u3] def _compute_message_to_parent(self, index, m_child, u_X, u_alpha): r""" """ if index == 0: alpha = u_alpha[0] m0 = m_child[0] * misc.add_trailing_axes(alpha, self.ndim) m1 = m_child[1] * misc.add_trailing_axes(alpha, 2 * self.ndim) m2 = m_child[2] * alpha m3 = m_child[3] return [m0, m1, m2, m3] elif index == 1: (tau_x, tau_xx, tau, logtau) = u_X m0 = ( linalg.inner(m_child[0], tau_x, ndim=self.ndim) + linalg.inner(m_child[1], tau_xx, ndim=2*self.ndim) + m_child[2] * tau ) m1 = m_child[3] return [m0, m1] else: raise ValueError("Invalid parent index") class WrapToGaussianWishart(Deterministic): r""" Wraps Gaussian and Wishart nodes into a Gaussian-Wishart node. The following node combinations can be wrapped: * Gaussian and Wishart * Gaussian-gamma and Wishart * Gaussian-Wishart and gamma """ def __init__(self, X, Lambda, ndim=1, **kwargs): r""" """ # Just in case X is an array, convert it to a Gaussian node first. try: X = self._ensure_moments(X, GaussianMoments, ndim=ndim) except Moments.NoConverterError: pass try: # Try combo Gaussian-Gamma and Wishart X = self._ensure_moments(X, GaussianGammaMoments, ndim=ndim) except Moments.NoConverterError: # Have to use Gaussian-Wishart and Gamma X = self._ensure_moments(X, GaussianWishartMoments, ndim=ndim) Lambda = self._ensure_moments(Lambda, GammaMoments, ndim=ndim) shape = X.dims[0] if Lambda.dims != ((), ()): raise ValueError( "Mean and precision have inconsistent shapes: {0} and {1}" .format( X.dims, Lambda.dims ) ) self.wishart = False else: # Gaussian-Gamma and Wishart shape = X.dims[0] Lambda = self._ensure_moments(Lambda, WishartMoments, ndim=ndim) if Lambda.dims != (2 * shape, ()): raise ValueError( "Mean and precision have inconsistent shapes: {0} and {1}" .format( X.dims, Lambda.dims ) ) self.wishart = True self.ndim = len(shape) self._parent_moments = ( X._moments, Lambda._moments, ) self._moments = GaussianWishartMoments(shape) super().__init__(X, Lambda, dims=self._moments.dims, **kwargs) def _compute_moments(self, u_X_alpha, u_Lambda): r""" """ if self.wishart: alpha_x = u_X_alpha[0] alpha_xx = u_X_alpha[1] alpha = u_X_alpha[2] log_alpha = u_X_alpha[3] Lambda = u_Lambda[0] logdet_Lambda = u_Lambda[1] D = np.prod(self.dims[0]) u0 = linalg.mvdot(Lambda, alpha_x, ndim=self.ndim) u1 = linalg.inner(Lambda, alpha_xx, ndim=2*self.ndim) u2 = Lambda * misc.add_trailing_axes(alpha, 2*self.ndim) u3 = logdet_Lambda + D * log_alpha u = [u0, u1, u2, u3] return u else: raise NotImplementedError() def _compute_message_to_parent(self, index, m_child, u_X_alpha, u_Lambda): r""" ... Message from the child is :math:`[m_0, m_1, m_2, m_3]`: .. math:: \alpha m_0^T \Lambda x + m_1 \alpha x^T \Lambda x + \mathrm{tr}(\alpha m_2 \Lambda) + m_3 (\log | \alpha \Lambda |) In case of Gaussian-gamma and Wishart parents: Message to the first parent (x, alpha): .. math:: \tilde{m_0} &= \Lambda m_0 \\ \tilde{m_1} &= m_1 \Lambda \\ \tilde{m_2} &= \mathrm{tr}(m_2 \Lambda) \\ \tilde{m_3} &= m_3 \cdot D Message to the second parent (Lambda): .. math:: \tilde{m_0} &= \alpha (\frac{1}{2} m_0 x^T + \frac{1}{2} x m_0^T + m_1 xx^T + m_2) \\ \tilde{m_1} &= m_3 """ if index == 0: if self.wishart: # Message to Gaussian-gamma (isotropic) Lambda = u_Lambda[0] D = np.prod(self.dims[0]) m0 = linalg.mvdot(Lambda, m_child[0], ndim=self.ndim) m1 = Lambda * misc.add_trailing_axes(m_child[1], 2*self.ndim) m2 = linalg.inner(Lambda, m_child[2], ndim=2*self.ndim) m3 = D * m_child[3] m = [m0, m1, m2, m3] return m else: # Message to Gaussian-Wishart raise NotImplementedError() elif index == 1: if self.wishart: # Message to Wishart alpha_x = u_X_alpha[0] alpha_xx = u_X_alpha[1] alpha = u_X_alpha[2] m0 = (0.5*linalg.outer(alpha_x, m_child[0], ndim=self.ndim) + 0.5*linalg.outer(m_child[0], alpha_x, ndim=self.ndim) + alpha_xx * misc.add_trailing_axes(m_child[1], 2*self.ndim) + misc.add_trailing_axes(alpha, 2*self.ndim) * m_child[2]) m1 = m_child[3] m = [m0, m1] return m else: # Message to gamma (isotropic) raise NotImplementedError() else: raise ValueError("Invalid parent index") def reshape_gaussian_array(dims_from, dims_to, x0, x1): r""" Reshape the moments Gaussian array variable. The plates remain unaffected. """ num_dims_from = len(dims_from) num_dims_to = len(dims_to) # Reshape the first moment / mean num_plates_from = np.ndim(x0) - num_dims_from plates_from = np.shape(x0)[:num_plates_from] shape = ( plates_from + (1,)*(num_dims_to-num_dims_from) + dims_from ) x0 = np.ones(dims_to) * np.reshape(x0, shape) # Reshape the second moment / covariance / precision num_plates_from = np.ndim(x1) - 2*num_dims_from plates_from = np.shape(x1)[:num_plates_from] shape = ( plates_from + (1,)*(num_dims_to-num_dims_from) + dims_from + (1,)*(num_dims_to-num_dims_from) + dims_from ) x1 = np.ones(dims_to+dims_to) * np.reshape(x1, shape) return (x0, x1) def transpose_covariance(Cov, ndim=1): r""" Transpose the covariance array of Gaussian array variable. That is, swap the last ndim axes with the ndim axes before them. This makes transposing easy for array variables when the covariance is not a matrix but a multidimensional array. """ axes_in = [Ellipsis] + list(range(2*ndim,0,-1)) axes_out = [Ellipsis] + list(range(ndim,0,-1)) + list(range(2*ndim,ndim,-1)) return np.einsum(Cov, axes_in, axes_out) def left_rotate_covariance(Cov, R, axis=-1, ndim=1): r""" Rotate the covariance array of Gaussian array variable. ndim is the number of axes for the Gaussian variable. For vector variable, ndim=1 and covariance is a matrix. """ if not isinstance(axis, int): raise ValueError("Axis must be an integer") if axis < -ndim or axis >= ndim: raise ValueError("Axis out of range") # Force negative axis if axis >= 0: axis -= ndim # Rotation from left axes_R = [Ellipsis, ndim+abs(axis)+1, ndim+abs(axis)] axes_Cov = [Ellipsis] + list(range(ndim+abs(axis), 0, -1)) axes_out = [Ellipsis, ndim+abs(axis)+1] + list(range(ndim+abs(axis)-1, 0, -1)) Cov = np.einsum(R, axes_R, Cov, axes_Cov, axes_out) return Cov def right_rotate_covariance(Cov, R, axis=-1, ndim=1): r""" Rotate the covariance array of Gaussian array variable. ndim is the number of axes for the Gaussian variable. For vector variable, ndim=1 and covariance is a matrix. """ if not isinstance(axis, int): raise ValueError("Axis must be an integer") if axis < -ndim or axis >= ndim: raise ValueError("Axis out of range") # Force negative axis if axis >= 0: axis -= ndim # Rotation from right axes_R = [Ellipsis, abs(axis)+1, abs(axis)] axes_Cov = [Ellipsis] + list(range(abs(axis), 0, -1)) axes_out = [Ellipsis, abs(axis)+1] + list(range(abs(axis)-1, 0, -1)) Cov = np.einsum(R, axes_R, Cov, axes_Cov, axes_out) return Cov def rotate_covariance(Cov, R, axis=-1, ndim=1): r""" Rotate the covariance array of Gaussian array variable. ndim is the number of axes for the Gaussian variable. For vector variable, ndim=1 and covariance is a matrix. """ # Rotate from left and right Cov = left_rotate_covariance(Cov, R, ndim=ndim, axis=axis) Cov = right_rotate_covariance(Cov, R, ndim=ndim, axis=axis) return Cov def rotate_mean(mu, R, axis=-1, ndim=1): r""" Rotate the mean array of Gaussian array variable. ndim is the number of axes for the Gaussian variable. For vector variable, ndim=1 and mu is a vector. """ if not isinstance(axis, int): raise ValueError("Axis must be an integer") if axis < -ndim or axis >= ndim: raise ValueError("Axis out of range") # Force negative axis if axis >= 0: axis -= ndim # Rotation from right axes_R = [Ellipsis, abs(axis)+1, abs(axis)] axes_mu = [Ellipsis] + list(range(abs(axis), 0, -1)) axes_out = [Ellipsis, abs(axis)+1] + list(range(abs(axis)-1, 0, -1)) mu = np.einsum(R, axes_R, mu, axes_mu, axes_out) return mu def array_to_vector(x, ndim=1): if ndim == 0: return x shape_x = np.shape(x) D = np.prod(shape_x[-ndim:]) return np.reshape(x, shape_x[:-ndim] + (D,)) def array_to_matrix(A, ndim=1): if ndim == 0: return A shape_A = np.shape(A) D = np.prod(shape_A[-ndim:]) return np.reshape(A, shape_A[:-2*ndim] + (D,D)) def vector_to_array(x, shape): shape_x = np.shape(x) return np.reshape(x, np.shape(x)[:-1] + tuple(shape)) def matrix_dot_vector(A, x, ndim=1): if ndim < 0: raise ValueError("ndim must be non-negative integer") if ndim == 0: return A*x dims_x = np.shape(x)[-ndim:] A = array_to_matrix(A, ndim=ndim) x = array_to_vector(x, ndim=ndim) y = np.einsum('...ik,...k->...i', A, x) return vector_to_array(y, dims_x) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/gaussian_markov_chain.py0000644000175100001770000021762700000000000026251 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2012-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ This module contains VMP nodes for Gaussian Markov chains. """ import numpy as np import scipy from bayespy.utils import misc from bayespy.utils import linalg from .node import Node, message_sum_multiply from .deterministic import Deterministic from .expfamily import ExponentialFamily from .expfamily import ExponentialFamilyDistribution from .expfamily import useconstructor from .gaussian import (Gaussian, GaussianMoments, GaussianWishartMoments, GaussianGammaMoments, WrapToGaussianGamma, WrapToGaussianWishart) from .wishart import Wishart, WishartMoments from .gamma import Gamma, GammaMoments from .categorical import CategoricalMoments from .node import Moments, ensureparents class GaussianMarkovChainMoments(Moments): def __init__(self, N, D): self.N = N self.D = D return super().__init__() def compute_fixed_moments(self, x): u0 = x u1 = x[...,:,np.newaxis] * x[...,np.newaxis,:] u2 = x[...,:-1,:,np.newaxis] * x[...,1:,np.newaxis,:] return [u0, u1, u2] def rotate(self, u, R, logdet=None): if logdet is None: logdet = np.linalg.slogdet(R)[1] N = np.shape(u[0])[-2] # Transform moments and g u0 = linalg.mvdot(R, u[0]) u1 = linalg.dot(R, u[1], R.T) u2 = linalg.dot(R, u[2], R.T) u = [u0, u1, u2] dg = -N * logdet return (u, dg) class TemplateGaussianMarkovChainDistribution(ExponentialFamilyDistribution): """ Sub-classes implement distribution specific computations. """ def __init__(self, N, D): self.N = N self.D = D self.moments = GaussianMarkovChainMoments(N, D) super().__init__() def compute_message_to_parent(self, parent, index, u_self, *u_parents): raise NotImplementedError() def compute_weights_to_parent(self, index, weights): raise NotImplementedError() def compute_phi_from_parents(self, *u_parents, mask=True): raise NotImplementedError() def compute_moments_and_cgf(self, phi, mask=True): """ Compute the moments and the cumulant-generating function. This basically performs the filtering and smoothing for the variable. Parameters ---------- phi Returns ------- u g """ # Solve the Kalman filtering and smoothing problem y = phi[0] A = -2*phi[1] # Don't multiply phi[2] by two because it is a sum of the super- and # sub-diagonal blocks so we would need to divide by two anyway. B = -phi[2] (CovXnXn, CovXpXn, Xn, ldet) = linalg.block_banded_solve(A, B, y) # Compute moments u0 = Xn u1 = CovXnXn + Xn[...,:,np.newaxis] * Xn[...,np.newaxis,:] u2 = CovXpXn + Xn[...,:-1,:,np.newaxis] * Xn[...,1:,np.newaxis,:] u = [u0, u1, u2] # Compute cumulant-generating function g = -0.5 * np.einsum('...ij,...ij', u[0], phi[0]) + 0.5*ldet return (u, g) def compute_cgf_from_parents(self, *u_parents): raise NotImplementedError() def compute_fixed_moments_and_f(self, x, mask=True): """ Compute u(x) and f(x) for given x. """ u0 = x u1 = x[...,:,np.newaxis] * x[...,np.newaxis,:] u2 = x[...,:-1,:,np.newaxis] * x[...,1:,np.newaxis,:] u = [u0, u1, u2] f = -0.5 * np.shape(x)[-2] * np.shape(x)[-1] * np.log(2*np.pi) return (u, f) def plates_to_parent(self, index, plates): """ Computes the plates of this node with respect to a parent. Child classes must implement this. Parameters ----------- index : int The index of the parent node to use. """ raise NotImplementedError() def plates_from_parent(self, index, plates): """ Compute the plates using information of a parent node. Child classes must implement this. Parameters ---------- index : int Index of the parent to use. """ raise NotImplementedError() def rotate(self, u, phi, R, inv=None, logdet=None): (u, dg) = self.moments.rotate(u, R, logdet=logdet) # It would be more efficient and simpler, if you just rotated the # moments and didn't touch phi. However, then you would need to call # update() before lower_bound_contribution. This is more error-safe. if inv is None: inv = np.linalg.inv(R) # Transform parameters phi0 = linalg.mvdot(inv.T, phi[0]) phi1 = linalg.dot(inv.T, phi[1], inv) phi2 = linalg.dot(inv.T, phi[2], inv) phi = [phi0, phi1, phi2] return (u, phi, dg) def compute_rotation_bound(self, u, u_mu_Lambda, u_A_V, R, inv=None, logdet=None): (Lambda_mu, Lambda_mumu, Lambda, logdetLambda) = u_mu_Lambda (V_A, V_AA, V, logdetV) = u_A_V V = misc.make_diag(V, ndim=1) R_XnXn = linalg.dot(R, self.XnXn) R_XpXp = linalg.dot(R, self.XpXp) R_X0X0 = linalg.dot(R, self.X0X0) tracedot(dot(Lambda, R_X0X0), R.T) tracedot(dot(V, R_XnXn), R.T) tracedot(dot(V_AA, R_XpXp), R.T) tracedot(dot(V_A, R_XpXn), R.T) (N - 1) * logdetV 2 * N * logdetR logp = random.gaussian_logpdf( Lambda_R_X0X0_R + V_R_XnXn_R, V_A_R_XpXn_R, V_AA_R_XpXp_R, (N - 1) * logdetV + 2 * N * logdetR ) logH = random.gaussian_entropy( -2 * M * logdetR, 0 ) dlogp dlogH return (L, dL) class _TemplateGaussianMarkovChain(ExponentialFamily): r""" VMP abstract node for Gaussian Markov chain. This is a general base class for different Gaussian Markov chain nodes. Output is Gaussian variables with mean, covariance and one-step cross-covariance. self.phi and self.u are defined in a particular way but otherwise the parent nodes may vary. Child classes must implement the following methods: _plates_to_parent _plates_from_parent See also -------- bayespy.inference.vmp.nodes.gaussian.Gaussian bayespy.inference.vmp.nodes.wishart.Wishart """ def random(self, *phi, plates=None): raise NotImplementedError() def _compute_cgf_for_gaussian_markov_chain(mumu_Lambda, logdet_Lambda, logdet_nu, N): """ Compute CGF using the moments of the parents. """ g0 = -0.5 * mumu_Lambda #np.einsum('...ij,...ij->...', mumu, Lambda) g1 = 0.5 * logdet_Lambda if np.ndim(logdet_nu) == 1: g1 = g1 + 0.5 * (N-1) * np.sum(logdet_nu, axis=-1) elif np.shape(logdet_nu)[-2] == 1: g1 = g1 + 0.5 * (N-1) * np.sum(logdet_nu, axis=(-1,-2)) else: g1 = g1 + 0.5 * np.sum(logdet_nu, axis=(-1,-2)) return g0 + g1 class GaussianMarkovChainDistribution(TemplateGaussianMarkovChainDistribution): r""" Implementation of VMP formulas for Gaussian Markov chain The log probability density function of the prior: .. todo:: Fix inputs and their weight matrix in the equations. .. math:: \log p(\mathbf{X} | \boldsymbol{\mu}, \mathbf{\Lambda}, \mathbf{A}, \mathbf{B}, \boldsymbol{\nu}) =& \log \mathcal{N}(\mathbf{x}_0|\boldsymbol{\mu}, \mathbf{\Lambda}) + \sum^N_{n=1} \log \mathcal{N}( \mathbf{x}_n | \mathbf{Ax}_{n-1} + \mathbf{Bu}_n, \mathrm{diag}(\boldsymbol{\nu})) \\ =& - \frac{1}{2} \mathbf{x}_0^T \mathbf{\Lambda} \mathbf{x}_0 + \frac{1}{2} \mathbf{x}_0^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \mathbf{x}_0 - \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log|\mathbf{\Lambda}| \\ & - \frac{1}{2} \sum^N_{n=1} \mathbf{x}_n^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{x}_n + \frac{1}{2} \sum^N_{n=1} \mathbf{x}_n^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} \mathbf{x}_{n-1} + \frac{1}{2} \sum^N_{n=1} \mathbf{x}_{n-1}^T\mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{x}_n - \frac{1}{2} \sum^N_{n=1} \mathbf{x}_{n-1}^T\mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} \mathbf{x}_{n-1} \\ & + \sum^N_{n=1} \sum^D_{d=1} \log\nu_d - \frac{1}{2} (N+1) D \log(2\pi) \\ =& \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix}^T \begin{bmatrix} -\frac{1}{2}\mathbf{\Lambda} - \frac{1}{2}\mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \frac{1}{2} \mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu}) & & & \\ \frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} & -\frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2}\mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \frac{1}{2} \mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu}) & & \\ & \ddots & \ddots & \ddots & \\ & & \frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} & -\frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2}\mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \frac{1}{2} \mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu}) \\ & & & \frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} & -\frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \end{bmatrix} \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix} \\ & + \frac{1}{2} \mathbf{x}_0^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \mathbf{x}_0 - \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log|\mathbf{\Lambda}| + \sum^N_{n=1} \sum^D_{d=1} \log\nu_d - \frac{1}{2} (N+1) D \log(2\pi) For simplicity, :math:`\boldsymbol{\nu}` and :math:`\mathbf{A}` are assumed not to depend on :math:`n` in the above equation, but this distribution class supports that dependency. One only needs to do the following replacements in the equations: :math:`\boldsymbol{\nu} \leftarrow \boldsymbol{\nu}_n` and :math:`\mathbf{A} \leftarrow \mathbf{A}_n`, where :math:`n=1,\ldots,N`. .. math:: u(\mathbf{X}) &= \begin{bmatrix} \begin{bmatrix} \mathbf{x}_0 & \ldots & \mathbf{x}_N \end{bmatrix} \\ \begin{bmatrix} \mathbf{x}_0\mathbf{x}_0^T & \ldots & \mathbf{x}_N\mathbf{x}_N^T \end{bmatrix} \\ \begin{bmatrix} \mathbf{x}_0\mathbf{x}_1^T & \ldots & \mathbf{x}_{N-1}\mathbf{x}_N^T \end{bmatrix} \end{bmatrix} \\ \phi(\boldsymbol{\mu}, \mathbf{\Lambda}, \mathbf{A}, \boldsymbol{\nu}) &= \begin{bmatrix} \begin{bmatrix} \mathbf{\Lambda} \boldsymbol{\mu} & \mathbf{0} & \ldots & \mathbf{0} \end{bmatrix} \\ \begin{bmatrix} -\frac{1}{2}\mathbf{\Lambda} - \frac{1}{2} \mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & -\frac{1}{2}\mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2} \mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \ldots & -\frac{1}{2}\mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2} \mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & -\frac{1}{2}\mathrm{diag}(\boldsymbol{\nu}) \end{bmatrix} \\ \begin{bmatrix} \mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) & \ldots & \mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) \end{bmatrix} \end{bmatrix} \\ g(\boldsymbol{\mu}, \mathbf{\Lambda}, \mathbf{A}, \boldsymbol{\nu}) &= \frac{1}{2}\log|\mathbf{\Lambda}| + \frac{1}{2} \sum^N_{n=1}\sum^D_{d=1}\log\nu_d \\ f(\mathbf{X}) &= -\frac{1}{2} (N+1) D \log(2\pi) The log probability denisty function of the posterior approximation: .. math:: \log q(\mathbf{X}) &= \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix}^T \begin{bmatrix} \mathbf{\Phi}_0^{(2)} & \frac{1}{2}\mathbf{\Phi}_1^{(3)} & & & \\ \frac{1}{2}{\mathbf{\Phi}_1^{(3)}}^T & \mathbf{\Phi}_1^{(2)} & \frac{1}{2}\mathbf{\Phi}_2^{(3)} & & \\ & \ddots & \ddots & \ddots & \\ & & \frac{1}{2}{\mathbf{\Phi}_{N-1}^{(3)}}^T & \mathbf{\Phi}_{N-1}^{(2)} & \frac{1}{2}\mathbf{\Phi}_N^{(3)} \\ & & & \frac{1}{2}{\mathbf{\Phi}_N^{(3)}}^T & \mathbf{\Phi}_N^{(2)} \end{bmatrix} \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix} + \ldots """ def compute_message_to_parent(self, parent, index, u, u_mu_Lambda, u_A_nu, *u_inputs): r""" Compute a message to a parent. Parameters ---------- index : int Index of the parent requesting the message. u : list of ndarrays Moments of this node. u_mu_Lambda : list of ndarrays Moments of parents :math:`(\boldsymbol{\mu}, \mathbf{\Lambda})`. u_A_nu : list of ndarrays Moments of parents :math:`(\mathbf{A}, \boldsymbol{\nu})`. u_inputs : list of ndarrays Moments of input signals. """ D = np.shape(u[0])[-1] if index == 0: # (mu, Lambda) -- GaussianWishartMoments x0 = u[0][...,0,:] x0x0 = u[1][...,0,:,:] m0 = x0 m1 = -0.5 m2 = -0.5 * x0x0 m3 = 0.5 return [m0, m1, m2, m3] elif index == 1: # (A, nu) -- GaussianGammaMoments XnXn = u[1] XpXn = u[2] # (..., N-1, D, D) m0 = XpXn.swapaxes(-1,-2) # (..., N-1, D, D, D) m1 = -0.5 * XnXn[..., :-1, None, :, :] # (..., N-1, D) m2 = -0.5 * np.einsum('...ii->...i', XnXn[...,1:,:,:]) # (..., N-1, D) m3 = 0.5 if len(u_inputs): Xn = u[0] z = u_inputs[0][0] zz = u_inputs[0][1] D_inputs = np.shape(z)[-1] m0_B = Xn[...,1:,:,None] * z[...,None,:] m1_BB = -0.5 * zz[..., None, :, :] m1_AB = -0.5 * Xn[..., :-1, None, :, None] * z[..., None, None, :] # Construct full message arrays from blocks m0 = np.concatenate([m0, m0_B], axis=-1) row1 = np.concatenate([m1, m1_AB], axis=-1) row2 = np.concatenate([m1_AB.swapaxes(-1,-2), m1_BB], axis=-1) m1 = np.concatenate([row1, row2], axis=-2) return [m0, m1, m2, m3] # m1 = 0.5 elif index == 2: # input signals # (..., N-1, D) Xn = u[0][...,1:,:] # (..., N-1, D) Xp = u[0][...,:-1,:] # (..., N-1, D, K) B = u_A_nu[0][...,D:] # (..., N-1, D, D, K) AB = u_A_nu[1][...,:D,D:] # (..., N-1, D, K, K) BB = u_A_nu[1][...,D:,D:] # (..., N-1, K) m0 = ( np.einsum('...dk,...d->...k', B, Xn) - np.einsum('...dk,...d->...k', np.sum(AB, axis=-3), Xp) ) # (..., N-1, K, K) m1 = -0.5 * np.sum(BB, axis=-3) return [m0, m1] raise IndexError("Parent index out of bounds") def compute_weights_to_parent(self, index, weights): if index == 0: # mu_Lambda return weights elif index == 1: # A_nu return weights[...,np.newaxis,np.newaxis] elif index == 2: # input signals return weights[...,np.newaxis] else: raise ValueError("Index out of bounds") def compute_phi_from_parents(self, u_mu_Lambda, u_A_nu, *u_inputs, mask=True): """ Compute the natural parameters using parents' moments. Parameters ---------- u_parents : list of list of arrays List of parents' lists of moments. Returns ------- phi : list of arrays Natural parameters. dims : tuple Shape of the variable part of phi. """ # Dimensionality of the Gaussian states D = np.shape(u_mu_Lambda[0])[-1] # Number of time instances in the process N = self.N # Helpful variables (show shapes in comments) Lambda_mu = u_mu_Lambda[0] # (..., D) Lambda = u_mu_Lambda[2] # (..., D, D) nu_A = u_A_nu[0][...,:D] # (..., N-1, D, D) nu_AA = u_A_nu[1][...,:D,:D] # (..., N-1, D, D, D) nu_B = u_A_nu[0][...,D:] # (..., N-1, D, inputs) nu_BB = u_A_nu[1][...,D:,D:] # (..., N-1, D, inputs, inputs) nu_AB = u_A_nu[1][...,:D,D:] # (..., N-1, D, D, inputs) nu = u_A_nu[2] * np.ones(D) # (..., N-1, D) # mu = u_mu[0] # (..., D) # Lambda = u_Lambda[0] # (..., D, D) # A = u_A[0][...,:D] # (..., N-1, D, D) # AA = u_A[1][...,:D,:D] # (..., N-1, D, D, D) # B = u_A[0][...,D:] # (..., N-1, D, inputs) # BB = u_A[1][...,D:,D:] # (..., N-1, D, inputs, inputs) # AB = u_A[1][...,:D,D:] # (..., N-1, D, D, inputs) # v = u_v[0] # (..., N-1, D) if len(u_inputs): inputs = u_inputs[0][0] else: inputs = None # Allocate memory (take into account effective plates) if inputs is not None: plates_phi0 = misc.broadcasted_shape(np.shape(Lambda_mu)[:-1], np.shape(nu_B)[:-3], np.shape(nu_AB)[:-4]) else: plates_phi0 = misc.broadcasted_shape(np.shape(Lambda_mu)[:-1]) plates_phi1 = misc.broadcasted_shape(np.shape(Lambda)[:-2], np.shape(nu_AA)[:-4]) plates_phi2 = misc.broadcasted_shape(np.shape(nu_A)[:-3]) phi0 = np.zeros(plates_phi0+(N,D)) phi1 = np.zeros(plates_phi1+(N,D,D)) phi2 = np.zeros(plates_phi2+(N-1,D,D)) # Parameters for x0 phi0[...,0,:] = Lambda_mu #np.einsum('...ik,...k->...i', Lambda, mu) phi1[...,0,:,:] = -0.5 * Lambda # Effect of the input signals if inputs is not None: phi0[...,1:,:] += np.einsum('...ij,...j->...i', nu_B, inputs) phi0[...,:-1,:] -= np.einsum( '...ij,...j->...i', np.sum(nu_AB, axis=-3), inputs ) # Diagonal blocks: -0.5 * (V_i + A_{i+1}' * V_{i+1} * A_{i+1}) phi1[..., 1:, :, :] = -0.5 * misc.diag(nu, ndim=1) phi1[..., :-1, :, :] += -0.5 * np.sum(nu_AA, axis=-3) #np.einsum('...kij,...k->...ij', AA, v) #phi1 *= -0.5 # Super-diagonal blocks: 0.5 * A.T * V # However, don't multiply by 0.5 because there are both super- and # sub-diagonal blocks (sum them together) phi2[..., :, :, :] = linalg.transpose(nu_A, ndim=1) # np.einsum('...ji,...j->...ij', A, v) return (phi0, phi1, phi2) def compute_cgf_from_parents(self, u_mu_Lambda, u_A_nu, *u_inputs): """ Compute CGF using the moments of the parents. """ g = _compute_cgf_for_gaussian_markov_chain(u_mu_Lambda[1], u_mu_Lambda[3], u_A_nu[3], self.N) if len(u_inputs): D = np.shape(u_mu_Lambda[0])[-1] uu = u_inputs[0][1] nu_BB = u_A_nu[1][...,D:,D:] nu = u_A_nu[2] #BB_v = np.einsum('...d,...dij->...ij', v, BB) g_inputs = -0.5 * np.einsum( '...ij,...ij->...', uu, np.sum(nu_BB, axis=-3) #BB_v ) # Sum over time axis if np.ndim(g_inputs) == 0 or np.shape(g_inputs)[-1] == 1: g_inputs *= self.N - 1 if np.ndim(g_inputs) > 0: g_inputs = np.sum(g_inputs, axis=-1) g = g + g_inputs return g def plates_to_parent(self, index, plates): """ Computes the plates of this node with respect to a parent. If this node has plates (...), the latent dimensionality is D and the number of time instances is N, the plates with respect to the parents are: (mu, Lambda): (...) (A, nu): (...,N-1,D) Parameters ---------- index : int The index of the parent node to use. """ if index == 0: # (mu, Lambda) return plates elif index == 1: # (A, nu) return plates + (self.N-1, self.D) elif index == 2: # input signals return plates + (self.N-1,) else: raise ValueError("Invalid parent index.") def plates_from_parent(self, index, plates): """ Compute the plates using information of a parent node. If the plates of the parents are: (mu, Lambda): (...) (A, nu): (...,N-1,D) the resulting plates of this node are (...) Parameters ---------- index : int Index of the parent to use. """ if index == 0: # (mu, Lambda) return plates elif index == 1: # (A, nu) return plates[:-2] elif index == 2: # input signals return plates[:-1] else: raise ValueError("Invalid parent index.") class GaussianMarkovChain(_TemplateGaussianMarkovChain): r""" Node for Gaussian Markov chain random variables. In a simple case, the graphical model can be presented as: .. bayesnet:: \tikzstyle{latent} += [minimum size=30pt]; \node[latent] (x0) {$\mathbf{x}_0$}; \node[latent, right=of x0] (x1) {$\mathbf{x}_1$}; \node[right=of x1] (dots) {$\cdots$}; \node[latent, right=of dots] (xn) {$\mathbf{x}_{N-1}$}; \edge {x0}{x1}; \edge {x1}{dots}; \edge {dots}{xn}; \node[latent, above left=1 and 0.1 of x0] (mu) {$\boldsymbol{\mu}$}; \node[latent, above right=1 and 0.1 of x0] (Lambda) {$\mathbf{\Lambda}$}; \node[latent, above left=1 and 0.1 of dots] (A) {$\mathbf{A}$}; \node[latent, above right=1 and 0.1 of dots] (nu) {$\boldsymbol{\nu}$}; \edge {mu,Lambda} {x0}; \edge {A,nu} {x1,dots,xn}; where :math:`\boldsymbol{\mu}` and :math:`\mathbf{\Lambda}` are the mean and the precision matrix of the initial state, :math:`\mathbf{A}` is the state dynamics matrix and :math:`\boldsymbol{\nu}` is the precision of the innovation noise. It is possible that :math:`\mathbf{A}` and/or :math:`\boldsymbol{\nu}` are different for each transition instead of being constant. The probability distribution is .. math:: p(\mathbf{x}_0, \ldots, \mathbf{x}_{N-1}) = p(\mathbf{x}_0) \prod^{N-1}_{n=1} p(\mathbf{x}_n | \mathbf{x}_{n-1}) where .. math:: p(\mathbf{x}_0) &= \mathcal{N}(\mathbf{x}_0 | \boldsymbol{\mu}, \mathbf{\Lambda}) \\ p(\mathbf{x}_n|\mathbf{x}_{n-1}) &= \mathcal{N}(\mathbf{x}_n | \mathbf{A}_{n-1}\mathbf{x}_{n-1}, \mathrm{diag}(\boldsymbol{\nu}_{n-1})). Parameters ---------- mu : Gaussian-like node or (...,D)-array :math:`\boldsymbol{\mu}`, mean of :math:`x_0`, :math:`D`-dimensional with plates (...) Lambda : Wishart-like node or (...,D,D)-array :math:`\mathbf{\Lambda}`, precision matrix of :math:`x_0`, :math:`D\times D` -dimensional with plates (...) A : Gaussian-like node or (D,D)-array or (...,1,D,D)-array or (...,N-1,D,D)-array :math:`\mathbf{A}`, state dynamics matrix, :math:`D`-dimensional with plates (D,) or (...,1,D) or (...,N-1,D) nu : gamma-like node or (D,)-array or (...,1,D)-array or (...,N-1,D)-array :math:`\boldsymbol{\nu}`, diagonal elements of the precision of the innovation process, plates (D,) or (...,1,D) or (...,N-1,D) n : int, optional :math:`N`, the length of the chain. Must be given if :math:`\mathbf{A}` and :math:`\boldsymbol{\nu}` are constant over time. See also -------- Gaussian, GaussianARD, Wishart, Gamma, SwitchingGaussianMarkovChain, VaryingGaussianMarkovChain, CategoricalMarkovChain """ def __init__(self, mu, Lambda, A, nu, n=None, inputs=None, **kwargs): """ Create GaussianMarkovChain node. """ super().__init__(mu, Lambda, A, nu, n=n, inputs=inputs, **kwargs) @classmethod def _constructor(cls, mu, Lambda, A, nu, n=None, inputs=None, **kwargs): """ Constructs distribution and moments objects. Compute the dimensions of phi and u. The plates and dimensions of the parents should be: mu: (...) and D-dimensional Lambda: (...) and D-dimensional A: (...,1,D) or (...,N-1,D) and D-dimensional v: (...,1,D) or (...,N-1,D) and 0-dimensional N: () and 0-dimensional (dummy parent) Check that the dimensionalities of the parents are proper. For instance, A should be a collection of DxD matrices, thus the dimensionality and the last plate should both equal D. Similarly, `v` should be a collection of diagonal innovation matrix elements, thus the last plate should equal D. """ mu_Lambda = WrapToGaussianWishart(mu, Lambda) A_nu = WrapToGaussianGamma(A, nu, ndim=1) D = mu_Lambda.dims[0][0] if inputs is not None: inputs = cls._ensure_moments(inputs, GaussianMoments, ndim=1) # Check whether to use input signals or not if inputs is None: _parent_moments = (GaussianWishartMoments((D,)), GaussianGammaMoments((D,))) else: K = inputs.dims[0][0] _parent_moments = (GaussianWishartMoments((D,)), GaussianGammaMoments((D,)), GaussianMoments((K,))) # Time instances from input signals if inputs is not None and len(inputs.plates) >= 1: n_inputs = inputs.plates[-1] else: n_inputs = 1 # Time instances from state dynamics matrix if len(A_nu.plates) >= 2: n_A_nu = A_nu.plates[-2] else: n_A_nu = 1 # Check consistency of the number of time instances if n_inputs != n_A_nu and n_inputs != 1 and n_A_nu != 1: raise Exception("Plates of parents are giving different number of time instances") n_parents = max(n_A_nu, n_inputs) if n is None: if n_parents == 1: raise Exception("The number of time instances could not be " "determined automatically. Give the number of " "time instances.") n = n_parents + 1 elif n_parents != 1 and n_parents+1 != n: raise Exception("The number of time instances must match " "the number of last plates of parents: " "%d != %d+1" % (n, n_parents)) # Dimensionality of the states D = mu_Lambda.dims[0][0] # Number of states M = n # Dimensionality of the inputs if inputs is None: D_inputs = 0 else: D_inputs = inputs.dims[0][0] # Check (mu, Lambda) if mu_Lambda.dims != ( (D,), (), (D, D), () ): raise Exception("Initial state parameters have wrong dimensionality") # Check (A, nu) if A_nu.dims != ( (D+D_inputs,), (D+D_inputs,D+D_inputs), (), () ): raise Exception("Dynamics matrix has wrong dimensionality") if len(A_nu.plates) == 0 or A_nu.plates[-1] != D: raise Exception("Dynamics matrix should have a last plate " "equal to the dimensionality of the " "system.") if (len(A_nu.plates) >= 2 and A_nu.plates[-2] != 1 and A_nu.plates[-2] != M-1): raise ValueError("The second last plate of the dynamics matrix " "should have length equal to one or " "N-1, where N is the number of time " "instances.") # Check input signals if inputs is not None: if inputs.dims != ( (D_inputs,), (D_inputs, D_inputs) ): raise ValueError("Input signals have wrong dimensionality") moments = GaussianMarkovChainMoments(M, D) dims = ( (M,D), (M,D,D), (M-1,D,D) ) distribution = GaussianMarkovChainDistribution(M, D) if inputs is None: parents = [mu_Lambda, A_nu] else: parents = [mu_Lambda, A_nu, inputs] return ( parents, kwargs, dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, mu_Lambda.plates), distribution.plates_from_parent(1, A_nu.plates)), distribution, moments, _parent_moments) def rotate(self, R, inv=None, logdet=None): # It would be more efficient and simpler, if you just rotated the # moments and didn't touch phi. However, then you would need to call # update() before lower_bound_contribution. This is more error-safe. (u, phi, dg) = self._distribution.rotate( self.u, self.phi, R, inv=inv, logdet=logdet ) self.u = u self.phi = phi self.g = self.g + dg return class VaryingGaussianMarkovChainDistribution(TemplateGaussianMarkovChainDistribution): """ Sub-classes implement distribution specific computations. """ def compute_message_to_parent(self, parent, index, u, u_mu, u_Lambda, u_B, u_S, u_v): """ Compute a message to a parent. Parameters ----------- index : int Index of the parent requesting the message. u : list of ndarrays Moments of this node. u_mu : list of ndarrays Moments of parent `mu`. u_Lambda : list of ndarrays Moments of parent `Lambda`. u_B : list of ndarrays Moments of parent `B`. u_S : list of ndarrays Moments of parent `S`. u_v : list of ndarrays Moments of parent `v`. """ if index == 0: # mu raise NotImplementedError() elif index == 1: # Lambda raise NotImplementedError() elif index == 2: # B, (...,D)x(D,K) XnXn = u[1] # (...,N,D,D) XpXn = u[2] # (...,N,D,D) S = misc.atleast_nd(u_S[0], 2) # (...,N,K) SS = misc.atleast_nd(u_S[1], 3) # (...,N,K,K) v = misc.atleast_nd(u_v[0], 2) # (...,N,D) # m0: (...,D,D,K) m0 = np.einsum('...nji,...nk,...ni->...ijk', XpXn, S, v) # m1: (...,D,D,K,D,K) if np.ndim(v) >= 2 and np.shape(v)[-2] > 1: raise ValueError("Innovation noise is time dependent") m1 = np.einsum('...nij,...nkl->...ikjl', XnXn[...,:-1,:,:], SS) m1 = -0.5 * np.einsum('...ikjl,...d->...dikjl', m1, v[...,0,:]) elif index == 3: # S, (...,N-1)x(K) XnXn = u[1] # (...,N,D,D) XpXn = u[2] # (...,N,D,D) B = u_B[0] # (...,D,D,K) BB = u_B[1] # (...,D,D,K,D,K) v = u_v[0] # (...,N,D) # m0: (...,N,K) m0 = np.einsum('...nji,...ijk,...ni->...nk', XpXn, B, np.atleast_2d(v)) # m1: (...,N,K,K) if np.ndim(v) >= 2 and np.shape(v)[-2] > 1: raise ValueError("Innovation noise is time dependent") m1 = np.einsum('...dikjl,...d->...ikjl', BB, np.atleast_2d(v)[...,0,:]) m1 = -0.5 * np.einsum('...nij,...ikjl->...nkl', XnXn[...,:-1,:,:], m1) elif index == 4: # v raise NotImplementedError() elif index == 5: # N raise NotImplementedError() return [m0, m1] def compute_weights_to_parent(self, index, weights): if index == 0: # mu return weights elif index == 1: # Lambda return weights elif index == 2: # B return weights[...,np.newaxis] # new plate axis for D elif index == 3: # S return weights[...,np.newaxis] # new plate axis for N elif index == 4: # v return weights[...,np.newaxis,np.newaxis] # new plate axis for N and D elif index == 5: # N return weights else: raise ValueError("Invalid index") def compute_phi_from_parents(self, u_mu, u_Lambda, u_B, u_S, u_v, mask=True): """ Compute the natural parameters using parents' moments. Parameters ---------- u_parents : list of list of arrays List of parents' lists of moments. Returns ------- phi : list of arrays Natural parameters. dims : tuple Shape of the variable part of phi. """ # Dimensionality of the Gaussian states D = np.shape(u_mu[0])[-1] # Number of time instances in the process N = self.N # Helpful variables (show shapes in comments) mu = u_mu[0] # (..., D) Lambda = u_Lambda[0] # (..., D, D) B = u_B[0] # (..., D, D, K) BB = u_B[1] # (..., D, D, K, D, K) S = u_S[0] # (..., N-1, K) or (..., 1, K) SS = u_S[1] # (..., N-1, K, K) v = u_v[0] # (..., N-1, D) or (..., 1, D) # TODO/FIXME: Take into account plates! plates_phi0 = misc.broadcasted_shape(np.shape(mu)[:-1], np.shape(Lambda)[:-2]) plates_phi1 = misc.broadcasted_shape(np.shape(Lambda)[:-2], np.shape(v)[:-2], np.shape(BB)[:-5], np.shape(SS)[:-3]) plates_phi2 = misc.broadcasted_shape(np.shape(B)[:-3], np.shape(S)[:-2], np.shape(v)[:-2]) phi0 = np.zeros(plates_phi0 + (N,D)) phi1 = np.zeros(plates_phi1 + (N,D,D)) phi2 = np.zeros(plates_phi2 + (N-1,D,D)) # Parameters for x0 phi0[...,0,:] = np.einsum('...ik,...k->...i', Lambda, mu) phi1[...,0,:,:] = Lambda # Diagonal blocks: -0.5 * (V_i + A_{i+1}' * V_{i+1} * A_{i+1}) phi1[..., 1:, :, :] = v[...,np.newaxis]*np.identity(D) if np.ndim(v) >= 2 and np.shape(v)[-2] > 1: raise Exception("This implementation is not efficient if " "innovation noise is time-dependent.") phi1[..., :-1, :, :] += np.einsum('...dikjl,...kl,...d->...ij', BB[...,None,:,:,:,:,:], SS, v) else: # We know that S does not have the D plate so we can sum that plate # axis out v_BB = np.einsum('...dikjl,...d->...ikjl', BB[...,None,:,:,:,:,:], v) phi1[..., :-1, :, :] += np.einsum('...ikjl,...kl->...ij', v_BB, SS) #phi1[..., :-1, :, :] += np.einsum('...kij,...k->...ij', AA, v) phi1 *= -0.5 # Super-diagonal blocks: 0.5 * A.T * V # However, don't multiply by 0.5 because there are both super- and # sub-diagonal blocks (sum them together) phi2[..., :, :, :] = np.einsum('...jik,...k,...j->...ij', B[...,None,:,:,:], S, v) #phi2[..., :, :, :] = np.einsum('...ji,...j->...ij', A, v) return (phi0, phi1, phi2) def compute_cgf_from_parents(self, u_mu, u_Lambda, u_B, u_S, u_v): """ Compute CGF using the moments of the parents. """ u_mumu_Lambda = linalg.inner(u_Lambda[0], u_mu[1], ndim=2) return _compute_cgf_for_gaussian_markov_chain(u_mumu_Lambda, u_Lambda[1], u_v[1], self.N) def plates_to_parent(self, index, plates): """ Computes the plates of this node with respect to a parent. If this node has plates (...), the latent dimensionality is D and the number of time instances is N, the plates with respect to the parents are: mu: (...) Lambda: (...) A: (...,N-1,D) v: (...,N-1,D) Parameters ----------- index : int The index of the parent node to use. """ if index == 0: # mu return plates elif index == 1: # Lambda return plates elif index == 2: # B return plates + (self.D,) elif index == 3: # S return plates + (self.N-1,) elif index == 4: # v return plates + (self.N-1,self.D) else: raise ValueError("Invalid parent index.") def plates_from_parent(self, index, plates): """ Compute the plates using information of a parent node. If the plates of the parents are: mu: (...) Lambda: (...) B: (...,D) S: (...,N-1) v: (...,N-1,D) N: () the resulting plates of this node are (...) Parameters ---------- index : int Index of the parent to use. """ if index == 0: # mu return plates elif index == 1: # Lambda return plates elif index == 2: # B, remove last plate D return plates[:-1] elif index == 3: # S, remove last plate N-1 return plates[:-1] elif index == 4: # v, remove last plates N-1,D return plates[:-2] else: raise ValueError("Invalid parent index.") class VaryingGaussianMarkovChain(_TemplateGaussianMarkovChain): r""" Node for Gaussian Markov chain random variables with time-varying dynamics. The node models a sequence of Gaussian variables :math:`\mathbf{x}_0,\ldots,\mathbf{x}_{N-1}` with linear Markovian dynamics. The time variability of the dynamics is obtained by modelling the state dynamics matrix as a linear combination of a set of matrices with time-varying linear combination weights. The graphical model can be presented as: .. bayesnet:: \tikzstyle{latent} += [minimum size=40pt]; \node[latent] (x0) {$\mathbf{x}_0$}; \node[latent, right=of x0] (x1) {$\mathbf{x}_1$}; \node[right=of x1] (dots) {$\cdots$}; \node[latent, right=of dots] (xn) {$\mathbf{x}_{N-1}$}; \edge {x0}{x1}; \edge {x1}{dots}; \edge {dots}{xn}; \node[latent, above left=1 and 0.1 of x0] (mu) {$\boldsymbol{\mu}$}; \node[latent, above right=1 and 0.1 of x0] (Lambda) {$\mathbf{\Lambda}$}; \node[det, below=of x1] (A0) {$\mathbf{A}_0$}; \node[right=of A0] (Adots) {$\cdots$}; \node[det, right=of Adots] (An) {$\mathbf{A}_{N-2}$}; \node[latent, above=of dots] (nu) {$\boldsymbol{\nu}$}; \edge {mu,Lambda} {x0}; \edge {nu} {x1,dots,xn}; \edge {A0} {x1}; \edge {Adots} {dots}; \edge {An} {xn}; \node[latent, below=of A0] (s0) {$s_{0,k}$}; \node[right=of s0] (sdots) {$\cdots$}; \node[latent, right=of sdots] (sn) {$\mathbf{s}_{N-2,k}$}; \node[latent, left=of s0] (B) {$\mathbf{B}_k$}; \edge {B} {A0, Adots, An}; \edge {s0} {A0}; \edge {sdots} {Adots}; \edge {sn} {An}; \plate {K} {(B)(s0)(sdots)(sn)} {$k=0,\ldots,K-1$}; where :math:`\boldsymbol{\mu}` and :math:`\mathbf{\Lambda}` are the mean and the precision matrix of the initial state, :math:`\boldsymbol{\nu}` is the precision of the innovation noise, and :math:`\mathbf{A}_n` are the state dynamics matrix obtained by mixing matrices :math:`\mathbf{B}_k` with weights :math:`s_{n,k}`. The probability distribution is .. math:: p(\mathbf{x}_0, \ldots, \mathbf{x}_{N-1}) = p(\mathbf{x}_0) \prod^{N-1}_{n=1} p(\mathbf{x}_n | \mathbf{x}_{n-1}) where .. math:: p(\mathbf{x}_0) &= \mathcal{N}(\mathbf{x}_0 | \boldsymbol{\mu}, \mathbf{\Lambda}) \\ p(\mathbf{x}_n|\mathbf{x}_{n-1}) &= \mathcal{N}(\mathbf{x}_n | \mathbf{A}_{n-1}\mathbf{x}_{n-1}, \mathrm{diag}(\boldsymbol{\nu})), \quad \text{for } n=1,\ldots,N-1, \\ \mathbf{A}_n & = \sum^{K-1}_{k=0} s_{n,k} \mathbf{B}_k, \quad \text{for } n=0,\ldots,N-2. Parameters ---------- mu : Gaussian-like node or (...,D)-array :math:`\boldsymbol{\mu}`, mean of :math:`x_0`, :math:`D`-dimensional with plates (...) Lambda : Wishart-like node or (...,D,D)-array :math:`\mathbf{\Lambda}`, precision matrix of :math:`x_0`, :math:`D\times D` -dimensional with plates (...) B : Gaussian-like node or (...,D,D,K)-array :math:`\{\mathbf{B}_k\}_{k=0}^{K-1}`, a set of state dynamics matrix, :math:`D \times K`-dimensional with plates (...,D) S : Gaussian-like node or (...,N-1,K)-array :math:`\{\mathbf{s}_0,\ldots,\mathbf{s}_{N-2}\}`, time-varying weights of the linear combination, :math:`K`-dimensional with plates (...,N-1) nu : gamma-like node or (...,D)-array :math:`\boldsymbol{\nu}`, diagonal elements of the precision of the innovation process, plates (...,D) n : int, optional :math:`N`, the length of the chain. Must be given if :math:`\mathbf{S}` does not have plates over the time domain (which would not make sense). See also -------- Gaussian, GaussianARD, Wishart, Gamma, GaussianMarkovChain, SwitchingGaussianMarkovChain Notes ----- Equivalent model block can be constructed with :class:`GaussianMarkovChain` by explicitly using :class:`SumMultiply` to compute the linear combination. However, that approach is not very efficient for large datasets because it does not utilize the structure of :math:`\mathbf{A}_n`, thus it explicitly computes huge moment arrays. References ---------- :cite:`Luttinen:2014` """ def __init__(self, mu, Lambda, B, S, nu, n=None, **kwargs): """ Create VaryingGaussianMarkovChain node. """ super().__init__(mu, Lambda, B, S, nu, n=n, **kwargs) @classmethod def _constructor(cls, mu, Lambda, B, S, v, n=None, **kwargs): """ Constructs distribution and moments objects. Compute the dimensions of phi and u. The plates and dimensions of the parents should be: mu: (...) and D-dimensional Lambda: (...) and D-dimensional B: (...,D) and (D,K)-dimensional S: (...,N-1) and K-dimensional v: (...,1,D) or (...,N-1,D) and 0-dimensional N: () and 0-dimensional (dummy parent) Check that the dimensionalities of the parents are proper. """ mu = cls._ensure_moments(mu, GaussianMoments, ndim=1) Lambda = cls._ensure_moments(Lambda, WishartMoments, ndim=1) B = cls._ensure_moments(B, GaussianMoments, ndim=2) S = cls._ensure_moments(S, GaussianMoments, ndim=1) v = cls._ensure_moments(v, GammaMoments) (D, K) = B.dims[0] parent_moments = ( GaussianMoments((D,)), WishartMoments((D,)), GaussianMoments((D, K)), GaussianMoments((K,)), GammaMoments() ) # A dummy wrapper for the number of time instances. n_S = 1 if len(S.plates) >= 1: n_S = S.plates[-1] n_v = 1 if len(v.plates) >= 2: n_v = v.plates[-2] if n_v != n_S and n_v != 1 and n_S != 1: raise Exception( "Plates of A and v are giving different number of time " "instances") n_S = max(n_v, n_S) if n is None: if n_S == 1: raise Exception( "The number of time instances could not be determined " "automatically. Give the number of time instances.") n = n_S + 1 elif n_S != 1 and n_S+1 != n: raise Exception( "The number of time instances must match the number of last " "plates of parents:" "%d != %d+1" % (n, n_S)) D = mu.dims[0][0] K = B.dims[0][-1] M = n #N.get_moments()[0] # Check mu if mu.dims != ( (D,), (D,D) ): raise ValueError("First parent has wrong dimensionality") # Check Lambda if Lambda.dims != ( (D,D), () ): raise ValueError("Second parent has wrong dimensionality") # Check B if B.dims != ( (D,K), (D,K,D,K) ): raise ValueError("Third parent has wrong dimensionality {0}. Should be {1}.".format(B.dims[0], (D,K))) if len(B.plates) == 0 or B.plates[-1] != D: raise ValueError("Third parent should have a last plate " "equal to the dimensionality of the " "system.") if S.dims != ( (K,), (K,K) ): raise ValueError("Fourth parent has wrong dimensionality %s, " "should be %s" % (S.dims, ( (K,), (K,K) ))) if (len(S.plates) >= 1 and S.plates[-1] != 1 and S.plates[-1] != M-1): raise ValueError("The last plate of the fourth " "parent should have length equal to one or " "N-1, where N is the number of time " "instances.") # Check v if v.dims != ( (), () ): raise Exception("Fifth parent has wrong dimensionality") if len(v.plates) == 0 or v.plates[-1] != D: raise Exception("Fifth parent should have a last plate " "equal to the dimensionality of the " "system.") if (len(v.plates) >= 2 and v.plates[-2] != 1 and v.plates[-2] != M-1): raise ValueError("The second last plate of the fifth " "parent should have length equal to one or " "N-1 where N is the number of time " "instances.") distribution = VaryingGaussianMarkovChainDistribution(M, D) moments = GaussianMarkovChainMoments(M, D) parents = [mu, Lambda, B, S, v] dims = ( (M,D), (M,D,D), (M-1,D,D) ) return (parents, kwargs, dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, mu.plates), distribution.plates_from_parent(1, Lambda.plates), distribution.plates_from_parent(2, B.plates), distribution.plates_from_parent(3, S.plates), distribution.plates_from_parent(4, v.plates)), distribution, moments, parent_moments) class SwitchingGaussianMarkovChainDistribution(TemplateGaussianMarkovChainDistribution): """ Sub-classes implement distribution specific computations. """ def __init__(self, N, D, K): self.K = K super().__init__(N, D) def compute_message_to_parent(self, parent, index, u, u_mu, u_Lambda, u_B, u_Z, u_v): """ Compute a message to a parent. Parameters ---------- index : int Index of the parent requesting the message. u : list of ndarrays Moments of this node. u_mu : list of ndarrays Moments of parent `mu`. u_Lambda : list of ndarrays Moments of parent `Lambda`. u_B : list of ndarrays Moments of parent `B`. u_Z : list of ndarrays Moments of parent `Z`. u_v : list of ndarrays Moments of parent `v`. """ if index == 0: # mu raise NotImplementedError() elif index == 1: # Lambda raise NotImplementedError() elif index == 2: # B, (...,K,D)x(D) XnXn = u[1] # (...,N,D,D) XpXn = u[2] # (...,N-1,D,D) Z = u_Z[0] # (...,N-1,K) v = misc.atleast_nd(u_v[0], 2) # (...,N-1,D) # Check that there is no time-dependency in v and remove the axis if np.ndim(v) >= 2 and np.shape(v)[-2] > 1: raise ValueError("Innovation noise is time dependent") v = np.squeeze(v, axis=-2) # m0: (...,K,D,D) m0 = np.einsum('...nji,...nk,...i->...kij', XpXn, Z, v) # m1: (...,K,D,D,D) m1 = np.einsum('...nij,...nk->...kij', XnXn[...,:-1,:,:], Z) m1 = -0.5 * np.einsum('...kij,...d->...kdij', m1, v) return [m0, m1] elif index == 3: # Z, (...,N-1)x(K) XnXn = u[1] # (...,N,D,D) XpXn = u[2] # (...,N-1,D,D) B = u_B[0] # (...,K,D,D) BB = u_B[1] # (...,K,D,D,D) v = misc.atleast_nd(u_v[0], 2) # (...,N-1,D) logv = misc.atleast_nd(u_v[1], 2) # (...,N-1,D) # Check that there is no time-dependency in v and remove the axis if np.ndim(v) >= 2 and np.shape(v)[-2] > 1: raise ValueError("Innovation noise is time dependent") v = np.squeeze(v, axis=-2) if np.ndim(logv) >= 2 and np.shape(logv)[-2] > 1: raise ValueError("Innovation noise is time dependent") logv = np.squeeze(logv, axis=-2) XnXn_v = np.einsum('...nii,...i->...n', XnXn[...,1:,:,:], v) XpXn_v_B = np.einsum('...nil,...l,...kli->...nk', XpXn, v, B) BvB = np.einsum('...kdij,...d->...kij', BB, v) XpXp_BvB = np.einsum('...nij,...kij->...nk', XnXn[...,:-1,:,:], BvB) m0 = ( -0.5 * XnXn_v[...,None] + XpXn_v_B -0.5 * XpXp_BvB +0.5 * np.sum(logv, axis=-1)[...,None,None] -0.5 * self.D * np.log(2*np.pi) ) return [m0] elif index == 4: # v raise NotImplementedError() elif index == 5: # N raise NotImplementedError() def compute_weights_to_parent(self, index, weights): if index == 0: # mu: (...)x(N,D) -> (...)x(D) return weights elif index == 1: # Lambda: (...)x(N,D) -> (...)x(D,D) return weights elif index == 2: # B: (...)x(N,D) -> (...,K,D)x(D) return weights[...,None,None] elif index == 3: # Z: (...)x(N,D) -> (...,N-1)x(K) return weights[...,None] elif index == 4: # v: (...)x(N,D) -> (...,N-1,D)x() return weights[...,None,None] else: raise ValueError("Invalid index") def compute_phi_from_parents(self, u_mu, u_Lambda, u_B, u_Z, u_v, mask=True): """ Compute the natural parameters using parents' moments. Parameters ---------- u_parents : list of list of arrays List of parents' lists of moments. Returns ------- phi : list of arrays Natural parameters. dims : tuple Shape of the variable part of phi. """ # Dimensionality of the Gaussian states D = np.shape(u_mu[0])[-1] # Number of time instances in the process N = self.N # Helpful variables (show shapes in comments) mu = u_mu[0] # (..., D) Lambda = u_Lambda[0] # (..., D, D) B = u_B[0] # (..., K, D, D) BB = u_B[1] # (..., K, D, D, D) Z = u_Z[0] # (..., N-1, K) v = misc.atleast_nd(u_v[0], 2) # (..., N-1, D) or (..., 1, D) # TODO/FIXME: Take into account plates! plates_phi0 = misc.broadcasted_shape(np.shape(mu)[:-1], np.shape(Lambda)[:-2]) plates_phi1 = misc.broadcasted_shape(np.shape(Lambda)[:-2], np.shape(v)[:-2], np.shape(BB)[:-4], np.shape(Z)[:-2]) plates_phi2 = misc.broadcasted_shape(np.shape(B)[:-3], np.shape(Z)[:-2], np.shape(v)[:-2]) phi0 = np.zeros(plates_phi0 + (N,D)) phi1 = np.zeros(plates_phi1 + (N,D,D)) phi2 = np.zeros(plates_phi2 + (N-1,D,D)) # Parameters for x0 phi0[...,0,:] = np.einsum('...ik,...k->...i', Lambda, mu) phi1[...,0,:,:] = Lambda # Diagonal blocks: -0.5 * (V_i + A_{i+1}' * V_{i+1} * A_{i+1}) phi1[..., 1:, :, :] = v[...,None]*np.identity(D) if np.shape(v)[-2] > 1: raise Exception("This implementation is not efficient if " "innovation noise is time-dependent.") phi1[..., :-1, :, :] += np.einsum('...kdij,...nk,...nd->...nij', BB[...,:,:,:,:], Z, v) else: # We know that S does not have the D plate so we can sum that plate # axis out v_BB = np.einsum('...kdij,...nd->...nkij', BB[...,:,:,:,:], v) phi1[..., :-1, :, :] += np.einsum('...nkij,...nk->...nij', v_BB, Z) phi1 *= -0.5 # Super-diagonal blocks: 0.5 * A.T * V # However, don't multiply by 0.5 because there are both super- and # sub-diagonal blocks (sum them together) phi2[..., :, :, :] = np.einsum('...kji,...nk,...nj->...nij', B[...,:,:,:], Z, v) return (phi0, phi1, phi2) def compute_cgf_from_parents(self, u_mu, u_Lambda, u_B, u_Z, u_v): """ Compute CGF using the moments of the parents. """ u_mumu_Lambda = linalg.inner(u_Lambda[0], u_mu[1], ndim=2) return _compute_cgf_for_gaussian_markov_chain(u_mumu_Lambda, u_Lambda[1], u_v[1], self.N) def plates_to_parent(self, index, plates): """ Computes the plates of this node with respect to a parent. If this node has plates (...), the latent dimensionality is D and the number of time instances is N, the plates with respect to the parents are: mu: (...) Lambda: (...) A: (...,N-1,D) v: (...,N-1,D) Parameters ---------- index : int The index of the parent node to use. """ if index == 0: # mu: (...)x(N,D) -> (...)x(D) return plates elif index == 1: # Lambda: (...)x(N,D) -> (...)x(D,D) return plates elif index == 2: # B: (...)x(N,D) -> (...,K,D)x(D) return plates + (self.K,self.D) elif index == 3: # Z: (...)x(N,D) -> (...,N-1)x(K) return plates + (self.N-1,) elif index == 4: # v: (...)x(N,D) -> (...,N-1,D)x() return plates + (self.N-1,self.D) else: raise ValueError("Invalid parent index.") def plates_from_parent(self, index, plates): """ Compute the plates using information of a parent node. If the plates of the parents are: mu: (...) Lambda: (...) B: (...,D) S: (...,N-1) v: (...,N-1,D) N: () the resulting plates of this node are (...) Parameters ---------- index : int Index of the parent to use. """ if index == 0: # mu: (...)x(D) -> (...)x(N,D) return plates elif index == 1: # Lambda: (...)x(D,D) -> (...)x(N,D) return plates elif index == 2: # B: (...,K,D)x(D) -> (...)x(N,D) return plates[:-2] elif index == 3: # Z: (...,N-1)x(K) -> (...)x(N,D) return plates[:-1] elif index == 4: # v: (...,N-1,D)x() -> (...)x(N,D) return plates[:-2] else: raise ValueError("Invalid parent index.") class SwitchingGaussianMarkovChain(_TemplateGaussianMarkovChain): r""" Node for Gaussian Markov chain random variables with switching dynamics. The node models a sequence of Gaussian variables :math:`\mathbf{x}_0,\ldots,\mathbf{x}_{N-1}$ with linear Markovian dynamics. The dynamics may change in time, which is obtained by having a set of matrices and at each time selecting one of them as the state dynamics matrix. The graphical model can be presented as: .. bayesnet:: \tikzstyle{latent} += [minimum size=40pt]; \node[latent] (x0) {$\mathbf{x}_0$}; \node[latent, right=of x0] (x1) {$\mathbf{x}_1$}; \node[right=of x1] (dots) {$\cdots$}; \node[latent, right=of dots] (xn) {$\mathbf{x}_{N-1}$}; \edge {x0}{x1}; \edge {x1}{dots}; \edge {dots}{xn}; \node[latent, above left=1 and 0.1 of x0] (mu) {$\boldsymbol{\mu}$}; \node[latent, above right=1 and 0.1 of x0] (Lambda) {$\mathbf{\Lambda}$}; \node[det, below=of x1] (A0) {$\mathbf{A}_0$}; \node[right=of A0] (Adots) {$\cdots$}; \node[det, right=of Adots] (An) {$\mathbf{A}_{N-2}$}; \node[latent, above=of dots] (nu) {$\boldsymbol{\nu}$}; \edge {mu,Lambda} {x0}; \edge {nu} {x1,dots,xn}; \edge {A0} {x1}; \edge {Adots} {dots}; \edge {An} {xn}; \node[latent, below=of A0] (z0) {$z_0$}; \node[right=of z0] (zdots) {$\cdots$}; \node[latent, right=of zdots] (zn) {$z_{N-2}$}; \node[latent, left=of z0] (B) {$\mathbf{B}_k$}; \edge {B} {A0, Adots, An}; \edge {z0} {A0}; \edge {zdots} {Adots}; \edge {zn} {An}; \plate {K} {(B)} {$k=0,\ldots,K-1$}; where :math:`\boldsymbol{\mu}` and :math:`\mathbf{\Lambda}` are the mean and the precision matrix of the initial state, :math:`\boldsymbol{\nu}` is the precision of the innovation noise, and :math:`\mathbf{A}_n` are the state dynamics matrix obtained by selecting one of the matrices :math:`\{\mathbf{B}_k\}^{K-1}_{k=0}` at each time. The selections are provided by :math:`z_n\in\{0,\ldots,K-1\}`. The probability distribution is .. math:: p(\mathbf{x}_0, \ldots, \mathbf{x}_{N-1}) = p(\mathbf{x}_0) \prod^{N-1}_{n=1} p(\mathbf{x}_n | \mathbf{x}_{n-1}) where .. math:: p(\mathbf{x}_0) &= \mathcal{N}(\mathbf{x}_0 | \boldsymbol{\mu}, \mathbf{\Lambda}) \\ p(\mathbf{x}_n|\mathbf{x}_{n-1}) &= \mathcal{N}(\mathbf{x}_n | \mathbf{A}_{n-1}\mathbf{x}_{n-1}, \mathrm{diag}(\boldsymbol{\nu})), \quad \text{for } n=1,\ldots,N-1, \\ \mathbf{A}_n &= \mathbf{B}_{z_n}, \quad \text{for } n=0,\ldots,N-2. Parameters ---------- mu : Gaussian-like node or (...,D)-array :math:`\boldsymbol{\mu}`, mean of :math:`x_0`, :math:`D`-dimensional with plates (...) Lambda : Wishart-like node or (...,D,D)-array :math:`\mathbf{\Lambda}`, precision matrix of :math:`x_0`, :math:`D\times D` -dimensional with plates (...) B : Gaussian-like node or (...,D,D,K)-array :math:`\{\mathbf{B}_k\}_{k=0}^{K-1}`, a set of state dynamics matrix, :math:`D \times K`-dimensional with plates (...,D) Z : categorical-like node or (...,N-1)-array :math:`\{z_0,\ldots,z_{N-2}\}`, time-dependent selection, :math:`K`-categorical with plates (...,N-1) nu : gamma-like node or (...,D)-array :math:`\boldsymbol{\nu}`, diagonal elements of the precision of the innovation process, plates (...,D) n : int, optional :math:`N`, the length of the chain. Must be given if :math:`\mathbf{Z}` does not have plates over the time domain (which would not make sense). See also -------- Gaussian, GaussianARD, Wishart, Gamma, GaussianMarkovChain, VaryingGaussianMarkovChain, Categorical, CategoricalMarkovChain Notes ----- Equivalent model block can be constructed with :class:`GaussianMarkovChain` by explicitly using :class:`Gate` to select the state dynamics matrix. However, that approach is not very efficient for large datasets because it does not utilize the structure of :math:`\mathbf{A}_n`, thus it explicitly computes huge moment arrays. """ def __init__(self, mu, Lambda, B, Z, nu, n=None, **kwargs): """ Create SwitchingGaussianMarkovChain node. """ super().__init__(mu, Lambda, B, Z, nu, n=n, **kwargs) @classmethod def _constructor(cls, mu, Lambda, B, Z, v, n=None, **kwargs): """ Constructs distribution and moments objects. Compute the dimensions of phi and u. The plates and dimensions of the parents should be: mu: (...) and D-dimensional Lambda: (...) and D-dimensional B: (...,K,D) and D-dimensional Z: (...,N-1) and K-dimensional v: (...,1,D) or (...,N-1,D) and 0-dimensional Check that the dimensionalities of the parents are proper. """ # Infer the number of dynamic matrices B = cls._ensure_moments(B, GaussianMoments, ndim=1) K = B.plates[-2] mu = cls._ensure_moments(mu, GaussianMoments, ndim=1) Lambda = cls._ensure_moments(Lambda, WishartMoments, ndim=1) Z = cls._ensure_moments(Z, CategoricalMoments, categories=K) v = cls._ensure_moments(v, GammaMoments) parent_moments = ( mu._moments, Lambda._moments, B._moments, Z._moments, v._moments ) # Infer the length of the chain n_Z = 1 if len(Z.plates) == 0: raise ValueError("Z must have temporal axis on plates") n_Z = Z.plates[-1] n_v = 1 if len(v.plates) >= 2: n_v = v.plates[-2] if n_v != n_Z and n_v != 1 and n_Z != 1: raise Exception( "Plates of Z and v are giving different number of time " "instances") n_Z = max(n_v, n_Z) if n is None: if n_Z == 1: raise Exception( "The number of time instances could not be determined " "automatically. Give the number of time instances.") n = n_Z + 1 elif n_Z != 1 and n_Z+1 != n: raise Exception( "The number of time instances must match the number of last " "plates of parents:" "%d != %d+1" % (n, n_Z)) D = mu.dims[0][0] K = Z.dims[0][0] M = n #N.get_moments()[0] # Check mu if mu.dims != ( (D,), (D,D) ): raise ValueError("First parent has wrong dimensionality") # Check Lambda if Lambda.dims != ( (D,D), () ): raise ValueError("Second parent has wrong dimensionality") # Check B if B.dims != ( (D,), (D,D) ): raise ValueError("Third parent has wrong dimensionality") if len(B.plates) < 2 or B.plates[-2:] != (K,D): raise ValueError("Third parent should have a last plate " "equal to the dimensionality of the " "system.") if Z.dims != ( (K,), ): raise ValueError("Fourth parent has wrong dimensionality %s, " "should be %s" % (Z.dims, ( (K,), ))) if Z.plates[-1] != M-1: raise ValueError("The last plate of the fourth " "parent should have length equal to one or " "N-1, where N is the number of time " "instances.") # Check v if v.dims != ( (), () ): raise Exception("Fifth parent has wrong dimensionality") if len(v.plates) == 0 or v.plates[-1] != D: raise Exception("Fifth parent should have a last plate " "equal to the dimensionality of the " "system.") if (len(v.plates) >= 2 and v.plates[-2] != 1 and v.plates[-2] != M-1): raise ValueError("The second last plate of the fifth " "parent should have length equal to one or " "N-1 where N is the number of time " "instances.") dims = ( (M,D), (M,D,D), (M-1,D,D) ) distribution = SwitchingGaussianMarkovChainDistribution(M, D, K) moments = GaussianMarkovChainMoments(M, D) parents = [mu, Lambda, B, Z, v] return (parents, kwargs, dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, mu.plates), distribution.plates_from_parent(1, Lambda.plates), distribution.plates_from_parent(2, B.plates), distribution.plates_from_parent(3, Z.plates), distribution.plates_from_parent(4, v.plates)), distribution, moments, parent_moments) class _MarkovChainToGaussian(Deterministic): """ Transform a Gaussian Markov chain node into a Gaussian node. This node is deterministic. """ def __init__(self, X, **kwargs): X = self._ensure_moments(X, GaussianMarkovChainMoments) D = X.dims[0][-1] self._moments = GaussianMoments((D,)) self._parent_moments = (X._moments,) super().__init__(X, dims=self._moments.dims, **kwargs) def _plates_to_parent(self, index): """ Return the number of plates to the parent node. Normally, the parent sees the same number of plates as the node itself. However, now that one of the variable dimensions of the parents corresponds to a plate in this node, it is necessary to fix it here: the last plate is ignored when calculating plates with respect to the parent. Parent: Plates = (...) Dims = (N, ...) This node: Plates = (..., N) Dims = (...) """ return self.plates[:-1] def _plates_from_parent(self, index): # Sub-classes may want to overwrite this if they manipulate plates if index != 0: raise ValueError("Invalid parent index.") parent = self.parents[0] plates = parent.plates + (parent.dims[0][0],) return plates def _compute_moments(self, u): """ Transform the moments of a GMC to moments of a Gaussian. There is no need to worry about the plates and variable dimensions because the child node is free to interpret the axes as it pleases. However, the Gaussian moments contain only and but not , thus the last moment is discarded. """ # Get the moments from the parent Gaussian Markov Chain #u = self.parents[0].get_moments() #message_to_child() # Send only moments and but not return u[:2] def _compute_weights_to_parent(self, index, weights): # Remove the last axis of the mask if np.ndim(weights) >= 1: weights = np.sum(weights, axis=-1) return weights @staticmethod def _compute_message_to_parent(index, m_children, *u_parents): """ Transform a message to a Gaussian into a message to a GMC. The messages to a Gaussian are almost correct, there are only two minor things to be done: 1) The last plate is changed into a variable/time dimension. Because a message mask is applied for plates only, the last axis of the mask must be applied to the message because the last plate is changed to a variable/time dimension. 2) Because the message does not contain part, we'll put the last/third message to None meaning that it is empty. Parameters ---------- index : int Index of the parent requesting the message. u_parents : list of list of ndarrays List of parents' moments. Returns ------- m : list of ndarrays Message as a list of arrays. mask : boolean ndarray Mask telling which plates should be taken into account. """ # Add the third empty message return [m_children[0], m_children[1], None] # Make use of the converter GaussianMarkovChainMoments.add_converter(GaussianMoments, _MarkovChainToGaussian) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/gp.py0000644000175100001770000005477000000000000022322 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import itertools import numpy as np import scipy as sp import scipy.linalg.decomp_cholesky as decomp import scipy.linalg as linalg import scipy.special as special import time import profile import scipy.spatial.distance as distance from .node import Node from .stochastic import Stochastic from bayespy.utils.misc import * # Computes log probability density function of the Gaussian # distribution def gaussian_logpdf(y_invcov_y, y_invcov_mu, mu_invcov_mu, logdetcov, D): return (-0.5*D*np.log(2*np.pi) -0.5*logdetcov -0.5*y_invcov_y +y_invcov_mu -0.5*mu_invcov_mu) # m prior mean function # k prior covariance function # x data inputs # z processed data outputs (z = inv(Cov) * (y-m(x))) # U data covariance Cholesky factor def gp_posterior_moment_function(m, k, x, y, noise=None): # Prior mu = m(x)[0] K = k(x,x)[0] if noise != None: K += noise #print('hereiamagain') #print(K) # Compute posterior GP N = len(y) if N == 0: U = None z = None else: U = chol(K) z = chol_solve(U, y-mu) def get_moments(xh, covariance=1, mean=True): (kh,) = k(x, xh) # Function for computing posterior moments if mean: # Mean vector mh = m(xh) if z != None: mh += np.dot(kh.T, z) else: mh = None if covariance: if covariance == 1: # Variance vector khh = k(xh) if U != None: khh -= np.einsum('i...,i...', kh, chol_solve(U, kh)) elif covariance == 2: # Full covariance matrix khh = k(xh,xh) if U != None: khh -= np.dot(kh.T, chol_solve(U,kh)) else: khh = None return [mh, khh] return get_moments # m prior mean function # k prior covariance function # x data inputs # z processed data outputs (z = inv(Cov) * (y-m(x))) # U data covariance Cholesky factor ## def gp_multi_posterior_moment_function(m, k, x, y, noise=None): ## # Prior ## mu = m(x)[0] ## K = k(x,x)[0] ## if noise != None: ## K += noise ## #print('hereiamagain') ## #print(K) ## # Compute posterior GP ## N = len(y) ## if N == 0: ## U = None ## z = None ## else: ## U = chol(K) ## z = chol_solve(U, y-mu) ## def get_moments(xh, covariance=1, mean=True): ## (kh,) = k(x, xh) ## # Function for computing posterior moments ## if mean: ## # Mean vector ## mh = m(xh) ## if z != None: ## mh += np.dot(kh.T, z) ## else: ## mh = None ## if covariance: ## if covariance == 1: ## # Variance vector ## khh = k(xh) ## if U != None: ## khh -= np.einsum('i...,i...', kh, chol_solve(U, kh)) ## elif covariance == 2: ## # Full covariance matrix ## khh = k(xh,xh) ## if U != None: ## khh -= np.dot(kh.T, chol_solve(U,kh)) ## else: ## khh = None ## return [mh, khh] ## return get_moments def gp_cov_se(D2, overwrite=False): if overwrite: K = D2 K *= -0.5 np.exp(K, out=K) else: K = np.exp(-0.5*D2) return K def gp_cov_delta(N): return np.identity(N) def squared_distance(x1, x2): # Reshape arrays to 2-D arrays sh1 = np.shape(x1)[:-1] sh2 = np.shape(x2)[:-1] d = np.shape(x1)[-1] x1 = np.reshape(x1, (-1,d)) x2 = np.reshape(x2, (-1,d)) # Compute squared Euclidean distance D2 = distance.cdist(x1, x2, metric='sqeuclidean') # Reshape the result D2 = np.reshape(D2, sh1 + sh2) return D2 # General rule for the parameters for covariance functions: # # (value, [ [dvalue1, ...], [dvalue2, ...], [dvalue3, ...], ...]) # # For instance, # # k = covfunc_se((1.0, []), (15, [ [1,update_grad] ])) # K = k((x1, [ [dx1,update_grad] ]), (x2, [])) # # Plain values are converted as: # value -> (value, []) def gp_standardize_input(x): if np.ndim(x) == 0: x = add_trailing_axes(x, 2) elif np.ndim(x) == 1: x = add_trailing_axes(x, 1) return x def gp_preprocess_inputs(*args): args = list(args) if len(args) < 1 or len(args) > 2: raise Exception("Number of inputs must be one or two") if len(args) == 2: if args[0] is args[1]: args[0] = gp_standardize_input(args[0]) args[1] = args[0] else: args[1] = gp_standardize_input(args[1]) args[0] = gp_standardize_input(args[0]) else: args[0] = gp_standardize_input(args[0]) return args def covfunc_delta(theta, *inputs, gradient=False): amplitude = theta[0] if gradient: gradient_amplitude = gradient[0] else: gradient_amplitude = [] inputs = gp_preprocess_inputs(*inputs) # Compute distance and covariance matrix if len(inputs) == 1: # Only variance vector asked x = inputs[0] K = np.ones(np.shape(x)[:-1]) * amplitude**2 else: # Full covariance matrix asked x1 = inputs[0] x2 = inputs[1] # Number of inputs x1 N1 = np.shape(x1)[-2] # x1 == x2? if x1 is x2: delta = True # Delta covariance K = gp_cov_delta(N1) * amplitude**2 else: delta = False # Number of inputs x2 N2 = np.shape(x2)[-2] # Zero covariance K = np.zeros((N1,N2)) # Gradient w.r.t. amplitude if gradient: for ind in range(len(gradient_amplitude)): gradient_amplitude[ind] = K * (2 * gradient_amplitude[ind] / amplitude) if gradient: return (K, gradient) else: return K def covfunc_se(theta, *inputs, gradient=False): amplitude = theta[0] lengthscale = theta[1] ## print('in se') ## print(amplitude) ## print(lengthscale) if gradient: gradient_amplitude = gradient[0] gradient_lengthscale = gradient[1] else: gradient_amplitude = [] gradient_lengthscale = [] inputs = gp_preprocess_inputs(*inputs) # Compute covariance matrix if len(inputs) == 1: x = inputs[0] # Compute variance vector K = np.ones(np.shape(x)[:-1]) K *= amplitude**2 # Compute gradient w.r.t. lengthscale for ind in range(len(gradient_lengthscale)): gradient_lengthscale[ind] = np.zeros(np.shape(x)[:-1]) else: x1 = inputs[0] / (lengthscale) x2 = inputs[1] / (lengthscale) # Compute distance matrix K = squared_distance(x1, x2) # Compute gradient partly if gradient: for ind in range(len(gradient_lengthscale)): gradient_lengthscale[ind] = K * ((lengthscale**-1) * gradient_lengthscale[ind]) # Compute covariance matrix gp_cov_se(K, overwrite=True) K *= amplitude**2 # Compute gradient w.r.t. lengthscale if gradient: for ind in range(len(gradient_lengthscale)): gradient_lengthscale[ind] *= K # Gradient w.r.t. amplitude if gradient: for ind in range(len(gradient_amplitude)): gradient_amplitude[ind] = K * (2 * gradient_amplitude[ind] / amplitude) # Return values if gradient: return (K, gradient) else: return K class NodeCovarianceFunction(Node): def __init__(self, covfunc, *args, **kwargs): self.covfunc = covfunc params = list(args) for i in range(len(args)): # Check constant parameters if is_numeric(args[i]): params[i] = NodeConstant([np.asanyarray(args[i])], dims=[np.shape(args[i])]) # TODO: Parameters could be constant functions? :) Node.__init__(self, *params, dims=[(np.inf, np.inf)], **kwargs) def message_to_child(self, gradient=False): params = [parent.message_to_child(gradient=gradient) for parent in self.parents] return self.covariance_function(*params) def covariance_function(self, *params): params = list(params) gradient_params = list() for ind in range(len(params)): if isinstance(params[ind], tuple): gradient_params.append(params[ind][1]) params[ind] = params[ind][0][0] else: gradient_params.append([]) params[ind] = params[ind][0] def cov(*inputs, gradient=False): if gradient: grads = [[grad[0] for grad in gradient_params[ind]] for ind in range(len(gradient_params))] (K, dK) = self.covfunc(params, *inputs, gradient=grads) for ind in range(len(dK)): for (grad, dk) in zip(gradient_params[ind], dK[ind]): grad[0] = dk K = [K] dK = [] for grad in gradient_params: dK += grad return (K, dK) else: K = self.covfunc(params, *inputs, gradient=False) return [K] return cov class NodeCovarianceFunctionSum(NodeCovarianceFunction): def __init__(self, *args, **kwargs): NodeCovarianceFunction.__init__(self, None, *args, **kwargs) def covariance_function(self, *covfuncs): def cov(*inputs, gradient=False): K_sum = 0 if gradient: dK_sum = list() for k in covfuncs: if gradient: (K, dK) = k(*inputs, gradient=gradient) dK_sum += dK else: K = k(*inputs, gradient=gradient) K_sum += K[0] if gradient: return ([K_sum], dK_sum) else: return [K_sum] return cov class NodeCovarianceFunctionDelta(NodeCovarianceFunction): def __init__(self, amplitude, **kwargs): NodeCovarianceFunction.__init__(self, covfunc_delta, amplitude, **kwargs) class NodeCovarianceFunctionSquaredExponential(NodeCovarianceFunction): def __init__(self, amplitude, lengthscale, **kwargs): NodeCovarianceFunction.__init__(self, covfunc_se, amplitude, lengthscale, **kwargs) class NodeMultiCovarianceFunction(NodeCovarianceFunction): def __init__(self, *args, **kwargs): NodeCovarianceFunction.__init__(self, None, *args, **kwargs) def covfunc(self, *covfuncs): def cov(*inputs, gradient=False): K_sum = 0 if gradient: dK_sum = list() for k in covfuncs: if gradient: (K, dK) = k(*inputs, gradient=gradient) dK_sum += dK else: K = k(*inputs, gradient=gradient) K_sum += K[0] if gradient: return ([K_sum], dK_sum) else: return [K_sum] return cov class NodeConstantGaussianProcess(Node): def __init__(self, f, **kwargs): self.f = f Node.__init__(self, dims=[(np.inf,)], **kwargs) def message_to_child(self, gradient=False): # Wrapper def func(x, gradient=False): if gradient: return ([self.f(x)], []) else: return [self.f(x)] return func # At least for now, simplify this GP node such that a GP is either # observed or latent. If it is observed, it doesn't take messages from # children, actually, it should not even have children! #class NodeMultiGaussianProcess(NodeVariable): class NodeMultiGaussianProcess(Stochastic): def __init__(self, m, k, **kwargs): self.x = [] self.f = [] # By default, posterior == prior self.m = m self.k = k # Ignore plates NodeVariable.__init__(self, m, k, plates=(), dims=[(np.inf,), (np.inf,np.inf)], **kwargs) def message_to_parent(self, index): if index == 0: k = self.parents[1].message_to_child()[0] K = k(self.x, self.x) return [self.x, self.mu, K] if index == 1: raise Exception("not implemented yet") def message_to_child(self): if self.observed: raise Exception("Observable GP should not have children.") return self.u def get_parameters(self): return self.u def observe(self, x, f): if np.ndim(x) == 1: if np.shape(f) != np.shape(x): print(np.shape(f)) print(np.shape(x)) raise Exception("Number of inputs and function values do not match") elif np.shape(f) != np.shape(x)[:-1]: print(np.shape(f)) print(np.shape(x)) raise Exception("Number of inputs and function values do not match") self.observed = True self.x = x self.f = f ## self.x_obs = x ## self.f_obs = f # You might want: # - mean for x # - covariance (and mean) for x # - variance (and mean) for x # - i.e., mean and/or (co)variance for x # - covariance for x1 and x2 def lower_bound_contribution(self, gradient=False): m = self.parents[0].message_to_child(gradient=gradient) k = self.parents[1].message_to_child(gradient=gradient) ## m = self.parents[0].message_to_child(gradient=gradient)[0] ## k = self.parents[1].message_to_child(gradient=gradient)[0] # Prior if gradient: (mu, dmus) = m(self.x, gradient=True) (K, dKs) = k(self.x, self.x, gradient=True) else: mu = m(self.x) K = k(self.x, self.x) dmus = [] dKs = [] mu = mu[0] K = K[0] # Log pdf if self.observed: # Vector of f-mu f0 = np.vstack([(f-m) for (f,m) in zip(self.f,mu)]) # Full covariance matrix K_full = np.bmat(K) try: U = chol(K_full) except linalg.LinAlgError: print('non positive definite, return -inf') return -np.inf z = chol_solve(U, f0) #print(K) L = gaussian_logpdf(np.dot(f0, z), 0, 0, logdet_chol(U), np.size(self.f)) for (dmu, func) in dmus: # Derivative w.r.t. mean vector d = -np.sum(z) # Send the derivative message func += d #func(d) for (dK, func) in dKs: # Compute derivative w.r.t. covariance matrix d = 0.5 * (np.dot(z, np.dot(dK, z)) - np.trace(chol_solve(U, dK))) # Send the derivative message #print('add gradient') #func += d func(d) else: raise Exception('Not implemented yet') return L ## Let f1 be observed and f2 latent function values. # Compute #L = gaussian_logpdf(sum_product(np.outer(self.f,self.f) + self.Cov, # Compute def update(self): # Messages from parents m = self.parents[0].message_to_child() k = self.parents[1].message_to_child() ## m = self.parents[0].message_to_child()[0] ## k = self.parents[1].message_to_child()[0] if self.observed: # Observations of this node self.u = gp_posterior_moment_function(m, k, self.x, self.f) else: x = np.array([]) y = np.array([]) # Messages from children for (child,index) in self.children: (msg, mask) = child.message_to_parent(index) # Ignoring masks and plates.. # m[0] is the inputs x = np.concatenate((x, msg[0]), axis=-2) # m[1] is the observations y = np.concatenate((y, msg[1])) # m[2] is the covariance matrix V = linalg.block_diag(V, msg[2]) self.u = gp_posterior_moment_function(m, k, x, y, covariance=V) self.x = x self.f = y class NodeGaussianProcess(Stochastic): #class NodeGaussianProcess(NodeVariable): def __init__(self, m, k, **kwargs): self.x = np.array([]) self.f = np.array([]) ## self.x_obs = np.zeros((0,1)) ## self.f_obs = np.zeros((0,)) # By default, posterior == prior self.m = m self.k = k # Ignore plates NodeVariable.__init__(self, m, k, plates=(), dims=[(np.inf,), (np.inf,np.inf)], **kwargs) def message_to_parent(self, index): if index == 0: k = self.parents[1].message_to_child()[0] K = k(self.x, self.x) return [self.x, self.mu, K] if index == 1: raise Exception("not implemented yet") def message_to_child(self): if self.observed: raise Exception("Observable GP should not have children.") return self.u def get_parameters(self): return self.u def observe(self, x, f): if np.ndim(x) == 1: if np.shape(f) != np.shape(x): print(np.shape(f)) print(np.shape(x)) raise Exception("Number of inputs and function values do not match") elif np.shape(f) != np.shape(x)[:-1]: print(np.shape(f)) print(np.shape(x)) raise Exception("Number of inputs and function values do not match") self.observed = True self.x = x self.f = f ## self.x_obs = x ## self.f_obs = f # You might want: # - mean for x # - covariance (and mean) for x # - variance (and mean) for x # - i.e., mean and/or (co)variance for x # - covariance for x1 and x2 def lower_bound_contribution(self, gradient=False): m = self.parents[0].message_to_child(gradient=gradient) k = self.parents[1].message_to_child(gradient=gradient) ## m = self.parents[0].message_to_child(gradient=gradient)[0] ## k = self.parents[1].message_to_child(gradient=gradient)[0] # Prior if gradient: (mu, dmus) = m(self.x, gradient=True) (K, dKs) = k(self.x, self.x, gradient=True) else: mu = m(self.x) K = k(self.x, self.x) dmus = [] dKs = [] mu = mu[0] K = K[0] # Log pdf if self.observed: f0 = self.f - mu #print('hereiam') #print(K) try: U = chol(K) except linalg.LinAlgError: print('non positive definite, return -inf') return -np.inf z = chol_solve(U, f0) #print(K) L = gaussian_logpdf(np.dot(f0, z), 0, 0, logdet_chol(U), np.size(self.f)) for (dmu, func) in dmus: # Derivative w.r.t. mean vector d = -np.sum(z) # Send the derivative message func += d #func(d) for (dK, func) in dKs: # Compute derivative w.r.t. covariance matrix d = 0.5 * (np.dot(z, np.dot(dK, z)) - np.trace(chol_solve(U, dK))) # Send the derivative message #print('add gradient') #func += d func(d) else: raise Exception('Not implemented yet') return L ## Let f1 be observed and f2 latent function values. # Compute #L = gaussian_logpdf(sum_product(np.outer(self.f,self.f) + self.Cov, # Compute def update(self): # Messages from parents m = self.parents[0].message_to_child() k = self.parents[1].message_to_child() ## m = self.parents[0].message_to_child()[0] ## k = self.parents[1].message_to_child()[0] if self.observed: # Observations of this node self.u = gp_posterior_moment_function(m, k, self.x, self.f) else: x = np.array([]) y = np.array([]) # Messages from children for (child,index) in self.children: (msg, mask) = child.message_to_parent(index) # Ignoring masks and plates.. # m[0] is the inputs x = np.concatenate((x, msg[0]), axis=-2) # m[1] is the observations y = np.concatenate((y, msg[1])) # m[2] is the covariance matrix V = linalg.block_diag(V, msg[2]) self.u = gp_posterior_moment_function(m, k, x, y, covariance=V) self.x = x self.f = y ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/logistic.py0000644000175100001770000001366500000000000023527 0ustar00runnerdocker00000000000000###################################################################### # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ###################################################################### """ Module for Bernoulli using the logistic function for Gaussian """ import numpy as np from .node import ensureparents from .expfamily import (ExponentialFamily, useconstructor) from .multinomial import (MultinomialMoments, MultinomialDistribution, Multinomial) from .dirichlet import DirichletMoments from .gaussian import GaussianMoments from bayespy.utils import random from bayespy.utils import misc class CategoricalDistribution(MultinomialDistribution): """ Class for the VMP formulas of categorical variables. """ def __init__(self, categories): """ Create VMP formula node for a categorical variable `categories` is the total number of categories. """ if not isinstance(categories, int): raise ValueError("Number of categories must be integer") if categories < 0: raise ValueError("Number of categoriess must be non-negative") self.D = categories super().__init__(1) def compute_message_to_parent(self, parent, index, u, u_p): """ Compute the message to a parent node. """ return super().compute_message_to_parent(parent, index, u, u_p) def compute_phi_from_parents(self, u_p, mask=True): """ Compute the natural parameter vector given parent moments. """ return super().compute_phi_from_parents(u_p, mask=mask) def compute_moments_and_cgf(self, phi, mask=True): """ Compute the moments and :math:`g(\phi)`. """ return super().compute_moments_and_cgf(phi, mask=mask) def compute_cgf_from_parents(self, u_p): """ Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ return super().compute_cgf_from_parents(u_p) def compute_fixed_moments_and_f(self, x, mask=True): """ Compute the moments and :math:`f(x)` for a fixed value. """ # Check the validity of x x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Values must be integers") if np.any(x < 0) or np.any(x >= self.D): raise ValueError("Invalid category index") # Form a binary matrix with only one non-zero (1) in the last axis u0 = np.zeros((np.size(x), self.D)) u0[[np.arange(np.size(x)), np.ravel(x)]] = 1 u0 = np.reshape(u0, np.shape(x) + (self.D,)) u = [u0] # f(x) is zero f = 0 return (u, f) def random(self, *phi, plates=None): """ Draw a random sample from the distribution. """ logp = phi[0] logp -= np.amax(logp, axis=-1, keepdims=True) p = np.exp(logp) return random.categorical(p, size=plates) class Logistic(ExponentialFamily): r""" :cite:`Jaakkola:2000` The true probability density function: .. math:: p(z=1|x) = g(x) \\ p(z=0|x) = g(-x) which can be written as: .. math:: p(z|x) = g(H_z) where :math:`H_z=(2z-1)x` and :math:`g(x)` is the logistic function: .. math:: g(x) = \frac{1} {1 + e^{-x}} The log of the logistic function: .. math:: \log g(x) = -\log(1 + e^{-x}) = \frac{x}{2} - \log(e^{x/2} + e^{-x/2}) The latter term: .. math:: f(x) = -\log(e^{x/2} + e^{-x/2}) This is a convex function in the variable :math:`x^2`, thus it can be bounded globally with a first order Taylor expansion in the variable x^2: .. math:: f(x) &\geq f(\xi) + \frac {\partial f(\xi)}{\partial(\xi^2)} (x^2 - \xi^2) \\ &= -\frac{\xi}{2} + \log g(\xi) + \frac{1}{4\xi}\tanh(\frac{\xi}{2}) (x^2 - \xi^2) Thus, the variational lower bound for the probability density function is: .. math:: p(z|x) \geq g(xi) \exp( \frac{H_z-\xi}{2} - \lambda(\xi) (H_z^2 - \xi^2)) and in log form: .. math:: \log p(z|x) \geq \log g(xi) + ( \frac{H_z-\xi}{2} - \lambda(\xi) (H_z^2 - \xi^2) ) where .. math:: \lambda(\xi) = \frac {\tanh(\xi/2)} {4\xi} Now, this log lower bound is quadratic with respect to :math:`H_z`, thus it is quadratic with respect to :math:`x` and it is conjugate with the Gaussian distribution. Re-organize terms: .. math:: \log p(z|x) &\geq -\lambda(\xi)(2z-1)^2 x^2 + zx - \frac{1}{2}x - \frac{1}{2}\xi + \lambda(\xi) \xi^2 + \log g(\xi) \\ &= -\lambda(\xi)(2z+1) x^2 + zx - \frac{1}{2}x - \frac{1}{2}\xi + \lambda(\xi) \xi^2 + \log g(\xi) \\ &= z (-2\lambda(\xi) x^2 + x) - \lambda(\xi) x^2 - \frac{1}{2}x - \frac{1}{2}\xi + \lambda(\xi) \xi^2 + \log g(\xi) where we have used :math:`z^2=z`. See also -------- Bernoulli, GaussianARD """ _parent_moments = ( GaussianMoments(()), ) def __init__(self, x, **kwargs): """ """ super().__init__(x, **kwargs) @classmethod @ensureparents def _constructor(cls, x, **kwargs): """ Constructs distribution and moments objects. """ # Get the number of categories D = p.dims[0][0] parents = [p] moments = CategoricalMoments(D) distribution = CategoricalDistribution(D) return (parents, kwargs, ( (D,), ), cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, p.plates)), distribution, moments, cls._parent_moments) def __str__(self): """ Print the distribution using standard parameterization. """ raise NotImplementedError() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/logpdf.py0000644000175100001770000000577200000000000023165 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from .expfamily import ExponentialFamily, useconstructor from .stochastic import Distribution from .node import Moments class LogPDFDistribution(Distribution): pass class LogPDF(ExponentialFamily): """ General node with arbitrary probability density function """ def __init__(self, logpdf, *parents, **kwargs): self._logpdf = logpdf super().__init__(logpdf, *parents, initialize=False, **kwargs) @classmethod def _constructor(cls, logpdf, *parents, approximation=None, shape=None, samples=10, **kwargs): r""" Constructs distribution and moments objects. """ if approximation is not None: raise NotImplementedError() #self._distribution = approximation._constructor dims = ( shape, ) _distribution = LogPDFDistribution() _moments = np.nan _parent_moments = [Moments()] * len(parents) parent_plates = [_distribution.plates_from_parent(i, parent.plates) for (i, parent) in enumerate(parents)] return (parents, kwargs, dims, cls._total_plates(kwargs.get('plates'), *parent_plates), _distribution, _moments, _parent_moments) def _get_message_and_mask_to_parent(self, index): def logpdf_sampler(x): inputs = [self.parents[j].random() if j != index else x for j in range(len(self.parents))] return self._logpdf(self.random(), *inputs) mask = self._distribution.compute_weights_to_parent(index, self.mask) != 0 return (logpdf_sampler, mask) def observe(self, x, *args, mask=True): """ Fix moments, compute f and propagate mask. """ # Compute fixed moments if not np.isnan(self._moments): u = self._moments.compute_fixed_moments(x, *args, mask=mask) else: u = (x,) + args # Check the dimensionality of the observations for (i,v) in enumerate(u): # This is what the dimensionality "should" be s = self.plates + self.dims[i] t = np.shape(v) if s != t: msg = "Dimensionality of the observations incorrect." msg += "\nShape of input: " + str(t) msg += "\nExpected shape: " + str(s) msg += "\nCheck plates." raise Exception(msg) # Set the moments self._set_moments(u, mask=mask) # Observed nodes should not be ignored self.observed = mask self._update_mask() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/mixture.py0000644000175100001770000005002600000000000023377 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012,2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the mixture distribution node. """ import warnings import numpy as np from bayespy.utils import misc from .node import Node from .expfamily import ExponentialFamily, \ ExponentialFamilyDistribution, \ useconstructor from .categorical import Categorical, \ CategoricalMoments class MixtureDistribution(ExponentialFamilyDistribution): """ Class for the VMP formulas of mixture variables. """ def __init__(self, distribution, cluster_plate, n_clusters, ndims, ndims_parents): """ Create VMP formula node for a mixture variable """ self.raw_distribution = distribution try: self.squeezed_distribution = distribution.squeeze(cluster_plate) except ValueError as err: raise ValueError( "Cannot mix over plate axis {0}: {1}".format( cluster_plate, str(err), ) ) from err self.cluster_plate = cluster_plate self.ndims = ndims self.ndims_parents = ndims_parents self.K = n_clusters def compute_message_to_parent(self, parent, index, u, *u_parents): """ Compute the message to a parent node. """ if index == 0: # Shape(phi) = [Nn,..,K,..,N0,Dd,..,D0] # Shape(L) = [Nn,..,K,..,N0] # Shape(u) = [Nn,..,N0,Dd,..,D0] # Shape(result) = [Nn,..,N0,K] # Compute g: # Shape(g) = [Nn,..,K,..,N0] g = self.raw_distribution.compute_cgf_from_parents(*(u_parents[1:])) # Reshape(g): # Shape(g) = [Nn,..,N0,K] if np.ndim(g) < abs(self.cluster_plate): # Not enough axes, just add the cluster plate axis g = np.expand_dims(g, -1) else: # Move the cluster plate axis g = misc.moveaxis(g, self.cluster_plate, -1) # Compute phi: # Shape(phi) = [Nn,..,K,..,N0,Dd,..,D0] phi = self.raw_distribution.compute_phi_from_parents(*(u_parents[1:])) # Reshape u: # Shape(u) = = [Nn,..,1,..,N0,Dd,..,D0] u_reshaped = [ np.expand_dims(ui, self.cluster_plate - ndimi) if np.ndim(ui) >= abs(self.cluster_plate - ndimi) else ui for (ui, ndimi) in zip(u, self.ndims) ] # Compute logpdf: # Shape(L) = [Nn,..,K,..,N0] L = self.raw_distribution.compute_logpdf( u_reshaped, phi, g, 0, self.ndims, ) # Move axis: # Shape(L) = [Nn,..,N0,K] L = np.moveaxis(L, self.cluster_plate, -1) m = [L] return m elif index >= 1: # Parent index for the distribution used for the # mixture. index_for_parent = index - 1 # Reshape u: # Shape(u_self) = [Nn,..1,..,N0,Dd,..,D0] u_self = list() for ind in range(len(u)): if self.cluster_plate < 0: cluster_axis = self.cluster_plate - self.ndims[ind] else: raise ValueError("Cluster plate axis must be negative") u_self.append(np.expand_dims(u[ind], axis=cluster_axis)) # Message from the mixed distribution # Shape(m) = [Nn,..,K,..,N0,Dd,..,D0] m = self.raw_distribution.compute_message_to_parent( parent, index_for_parent, u_self, *(u_parents[1:]) ) # Note: The cluster assignment probabilities can be considered as # weights to plate elements. These weights need to mapped properly # via the plate mapping of self.distribution. Otherwise, nested # mixtures won't work, or possibly not any distribution that does # something to the plates. Thus, use compute_weights_to_parent to # compute the transformations to the weight array properly. # # See issue #39 for more details. # Compute weights (i.e., cluster assignment probabilities) and map # the plates properly. # Shape(p) = [Nn,..,K,..,N0] p = misc.atleast_nd(u_parents[0][0], abs(self.cluster_plate)) p = misc.moveaxis(p, -1, self.cluster_plate) p = self.raw_distribution.compute_weights_to_parent( index_for_parent, p, ) # Weigh the elements in the message array # # TODO/FIXME: This may result in huge intermediate arrays. Need to # use einsum! m = [mi * misc.add_trailing_axes(p, ndim) #for (mi, ndim) in zip(m, self.ndims)] for (mi, ndim) in zip(m, self.ndims_parents[index_for_parent])] return m def compute_weights_to_parent(self, index, weights): """ Maps the mask to the plates of a parent. """ if index == 0: return weights else: if self.cluster_plate >= 0: raise ValueError("Cluster plate axis must be negative") if np.ndim(weights) >= abs(self.cluster_plate): weights = np.expand_dims(weights, axis=self.cluster_plate) return self.raw_distribution.compute_weights_to_parent( index-1, weights ) def compute_phi_from_parents(self, *u_parents, mask=True): """ Compute the natural parameter vector given parent moments. """ # Compute weighted average of the parameters # Cluster parameters Phi = self.raw_distribution.compute_phi_from_parents(*(u_parents[1:])) # Contributions/weights/probabilities P = u_parents[0][0] phi = list() nans = False for ind in range(len(Phi)): # Compute element-wise product and then sum over K clusters. # Note that the dimensions aren't perfectly aligned because # the cluster dimension (K) may be arbitrary for phi, and phi # also has dimensions (Dd,..,D0) of the parameters. # Shape(phi) = [Nn,..,K,..,N0,Dd,..,D0] # Shape(p) = [Nn,..,N0,K] # Shape(result) = [Nn,..,N0,Dd,..,D0] # General broadcasting rules apply for Nn,..,N0, that is, # preceding dimensions may be missing or dimension may be # equal to one. Probably, shape(phi) has lots of missing # dimensions and/or dimensions that are one. if self.cluster_plate < 0: cluster_axis = self.cluster_plate - self.ndims[ind] else: raise RuntimeError("Cluster plate should be negative") # Move cluster axis to the last: # Shape(phi) = [Nn,..,N0,Dd,..,D0,K] if np.ndim(Phi[ind]) >= abs(cluster_axis): phi.append(misc.moveaxis(Phi[ind], cluster_axis, -1)) else: phi.append(Phi[ind][...,None]) # Add axes to p: # Shape(p) = [Nn,..,N0,K,1,..,1] p = misc.add_trailing_axes(P, self.ndims[ind]) # Move cluster axis to the last: # Shape(p) = [Nn,..,N0,1,..,1,K] p = misc.moveaxis(p, -(self.ndims[ind]+1), -1) # Handle zero probability cases. This avoids nans when p=0 and # phi=inf. phi[ind] = np.where(p != 0, phi[ind], 0) # Now the shapes broadcast perfectly and we can sum # p*phi over the last axis: # Shape(result) = [Nn,..,N0,Dd,..,D0] phi[ind] = misc.sum_product(p, phi[ind], axes_to_sum=-1) if np.any(np.isnan(phi[ind])): nans = True if nans: warnings.warn("The natural parameters of mixture distribution " "contain nans. This may happen if you use fixed " "parameters in your model. Technically, one possible " "reason is that the cluster assignment probability " "for some element is zero (p=0) and the natural " "parameter of that cluster is -inf, thus " "0*(-inf)=nan. Solution: Use parameters that assign " "non-zero probabilities for the whole domain.") return phi def compute_moments_and_cgf(self, phi, mask=True): """ Compute the moments and :math:`g(\phi)`. """ return self.squeezed_distribution.compute_moments_and_cgf(phi, mask=mask) def compute_cgf_from_parents(self, *u_parents): """ Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ # Compute weighted average of g over the clusters. # Shape(g) = [Nn,..,K,..,N0] # Shape(p) = [Nn,..,N0,K] # Shape(result) = [Nn,..,N0] # Compute g for clusters: # Shape(g) = [Nn,..,K,..,N0] g = self.raw_distribution.compute_cgf_from_parents(*(u_parents[1:])) # Move cluster axis to last: # Shape(g) = [Nn,..,N0,K] if np.ndim(g) < abs(self.cluster_plate): # Not enough axes, just add the cluster plate axis g = np.expand_dims(g, -1) else: # Move the cluster plate axis g = misc.moveaxis(g, self.cluster_plate, -1) # Cluster assignments/contributions/probabilities/weights: # Shape(p) = [Nn,..,N0,K] p = u_parents[0][0] # Weighted average of g over the clusters. As p and g are # properly aligned, you can just sum p*g over the last # axis and utilize broadcasting: # Shape(result) = [Nn,..,N0] g = misc.sum_product(p, g, axes_to_sum=-1) return g def compute_fixed_moments_and_f(self, x, mask=True): """ Compute the moments and :math:`f(x)` for a fixed value. """ return self.squeezed_distribution.compute_fixed_moments_and_f(x, mask=True) def plates_to_parent(self, index, plates): """ Resolves the plate mapping to a parent. Given the plates of the node's moments, this method returns the plates that the message to a parent has for the parent's distribution. """ if index == 0: return plates else: # Add the cluster plate axis plates = list(plates) if self.cluster_plate < 0: knd = len(plates) + self.cluster_plate + 1 else: raise RuntimeError("Cluster plate axis must be negative") plates.insert(knd, self.K) plates = tuple(plates) return self.raw_distribution.plates_to_parent(index-1, plates) def plates_from_parent(self, index, plates): """ Resolve the plate mapping from a parent. Given the plates of a parent's moments, this method returns the plates that the moments has for this distribution. """ if index == 0: return plates else: plates = self.raw_distribution.plates_from_parent(index-1, plates) # Remove the cluster plate, if the parent has it plates = list(plates) if len(plates) >= abs(self.cluster_plate): plates.pop(self.cluster_plate) return tuple(plates) def random(self, *phi, plates=None): """ Draw a random sample from the distribution. """ return self.squeezed_distribution.random(*phi, plates=plates) def compute_gradient(self, g, u, phi): r""" Compute the standard gradient with respect to the natural parameters. """ return self.squeezed_distribution.compute_gradient(g, u, phi) class Mixture(ExponentialFamily): r""" Node for exponential family mixture variables. The node represents a random variable which is sampled from a mixture distribution. It is possible to mix any exponential family distribution. The probability density function is .. math:: p(x|z=k,\boldsymbol{\theta}_0,\ldots,\boldsymbol{\theta}_{K-1}) = \phi(x|\boldsymbol{\theta}_k), where :math:`\phi` is the probability density function of the mixed exponential family distribution and :math:`\boldsymbol{\theta}_0, \ldots, \boldsymbol{\theta}_{K-1}` are the parameters of each cluster. For instance, :math:`\phi` could be the Gaussian probability density function :math:`\mathcal{N}` and :math:`\boldsymbol{\theta}_k = \{\boldsymbol{\mu}_k, \mathbf{\Lambda}_k\}` where :math:`\boldsymbol{\mu}_k` and :math:`\mathbf{\Lambda}_k` are the mean vector and precision matrix for cluster :math:`k`. Parameters ---------- z : categorical-like node or array :math:`z`, cluster assignment node_class : stochastic exponential family node class Mixed distribution params : types specified by the mixed distribution Parameters of the mixed distribution. If some parameters should vary between clusters, those parameters' plate axis `cluster_plate` should have a size which equals the number of clusters. For parameters with shared values, that plate axis should have length 1. At least one parameter should vary between clusters. cluster_plate : int, optional Negative integer defining which plate axis is used for the clusters in the parameters. That plate axis is ignored from the parameters when considering the plates for this node. By default, mix over the last plate axis. See also -------- Categorical, CategoricalMarkovChain Examples -------- A simple 2-dimensional Gaussian mixture model with three clusters for 100 samples can be constructed, for instance, as: >>> import numpy as np >>> from bayespy.nodes import (Dirichlet, Categorical, Mixture, ... Gaussian, Wishart) >>> alpha = Dirichlet([1e-3, 1e-3, 1e-3]) >>> Z = Categorical(alpha, plates=(100,)) >>> mu = Gaussian(np.zeros(2), 1e-6*np.identity(2), plates=(3,)) >>> Lambda = Wishart(2, 1e-6*np.identity(2), plates=(3,)) >>> X = Mixture(Z, Gaussian, mu, Lambda) """ def __init__(self, z, node_class, *params, cluster_plate=-1, **kwargs): self.cluster_plate = cluster_plate super().__init__(z, node_class, *params, cluster_plate=cluster_plate, **kwargs) @classmethod def _constructor(cls, z, node_class, *args, cluster_plate=-1, **kwargs): """ Constructs distribution and moments objects. """ if cluster_plate >= 0: raise ValueError("Cluster plate axis must be negative") # Get the stuff for the mixed distribution (parents, _, dims, mixture_plates, distribution, moments, parent_moments) = \ node_class._constructor(*args) # Check that at least one of the parents has the cluster plate axis if len(mixture_plates) < abs(cluster_plate): raise ValueError("The mixed distribution does not have a plates " "axis for the cluster plate axis") # Resolve the number of clusters mixture_plates = list(mixture_plates) K = mixture_plates.pop(cluster_plate) # Convert a node to get the number of clusters z = cls._ensure_moments(z, CategoricalMoments, categories=K) if z.dims[0][0] != K: raise ValueError("Inconsistent number of clusters") plates = cls._total_plates(kwargs.get('plates'), mixture_plates, z.plates) ndims = [len(dim) for dim in dims] parents = [cls._ensure_moments(p_i, m_i.__class__, **m_i.get_instance_conversion_kwargs()) for (p_i, m_i) in zip(parents, parent_moments)] ndims_parents = [[len(dims_i) for dims_i in parent.dims] for parent in parents] # Convert the distribution to a mixture distribution = MixtureDistribution(distribution, cluster_plate, K, ndims, ndims_parents) # Add cluster assignments to parents parent_moments = [CategoricalMoments(K)] + list(parent_moments) parents = [z] + list(parents) return (parents, kwargs, dims, plates, distribution, moments, parent_moments) def integrated_logpdf_from_parents(self, x, index): """ Approximates the posterior predictive pdf \int p(x|parents) q(parents) dparents in log-scale as \int q(parents_i) exp( \int q(parents_\i) \log p(x|parents) dparents_\i ) dparents_i.""" if index == 0: # Integrate out the cluster assignments # First, integrate the cluster parameters in log-scale # compute_logpdf(cls, u, phi, g, f): # Shape(x) = [M1,..,Mm,N1,..,Nn,D1,..,Dd] u_parents = self._message_from_parents() # Shape(u) = [M1,..,Mm,N1,..,1,..,Nn,D1,..,Dd] # Shape(f) = [M1,..,Mm,N1,..,1,..,Nn] (u, f) = self._distribution.raw_distribution.compute_fixed_moments_and_f(x) f = np.expand_dims(f, axis=self.cluster_plate) for i in range(len(u)): ndim_i = len(self.dims[i]) cluster_axis = self.cluster_plate - ndim_i u[i] = np.expand_dims(u[i], axis=cluster_axis) # Shape(phi) = [N1,..,K,..,Nn,D1,..,Dd] phi = self._distribution.raw_distribution.compute_phi_from_parents(*(u_parents[1:])) # Shape(g) = [N1,..,K,..,Nn] g = self._distribution.raw_distribution.compute_cgf_from_parents(*(u_parents[1:])) # Shape(lpdf) = [M1,..,Mm,N1,..,K,..,Nn] lpdf = self._distribution.raw_distribution.compute_logpdf(u, phi, g, f, self.ndims) # From logpdf to pdf, but avoid over/underflow lpdf_max = np.max(lpdf, axis=self.cluster_plate, keepdims=True) pdf = np.exp(lpdf-lpdf_max) # Move cluster axis to be the last: # Shape(pdf) = [M1,..,Mm,N1,..,Nn,K] pdf = misc.moveaxis(pdf, self.cluster_plate, -1) # Cluster assignments/probabilities/weights # Shape(p) = [N1,..,Nn,K] p = u_parents[0][0] # Weighted average. TODO/FIXME: Use einsum! # Shape(pdf) = [M1,..,Mm,N1,..,Nn] pdf = np.sum(pdf * p, axis=self.cluster_plate) # Back to log-scale (add the overflow fix!) lpdf_max = np.squeeze(lpdf_max, axis=self.cluster_plate) lpdf = np.log(pdf) + lpdf_max return lpdf raise NotImplementedError() def MultiMixture(thetas, *mixture_args, **kwargs): """Creates a mixture over several axes using as many categorical variables. The mixings are assumed to be separate, that is, inner mixings don't affect the parameters of outer mixings. """ thetas = [theta if isinstance(theta, Node) else np.asanyarray(theta) for theta in thetas] N = len(thetas) # Add trailing plate axes to thetas because you assume that each # mixed axis is separate from the others. thetas = [theta[(Ellipsis,) + i*(None,)] for (i, theta) in enumerate(thetas)] args = ( thetas[:1] + list(misc.zipper_merge((N-1) * [Mixture], thetas[1:])) + list(mixture_args) ) return Mixture(*args, **kwargs) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/ml.py0000644000175100001770000001337500000000000022320 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2016 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from .node import Moments from .deterministic import Deterministic from .stochastic import Stochastic class DeltaMoments(Moments): r""" Class for the moments of constants or delta distributed variables """ def __init__(self, shape): self.shape = shape self.dims = (shape,) return super().__init__() @classmethod def from_values(cls, x, ndim): if np.ndim(x) < ndim: raise ValueError("Broadcasting not (yet) supported in DeltaMoments") if ndim == 0: return cls(()) else: return cls(np.shape(x)[-ndim:]) def get_converter(self, moments_to): if issubclass(DeltaMoments, moments_to): return lambda x: x return get_delta_moments_class_converter(moments_to) def compute_fixed_moments(self, x): r""" Compute the moments for a fixed value """ return [x] def compute_dims_from_values(self, x): r""" Return the shape of the moments for a fixed value. """ return ((),) def get_instance_conversion_kwargs(self): return dict(shape=self.shape) def get_instance_converter(self, shape): if shape != self.shape: raise ValueError() return None class DeltaClassConverterMoments(Moments): def __init__(self, x, moments_class): self.x = x self.moments_class = moments_class return def get_instance_conversion_kwargs(self): return dict(i_am_delta=True) def get_instance_converter(self, **kwargs): if kwargs.get('i_am_delta'): return None moments = self.moments_class.from_values( self.x.get_moments()[0], **kwargs ) return DeltaInstanceConverter(moments) def get_delta_moments_class_converter(moments_class): class DeltaClassConverter(Deterministic): def __init__(self, node): self._parent_moments = (node._moments,) self._moments = DeltaClassConverterMoments(node, moments_class) return super().__init__(node, dims=((),)) def _compute_moments(self, u): return u def _compute_message_to_parent(self, index, m, u): return m return DeltaClassConverter class DeltaInstanceConverter(): def __init__(self, moments): self.moments = moments return def compute_moments(self, u): return self.moments.compute_fixed_moments(u[0]) def compute_message_to_parent(self, m, u_parent): x = u_parent[0] (u, du) = self.moments.compute_fixed_moments(x, gradient=m) return [du] def compute_weights_to_parent(self, weights): return 1 def plates_multiplier_from_parent(self, plates_multiplier): return () def plates_from_parent(self, plates): return self.moments.plates_from_shape(plates) def plates_to_parent(self, plates): return self.moments.shape_from_plates(plates) class MaximumLikelihood(Stochastic): _parent_moments = () def __init__(self, array, regularization=None, **kwargs): self._x = array self._moments = DeltaMoments(np.shape(array)) self._regularization = regularization return super().__init__( plates=np.shape(array), dims=( (), ), initialize=False, **kwargs ) def _get_id_list(self): return [] def get_moments(self): return [self._x] def lower_bound_contribution(self, ignore_masked=None): if self._regularization is None: return 0 return -np.sum(self._regularization(self._x)) def get_riemannian_gradient(self): m_children = self._message_from_children(u_self=self.get_moments()) g = m_children # TODO/FIXME: REGULARIZATION GRADIENT!! return g def get_gradient(self, rg): return rg def get_parameters(self): return [self._x] def set_parameters(self, x): if len(x) != 1: raise Exception("Wrong number of parameters. Should be 1, is {0}".format(len(x))) self._x = x[0] return def _update_distribution_and_lowerbound(self, m): raise NotImplementedError() class Function(Deterministic): def __init__(self, function, *nodes_gradients, shape=None, **kwargs): self._function = function (nodes, gradients) = zip(*nodes_gradients) self._parent_moments = tuple(node._moments for node in nodes) self._gradients = gradients if shape is None: # Shape wasn't given explicitly. Computes the output value once to # determine the shape. y = self._compute_moments( *[ node.get_moments() for node in nodes ] ) shape = np.shape(y[0]) self._moments = DeltaMoments(shape) return super().__init__(*nodes, dims=((),), **kwargs) def _compute_moments(self, *u_nodes): x = [u[0] for u in u_nodes] return [self._function(*x)] def _compute_message_to_parent(self, index, m, *u_nodes): x = [u[0] for u in u_nodes] return [self._gradients[index](m[0], *x)] def _compute_weights_to_parent(self, index, mask): return 1 def _compute_plates_from_parent(self, index, plates): return self._moments.shape def _compute_plates_to_parent(self, index, plates): return self.parents[index].plates ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/multinomial.py0000644000175100001770000002273700000000000024244 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the multinomial distribution node. """ import numpy as np from scipy import special from .expfamily import ExponentialFamily from .expfamily import ExponentialFamilyDistribution from .expfamily import useconstructor from .dirichlet import Dirichlet, DirichletMoments from .node import Moments, ensureparents from bayespy.utils import random from bayespy.utils import misc from bayespy.utils import linalg class MultinomialMoments(Moments): """ Class for the moments of multinomial variables. """ def __init__(self, categories): self.categories = categories self.dims = ( (categories,), ) def compute_fixed_moments(self, x): """ Compute the moments for a fixed value `x` must be a vector of counts. """ # Check that counts are valid x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Counts must be integer") if np.any(x < 0): raise ValueError("Counts must be non-negative") # Moments is just the counts vector u0 = x.copy() return [u0] @classmethod def from_values(cls, x): D = np.shape(x)[-1] return cls( (D,) ) class MultinomialDistribution(ExponentialFamilyDistribution): """ Class for the VMP formulas of multinomial variables. """ def __init__(self, trials): """ Create VMP formula node for a multinomial variable `trials` is the total number of trials. """ trials = np.asanyarray(trials) if not misc.isinteger(trials): raise ValueError("Number of trials must be integer") if np.any(trials < 0): raise ValueError("Number of trials must be non-negative") self.N = trials super().__init__() def compute_message_to_parent(self, parent, index, u, u_p): """ Compute the message to a parent node. """ if index == 0: return [ u[0].copy() ] else: raise ValueError("Index out of bounds") def compute_phi_from_parents(self, u_p, mask=True): """ Compute the natural parameter vector given parent moments. """ logp = u_p[0] return [logp] def compute_moments_and_cgf(self, phi, mask=True): r""" Compute the moments and :math:`g(\phi)`. .. math:: \overline{\mathbf{u}} = \mathrm{E}[x] = N \cdot \begin{bmatrix} \frac{e^{\phi_1}}{\sum_i e^{\phi_i}} & \cdots & \frac{e^{\phi_D}}{\sum_i e^{\phi_i}} \end{bmatrix} """ # Compute the normalized probabilities in a numerically stable way (p, logsum_p) = misc.normalized_exp(phi[0]) N = np.expand_dims(self.N, -1) u0 = N * p u = [u0] g = -np.squeeze(N * logsum_p, axis=-1) return (u, g) def compute_cgf_from_parents(self, u_p): r""" Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ return 0 def compute_fixed_moments_and_f(self, x, mask=True): r""" Compute the moments and :math:`f(x)` for a fixed value. """ # Check that counts are valid x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Counts must be integers") if np.any(x < 0): raise ValueError("Counts must be non-negative") if np.any(np.sum(x, axis=-1) != self.N): raise ValueError("Counts must sum to the number of trials") # Moments is just the counts vector u0 = x.copy() u = [u0] f = special.gammaln(self.N+1) - np.sum(special.gammaln(x+1), axis=-1) return (u, f) def random(self, *phi, plates=None): r""" Draw a random sample from the distribution. """ (p, _) = misc.normalized_exp(phi[0]) return random.multinomial(self.N, p, size=plates) def compute_gradient(self, g, u, phi): r""" Compute the Euclidean gradient. In order to compute the Euclidean gradient, we first need to derive the gradient of the moments with respect to the variational parameters: .. math:: \mathrm{d}\overline{u}_i = N \cdot \frac {e^{\phi_i} \mathrm{d}\phi_i \sum_j e^{\phi_j}} {(\sum_k e^{\phi_k})^2} - N \cdot \frac {e^{\phi_i} \sum_j e^\phi_j \mathrm{d}\phi_j} {(\sum_k e^{\phi_k})^2} = \overline{u}_i \mathrm{d}\phi_i - \overline{u}_i \sum_j \frac{\overline{u}_j}{N} \mathrm{d}\phi_j Now we can make use of the chain rule. Given the Riemannian gradient :math:`\tilde{\nabla}` of the variational lower bound :math:`\mathcal{L}` with respect to the variational parameters :math:`\phi`, put the above result to the derivative term and re-organize the terms to get the Euclidean gradient :math:`\nabla`: .. math:: \mathrm{d}\mathcal{L} = \tilde{\nabla}^T \mathrm{d}\overline{\mathbf{u}} = \sum_i \tilde{\nabla}_i \mathrm{d}\overline{u}_i = \sum_i \tilde{\nabla}_i ( \overline{u}_i \mathrm{d}\phi_i - \overline{u}_i \sum_j \frac {\overline{u}_j} {N} \mathrm{d}\phi_j ) = \sum_i \left(\tilde{\nabla}_i \overline{u}_i \mathrm{d}\phi_i - \frac{\overline{u}_i}{N} \mathrm{d}\phi_i \sum_j \tilde{\nabla}_j \overline{u}_j \right) \equiv \nabla^T \mathrm{d}\phi Thus, the Euclidean gradient is: .. math:: \nabla_i = \tilde{\nabla}_i \overline{u}_i - \frac{\overline{u}_i}{N} \sum_j \tilde{\nabla}_j \overline{u}_j See also -------- compute_moments_and_cgf : Computes the moments :math:`\overline{\mathbf{u}}` given the variational parameters :math:`\phi`. """ return u[0] * (g - linalg.inner(g, u[0])[...,None] / self.N) def squeeze(self, axis): try: N_squeezed = np.squeeze(self.N, axis) except ValueError as err: raise ValueError( "The number of trials must be constant over a squeezed axis, " "so the corresponding array axis must be singleton. " "Cannot squeeze axis {0} from a multinomial distribution " "because the number of trials arrays has shape {2}, so " "the given axis has length {1} != 1. ".format( axis, np.shape(self.N)[axis], np.shape(self.N), ) ) from err else: return MultinomialDistribution(N_squeezed) class Multinomial(ExponentialFamily): r""" Node for multinomial random variables. Assume there are :math:`K` categories and :math:`N` trials each of which leads a success for exactly one of the categories. Given the probabilities :math:`p_0,\ldots,p_{K-1}` for the categories, multinomial distribution is gives the probability of any combination of numbers of successes for the categories. The node models the number of successes :math:`x_k \in \{0, \ldots, n\}` in :math:`n` trials with probability :math:`p_k` for success in :math:`K` categories. .. math:: \mathrm{Multinomial}(\mathbf{x}| N, \mathbf{p}) = \frac{N!}{x_0!\cdots x_{K-1}!} p_0^{x_0} \cdots p_{K-1}^{x_{K-1}} Parameters ---------- n : scalar or array :math:`N`, number of trials p : Dirichlet-like node or (...,K)-array :math:`\mathbf{p}`, probabilities of successes for the categories See also -------- Dirichlet, Binomial, Categorical """ def __init__(self, n, p, **kwargs): """ Create Multinomial node. """ super().__init__(n, p, **kwargs) @classmethod def _constructor(cls, n, p, **kwargs): """ Constructs distribution and moments objects. This method is called if useconstructor decorator is used for __init__. Becase the distribution and moments object depend on the number of categories, that is, they depend on the parent node, this method can be used to construct those objects. """ # Get the number of categories p = cls._ensure_moments(p, DirichletMoments) D = p.dims[0][0] moments = MultinomialMoments(D) parent_moments = (p._moments,) parents = [p] distribution = MultinomialDistribution(n) return (parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), distribution.plates_from_parent(0, p.plates), np.shape(n)), distribution, moments, parent_moments) def __str__(self): """ Print the distribution using standard parameterization. """ logsum_p = misc.logsumexp(self.phi[0], axis=-1, keepdims=True) p = np.exp(self.phi[0] - logsum_p) p /= np.sum(p, axis=-1, keepdims=True) return ("%s ~ Multinomial(p)\n" " p = \n" "%s\n" % (self.name, p)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/node.py0000644000175100001770000013366400000000000022641 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import functools from bayespy.utils import misc """ This module contains a sketch of a new implementation of the framework. """ def message_sum_multiply(plates_parent, dims_parent, *arrays): """ Compute message to parent and sum over plates. Divide by the plate multiplier. """ # The shape of the full message shapes = [np.shape(array) for array in arrays] shape_full = misc.broadcasted_shape(*shapes) # Find axes that should be summed shape_parent = plates_parent + dims_parent sum_axes = misc.axes_to_collapse(shape_full, shape_parent) # Compute the multiplier for cancelling the # plate-multiplier. Because we are summing over the # dimensions already in this function (for efficiency), we # need to cancel the effect of the plate-multiplier # applied in the message_to_parent function. r = 1 for j in sum_axes: if j >= 0 and j < len(plates_parent): r *= shape_full[j] elif j < 0 and j < -len(dims_parent): r *= shape_full[j] # Compute the sum-product m = misc.sum_multiply(*arrays, axis=sum_axes, sumaxis=True, keepdims=True) / r # Remove extra axes m = misc.squeeze_to_dim(m, len(shape_parent)) return m class Moments(): """ Base class for defining the expectation of the sufficient statistics. The benefits: * Write statistic-specific features in one place only. For instance, covariance from Gaussian message. * Different nodes may have identically defined statistic so you need to implement related features only once. For instance, Gaussian and GaussianARD differ on the prior but the moments are the same. * General processing nodes which do not change the type of the moments may "inherit" the features from the parent node. For instance, slicing operator. * Conversions can be done easily in both of the above cases if the message conversion is defined in the moments class. For instance, GaussianMarkovChain to Gaussian and VaryingGaussianMarkovChain to Gaussian. """ _converters = {} class NoConverterError(Exception): pass def get_instance_converter(self, **kwargs): """Default converter within a moments class is an identity. Override this method when moment class instances are not identical if they have different attributes. """ if len(kwargs) > 0: raise NotImplementedError( "get_instance_converter not implemented for class {0}" .format(self.__class__.__name__) ) return None def get_instance_conversion_kwargs(self): """ Override this method when moment class instances are not identical if they have different attributes. """ return {} @classmethod def add_converter(cls, moments_to, converter): cls._converters = cls._converters.copy() cls._converters[moments_to] = converter return def get_converter(self, moments_to): """ Finds conversion to another moments type if possible. Note that a conversion from moments A to moments B may require intermediate conversions. For instance: A->C->D->B. This method finds the path which uses the least amount of conversions and returns that path as a single conversion. If no conversion path is available, an error is raised. The search algorithm starts from the original moments class and applies all possible converters to get a new list of moments classes. This list is extended by adding recursively all parent classes because their converters are applicable. Then, all possible converters are applied to this list to get a new list of current moments classes. This is iterated until the algorithm hits the target moments class or its subclass. """ # Check if there is no need for a conversion # # TODO/FIXME: This isn't sufficient. Moments can have attributes that # make them incompatible (e.g., ndim in GaussianMoments). if isinstance(self, moments_to): return lambda X: X # Initialize variables visited = set() visited.add(self.__class__) converted_list = [(self.__class__, [])] # Each iteration step consists of two parts: # 1) form a set of the current classes and all their parent classes # recursively # 2) from the current set, apply possible conversions to get a new set # of classes # Repeat these two steps until in step (1) you hit the target class. while len(converted_list) > 0: # Go through all parents recursively so we can then use all # converters that are available current_list = [] for (moments_class, converter_path) in converted_list: if issubclass(moments_class, moments_to): # Shortest conversion path found, return the resulting total # conversion function return misc.composite_function(converter_path) current_list.append((moments_class, converter_path)) parents = list(moments_class.__bases__) for parent in parents: # Recursively add parents for p in parent.__bases__: if isinstance(p, Moments): parents.append(p) # Add un-visited parents if issubclass(parent, Moments) and parent not in visited: visited.add(parent) current_list.append((parent, converter_path)) # Find all converters and extend the converter paths converted_list = [] for (moments_class, converter_path) in current_list: for (conv_mom_cls, conv) in moments_class._converters.items(): if conv_mom_cls not in visited: visited.add(conv_mom_cls) converted_list.append((conv_mom_cls, converter_path + [conv])) raise self.NoConverterError("No conversion defined from %s to %s" % (self.__class__.__name__, moments_to.__name__)) def compute_fixed_moments(self, x): # This method can't be static because the computation of the moments may # depend on, for instance, ndim in Gaussian arrays. raise NotImplementedError("compute_fixed_moments not implemented for " "%s" % (self.__class__.__name__)) @classmethod def from_values(cls, x): raise NotImplementedError("from_values not implemented " "for %s" % (cls.__name__)) def ensureparents(func): @functools.wraps(func) def wrapper(self, *parents, **kwargs): # Convert parents to proper nodes if self._parent_moments is None: raise ValueError( "Parent moments must be defined for {0}" .format(self.__class__.__name__) ) parents = [ Node._ensure_moments( parent, moments.__class__, **moments.get_instance_conversion_kwargs() ) for (parent, moments) in zip(parents, self._parent_moments) ] # parents = list(parents) # for (ind, parent) in enumerate(parents): # parents[ind] = self._ensure_moments(parent, # self._parent_moments[ind]) # Run the function return func(self, *parents, **kwargs) return wrapper class Node(): """ Base class for all nodes. mask dims plates parents children name Sub-classes must implement: 1. For computing the message to children: get_moments(self): 2. For computing the message to parents: _get_message_and_mask_to_parent(self, index) Sub-classes may need to re-implement: 1. If they manipulate plates: _compute_weights_to_parent(index, weights) _plates_to_parent(self, index) _plates_from_parent(self, index) """ # These are objects of the _parent_moments_class. If the default way of # creating them is not correct, write your own creation code. _moments = None _parent_moments = None plates = None _id_counter = 0 @ensureparents def __init__(self, *parents, dims=None, plates=None, name="", notify_parents=True, plotter=None, plates_multiplier=None, allow_dependent_parents=False): self.parents = parents self.dims = dims self.name = name self._plotter = plotter if not allow_dependent_parents: parent_id_list = [] for parent in parents: parent_id_list = parent_id_list + list(parent._get_id_list()) if len(parent_id_list) != len(set(parent_id_list)): raise ValueError("Parent nodes are not independent") # Inform parent nodes if notify_parents: for (index,parent) in enumerate(self.parents): parent._add_child(self, index) # Check plates parent_plates = [self._plates_from_parent(index) for index in range(len(self.parents))] if any(p is None for p in parent_plates): raise ValueError("Method _plates_from_parent returned None") # Get and validate the plates for this node plates = self._total_plates(plates, *parent_plates) if self.plates is None: self.plates = plates # By default, ignore all plates self.mask = np.array(False) # Children self.children = set() # Get and validate the plate multiplier parent_plates_multiplier = [self._plates_multiplier_from_parent(index) for index in range(len(self.parents))] #if plates_multiplier is None: # plates_multiplier = parent_plates_multiplier plates_multiplier = self._total_plates(plates_multiplier, *parent_plates_multiplier) self.plates_multiplier = plates_multiplier def get_pdf_nodes(self): return tuple( node for (child, _) in self.children for node in child._get_pdf_nodes_conditioned_on_parents() ) def _get_pdf_nodes_conditioned_on_parents(self): return self.get_pdf_nodes() def _get_id_list(self): """ Returns the stochastic ID list. This method is used to check that same stochastic nodes are not direct parents of a node several times. It is only valid if there are intermediate stochastic nodes. To put it another way: each ID corresponds to one factor q(..) in the posterior approximation. Different IDs mean different factors, thus they mean independence. The parents must have independent factors. Stochastic nodes should return their unique ID. Deterministic nodes should return the IDs of their parents. Constant nodes should return empty list of IDs. """ raise NotImplementedError() @classmethod def _total_plates(cls, plates, *parent_plates): if plates is None: # By default, use the minimum number of plates determined # from the parent nodes try: return misc.broadcasted_shape(*parent_plates) except ValueError: raise ValueError( "The plates of the parents do not broadcast: {0}".format( parent_plates ) ) else: # Check that the parent_plates are a subset of plates. for (ind, p) in enumerate(parent_plates): if not misc.is_shape_subset(p, plates): raise ValueError("The plates %s of the parents " "are not broadcastable to the given " "plates %s." % (p, plates)) return plates @staticmethod def _ensure_moments(node, moments_class, **kwargs): try: converter = node._moments.get_converter(moments_class) except AttributeError: from .constant import Constant return Constant( moments_class.from_values(node, **kwargs), node ) else: node = converter(node) converter = node._moments.get_instance_converter(**kwargs) if converter is not None: from .converters import NodeConverter return NodeConverter(converter, node) return node def _compute_plates_to_parent(self, index, plates): # Sub-classes may want to overwrite this if they manipulate plates return plates def _compute_plates_from_parent(self, index, plates): # Sub-classes may want to overwrite this if they manipulate plates return plates def _compute_plates_multiplier_from_parent(self, index, plates_multiplier): # TODO/FIXME: How to handle this properly? return plates_multiplier def _plates_to_parent(self, index): return self._compute_plates_to_parent(index, self.plates) def _plates_from_parent(self, index): return self._compute_plates_from_parent(index, self.parents[index].plates) def _plates_multiplier_from_parent(self, index): return self._compute_plates_multiplier_from_parent( index, self.parents[index].plates_multiplier ) @property def plates_multiplier(self): """ Plate multiplier is applied to messages to parents """ return self.__plates_multiplier @plates_multiplier.setter def plates_multiplier(self, value): # TODO/FIXME: Check that multiplier is consistent with plates self.__plates_multiplier = value return def get_shape(self, ind): return self.plates + self.dims[ind] def _add_child(self, child, index): """ Add a child node. Parameters ---------- child : node index : int The parent index of this node for the child node. The child node recognizes its parents by their index number. """ self.children.add((child, index)) def _remove_child(self, child, index): """ Remove a child node. """ self.children.remove((child, index)) def get_mask(self): return self.mask ## def _get_message_mask(self): ## return self.mask def _set_mask(self, mask): # Sub-classes may overwrite this method if they have some other masks to # be combined (for instance, observation mask) self.mask = mask def _update_mask(self): # Combine masks from children mask = np.array(False) for (child, index) in self.children: mask = np.logical_or(mask, child._mask_to_parent(index)) # Set the mask of this node self._set_mask(mask) if not misc.is_shape_subset(np.shape(self.mask), self.plates): raise ValueError("The mask of the node %s has updated " "incorrectly. The plates in the mask %s are not a " "subset of the plates of the node %s." % (self.name, np.shape(self.mask), self.plates)) # Tell parents to update their masks for parent in self.parents: parent._update_mask() def _compute_weights_to_parent(self, index, weights): """Compute the mask used for messages sent to parent[index]. The mask tells which plates in the messages are active. This method is used for obtaining the mask which is used to set plates in the messages to parent to zero. Sub-classes may want to overwrite this method if they do something to plates so that the mask is somehow altered. """ return weights def _mask_to_parent(self, index): """ Get the mask with respect to parent[index]. The mask tells which plate connections are active. The mask is "summed" (logical or) and reshaped into the plate shape of the parent. Thus, it can't be used for masking messages, because some plates have been summed already. This method is used for propagating the mask to parents. """ mask = self._compute_weights_to_parent(index, self.mask) != 0 # Check the shape of the mask plates_to_parent = self._plates_to_parent(index) if not misc.is_shape_subset(np.shape(mask), plates_to_parent): raise ValueError("In node %s, the mask being sent to " "parent[%d] (%s) has invalid shape: The shape of " "the mask %s is not a sub-shape of the plates of " "the node with respect to the parent %s. It could " "be that this node (%s) is manipulating plates " "but has not overwritten the method " "_compute_weights_to_parent." % (self.name, index, self.parents[index].name, np.shape(mask), plates_to_parent, self.__class__.__name__)) # "Sum" (i.e., logical or) over the plates that have unit length in # the parent node. parent_plates = self.parents[index].plates s = misc.axes_to_collapse(np.shape(mask), parent_plates) mask = np.any(mask, axis=s, keepdims=True) mask = misc.squeeze_to_dim(mask, len(parent_plates)) return mask def _message_to_child(self): u = self.get_moments() # Debug: Check that the message has appropriate shape for (ui, dim) in zip(u, self.dims): ndim = len(dim) if ndim > 0: if np.shape(ui)[-ndim:] != dim: raise RuntimeError( "A bug found by _message_to_child for %s: " "The variable axes of the moments %s are not equal to " "the axes %s defined by the node %s. A possible reason " "is that the plates of the node are inferred " "incorrectly from the parents, and the method " "_plates_from_parents should be implemented." % (self.__class__.__name__, np.shape(ui)[-ndim:], dim, self.name)) if not misc.is_shape_subset(np.shape(ui)[:-ndim], self.plates): raise RuntimeError( "A bug found by _message_to_child for %s: " "The plate axes of the moments %s are not a subset of " "the plate axes %s defined by the node %s." % (self.__class__.__name__, np.shape(ui)[:-ndim], self.plates, self.name)) else: if not misc.is_shape_subset(np.shape(ui), self.plates): raise RuntimeError( "A bug found by _message_to_child for %s: " "The plate axes of the moments %s are not a subset of " "the plate axes %s defined by the node %s." % (self.__class__.__name__, np.shape(ui), self.plates, self.name)) return u def _message_to_parent(self, index, u_parent=None): # Compute the message, check plates, apply mask and sum over some plates if index >= len(self.parents): raise ValueError("Parent index larger than the number of parents") # Compute the message and mask (m, mask) = self._get_message_and_mask_to_parent(index, u_parent=u_parent) mask = misc.squeeze(mask) # Plates in the mask plates_mask = np.shape(mask) # The parent we're sending the message to parent = self.parents[index] # Plates with respect to the parent plates_self = self._plates_to_parent(index) # Plate multiplier of the parent multiplier_parent = self._plates_multiplier_from_parent(index) # Check if m is a logpdf function (for black-box variational inference) if callable(m): return m def m_function(*args): lpdf = m(*args) # Log pdf only contains plate axes! plates_m = np.shape(lpdf) r = (self.broadcasting_multiplier(plates_self, plates_m, plates_mask, parent.plates) * self.broadcasting_multiplier(self.plates_multiplier, multiplier_parent)) axes_msg = misc.axes_to_collapse(plates_m, parent.plates) m[i] = misc.sum_multiply(mask_i, m[i], r, axis=axes_msg, keepdims=True) # Remove leading singular plates if the parent does not have # those plate axes. m[i] = misc.squeeze_to_dim(m[i], len(shape_parent)) return m_function raise NotImplementedError() # Compact the message to a proper shape for i in range(len(m)): # Empty messages are given as None. We can ignore those. if m[i] is not None: try: r = self.broadcasting_multiplier(self.plates_multiplier, multiplier_parent) except: raise ValueError("The plate multipliers are incompatible. " "This node (%s) has %s and parent[%d] " "(%s) has %s" % (self.name, self.plates_multiplier, index, parent.name, multiplier_parent)) ndim = len(parent.dims[i]) # Source and target shapes if ndim > 0: dims = misc.broadcasted_shape(np.shape(m[i])[-ndim:], parent.dims[i]) from_shape = plates_self + dims else: from_shape = plates_self to_shape = parent.get_shape(i) # Add variable axes to the mask mask_i = misc.add_trailing_axes(mask, ndim) # Apply mask and sum plate axes as necessary (and apply plate # multiplier) m[i] = r * misc.sum_multiply_to_plates(np.where(mask_i, m[i], 0), to_plates=to_shape, from_plates=from_shape, ndim=0) return m def _message_from_children(self, u_self=None): msg = [np.zeros(shape) for shape in self.dims] #msg = [np.array(0.0) for i in range(len(self.dims))] isfunction = None for (child,index) in self.children: m = child._message_to_parent(index, u_parent=u_self) if callable(m): if isfunction is False: raise NotImplementedError() elif isfunction is None: msg = m else: def join(m1, m2): return (m1[0] + m2[0], m1[1] + m2[1]) msg = lambda x: join(m(x), msg(x)) isfunction = True else: if isfunction is True: raise NotImplementedError() else: isfunction = False for i in range(len(self.dims)): if m[i] is not None: # Check broadcasting shapes sh = misc.broadcasted_shape(self.get_shape(i), np.shape(m[i])) try: # Try exploiting broadcasting rules msg[i] += m[i] except ValueError: msg[i] = msg[i] + m[i] return msg def _message_from_parents(self, exclude=None): return [list(parent._message_to_child()) if ind != exclude else None for (ind,parent) in enumerate(self.parents)] def get_moments(self): raise NotImplementedError() def delete(self): """ Delete this node and the children """ for (ind, parent) in enumerate(self.parents): parent._remove_child(self, ind) for (child, _) in self.children: child.delete() @staticmethod def broadcasting_multiplier(plates, *args): return misc.broadcasting_multiplier(plates, *args) ## """ ## Compute the plate multiplier for given shapes. ## The first shape is compared to all other shapes (using NumPy ## broadcasting rules). All the elements which are non-unit in the first ## shape but 1 in all other shapes are multiplied together. ## This method is used, for instance, for computing a correction factor for ## messages to parents: If this node has non-unit plates that are unit ## plates in the parent, those plates are summed. However, if the message ## has unit axis for that plate, it should be first broadcasted to the ## plates of this node and then summed to the plates of the parent. In ## order to avoid this broadcasting and summing, it is more efficient to ## just multiply by the correct factor. This method computes that ## factor. The first argument is the full plate shape of this node (with ## respect to the parent). The other arguments are the shape of the message ## array and the plates of the parent (with respect to this node). ## """ ## # Check broadcasting of the shapes ## for arg in args: ## misc.broadcasted_shape(plates, arg) ## # Check that each arg-plates are a subset of plates? ## for arg in args: ## if not misc.is_shape_subset(arg, plates): ## raise ValueError("The shapes in args are not a sub-shape of " ## "plates.") ## r = 1 ## for j in range(-len(plates),0): ## mult = True ## for arg in args: ## # if -j <= len(arg) and arg[j] != 1: ## if not (-j > len(arg) or arg[j] == 1): ## mult = False ## if mult: ## r *= plates[j] ## return r def move_plates(self, from_plate, to_plate): return _MovePlate(self, from_plate, to_plate, name=self.name + ".move_plates") def add_plate_axis(self, to_plate): return AddPlateAxis(to_plate)(self, name=self.name+".add_plate_axis") def __getitem__(self, index): return Slice(self, index, name=(self.name+".__getitem__")) def has_plotter(self): """ Return True if the node has a plotter """ return callable(self._plotter) def set_plotter(self, plotter): self._plotter = plotter def plot(self, fig=None, **kwargs): """ Plot the node distribution using the plotter of the node Because the distributions are in general very difficult to plot, the user must specify some functions which performs the plotting as wanted. See, for instance, bayespy.plot.plotting for available plotters, that is, functions that perform plotting for a node. """ if fig is None: import matplotlib.pyplot as plt fig = plt.gcf() if callable(self._plotter): ax = self._plotter(self, fig=fig, **kwargs) fig.suptitle('q(%s)' % self.name) return ax else: raise Exception("No plotter defined, can not plot") @staticmethod def _compute_message(*arrays, plates_from=(), plates_to=(), ndim=0): """ A general function for computing messages by sum-multiply The function computes the product of the input arrays and then sums to the requested plates. """ # Check that the plates broadcast properly if not misc.is_shape_subset(plates_to, plates_from): raise ValueError("plates_to must be broadcastable to plates_from") # Compute the explicit shape of the product shapes = [np.shape(array) for array in arrays] arrays_shape = misc.broadcasted_shape(*shapes) # Compute plates and dims that are present if ndim == 0: arrays_plates = arrays_shape dims = () else: arrays_plates = arrays_shape[:-ndim] dims = arrays_shape[-ndim:] # Compute the correction term. If some of the plates that should be # summed are actually broadcasted, one must multiply by the size of the # corresponding plate r = Node.broadcasting_multiplier(plates_from, arrays_plates, plates_to) # For simplicity, make the arrays equal ndim arrays = misc.make_equal_ndim(*arrays) # Keys for the input plates: (N-1, N-2, ..., 0) nplates = len(arrays_plates) in_plate_keys = list(range(nplates-1, -1, -1)) # Keys for the output plates out_plate_keys = [key for key in in_plate_keys if key < len(plates_to) and plates_to[-key-1] != 1] # Keys for the dims dim_keys = list(range(nplates, nplates+ndim)) # Total input and output keys in_keys = len(arrays) * [in_plate_keys + dim_keys] out_keys = out_plate_keys + dim_keys # Compute the sum-product with correction einsum_args = misc.zipper_merge(arrays, in_keys) + [out_keys] y = r * np.einsum(*einsum_args) # Reshape the result and apply correction nplates_result = min(len(plates_to), len(arrays_plates)) if nplates_result == 0: plates_result = [] else: plates_result = [min(plates_to[ind], arrays_plates[ind]) for ind in range(-nplates_result, 0)] y = np.reshape(y, plates_result + list(dims)) return y from .deterministic import Deterministic def slicelen(s, length=None): if length is not None: s = slice(*(s.indices(length))) return max(0, misc.ceildiv(s.stop - s.start, s.step)) class Slice(Deterministic): """ Basic slicing for plates. Slicing occurs when index is a slice object (constructed by start:stop:step notation inside of brackets), an integer, or a tuple of slice objects and integers. Currently, accept slices, newaxis, ellipsis and integers. For instance, does not accept lists/tuples to pick multiple indices of the same axis. Ellipsis expand to the number of : objects needed to make a selection tuple of the same length as x.ndim. Only the first ellipsis is expanded, any others are interpreted as :. Similar to: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#basic-slicing """ def __init__(self, X, slices, **kwargs): self._moments = X._moments self._parent_moments = (X._moments,) # Force a list if not isinstance(slices, tuple): slices = [slices] else: slices = list(slices) # # Expand Ellipsis # # Compute the number of required axes and how Ellipsis is expanded num_axis = 0 ellipsis_index = None for (k, s) in enumerate(slices): if misc.is_scalar_integer(s) or isinstance(s, slice): num_axis += 1 elif s is None: pass elif s is Ellipsis: # Index is an ellipsis, e.g., [...] if ellipsis_index is None: # Expand ... ellipsis_index = k else: # Interpret ... as : num_axis += 1 slices[k] = slice(None) else: raise TypeError("Invalid argument type: {0}".format(s.__class__)) if num_axis > len(X.plates): raise IndexError("Too many indices") # The number of plates that were not given explicit slicing (either # Ellipsis was used or the number of slices was smaller than the number # of plate axes) expand_len = len(X.plates) - num_axis if ellipsis_index is not None: # Replace Ellipsis with correct number of : k = ellipsis_index del slices[k] slices = slices[:k] + [slice(None)] * expand_len + slices[k:] else: # Add trailing : so that each plate has explicit slicing slices = slices + [slice(None)] * expand_len # # Preprocess indexing: # - integer indices to non-negative values # - slice start/stop values to non-negative # - slice start/stop values based on the size of the plate # # Index for parent plates j = 0 for (k, s) in enumerate(slices): if misc.is_scalar_integer(s): # Index is an integer, e.g., [3] if s < 0: # Handle negative index s += X.plates[j] if s < 0 or s >= X.plates[j]: raise IndexError("Index out of range") # Store the preprocessed integer index slices[k] = s j += 1 elif isinstance(s, slice): # Index is a slice, e.g., [2:6] # Normalize the slice s = slice(*(s.indices(X.plates[j]))) if slicelen(s) <= 0: raise IndexError("Slicing leads to empty plates") slices[k] = s j += 1 self.slices = slices super().__init__(X, dims=X.dims, **kwargs) def _plates_to_parent(self, index): return self.parents[index].plates def _plates_from_parent(self, index): plates = list(self.parents[index].plates) # Compute the plates. Note that Ellipsis has already been preprocessed # to a proper number of : k = 0 for s in self.slices: # Then, each case separately: slice, newaxis, integer if isinstance(s, slice): # Slice, e.g., [2:5] N = slicelen(s) if N <= 0: raise IndexError("Slicing leads to empty plates") plates[k] = N k += 1 elif s is None: # [np.newaxis] plates = plates[:k] + [1] + plates[k:] k += 1 elif misc.is_scalar_integer(s): # Integer, e.g., [3] del plates[k] else: raise RuntimeError("BUG: Unknown index type. Should capture earlier.") return tuple(plates) @staticmethod def __reverse_indexing(slices, m_child, plates, dims): """ A helpful function for performing reverse indexing/slicing """ j = -1 # plate index for parent i = -1 # plate index for child child_slices = () parent_slices = () msg_plates = () # Compute plate axes in the message from children ndim = len(dims) if ndim > 0: m_plates = np.shape(m_child)[:-ndim] else: m_plates = np.shape(m_child) for s in reversed(slices): if misc.is_scalar_integer(s): # Case: integer parent_slices = (s,) + parent_slices msg_plates = (plates[j],) + msg_plates j -= 1 elif s is None: # Case: newaxis if -i <= len(m_plates): child_slices = (0,) + child_slices i -= 1 elif isinstance(s, slice): # Case: slice if -i <= len(m_plates): child_slices = (slice(None),) + child_slices parent_slices = (s,) + parent_slices if ((-i > len(m_plates) or m_plates[i] == 1) and slicelen(s) == plates[j]): # Broadcasting can be applied. The message does not need # to be explicitly shaped to the full size msg_plates = (1,) + msg_plates else: # No broadcasting. Must explicitly form the full size # axis msg_plates = (plates[j],) + msg_plates j -= 1 i -= 1 else: raise RuntimeError("BUG: Unknown index type. Should capture earlier.") # Set the elements of the message m_parent = np.zeros(msg_plates + dims) if np.ndim(m_parent) == 0 and np.ndim(m_child) == 0: m_parent = m_child elif np.ndim(m_parent) == 0: m_parent = m_child[child_slices] elif np.ndim(m_child) == 0: m_parent[parent_slices] = m_child else: m_parent[parent_slices] = m_child[child_slices] return m_parent def _compute_weights_to_parent(self, index, weights): """ Compute the mask to the parent node. """ if index != 0: raise ValueError("Invalid index") parent = self.parents[0] return self.__reverse_indexing(self.slices, weights, parent.plates, ()) def _compute_message_to_parent(self, index, m, u): """ Compute the message to a parent node. """ if index != 0: raise ValueError("Invalid index") parent = self.parents[0] # Apply reverse indexing for the message arrays msg = [self.__reverse_indexing(self.slices, m_child, parent.plates, dims) for (m_child, dims) in zip(m, parent.dims)] return msg def _compute_moments(self, u): """ Get the moments with an added plate axis. """ # Process each moment for n in range(len(u)): # Compute the effective plates in the message/moment ndim = len(self.dims[n]) if ndim > 0: shape = np.shape(u[n])[:-ndim] else: shape = np.shape(u[n]) # Construct a list of slice objects u_slices = [] # Index for the shape j = -len(self.parents[0].plates) for (k, s) in enumerate(self.slices): if s is None: # [np.newaxis] if -j < len(shape): # Only add newaxis if there are some axes before # this. It does not make any difference if you added # leading unit axes u_slices.append(s) else: # slice or integer index if -j <= len(shape): # The moment has this axis, so it is not broadcasting it if shape[j] != 1: # Use the slice as it is u_slices.append(s) elif isinstance(s, slice): # Slice. # The moment is using broadcasting, just pick the # first element but use slice in order to keep the # axis u_slices.append(slice(0,1,1)) else: # Integer. # The moment is using broadcasting, just pick the # first element u_slices.append(0) j += 1 # Slice the message/moment u[n] = u[n][tuple(u_slices)] return u def AddPlateAxis(to_plate): if to_plate >= 0: raise Exception("Give negative value for axis index to_plate.") class _AddPlateAxis(Deterministic): def __init__(self, X, **kwargs): nonlocal to_plate N = len(X.plates) + 1 # Check the parameters if to_plate >= 0 or to_plate < -N: raise ValueError("Invalid plate position to add.") # Use positive indexing only ## if to_plate < 0: ## to_plate += N # Use negative indexing only if to_plate >= 0: to_plate -= N #self.to_plate = to_plate super().__init__(X, dims=X.dims, **kwargs) def _plates_to_parent(self, index): plates = list(self.plates) plates.pop(to_plate) return tuple(plates) #return self.plates[:to_plate] + self.plates[(to_plate+1):] def _plates_from_parent(self, index): plates = list(self.parents[index].plates) plates.insert(len(plates)-to_plate+1, 1) return tuple(plates) def _compute_weights_to_parent(self, index, weights): # Remove the added mask plate if abs(to_plate) <= np.ndim(weights): sh_weighs = list(np.shape(weights)) sh_weights.pop(to_plate) weights = np.reshape(weights, sh_weights) return weights def _compute_message_to_parent(self, index, m, *u_parents): """ Compute the message to a parent node. """ # Remove the added message plate for i in range(len(m)): # Remove the axis if np.ndim(m[i]) >= abs(to_plate) + len(self.dims[i]): axis = to_plate - len(self.dims[i]) sh_m = list(np.shape(m[i])) sh_m.pop(axis) m[i] = np.reshape(m[i], sh_m) return m def _compute_moments(self, u): """ Get the moments with an added plate axis. """ # Get parents' moments #u = self.parents[0].message_to_child() # Move a plate axis u = list(u) for i in range(len(u)): # Make sure the moments have all the axes #diff = len(self.plates) + len(self.dims[i]) - np.ndim(u[i]) - 1 #u[i] = misc.add_leading_axes(u[i], diff) # The location of the new axis/plate: axis = np.ndim(u[i]) - abs(to_plate) - len(self.dims[i]) + 1 if axis > 0: # Add one axes to the correct position sh_u = list(np.shape(u[i])) sh_u.insert(axis, 1) u[i] = np.reshape(u[i], sh_u) return u return _AddPlateAxis class NodeConstantScalar(Node): @staticmethod def compute_fixed_u_and_f(x): """ Compute u(x) and f(x) for given x. """ return ([x], 0) def __init__(self, a, **kwargs): self.u = [a] super().__init__(self, plates=np.shape(a), dims=[()], **kwargs) def start_optimization(self): # FIXME: Set the plate sizes appropriately!! x0 = self.u[0] #self.gradient = np.zeros(np.shape(x0)) def transform(x): # E.g., for positive scalars you could have exp here. self.gradient = np.zeros(np.shape(x0)) self.u[0] = x def gradient(): # This would need to apply the gradient of the # transformation to the computed gradient return self.gradient return (x0, transform, gradient) def add_to_gradient(self, d): self.gradient += d def message_to_child(self, gradient=False): if gradient: return (self.u, [ [np.ones(np.shape(self.u[0])), #self.gradient] ]) self.add_to_gradient] ]) else: return self.u def stop_optimization(self): #raise Exception("Not implemented for " + str(self.__class__)) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/point_estimate.py0000644000175100001770000001215400000000000024726 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from .node import Node, Moments from .stochastic import Stochastic from .deterministic import Deterministic class DeltaMoments(Moments): r""" Class for the moments of constants or delta distributed variables """ def compute_fixed_moments(self, x): r""" Compute the moments for a fixed value """ return [x] def compute_dims_from_values(self, x): r""" Return the shape of the moments for a fixed value. """ return ((),) class DeltaToAny(Deterministic): r""" Special converter of delta moments to any moments """ def __init__(self, X, moments): r""" """ self._moments = moments self._parent_moments = [DeltaMoments()] #(plates, dims) = moments.compute_plates_and_dims(X.get_shape(0)) dims = moments.compute_dims_from_shape(X.get_shape(0)) super().__init__(X, dims=dims, **kwargs) def _compute_moments(self, u_X): r""" """ x = u_X[0] return self._moments.compute_fixed_moments(x) def _compute_message_to_parent(self, index, m_child, u_X): r""" """ # Convert child message array to a gradient function raise NotImplementedError() if index == 0: m = m_child[:2] return m else: raise ValueError("Invalid parent index") def _compute_weights_to_parent(self, index, weights): r""" """ raise NotImplementedError() if index == 0: raise NotImplementedError() else: raise ValueError("Invalid parent index") def _plates_to_parent(self, index): r""" """ raise NotImplementedError() if index == 0: self.get_shape(0) raise NotImplementedError() else: raise ValueError("Invalid parent index") def _plates_from_parent(self, index): r""" """ raise NotImplementedError() if index == 0: return self.__cached_plates raise NotImplementedError() else: raise ValueError("Invalid parent index") class Scalar(Stochastic): def __init__(self, plates=None): dims = [()] raise NotImplementedError() def get_riemannian_gradient(self): m_children = self._message_from_children() g = self.annealing * m_children[0] return g def get_gradient(self, rg): return rg def get_parameters(self): return self.u def set_parameters(self, x): if len(x) != 1: raise Exception("Wrong number of parameters. Should be 1, is {0}".format(len(x))) self.u = [x[0]] return class PositiveScalar(Stochastic): pass class Constant(Node): r""" Node for presenting constant values. The node wraps arrays into proper node type. """ def __init__(self, moments, x, **kwargs): if not isinstance(moments, Moments) and issubclass(moments, Moments): raise ValueError("Give moments as an object instance instead of a class") self._moments = moments x = np.asanyarray(x) # Compute moments self.u = self._moments.compute_fixed_moments(x) # Dimensions of the moments dims = self._moments.compute_dims_from_values(x) # Resolve plates D = len(dims[0]) if D > 0: plates = np.shape(self.u[0])[:-D] else: plates = np.shape(self.u[0]) # Parent constructor super().__init__(dims=dims, plates=plates, **kwargs) def _get_id_list(self): """ Returns the stochastic ID list. This method is used to check that same stochastic nodes are not direct parents of a node several times. It is only valid if there are intermediate stochastic nodes. To put it another way: each ID corresponds to one factor q(..) in the posterior approximation. Different IDs mean different factors, thus they mean independence. The parents must have independent factors. Stochastic nodes should return their unique ID. Deterministic nodes should return the IDs of their parents. Constant nodes should return empty list of IDs. """ return [] def get_moments(self): return self.u def set_value(self, x): x = np.asanyarray(x) shapes = [np.shape(ui) for ui in self.u] self.u = self._moments.compute_fixed_moments(x) for (i, shape) in enumerate(shapes): if np.shape(self.u[i]) != shape: raise ValueError("Incorrect shape for the array") def lower_bound_contribution(self, gradient=False, **kwargs): # Deterministic functions are delta distributions so the lower bound # contribuion is zero. return 0 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/poisson.py0000644000175100001770000000762700000000000023405 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Module for the Poisson distribution node. """ import numpy as np from scipy import special from .expfamily import ExponentialFamily from .expfamily import ExponentialFamilyDistribution from .node import Moments from .gamma import GammaMoments from bayespy.utils import misc class PoissonMoments(Moments): """ Class for the moments of Poisson variables """ dims = ( (), ) def compute_fixed_moments(self, x): """ Compute the moments for a fixed value """ # Make sure the values are integers in valid range x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Count not integer") # Now, the moments are just the counts return [x] @classmethod def from_values(cls, x): """ Return the shape of the moments for a fixed value. The realizations are scalars, thus the shape of the moment is (). """ return cls() class PoissonDistribution(ExponentialFamilyDistribution): """ Class for the VMP formulas of Poisson variables. """ def compute_message_to_parent(self, parent, index, u, u_lambda): """ Compute the message to a parent node. """ if index == 0: m0 = -1 m1 = np.copy(u[0]) return [m0, m1] else: raise ValueError("Index out of bounds") def compute_phi_from_parents(self, u_lambda, mask=True): """ Compute the natural parameter vector given parent moments. """ l = u_lambda[0] logl = u_lambda[1] phi0 = logl return [phi0] def compute_moments_and_cgf(self, phi, mask=True): """ Compute the moments and :math:`g(\phi)`. """ u0 = np.exp(phi[0]) u = [u0] g = -u0 return (u, g) def compute_cgf_from_parents(self, u_lambda): """ Compute :math:`\mathrm{E}_{q(p)}[g(p)]` """ l = u_lambda[0] g = -l return g def compute_fixed_moments_and_f(self, x, mask=True): """ Compute the moments and :math:`f(x)` for a fixed value. """ # Check the validity of x x = np.asanyarray(x) if not misc.isinteger(x): raise ValueError("Values must be integers") if np.any(x < 0): raise ValueError("Values must be positive") # Compute moments u0 = np.copy(x) u = [u0] # Compute f(x) f = -special.gammaln(x+1) return (u, f) def random(self, *phi, plates=None): """ Draw a random sample from the distribution. """ return np.random.poisson(np.exp(phi[0]), size=plates) class Poisson(ExponentialFamily): """ Node for Poisson random variables. The node uses Poisson distribution: .. math:: p(x) = \mathrm{Poisson}(x|\lambda) where :math:`\lambda` is the rate parameter. Parameters ---------- l : gamma-like node or scalar or array :math:`\lambda`, rate parameter See also -------- Gamma, Exponential """ dims = ( (), ) _moments = PoissonMoments() _parent_moments = [GammaMoments()] _distribution = PoissonDistribution() def __init__(self, l, **kwargs): """ Create Poisson random variable node """ super().__init__(l, **kwargs) def __str__(self): """ Print the distribution using standard parameterization. """ l = self.u[0] return ("%s ~ Categorical(lambda)\n" " lambda =\n" "%s\n" % (self.name, l)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/stochastic.py0000644000175100001770000002654600000000000024060 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from bayespy.utils import misc import h5py from .node import Node class Distribution(): """ A base class for the VMP formulas of variables. Sub-classes implement distribution specific computations. If a sub-class maps the plates differently, it needs to overload the following methods: * compute_weights_to_parent * plates_to_parent * plates_from_parent """ def compute_message_to_parent(self, parent, index, u_self, *u_parents): """ Compute the message to a parent node. """ raise NotImplementedError() def compute_weights_to_parent(self, index, weights): """ Maps the mask to the plates of a parent. """ # Sub-classes may need to overwrite this method return weights def plates_to_parent(self, index, plates): """ Resolves the plate mapping to a parent. Given the plates of the node's moments, this method returns the plates that the message to a parent has for the parent's distribution. """ return plates def plates_from_parent(self, index, plates): """ Resolve the plate mapping from a parent. Given the plates of a parent's moments, this method returns the plates that the moments has for this distribution. """ return plates def random(self, *params, plates=None): """ Draw a random sample from the distribution. """ raise NotImplementedError() def squeeze(self, axis): """Squeeze a plate axis from the distribution The default implementation does no changes to the distribution. Override if needed. """ return self class Stochastic(Node): """ Base class for nodes that are stochastic. u observed Sub-classes must implement: _compute_message_to_parent(parent, index, u_self, *u_parents) _update_distribution_and_lowerbound(self, m, *u) lowerbound(self) _compute_dims initialize_from_prior() If you want to be able to observe the variable: _compute_fixed_moments_and_f Sub-classes may need to re-implement: 1. If they manipulate plates: _compute_weights_to_parent(index, weights) _compute_plates_to_parent(self, index, plates) _compute_plates_from_parent(self, index, plates) """ # Sub-classes must over-write this _distribution = None def __init__(self, *args, initialize=True, dims=None, **kwargs): self._id = Node._id_counter Node._id_counter += 1 super().__init__(*args, dims=dims, **kwargs) # Initialize moment array axes = len(self.plates)*(1,) self.u = [misc.nans(axes+dim) for dim in dims] # Not observed self.observed = False self.ndims = [len(dim) for dim in self.dims] if initialize: self.initialize_from_prior() def get_pdf_nodes(self): return (self,) + super().get_pdf_nodes() def _get_pdf_nodes_conditioned_on_parents(self): return (self,) def _get_id_list(self): """ Returns the stochastic ID list. This method is used to check that same stochastic nodes are not direct parents of a node several times. It is only valid if there are intermediate stochastic nodes. To put it another way: each ID corresponds to one factor q(..) in the posterior approximation. Different IDs mean different factors, thus they mean independence. The parents must have independent factors. Stochastic nodes should return their unique ID. Deterministic nodes should return the IDs of their parents. Constant nodes should return empty list of IDs. """ return [self._id] def _compute_plates_to_parent(self, index, plates): return self._distribution.plates_to_parent(index, plates) def _compute_plates_from_parent(self, index, plates): return self._distribution.plates_from_parent(index, plates) def _compute_weights_to_parent(self, index, weights): return self._distribution.compute_weights_to_parent(index, weights) def get_moments(self): # Just for safety, do not return a reference to the moment list of this # node but instead create a copy of the list. return [ui for ui in self.u] def _get_message_and_mask_to_parent(self, index, u_parent=None): u_parents = self._message_from_parents(exclude=index) u_parents[index] = u_parent m = self._distribution.compute_message_to_parent(self.parents[index], index, self.u, *u_parents) mask = self._distribution.compute_weights_to_parent(index, self.mask) != 0 return (m, mask) def _set_mask(self, mask): self.mask = np.logical_or(mask, self.observed) def _check_shape(self, u, broadcast=True): if len(u) != len(self.dims): raise ValueError("Incorrect number of arrays") for (dimsi, ui) in zip(self.dims, u): sh_true = self.plates + dimsi sh = np.shape(ui) ndim = len(dimsi) errmsg = ( "Shape of the given array not equal to the shape of the node.\n" "Received shape: {0}\n" "Expected shape: {1}\n" "Check plates." .format(sh, sh_true) ) if not broadcast: if sh != sh_true: raise ValueError(errmsg) else: if ndim == 0: if not misc.is_shape_subset(sh, sh_true): raise ValueError(errmsg) else: plates_ok = misc.is_shape_subset(sh[:-ndim], self.plates) dims_ok = (sh[-ndim:] == dimsi) if not (plates_ok and dims_ok): raise ValueError(errmsg) return def _set_moments(self, u, mask=True, broadcast=True): self._check_shape(u, broadcast=broadcast) # Store the computed moments u but do not change moments for # observations, i.e., utilize the mask. for ind in range(len(u)): # Add axes to the mask for the variable dimensions (mask # contains only axes for the plates). u_mask = misc.add_trailing_axes(mask, self.ndims[ind]) # Enlarge self.u[ind] as necessary so that it can store the # broadcasted result. sh = misc.broadcasted_shape_from_arrays(self.u[ind], u[ind], u_mask) self.u[ind] = misc.repeat_to_shape(self.u[ind], sh) # TODO/FIXME/BUG: The mask of observations is not used, observations # may be overwritten!!! ??? # Hah, this function is used to set the observations! The caller # should be careful what mask he uses! If you want to set only # latent variables, then use such a mask. # Use mask to update only unobserved plates and keep the # observed as before np.copyto(self.u[ind], u[ind], where=u_mask) # Make sure u has the correct number of dimensions: shape = self.get_shape(ind) ndim = len(shape) ndim_u = np.ndim(self.u[ind]) if ndim > ndim_u: self.u[ind] = misc.add_leading_axes(u[ind], ndim - ndim_u) elif ndim < ndim_u: # This should not ever happen because we already checked the # shape at the beginning of the function. raise RuntimeError( "This error should not happen. Fix shape checking." "The size of the variable %s's %s-th moment " "array is %s which is larger than it should " "be, that is, %s, based on the plates %s and " "dimension %s. Check that you have provided " "plates properly." % (self.name, ind, np.shape(self.u[ind]), shape, self.plates, self.dims[ind])) def update(self, annealing=1.0): if not np.all(self.observed): u_parents = self._message_from_parents() m_children = self._message_from_children() if annealing != 1.0: m_children = [annealing * m for m in m_children] self._update_distribution_and_lowerbound(m_children, *u_parents) def observe(self, x, mask=True): """ Fix moments, compute f and propagate mask. """ raise NotImplementedError() def unobserve(self): # Update mask self.observed = False self._update_mask() def lowerbound(self): # Sub-class should implement this raise NotImplementedError() def _update_distribution_and_lowerbound(self, m_children, *u_parents): # Sub-classes should implement this raise NotImplementedError() def save(self, filename): # Open HDF5 file h5f = h5py.File(filename, 'w') try: # Write each node nodegroup = h5f.create_group('nodes') if self.name == '': raise ValueError("In order to save nodes, they must have " "(unique) names.") self._save(nodegroup.create_group(self.name)) finally: # Close file h5f.close() def _save(self, group): """ Save the state of the node into a HDF5 file. group can be the root """ for i in range(len(self.u)): misc.write_to_hdf5(group, self.u[i], 'u%d' % i) misc.write_to_hdf5(group, self.observed, 'observed') return def load(self, filename): h5f = h5py.File(filename, 'r') try: self._load(h5f['nodes'][self.name]) finally: h5f.close() return def _load(self, group): """ Load the state of the node from a HDF5 file. """ # TODO/FIXME: Check that the shapes are correct! for i in range(len(self.u)): ui = group['u%d' % i][...] self.u[i] = ui old_observed = self.observed self.observed = group['observed'][...] # Update masks if necessary if np.any(old_observed != self.observed): self._update_mask() def random(self): """ Draw a random sample from the distribution. """ raise NotImplementedError() def show(self): """ Print the distribution using standard parameterization. """ print(str(self)) def __str__(self): """ """ raise NotImplementedError("String representation not yet implemented for " "node class %s" % (self.__class__.__name__)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/take.py0000644000175100001770000001050400000000000022623 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from .deterministic import Deterministic from .node import Moments from bayespy.utils import misc class Take(Deterministic): """ Choose elements/sub-arrays along a plate axis Basically, applies `np.take` on a plate axis. Allows advanced mapping of plates. Parameters ---------- node : Node A node to apply the take operation on. indices : array of integers Plate elements to pick along a plate axis. plate_axis : int (negative) The plate axis to pick elements from (default: -1). See also -------- numpy.take Examples -------- >>> from bayespy.nodes import Gamma, Take >>> alpha = Gamma([1, 2, 3], [1, 1, 1]) >>> x = Take(alpha, [1, 1, 2, 2, 1, 0]) >>> x.get_moments()[0] array([2., 2., 3., 3., 2., 1.]) """ def __init__(self, node, indices, plate_axis=-1, **kwargs): self._moments = node._moments self._parent_moments = (node._moments,) self._indices = np.array(indices) self._plate_axis = plate_axis self._original_length = node.plates[plate_axis] # Validate arguments if not misc.is_scalar_integer(plate_axis): raise ValueError("Plate axis must be integer") if plate_axis >= 0: raise ValueError("plate_axis must be negative index") if plate_axis < -len(node.plates): raise ValueError("plate_axis out of bounds") if not issubclass(self._indices.dtype.type, np.integer): raise ValueError("Indices must be integers") if (np.any(self._indices < -self._original_length) or np.any(self._indices >= self._original_length)): raise ValueError("Index out of bounds") super().__init__(node, dims=node.dims, **kwargs) def _compute_moments(self, u_parent): u = [] for (ui, dimi) in zip(u_parent, self.dims): axis = self._plate_axis - len(dimi) # Just in case the taken axis is using broadcasting and has unit # length in u_parent, force it to have the correct length along the # axis in order to avoid errors in np.take. broadcaster = np.ones((self._original_length,) + (-axis-1)*(1,)) u.append(np.take(ui*broadcaster, self._indices, axis=axis)) return u def _compute_message_to_parent(self, index, m_child, u_parent): m = [ misc.put_simple( mi, self._indices, axis=self._plate_axis-len(dimi), length=self._original_length, ) for (mi, dimi) in zip(m_child, self.dims) ] return m def _compute_weights_to_parent(self, index, weights): return misc.put_simple( weights, self._indices, axis=self._plate_axis, length=self._original_length, ) def _compute_plates_to_parent(self, index, plates): # Number of axes created by take operation N = np.ndim(self._indices) if self._plate_axis >= 0: raise RuntimeError("Plate axis should be negative") end_before = self._plate_axis - N + 1 start_after = self._plate_axis + 1 if end_before == 0: return plates + (self._original_length,) elif start_after == 0: return plates[:end_before] + (self._original_length,) return (plates[:end_before] + (self._original_length,) + plates[start_after:]) def _compute_plates_from_parent(self, index, parent_plates): plates = parent_plates[:self._plate_axis] + np.shape(self._indices) if self._plate_axis != -1: plates = plates + parent_plates[(self._plate_axis+1):] return plates def _compute_plates_multiplier_from_parent(self, index, parent_multiplier): if any(p != 1 for p in parent_multiplier): raise NotImplementedError( "Take node doesn't yet support plate multipliers {0}" .format(parent_multiplier) ) return parent_multiplier ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.413372 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/0000755000175100001770000000000000000000000022467 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/__init__.py0000644000175100001770000000000000000000000024566 0ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_bernoulli.py0000644000175100001770000000673600000000000026107 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `bernoulli` module. """ import numpy as np import scipy from bayespy.nodes import (Bernoulli, Beta, Mixture) from bayespy.utils import random from bayespy.utils.misc import TestCase class TestBernoulli(TestCase): """ Unit tests for Bernoulli node """ def test_init(self): """ Test the creation of Bernoulli nodes. """ # Some simple initializations X = Bernoulli(0.5) X = Bernoulli(Beta([2,3])) # Check that plates are correct X = Bernoulli(0.7, plates=(4,3)) self.assertEqual(X.plates, (4,3)) X = Bernoulli(0.7*np.ones((4,3))) self.assertEqual(X.plates, (4,3)) X = Bernoulli(Beta([4,3], plates=(4,3))) self.assertEqual(X.plates, (4,3)) # Invalid probability self.assertRaises(ValueError, Bernoulli, -0.5) self.assertRaises(ValueError, Bernoulli, 1.5) # Inconsistent plates self.assertRaises(ValueError, Bernoulli, 0.5*np.ones(4), plates=(3,)) # Explicit plates too small self.assertRaises(ValueError, Bernoulli, 0.5*np.ones(4), plates=(1,)) pass def test_moments(self): """ Test the moments of Bernoulli nodes. """ # Simple test X = Bernoulli(0.7) u = X._message_to_child() self.assertEqual(len(u), 1) self.assertAllClose(u[0], 0.7) # Test plates in p p = np.random.rand(3) X = Bernoulli(p) u = X._message_to_child() self.assertAllClose(u[0], p) # Test with beta prior P = Beta([7, 3]) logp = P._message_to_child()[0] p0 = np.exp(logp[0]) / (np.exp(logp[0]) + np.exp(logp[1])) X = Bernoulli(P) u = X._message_to_child() self.assertAllClose(u[0], p0) # Test with broadcasted plates P = Beta([7, 3], plates=(10,)) X = Bernoulli(P) u = X._message_to_child() self.assertAllClose(u[0] * np.ones(X.get_shape(0)), p0*np.ones(10)) pass def test_mixture(self): """ Test mixture of Bernoulli """ P = Mixture([2,0,0], Bernoulli, [0.1, 0.2, 0.3]) u = P._message_to_child() self.assertEqual(len(u), 1) self.assertAllClose(u[0], [0.3, 0.1, 0.1]) pass def test_observed(self): """ Test observation of Bernoulli node """ Z = Bernoulli(0.3) Z.observe(2 < 3) pass def test_random(self): """ Test random sampling in Bernoulli node """ p = [1.0, 0.0] with np.errstate(divide='ignore'): Z = Bernoulli(p, plates=(3,2)).random() self.assertArrayEqual(Z, np.ones((3,2))*p) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_beta.py0000644000175100001770000000515300000000000025017 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `beta` module. """ import numpy as np from scipy import special from bayespy.nodes import Beta from bayespy.utils import random from bayespy.utils.misc import TestCase class TestBeta(TestCase): """ Unit tests for Beta node """ def test_init(self): """ Test the creation of beta nodes. """ # Some simple initializations p = Beta([1.5, 4.2]) # Check that plates are correct p = Beta([2, 3], plates=(4,3)) self.assertEqual(p.plates, (4,3)) p = Beta(np.ones((4,3,2))) self.assertEqual(p.plates, (4,3)) # Parent not a vector self.assertRaises(ValueError, Beta, 4) # Parent vector has wrong shape self.assertRaises(ValueError, Beta, [4]) self.assertRaises(ValueError, Beta, [4,4,4]) # Parent vector has invalid values self.assertRaises(ValueError, Beta, [-2,3]) # Plates inconsistent self.assertRaises(ValueError, Beta, np.ones((4,2)), plates=(3,)) # Explicit plates too small self.assertRaises(ValueError, Beta, np.ones((4,2)), plates=(1,)) pass def test_moments(self): """ Test the moments of beta nodes. """ p = Beta([2, 3]) u = p._message_to_child() self.assertAllClose(u[0], special.psi([2,3]) - special.psi(2+3)) pass def test_random(self): """ Test random sampling of beta nodes. """ p = Beta([1e20, 3e20]) x = p.random() self.assertAllClose(x, 0.25) p = Beta([[1e20, 3e20], [1e20, 1e20]]) x = p.random() self.assertAllClose(x, [0.25, 0.5]) p = Beta([1e20, 3e20], plates=(3,)) x = p.random() self.assertAllClose(x, [0.25, 0.25, 0.25]) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_binomial.py0000644000175100001770000001502700000000000025677 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `binomial` module. """ import numpy as np import scipy from bayespy.nodes import (Binomial, Beta, Mixture) from bayespy.utils import random from bayespy.utils.misc import TestCase class TestBinomial(TestCase): """ Unit tests for Binomial node """ def test_init(self): """ Test the creation of binomial nodes. """ # Some simple initializations X = Binomial(10, 0.5) X = Binomial(10, Beta([2,3])) # Check that plates are correct X = Binomial(10, 0.7, plates=(4,3)) self.assertEqual(X.plates, (4,3)) X = Binomial(10, 0.7*np.ones((4,3))) self.assertEqual(X.plates, (4,3)) n = np.ones((4,3), dtype=np.int64) X = Binomial(n, 0.7) self.assertEqual(X.plates, (4,3)) X = Binomial(10, Beta([4,3], plates=(4,3))) self.assertEqual(X.plates, (4,3)) # Invalid probability self.assertRaises(ValueError, Binomial, 10, -0.5) self.assertRaises(ValueError, Binomial, 10, 1.5) # Invalid number of trials self.assertRaises(ValueError, Binomial, -1, 0.5) self.assertRaises(ValueError, Binomial, 8.5, 0.5) # Inconsistent plates self.assertRaises(ValueError, Binomial, 10, 0.5*np.ones(4), plates=(3,)) # Explicit plates too small self.assertRaises(ValueError, Binomial, 10, 0.5*np.ones(4), plates=(1,)) pass def test_moments(self): """ Test the moments of binomial nodes. """ # Simple test X = Binomial(1, 0.7) u = X._message_to_child() self.assertEqual(len(u), 1) self.assertAllClose(u[0], 0.7) # Test n X = Binomial(10, 0.7) u = X._message_to_child() self.assertAllClose(u[0], 10*0.7) # Test plates in p n = np.random.randint(1, 10) p = np.random.rand(3) X = Binomial(n, p) u = X._message_to_child() self.assertAllClose(u[0], p*n) # Test plates in n n = np.random.randint(1, 10, size=(3,)) p = np.random.rand() X = Binomial(n, p) u = X._message_to_child() self.assertAllClose(u[0], p*n) # Test plates in p and n n = np.random.randint(1, 10, size=(4,1)) p = np.random.rand(3) X = Binomial(n, p) u = X._message_to_child() self.assertAllClose(u[0], p*n) # Test with beta prior P = Beta([7, 3]) logp = P._message_to_child()[0] p0 = np.exp(logp[0]) / (np.exp(logp[0]) + np.exp(logp[1])) X = Binomial(1, P) u = X._message_to_child() self.assertAllClose(u[0], p0) # Test with broadcasted plates P = Beta([7, 3], plates=(10,)) X = Binomial(5, P) u = X._message_to_child() self.assertAllClose(u[0] * np.ones(X.get_shape(0)), 5*p0*np.ones(10)) pass def test_mixture(self): """ Test binomial mixture """ X = Mixture(2, Binomial, 10, [0.1, 0.2, 0.3, 0.4]) u = X._message_to_child() self.assertAllClose(u[0], 3.0) pass def test_observed(self): """ Test observation of Bernoulli node """ Z = Binomial(10, 0.3) Z.observe(10) u = Z._message_to_child() self.assertAllClose(u[0], 10) Z = Binomial(10, 0.9) Z.observe(2) u = Z._message_to_child() self.assertAllClose(u[0], 2) pass def test_random(self): """ Test random sampling in Binomial node """ N = [ [5], [50] ] p = [1.0, 0.0] with np.errstate(divide='ignore'): Z = Binomial(N, p, plates=(3,2,2)).random() self.assertArrayEqual(Z, np.ones((3,2,2))*N*p) def test_mixture_with_count_array(self): """ Test binomial mixture with varying number of trials """ p0 = 0.6 p1 = 0.9 p2 = 0.3 counts = [[10], [5], [3]] X = Mixture(2, Binomial, counts, [p0, p1, p2]) u = X._message_to_child() self.assertAllClose( u[0], np.array(counts)[:, 0] * np.array(p2) ) # Multi-mixture and count array # Shape(p) = (2, 1, 3) + () p = [ [[0.6, 0.9, 0.8]], [[0.1, 0.2, 0.3]], ] # Shape(Z1) = (1, 3) + (2,) -> () + (2,) Z1 = 1 # Shape(Z2) = (1,) + (3,) -> () + (3,) Z2 = 2 # Shape(counts) = (5, 1) counts = [[10], [5], [3], [2], [4]] # Shape(X) = (5,) + () X = Mixture( Z1, Mixture, Z2, Binomial, counts, p, # NOTE: We mix over axes -3 and -1. But as we first mix over the # default (-1), then the next mixing happens over -2 (because one # axis was already dropped). cluster_plate=-2, ) self.assertAllClose( X._message_to_child()[0], np.array(counts)[:, 0] * np.array(p)[Z1, :, Z2] ) # Can't have non-singleton axis in counts over the mixed axis p0 = 0.6 p1 = 0.9 p2 = 0.3 counts = [10, 5, 3] self.assertRaises( ValueError, Mixture, 2, Binomial, counts, [p0, p1, p2], ) return ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_categorical.py0000644000175100001770000001564700000000000026372 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `categorical` module. """ import warnings import numpy as np import scipy from bayespy.nodes import (Categorical, Dirichlet, Mixture, Gamma) from bayespy.utils import random from bayespy.utils import misc from bayespy.utils.misc import TestCase class TestCategorical(TestCase): """ Unit tests for Categorical node """ def test_init(self): """ Test the creation of categorical nodes. """ # Some simple initializations X = Categorical([0.1, 0.3, 0.6]) X = Categorical(Dirichlet([5,4,3])) # Check that plates are correct X = Categorical([0.1, 0.3, 0.6], plates=(3,4)) self.assertEqual(X.plates, (3,4)) X = Categorical(0.25*np.ones((2,3,4))) self.assertEqual(X.plates, (2,3)) X = Categorical(Dirichlet([2,1,9], plates=(3,4))) self.assertEqual(X.plates, (3,4)) # Probabilities not a vector self.assertRaises(ValueError, Categorical, 0.5) # Invalid probability self.assertRaises(ValueError, Categorical, [-0.5, 1.5], n=10) self.assertRaises(ValueError, Categorical, [0.5, 1.5], n=10) # Inconsistent plates self.assertRaises(ValueError, Categorical, 0.25*np.ones((2,4)), plates=(3,), n=10) # Explicit plates too small self.assertRaises(ValueError, Categorical, 0.25*np.ones((2,4)), plates=(1,), n=10) pass def test_moments(self): """ Test the moments of categorical nodes. """ # Simple test X = Categorical([0.7,0.2,0.1]) u = X._message_to_child() self.assertEqual(len(u), 1) self.assertAllClose(u[0], [0.7,0.2,0.1]) # Test plates in p p = np.random.dirichlet([1,1], size=3) X = Categorical(p) u = X._message_to_child() self.assertAllClose(u[0], p) # Test with Dirichlet prior P = Dirichlet([7, 3]) logp = P._message_to_child()[0] p0 = np.exp(logp[0]) / (np.exp(logp[0]) + np.exp(logp[1])) p1 = np.exp(logp[1]) / (np.exp(logp[0]) + np.exp(logp[1])) X = Categorical(P) u = X._message_to_child() p = np.array([p0, p1]) self.assertAllClose(u[0], p) # Test with broadcasted plates P = Dirichlet([7, 3], plates=(10,)) X = Categorical(P) u = X._message_to_child() self.assertAllClose(u[0] * np.ones(X.get_shape(0)), p*np.ones((10,1))) pass def test_observed(self): """ Test observed categorical nodes """ # Single observation X = Categorical([0.7,0.2,0.1]) X.observe(2) u = X._message_to_child() self.assertAllClose(u[0], [0,0,1]) # One plate axis X = Categorical([0.7,0.2,0.1], plates=(2,)) X.observe([2,1]) u = X._message_to_child() self.assertAllClose(u[0], [[0,0,1], [0,1,0]]) # Several plate axes X = Categorical([0.7,0.1,0.1,0.1], plates=(2,3,)) X.observe([[2,1,1], [0,2,3]]) u = X._message_to_child() self.assertAllClose(u[0], [ [[0,0,1,0], [0,1,0,0], [0,1,0,0]], [[1,0,0,0], [0,0,1,0], [0,0,0,1]] ]) # Check invalid observations X = Categorical([0.7,0.2,0.1]) self.assertRaises(ValueError, X.observe, -1) self.assertRaises(ValueError, X.observe, 3) self.assertRaises(ValueError, X.observe, 1.5) pass def test_constant(self): """ Test constant categorical nodes """ # Basic test Y = Mixture(2, Gamma, [1, 2, 3], [1, 1, 1]) u = Y._message_to_child() self.assertAllClose(u[0], 3/1) # Test with one plate axis alpha = [[1, 2, 3], [4, 5, 6]] Y = Mixture([2, 1], Gamma, alpha, 1) u = Y._message_to_child() self.assertAllClose(u[0], [3, 5]) # Test with two plate axes alpha = [ [[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]] ] Y = Mixture([[2, 1], [0, 2]], Gamma, alpha, 1) u = Y._message_to_child() self.assertAllClose(u[0], [[3, 5], [7, 12]]) pass def test_initialization(self): """ Test initialization of categorical nodes """ # Test initialization from random with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) Z = Categorical([[0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]) Z.initialize_from_random() u = Z._message_to_child() self.assertAllClose(u[0], [[0, 1, 0], [0, 0, 1]]) pass def test_gradient(self): """ Check the Euclidean gradient of the categorical node """ Z = Categorical([[0.3, 0.5, 0.2], [0.1, 0.6, 0.3]]) Y = Mixture(Z, Gamma, [2, 3, 4], [5, 6, 7]) Y.observe([4.2, 0.2]) def f(x): Z.set_parameters([np.reshape(x, Z.get_shape(0))]) return Z.lower_bound_contribution() + Y.lower_bound_contribution() def df(x): Z.set_parameters([np.reshape(x, Z.get_shape(0))]) g = Z.get_riemannian_gradient() return Z.get_gradient(g)[0] x0 = np.ravel(np.log([[2, 3, 7], [0.1, 3, 1]])) self.assertAllClose( misc.gradient(f, x0), np.ravel(df(x0)) ) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_categorical_markov_chain.py0000644000175100001770000001627100000000000031105 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for bayespy.inference.vmp.nodes.categorical_markov_chain module. """ import warnings import numpy as np from bayespy.utils import misc from bayespy.inference.vmp.nodes import CategoricalMarkovChain, \ Dirichlet class TestCategoricalMarkovChain(misc.TestCase): def test_init(self): """ Test the creation of CategoricalMarkovChain object """ # # Plates # p0 = np.random.dirichlet([1, 1]) P = np.random.dirichlet([1, 1], size=(3,2)) Z = CategoricalMarkovChain(p0, P) self.assertEqual((), Z.plates, msg="Incorrect plates") self.assertEqual(((2,),(3,2,2)), Z.dims, msg="Incorrect dimensions") p0 = np.random.dirichlet([1, 1], size=(4,)) P = np.random.dirichlet([1, 1], size=(3,2)) Z = CategoricalMarkovChain(p0, P) self.assertEqual((4,), Z.plates, msg="Incorrect plates") self.assertEqual(((2,),(3,2,2)), Z.dims, msg="Incorrect dimensions") p0 = np.random.dirichlet([1, 1]) P = np.random.dirichlet([1, 1], size=(4,3,2)) Z = CategoricalMarkovChain(p0, P) self.assertEqual((4,), Z.plates, msg="Incorrect plates") self.assertEqual(((2,),(3,2,2)), Z.dims, msg="Incorrect dimensions") # Test some overflow bugs p0 = np.array([0.5, 0.5]) P = Dirichlet(1e-3*np.ones(2), plates=(2,)) Z = CategoricalMarkovChain(p0, P, states=2000) u = Z._message_to_child() self.assertTrue(np.all(~np.isnan(u[0])), msg="Nans in moments") self.assertTrue(np.all(~np.isnan(u[1])), msg="Nans in moments") pass def test_message_to_child(self): """ Test the message of CategoricalMarkovChain to child """ with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) # Deterministic oscillator p0 = np.array([1.0, 0.0]) P = np.array(3*[[[0.0, 1.0], [1.0, 0.0]]]) Z = CategoricalMarkovChain(p0, P) u = Z._message_to_child() self.assertAllClose(u[0], [1.0, 0]) self.assertAllClose(u[1], [ [[0.0, 1.0], [0.0, 0.0]], [[0.0, 0.0], [1.0, 0.0]], [[0.0, 1.0], [0.0, 0.0]] ]) # Maximum randomness p0 = np.array([0.5, 0.5]) P = np.array(3*[[[0.5, 0.5], [0.5, 0.5]]]) Z = CategoricalMarkovChain(p0, P) u = Z._message_to_child() self.assertAllClose(u[0], [0.5, 0.5]) self.assertAllClose(u[1], [ [[0.25, 0.25], [0.25, 0.25]], [[0.25, 0.25], [0.25, 0.25]], [[0.25, 0.25], [0.25, 0.25]] ]) # Random init, deterministic dynamics p0 = np.array([0.5, 0.5]) P = np.array(3*[[[0, 1], [1, 0]]]) Z = CategoricalMarkovChain(p0, P) u = Z._message_to_child() self.assertAllClose(u[0], [0.5, 0.5]) self.assertAllClose(u[1], [ [[0.0, 0.5], [0.5, 0.0]], [[0.0, 0.5], [0.5, 0.0]], [[0.0, 0.5], [0.5, 0.0]] ]) # Test plates p0 = np.array([ [1.0, 0.0], [0.5, 0.5] ]) P = np.array([ [ [[0.0, 1.0], [1.0, 0.0]] ], [ [[0.5, 0.5], [0.5, 0.5]] ] ]) Z = CategoricalMarkovChain(p0, P) u = Z._message_to_child() self.assertAllClose(u[0], [[1.0, 0.0], [0.5, 0.5]]) self.assertAllClose(u[1], [ [ [[0.0, 1.0], [0.0, 0.0]] ], [ [[0.25, 0.25], [0.25, 0.25]] ] ]) # Test broadcasted state transition probabilities p0 = np.array([1.0, 0.0]) P = Dirichlet([1e-10, 1e10], plates=(3,2)) Z = CategoricalMarkovChain(p0, P) u = Z._message_to_child() self.assertAllClose(u[0], [1.0, 0]) self.assertAllClose(u[1], [ [[0.0, 1.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 1.0]], [[0.0, 0.0], [0.0, 1.0]] ]) pass def test_random(self): """ Test random sampling of categorical Markov chain """ with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) # Simple random sample Z = CategoricalMarkovChain([1, 0], [[0, 1], [1, 0]], states=3) z = Z.random() self.assertAllClose(z, [0, 1, 0]) # Draw sample with plates p0 = [ [1,0], [0,1] ] P = [ [ [[0,1], [1,0]] ], [ [[1,0], [0,1]] ] ] Z = CategoricalMarkovChain(p0, P, states=3) z = Z.random() self.assertAllClose(z, [[0, 1, 0], [1, 1, 1]]) # Draw sample with plates, parameters broadcasted Z = CategoricalMarkovChain([1, 0], [[0, 1], [1, 0]], states=3, plates=(3,4)) z = Z.random() self.assertAllClose(z, np.ones((3,4,1))*[0, 1, 0]) # Draw sample with time-varying transition matrix p0 = [1, 0] P = [ [[0,1], [1,0]], [[1,0], [0,1]], [[1, 0], [1, 0]] ] Z = CategoricalMarkovChain(p0, P, states=4) z = Z.random() self.assertAllClose(z, [0, 1, 1, 0]) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_concatenate.py0000644000175100001770000002354200000000000026372 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `concatenate` module. """ import warnings import numpy as np from bayespy.nodes import (Concatenate, GaussianARD, Gamma) from bayespy.utils import random from bayespy.utils.misc import TestCase class TestConcatenate(TestCase): """ Unit tests for Concatenate node. """ def test_init(self): """ Test the creation of Concatenate node """ # One parent only X = GaussianARD(0, 1, plates=(3,), shape=()) Y = Concatenate(X) self.assertEqual(Y.plates, (3,)) self.assertEqual(Y.dims, ( (), () )) X = GaussianARD(0, 1, plates=(3,), shape=(2,4)) Y = Concatenate(X) self.assertEqual(Y.plates, (3,)) self.assertEqual(Y.dims, ( (2,4), (2,4,2,4) )) # Two parents X1 = GaussianARD(0, 1, plates=(2,), shape=()) X2 = GaussianARD(0, 1, plates=(3,), shape=()) Y = Concatenate(X1, X2) self.assertEqual(Y.plates, (5,)) self.assertEqual(Y.dims, ( (), () )) # Two parents with shapes X1 = GaussianARD(0, 1, plates=(2,), shape=(4,6)) X2 = GaussianARD(0, 1, plates=(3,), shape=(4,6)) Y = Concatenate(X1, X2) self.assertEqual(Y.plates, (5,)) self.assertEqual(Y.dims, ( (4,6), (4,6,4,6) )) # Two parents with non-default axis X1 = GaussianARD(0, 1, plates=(2,4), shape=()) X2 = GaussianARD(0, 1, plates=(3,4), shape=()) Y = Concatenate(X1, X2, axis=-2) self.assertEqual(Y.plates, (5,4)) self.assertEqual(Y.dims, ( (), () )) # Three parents X1 = GaussianARD(0, 1, plates=(2,), shape=()) X2 = GaussianARD(0, 1, plates=(3,), shape=()) X3 = GaussianARD(0, 1, plates=(4,), shape=()) Y = Concatenate(X1, X2, X3) self.assertEqual(Y.plates, (9,)) self.assertEqual(Y.dims, ( (), () )) # Constant parent X1 = [7.2, 3.5] X2 = GaussianARD(0, 1, plates=(3,), shape=()) Y = Concatenate(X1, X2) self.assertEqual(Y.plates, (5,)) self.assertEqual(Y.dims, ( (), () )) # Different moments X1 = GaussianARD(0, 1, plates=(3,)) X2 = Gamma(1, 1, plates=(4,)) self.assertRaises(ValueError, Concatenate, X1, X2) # Incompatible shapes X1 = GaussianARD(0, 1, plates=(3,), shape=(2,)) X2 = GaussianARD(0, 1, plates=(2,), shape=()) self.assertRaises(ValueError, Concatenate, X1, X2) # Incompatible plates X1 = GaussianARD(0, 1, plates=(4,3), shape=()) X2 = GaussianARD(0, 1, plates=(5,2,), shape=()) self.assertRaises(ValueError, Concatenate, X1, X2) pass def test_message_to_child(self): """ Test the message to child of Concatenate node. """ var = lambda plates, shape: GaussianARD( np.random.randn(*(plates + shape)), np.random.rand(*(plates + shape)), plates=plates, shape=shape ) # Two parents without shapes X1 = var((2,), ()) X2 = var((3,), ()) Y = Concatenate(X1, X2) u1 = X1.get_moments() u2 = X2.get_moments() u = Y.get_moments() self.assertAllClose((u[0]*np.ones((5,)))[:2], u1[0]*np.ones((2,))) self.assertAllClose((u[1]*np.ones((5,)))[:2], u1[1]*np.ones((2,))) self.assertAllClose((u[0]*np.ones((5,)))[2:], u2[0]*np.ones((3,))) self.assertAllClose((u[1]*np.ones((5,)))[2:], u2[1]*np.ones((3,))) # Two parents with shapes X1 = var((2,), (4,)) X2 = var((3,), (4,)) Y = Concatenate(X1, X2) u1 = X1.get_moments() u2 = X2.get_moments() u = Y.get_moments() self.assertAllClose((u[0]*np.ones((5,4)))[:2], u1[0]*np.ones((2,4))) self.assertAllClose((u[1]*np.ones((5,4,4)))[:2], u1[1]*np.ones((2,4,4))) self.assertAllClose((u[0]*np.ones((5,4)))[2:], u2[0]*np.ones((3,4))) self.assertAllClose((u[1]*np.ones((5,4,4)))[2:], u2[1]*np.ones((3,4,4))) # Test with non-constant axis X1 = GaussianARD(0, 1, plates=(2,4), shape=()) X2 = GaussianARD(0, 1, plates=(3,4), shape=()) Y = Concatenate(X1, X2, axis=-2) u1 = X1.get_moments() u2 = X2.get_moments() u = Y.get_moments() self.assertAllClose((u[0]*np.ones((5,4)))[:2], u1[0]*np.ones((2,4))) self.assertAllClose((u[1]*np.ones((5,4)))[:2], u1[1]*np.ones((2,4))) self.assertAllClose((u[0]*np.ones((5,4)))[2:], u2[0]*np.ones((3,4))) self.assertAllClose((u[1]*np.ones((5,4)))[2:], u2[1]*np.ones((3,4))) # Test with constant parent X1 = np.random.randn(2, 4) X2 = GaussianARD(0, 1, plates=(3,), shape=(4,)) Y = Concatenate(X1, X2) u1 = Y.parents[0].get_moments() u2 = X2.get_moments() u = Y.get_moments() self.assertAllClose((u[0]*np.ones((5,4)))[:2], u1[0]*np.ones((2,4))) self.assertAllClose((u[1]*np.ones((5,4,4)))[:2], u1[1]*np.ones((2,4,4))) self.assertAllClose((u[0]*np.ones((5,4)))[2:], u2[0]*np.ones((3,4))) self.assertAllClose((u[1]*np.ones((5,4,4)))[2:], u2[1]*np.ones((3,4,4))) pass def test_message_to_parent(self): """ Test the message to parents of Concatenate node. """ # Two parents without shapes X1 = GaussianARD(0, 1, plates=(2,), shape=()) X2 = GaussianARD(0, 1, plates=(3,), shape=()) Z = Concatenate(X1, X2) Y = GaussianARD(Z, 1) Y.observe(np.random.randn(*Y.get_shape(0))) m1 = X1._message_from_children() m2 = X2._message_from_children() m = Z._message_from_children() self.assertAllClose((m[0]*np.ones((5,)))[:2], m1[0]*np.ones((2,))) self.assertAllClose((m[1]*np.ones((5,)))[:2], m1[1]*np.ones((2,))) self.assertAllClose((m[0]*np.ones((5,)))[2:], m2[0]*np.ones((3,))) self.assertAllClose((m[1]*np.ones((5,)))[2:], m2[1]*np.ones((3,))) # Two parents with shapes with warnings.catch_warnings(): warnings.simplefilter("ignore", FutureWarning) X1 = GaussianARD(0, 1, plates=(2,), shape=(4,6)) X2 = GaussianARD(0, 1, plates=(3,), shape=(4,6)) Z = Concatenate(X1, X2) Y = GaussianARD(Z, 1) Y.observe(np.random.randn(*Y.get_shape(0))) m1 = X1._message_from_children() m2 = X2._message_from_children() m = Z._message_from_children() self.assertAllClose((m[0]*np.ones((5,4,6)))[:2], m1[0]*np.ones((2,4,6))) self.assertAllClose((m[1]*np.ones((5,4,6,4,6)))[:2], m1[1]*np.ones((2,4,6,4,6))) self.assertAllClose((m[0]*np.ones((5,4,6)))[2:], m2[0]*np.ones((3,4,6))) self.assertAllClose((m[1]*np.ones((5,4,6,4,6)))[2:], m2[1]*np.ones((3,4,6,4,6))) # Two parents with non-default concatenation axis X1 = GaussianARD(0, 1, plates=(2,4), shape=()) X2 = GaussianARD(0, 1, plates=(3,4), shape=()) Z = Concatenate(X1, X2, axis=-2) Y = GaussianARD(Z, 1) Y.observe(np.random.randn(*Y.get_shape(0))) m1 = X1._message_from_children() m2 = X2._message_from_children() m = Z._message_from_children() self.assertAllClose((m[0]*np.ones((5,4)))[:2], m1[0]*np.ones((2,4))) self.assertAllClose((m[1]*np.ones((5,4)))[:2], m1[1]*np.ones((2,4))) self.assertAllClose((m[0]*np.ones((5,4)))[2:], m2[0]*np.ones((3,4))) self.assertAllClose((m[1]*np.ones((5,4)))[2:], m2[1]*np.ones((3,4))) # Constant parent X1 = np.random.randn(2,4,6) X2 = GaussianARD(0, 1, plates=(3,), shape=(4,6)) Z = Concatenate(X1, X2) Y = GaussianARD(Z, 1) Y.observe(np.random.randn(*Y.get_shape(0))) m1 = Z._message_to_parent(0) m2 = X2._message_from_children() m = Z._message_from_children() self.assertAllClose((m[0]*np.ones((5,4,6)))[:2], m1[0]*np.ones((2,4,6))) self.assertAllClose((m[1]*np.ones((5,4,6,4,6)))[:2], m1[1]*np.ones((2,4,6,4,6))) self.assertAllClose((m[0]*np.ones((5,4,6)))[2:], m2[0]*np.ones((3,4,6))) self.assertAllClose((m[1]*np.ones((5,4,6,4,6)))[2:], m2[1]*np.ones((3,4,6,4,6))) pass def test_mask_to_parent(self): """ Test the mask handling in Concatenate node """ pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_deterministic.py0000644000175100001770000002635700000000000026760 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `deterministic` module. """ import unittest import numpy as np import scipy from numpy import testing from ..node import Node, Moments from ..deterministic import tile from ..stochastic import Stochastic class TestTile(unittest.TestCase): def check_message_to_children(self, tiles, u_parent, u_tiled, dims=None, plates=None): # Set up the dummy model class Dummy(Stochastic): _parent_moments = () _moments = Moments() def __init__(self, u, dims=dims, plates=plates): super().__init__(dims=dims, plates=plates, initialize=False) self.u = u X = Dummy(u_parent, dims=dims, plates=plates) Y = tile(X, tiles) u_Y = Y._compute_moments(u_parent) for (x,y) in zip(u_Y, u_tiled): self.assertEqual(np.shape(x), np.shape(y), msg="Incorrect shape.") testing.assert_allclose(x, y, err_msg="Incorrect moments.") def test_message_to_children(self): """ Test the moments of Tile node. """ # Define th check function check_message_to_children = self.check_message_to_children # Check scalar (and broadcasting) check_message_to_children(2, (5,), (5,), dims=[()], plates=()), # Check 1-D check_message_to_children(2, ([1,2],), ([1,2,1,2],), dims=[()], plates=(2,)) # Check N-D check_message_to_children(2, ([[1,2], [3,4], [5,6]],), ([[1,2,1,2], [3,4,3,4], [5,6,5,6]],), dims=[()], plates=(3,2)) # Check not-last plate check_message_to_children([2,1], ([[1,2], [3,4]],), ([[1,2], [3,4], [1,2], [3,4]],), dims=[()], plates=(2,2)) # Check several plates check_message_to_children([2,3], ([[1,2], [3,4]],), ([[1,2,1,2,1,2], [3,4,3,4,3,4], [1,2,1,2,1,2], [3,4,3,4,3,4]],), dims=[()], plates=(2,2)) # Check non-zero dimensional variables check_message_to_children(2, ([[1,2], [3,4]],), ([[1,2], [3,4], [1,2], [3,4]],), dims=[(2,)], plates=(2,)) # Check several moments check_message_to_children(2, ([[1,2], [3,4]], [1,2]), ([[1,2], [3,4], [1,2], [3,4]], [1,2,1,2]), dims=[(2,),()], plates=(2,)) # Check broadcasting of tiled plate check_message_to_children(2, ([[1,], [2,]],), ([[1,], [2,]],), dims=[()], plates=(2,2)) # Check broadcasting of non-tiled plate check_message_to_children(2, ([[1,2]],), ([[1,2,1,2]],), dims=[()], plates=(2,2)) # Check broadcasting of leading plates that are not in parent check_message_to_children([2,1], ([1,2],), ([1,2],), dims=[()], plates=(2,)) def check_message_to_parent(self, tiles, m_children, m_true, dims=None, plates_parent=None, plates_children=None): # Set up the dummy model class Dummy(Stochastic): _parent_moments = () _moments = Moments() X = Dummy(dims=dims, plates=plates_parent, initialize=False) Y = tile(X, tiles) m = Y._compute_message_to_parent(0, m_children, None) for (x,y) in zip(m, m_true): self.assertEqual(np.shape(x), np.shape(y), msg="Incorrect shape.") testing.assert_allclose(x, y, err_msg="Incorrect message.") def test_message_to_parent(self): """ Test the parent message of Tile node. """ # Define th check function check = self.check_message_to_parent # Check scalar check(2, ([5,5],), (10,), dims=[()], plates_parent=(), plates_children=(2,)), # Check 1-D check(2, ([1,2,3,4],), ([4,6],), dims=[()], plates_parent=(2,), plates_children=(4,)) # Check N-D check(2, ([[1,2,7,8], [3,4,9,0], [5,6,1,2]],), ([[8,10], [12,4], [6,8]],), dims=[()], plates_parent=(3,2), plates_children=(3,4)) # Check not-last plate check([2,1], ([[1,2], [3,4], [5,6], [7,8]],), ([[6,8], [10,12]],), dims=[()], plates_parent=(2,2), plates_children=(4,2)) # Check several plates check([2,3], ([[1,2,1,2,1,2], [3,4,3,4,3,4], [1,2,1,2,1,2], [3,4,3,4,3,4]],), ([[6,12], [18,24]],), dims=[()], plates_parent=(2,2), plates_children=(4,6)) # Check broadcasting if message has unit axis for tiled plate check(2, ([[1,], [2,], [3,]],), ([[2,], [4,], [6,]],), dims=[()], plates_parent=(3,2), plates_children=(3,4)) # Check broadcasting if message has unit axis for non-tiled plate check(2, ([[1,2,3,4]],), ([[4,6]],), dims=[()], plates_parent=(3,2), plates_children=(3,4)) # Check non-zero dimensional variables check(2, ([[1,2], [3,4], [5,6], [7,8]],), ([[6,8], [10,12]],), dims=[(2,)], plates_parent=(2,), plates_children=(4,)) # Check several moments check(2, ([[1,2], [3,4], [5,6], [7,8]], [1,2,3,4]), ([[6,8], [10,12]], [4,6]), dims=[(2,),()], plates_parent=(2,), plates_children=(4,)) def check_mask_to_parent(self, tiles, mask_child, mask_true, plates_parent=None, plates_children=None): # Set up the dummy model class Dummy(Stochastic): _moments = Moments() _parent_moments = () X = Dummy(dims=[()], plates=plates_parent, initialize=False) Y = tile(X, tiles) mask = Y._compute_weights_to_parent(0, mask_child) != 0 self.assertEqual(np.shape(mask), np.shape(mask_true), msg="Incorrect shape.") testing.assert_equal(mask, mask_true, err_msg="Incorrect mask.") def test_mask_to_parent(self): """ Test the mask message to parent of Tile node. """ # Define th check function check = self.check_mask_to_parent # Check scalar parent check(2, [True,False], True, plates_parent=(), plates_children=(2,)) check(2, [False,False], False, plates_parent=(), plates_children=(2,)) # Check 1-D check(2, [True,False,False,False], [True,False], plates_parent=(2,), plates_children=(4,)) # Check N-D check(2, [[True,False,True,False], [False,True,False,False]], [[True,False], [False,True]], plates_parent=(2,2), plates_children=(2,4)) # Check not-last plate check([2,1], [[True,False], [False,True], [True,False], [False,False]], [[True,False], [False,True]], plates_parent=(2,2), plates_children=(4,2)) # Check several plates check([2,3], [[False,False,True,False,False,False], [False,False,False,False,False,False], [False,False,False,False,False,False], [False,True,False,False,False,False]], [[True,False], [False,True]], plates_parent=(2,2), plates_children=(4,6)) # Check broadcasting if message has unit axis for tiled plate check(2, [[True,], [False,], [True,]], [[True,], [False,], [True,]], plates_parent=(3,2), plates_children=(3,4)) # Check broadcasting if message has unit axis for non-tiled plate check(2, [[False,False,False,True]], [[False,True]], plates_parent=(3,2), plates_children=(3,4)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_dirichlet.py0000644000175100001770000000436300000000000026055 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `dirichlet` module. """ import numpy as np from scipy import special from bayespy.nodes import Dirichlet from bayespy.utils import random from bayespy.utils.misc import TestCase class TestDirichlet(TestCase): """ Unit tests for Dirichlet node """ def test_init(self): """ Test the creation of Dirichlet nodes. """ # Some simple initializations p = Dirichlet([1.5, 4.2, 3.5]) # Check that plates are correct p = Dirichlet([2, 3, 4], plates=(4,3)) self.assertEqual(p.plates, (4,3)) p = Dirichlet(np.ones((4,3,5))) self.assertEqual(p.plates, (4,3)) # Parent not a vector self.assertRaises(ValueError, Dirichlet, 4) # Parent vector has invalid values self.assertRaises(ValueError, Dirichlet, [-2,3,1]) # Plates inconsistent self.assertRaises(ValueError, Dirichlet, np.ones((4,3)), plates=(3,)) # Explicit plates too small self.assertRaises(ValueError, Dirichlet, np.ones((4,3)), plates=(1,)) pass def test_moments(self): """ Test the moments of Dirichlet nodes. """ p = Dirichlet([2, 3, 4]) u = p._message_to_child() self.assertAllClose(u[0], special.psi([2,3,4]) - special.psi(2+3+4)) pass def test_constant(self): """ Test the constant moments of Dirichlet nodes. """ p = Dirichlet([1, 1, 1]) p.initialize_from_value([0.5, 0.4, 0.1]) u = p._message_to_child() self.assertAllClose(u[0], np.log([0.5, 0.4, 0.1])) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_dot.py0000644000175100001770000010355400000000000024676 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `dot` module. """ import unittest import numpy as np import scipy from numpy import testing from ..dot import Dot, SumMultiply from ..gaussian import Gaussian, GaussianARD from bayespy.nodes import GaussianGamma from ...vmp import VB from bayespy.utils import misc from bayespy.utils import linalg from bayespy.utils import random from bayespy.utils.misc import TestCase class TestSumMultiply(TestCase): def test_parent_validity(self): """ Test that the parent nodes are validated properly in SumMultiply """ V = GaussianARD(1, 1) X = Gaussian(np.ones(1), np.identity(1)) Y = Gaussian(np.ones(3), np.identity(3)) Z = Gaussian(np.ones(5), np.identity(5)) A = SumMultiply(X, ['i']) self.assertEqual(A.dims, ((), ())) A = SumMultiply('i', X) self.assertEqual(A.dims, ((), ())) A = SumMultiply(X, ['i'], ['i']) self.assertEqual(A.dims, ((1,), (1,1))) A = SumMultiply('i->i', X) self.assertEqual(A.dims, ((1,), (1,1))) A = SumMultiply(X, ['i'], Y, ['j'], ['i','j']) self.assertEqual(A.dims, ((1,3), (1,3,1,3))) A = SumMultiply('i,j->ij', X, Y) self.assertEqual(A.dims, ((1,3), (1,3,1,3))) A = SumMultiply(V, [], X, ['i'], Y, ['i'], []) self.assertEqual(A.dims, ((), ())) A = SumMultiply(',i,i->', V, X, Y) self.assertEqual(A.dims, ((), ())) # Gaussian-gamma parents C = GaussianGamma(np.ones(3), np.identity(3), 1, 1) A = SumMultiply(Y, ['i'], C, ['i'], ['i']) self.assertEqual(A.dims, ((3,), (3,3), (), ())) A = SumMultiply('i,i->i', Y, C) self.assertEqual(A.dims, ((3,), (3,3), (), ())) C = GaussianGamma(np.ones(3), np.identity(3), 1, 1) A = SumMultiply(Y, ['i'], C, ['i'], []) self.assertEqual(A.dims, ((), (), (), ())) A = SumMultiply('i,i->', Y, C) self.assertEqual(A.dims, ((), (), (), ())) # Error: not enough inputs self.assertRaises(ValueError, SumMultiply) self.assertRaises(ValueError, SumMultiply, X) # Error: too many keys self.assertRaises(ValueError, SumMultiply, Y, ['i', 'j']) self.assertRaises(ValueError, SumMultiply, 'ij', Y) # Error: not broadcastable self.assertRaises(ValueError, SumMultiply, Y, ['i'], Z, ['i']) self.assertRaises(ValueError, SumMultiply, 'i,i', Y, Z) # Error: output key not in inputs self.assertRaises(ValueError, SumMultiply, X, ['i'], ['j']) self.assertRaises(ValueError, SumMultiply, 'i->j', X) # Error: non-unique input keys self.assertRaises(ValueError, SumMultiply, X, ['i','i']) self.assertRaises(ValueError, SumMultiply, 'ii', X) # Error: non-unique output keys self.assertRaises(ValueError, SumMultiply, X, ['i'], ['i','i']) self.assertRaises(ValueError, SumMultiply, 'i->ii', X) # String has too many '->' self.assertRaises(ValueError, SumMultiply, 'i->i->i', X) # String has too many input nodes self.assertRaises(ValueError, SumMultiply, 'i,i->i', X) # Same parent several times self.assertRaises(ValueError, SumMultiply, 'i,i->i', X, X) # Same parent several times via deterministic node Xh = SumMultiply('i->i', X) self.assertRaises(ValueError, SumMultiply, 'i,i->i', X, Xh) def test_message_to_child(self): """ Test the message from SumMultiply to its children. """ def compare_moments(u0, u1, *args): Y = SumMultiply(*args) u_Y = Y.get_moments() self.assertAllClose(u_Y[0], u0) self.assertAllClose(u_Y[1], u1) # Test constant parent y = np.random.randn(2,3,4) compare_moments(y, linalg.outer(y, y, ndim=2), 'ij->ij', y) # Do nothing for 2-D array Y = GaussianARD(np.random.randn(5,2,3), np.random.rand(5,2,3), plates=(5,), shape=(2,3)) y = Y.get_moments() compare_moments(y[0], y[1], 'ij->ij', Y) compare_moments(y[0], y[1], Y, [0,1], [0,1]) # Sum over the rows of a matrix Y = GaussianARD(np.random.randn(5,2,3), np.random.rand(5,2,3), plates=(5,), shape=(2,3)) y = Y.get_moments() mu = np.einsum('...ij->...j', y[0]) cov = np.einsum('...ijkl->...jl', y[1]) compare_moments(mu, cov, 'ij->j', Y) compare_moments(mu, cov, Y, [0,1], [1]) # Inner product of three vectors X1 = GaussianARD(np.random.randn(2), np.random.rand(2), plates=(), shape=(2,)) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(6,1,2), np.random.rand(6,1,2), plates=(6,1), shape=(2,)) x2 = X2.get_moments() X3 = GaussianARD(np.random.randn(7,6,5,2), np.random.rand(7,6,5,2), plates=(7,6,5), shape=(2,)) x3 = X3.get_moments() mu = np.einsum('...i,...i,...i->...', x1[0], x2[0], x3[0]) cov = np.einsum('...ij,...ij,...ij->...', x1[1], x2[1], x3[1]) compare_moments(mu, cov, 'i,i,i', X1, X2, X3) compare_moments(mu, cov, 'i,i,i->', X1, X2, X3) compare_moments(mu, cov, X1, [9], X2, [9], X3, [9]) compare_moments(mu, cov, X1, [9], X2, [9], X3, [9], []) # Outer product of two vectors X1 = GaussianARD(np.random.randn(2), np.random.rand(2), plates=(5,), shape=(2,)) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(6,1,2), np.random.rand(6,1,2), plates=(6,1), shape=(2,)) x2 = X2.get_moments() mu = np.einsum('...i,...j->...ij', x1[0], x2[0]) cov = np.einsum('...ik,...jl->...ijkl', x1[1], x2[1]) compare_moments(mu, cov, 'i,j->ij', X1, X2) compare_moments(mu, cov, X1, [9], X2, [7], [9,7]) # Matrix product Y1 = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), plates=(), shape=(3,2)) y1 = Y1.get_moments() Y2 = GaussianARD(np.random.randn(5,2,3), np.random.rand(5,2,3), plates=(5,), shape=(2,3)) y2 = Y2.get_moments() mu = np.einsum('...ik,...kj->...ij', y1[0], y2[0]) cov = np.einsum('...ikjl,...kmln->...imjn', y1[1], y2[1]) compare_moments(mu, cov, 'ik,kj->ij', Y1, Y2) compare_moments(mu, cov, Y1, ['i','k'], Y2, ['k','j'], ['i','j']) # Trace of a matrix product Y1 = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), plates=(), shape=(3,2)) y1 = Y1.get_moments() Y2 = GaussianARD(np.random.randn(5,2,3), np.random.rand(5,2,3), plates=(5,), shape=(2,3)) y2 = Y2.get_moments() mu = np.einsum('...ij,...ji->...', y1[0], y2[0]) cov = np.einsum('...ikjl,...kilj->...', y1[1], y2[1]) compare_moments(mu, cov, 'ij,ji', Y1, Y2) compare_moments(mu, cov, 'ij,ji->', Y1, Y2) compare_moments(mu, cov, Y1, ['i','j'], Y2, ['j','i']) compare_moments(mu, cov, Y1, ['i','j'], Y2, ['j','i'], []) # Vector-matrix-vector product X1 = GaussianARD(np.random.randn(3), np.random.rand(3), plates=(), shape=(3,)) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(6,1,2), np.random.rand(6,1,2), plates=(6,1), shape=(2,)) x2 = X2.get_moments() Y = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), plates=(), shape=(3,2)) y = Y.get_moments() mu = np.einsum('...i,...ij,...j->...', x1[0], y[0], x2[0]) cov = np.einsum('...ia,...ijab,...jb->...', x1[1], y[1], x2[1]) compare_moments(mu, cov, 'i,ij,j', X1, Y, X2) compare_moments(mu, cov, X1, [1], Y, [1,2], X2, [2]) # Complex sum-product of 0-D, 1-D, 2-D and 3-D arrays V = GaussianARD(np.random.randn(7,6,5), np.random.rand(7,6,5), plates=(7,6,5), shape=()) v = V.get_moments() X = GaussianARD(np.random.randn(6,1,2), np.random.rand(6,1,2), plates=(6,1), shape=(2,)) x = X.get_moments() Y = GaussianARD(np.random.randn(3,4), np.random.rand(3,4), plates=(5,), shape=(3,4)) y = Y.get_moments() Z = GaussianARD(np.random.randn(4,2,3), np.random.rand(4,2,3), plates=(6,5), shape=(4,2,3)) z = Z.get_moments() mu = np.einsum('...,...i,...kj,...jik->...k', v[0], x[0], y[0], z[0]) cov = np.einsum('...,...ia,...kjcb,...jikbac->...kc', v[1], x[1], y[1], z[1]) compare_moments(mu, cov, ',i,kj,jik->k', V, X, Y, Z) compare_moments(mu, cov, V, [], X, ['i'], Y, ['k','j'], Z, ['j','i','k'], ['k']) # Test with constant nodes N = 10 D = 5 a = np.random.randn(N, D) B = Gaussian( np.random.randn(D), random.covariance(D), ) X = SumMultiply('i,i->', B, a) np.testing.assert_allclose( X.get_moments()[0], np.einsum('ni,i->n', a, B.get_moments()[0]), ) np.testing.assert_allclose( X.get_moments()[1], np.einsum('ni,nj,ij->n', a, a, B.get_moments()[1]), ) # # Gaussian-gamma parents # # Outer product of vectors X1 = GaussianARD(np.random.randn(2), np.random.rand(2), shape=(2,)) x1 = X1.get_moments() X2 = GaussianGamma( np.random.randn(6,1,2), random.covariance(2), np.random.rand(6,1), np.random.rand(6,1), plates=(6,1) ) x2 = X2.get_moments() Y = SumMultiply('i,j->ij', X1, X2) u = Y._message_to_child() y = np.einsum('...i,...j->...ij', x1[0], x2[0]) yy = np.einsum('...ik,...jl->...ijkl', x1[1], x2[1]) self.assertAllClose(u[0], y) self.assertAllClose(u[1], yy) self.assertAllClose(u[2], x2[2]) self.assertAllClose(u[3], x2[3]) # Test with constant nodes N = 10 M = 8 D = 5 a = np.random.randn(N, 1, D) B = GaussianGamma( np.random.randn(M, D), random.covariance(D, size=(M,)), np.random.rand(M), np.random.rand(M), ndim=1, ) X = SumMultiply('i,i->', B, a) np.testing.assert_allclose( X.get_moments()[0], np.einsum('nmi,mi->nm', a, B.get_moments()[0]), ) np.testing.assert_allclose( X.get_moments()[1], np.einsum('nmi,nmj,mij->nm', a, a, B.get_moments()[1]), ) np.testing.assert_allclose( X.get_moments()[2], B.get_moments()[2], ) np.testing.assert_allclose( X.get_moments()[3], B.get_moments()[3], ) pass def test_message_to_parent(self): """ Test the message from SumMultiply node to its parents. """ data = 2 tau = 3 def check_message(true_m0, true_m1, parent, *args, F=None): if F is None: A = SumMultiply(*args) B = GaussianARD(A, tau) B.observe(data*np.ones(A.plates + A.dims[0])) else: A = F (A_m0, A_m1) = A._message_to_parent(parent) self.assertAllClose(true_m0, A_m0) self.assertAllClose(true_m1, A_m1) pass # Check: different message to each of multiple parents X1 = GaussianARD(np.random.randn(2), np.random.rand(2), ndim=1) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(2), np.random.rand(2), ndim=1) x2 = X2.get_moments() m0 = tau * data * x2[0] m1 = -0.5 * tau * x2[1] * np.identity(2) check_message(m0, m1, 0, 'i,i->i', X1, X2) check_message(m0, m1, 0, X1, [9], X2, [9], [9]) m0 = tau * data * x1[0] m1 = -0.5 * tau * x1[1] * np.identity(2) check_message(m0, m1, 1, 'i,i->i', X1, X2) check_message(m0, m1, 1, X1, [9], X2, [9], [9]) # Check: key not in output X1 = GaussianARD(np.random.randn(2), np.random.rand(2), ndim=1) x1 = X1.get_moments() m0 = tau * data * np.ones(2) m1 = -0.5 * tau * np.ones((2,2)) check_message(m0, m1, 0, 'i', X1) check_message(m0, m1, 0, 'i->', X1) check_message(m0, m1, 0, X1, [9]) check_message(m0, m1, 0, X1, [9], []) # Check: key not in some input X1 = GaussianARD(np.random.randn(), np.random.rand()) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(2), np.random.rand(2), ndim=1) x2 = X2.get_moments() m0 = tau * data * np.sum(x2[0], axis=-1) m1 = -0.5 * tau * np.sum(x2[1] * np.identity(2), axis=(-1,-2)) check_message(m0, m1, 0, ',i->i', X1, X2) check_message(m0, m1, 0, X1, [], X2, [9], [9]) m0 = tau * data * x1[0] * np.ones(2) m1 = -0.5 * tau * x1[1] * np.identity(2) check_message(m0, m1, 1, ',i->i', X1, X2) check_message(m0, m1, 1, X1, [], X2, [9], [9]) # Check: keys in different order Y1 = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), ndim=2) y1 = Y1.get_moments() Y2 = GaussianARD(np.random.randn(2,3), np.random.rand(2,3), ndim=2) y2 = Y2.get_moments() m0 = tau * data * y2[0].T m1 = -0.5 * tau * np.einsum('ijlk->jikl', y2[1] * misc.identity(2,3)) check_message(m0, m1, 0, 'ij,ji->ij', Y1, Y2) check_message(m0, m1, 0, Y1, ['i','j'], Y2, ['j','i'], ['i','j']) m0 = tau * data * y1[0].T m1 = -0.5 * tau * np.einsum('ijlk->jikl', y1[1] * misc.identity(3,2)) check_message(m0, m1, 1, 'ij,ji->ij', Y1, Y2) check_message(m0, m1, 1, Y1, ['i','j'], Y2, ['j','i'], ['i','j']) # Check: plates when different dimensionality X1 = GaussianARD(np.random.randn(5), np.random.rand(5), shape=(), plates=(5,)) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(5,3), np.random.rand(5,3), shape=(3,), plates=(5,)) x2 = X2.get_moments() m0 = tau * data * np.sum(np.ones((5,3)) * x2[0], axis=-1) m1 = -0.5 * tau * np.sum(x2[1] * misc.identity(3), axis=(-1,-2)) check_message(m0, m1, 0, ',i->i', X1, X2) check_message(m0, m1, 0, X1, [], X2, ['i'], ['i']) m0 = tau * data * x1[0][:,np.newaxis] * np.ones((5,3)) m1 = -0.5 * tau * x1[1][:,np.newaxis,np.newaxis] * misc.identity(3) check_message(m0, m1, 1, ',i->i', X1, X2) check_message(m0, m1, 1, X1, [], X2, ['i'], ['i']) # Check: other parent's moments broadcasts over plates when node has the # same plates X1 = GaussianARD(np.random.randn(5,4,3), np.random.rand(5,4,3), shape=(3,), plates=(5,4)) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(3), np.random.rand(3), shape=(3,), plates=(5,4)) x2 = X2.get_moments() m0 = tau * data * np.ones((5,4,3)) * x2[0] m1 = -0.5 * tau * x2[1] * misc.identity(3) check_message(m0, m1, 0, 'i,i->i', X1, X2) check_message(m0, m1, 0, X1, ['i'], X2, ['i'], ['i']) # Check: other parent's moments broadcasts over plates when node does # not have that plate X1 = GaussianARD(np.random.randn(3), np.random.rand(3), shape=(3,), plates=()) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(3), np.random.rand(3), shape=(3,), plates=(5,4)) x2 = X2.get_moments() m0 = tau * data * np.sum(np.ones((5,4,3)) * x2[0], axis=(0,1)) m1 = -0.5 * tau * np.sum(np.ones((5,4,1,1)) * misc.identity(3) * x2[1], axis=(0,1)) check_message(m0, m1, 0, 'i,i->i', X1, X2) check_message(m0, m1, 0, X1, ['i'], X2, ['i'], ['i']) # Check: other parent's moments broadcasts over plates when the node # only broadcasts that plate X1 = GaussianARD(np.random.randn(3), np.random.rand(3), shape=(3,), plates=(1,1)) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(3), np.random.rand(3), shape=(3,), plates=(5,4)) x2 = X2.get_moments() m0 = tau * data * np.sum(np.ones((5,4,3)) * x2[0], axis=(0,1), keepdims=True) m1 = -0.5 * tau * np.sum(np.ones((5,4,1,1)) * misc.identity(3) * x2[1], axis=(0,1), keepdims=True) check_message(m0, m1, 0, 'i,i->i', X1, X2) check_message(m0, m1, 0, X1, ['i'], X2, ['i'], ['i']) # Check: broadcasted dimensions X1 = GaussianARD(np.random.randn(1,1), np.random.rand(1,1), ndim=2) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), ndim=2) x2 = X2.get_moments() m0 = tau * data * np.sum(np.ones((3,2)) * x2[0], keepdims=True) m1 = -0.5 * tau * np.sum(misc.identity(3,2) * x2[1], keepdims=True) check_message(m0, m1, 0, 'ij,ij->ij', X1, X2) check_message(m0, m1, 0, X1, [0,1], X2, [0,1], [0,1]) m0 = tau * data * np.ones((3,2)) * x1[0] m1 = -0.5 * tau * misc.identity(3,2) * x1[1] check_message(m0, m1, 1, 'ij,ij->ij', X1, X2) check_message(m0, m1, 1, X1, [0,1], X2, [0,1], [0,1]) # Check: non-ARD observations X1 = GaussianARD(np.random.randn(2), np.random.rand(2), ndim=1) x1 = X1.get_moments() Lambda = np.array([[2, 1.5], [1.5, 2]]) F = SumMultiply('i->i', X1) Y = Gaussian(F, Lambda) y = np.random.randn(2) Y.observe(y) m0 = np.dot(Lambda, y) m1 = -0.5 * Lambda check_message(m0, m1, 0, 'i->i', X1, F=F) check_message(m0, m1, 0, X1, ['i'], ['i'], F=F) # Check: mask with same shape X1 = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), shape=(2,), plates=(3,)) x1 = X1.get_moments() mask = np.array([True, False, True]) F = SumMultiply('i->i', X1) Y = GaussianARD(F, tau, ndim=1) Y.observe(data*np.ones((3,2)), mask=mask) m0 = tau * data * mask[:,np.newaxis] * np.ones(2) m1 = -0.5 * tau * mask[:,np.newaxis,np.newaxis] * np.identity(2) check_message(m0, m1, 0, 'i->i', X1, F=F) check_message(m0, m1, 0, X1, ['i'], ['i'], F=F) # Check: mask larger X1 = GaussianARD(np.random.randn(2), np.random.rand(2), shape=(2,), plates=()) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), shape=(2,), plates=(3,)) x2 = X2.get_moments() mask = np.array([True, False, True]) F = SumMultiply('i,i->i', X1, X2) Y = GaussianARD(F, tau, plates=(3,), ndim=1) Y.observe(data*np.ones((3,2)), mask=mask) m0 = tau * data * np.sum(mask[:,np.newaxis] * x2[0], axis=0) m1 = -0.5 * tau * np.sum(mask[:,np.newaxis,np.newaxis] * x2[1] * np.identity(2), axis=0) check_message(m0, m1, 0, 'i,i->i', X1, X2, F=F) check_message(m0, m1, 0, X1, ['i'], X2, ['i'], ['i'], F=F) # Check: mask for broadcasted plate X1 = GaussianARD(np.random.randn(2), np.random.rand(2), ndim=1, plates=(1,)) x1 = X1.get_moments() X2 = GaussianARD(np.random.randn(2), np.random.rand(2), ndim=1, plates=(3,)) x2 = X2.get_moments() mask = np.array([True, False, True]) F = SumMultiply('i,i->i', X1, X2) Y = GaussianARD(F, tau, plates=(3,), ndim=1) Y.observe(data*np.ones((3,2)), mask=mask) m0 = tau * data * np.sum(mask[:,np.newaxis] * x2[0], axis=0, keepdims=True) m1 = -0.5 * tau * np.sum(mask[:,np.newaxis,np.newaxis] * x2[1] * np.identity(2), axis=0, keepdims=True) check_message(m0, m1, 0, 'i->i', X1, F=F) check_message(m0, m1, 0, X1, ['i'], ['i'], F=F) # Test with constant nodes N = 10 M = 8 D = 5 K = 3 a = np.random.randn(N, D) B = Gaussian( np.random.randn(D), random.covariance(D), ) C = GaussianARD( np.random.randn(M, 1, D, K), np.random.rand(M, 1, D, K), ndim=2 ) F = SumMultiply('i,i,ij->', a, B, C) tau = np.random.rand(M, N) Y = GaussianARD(F, tau, plates=(M,N)) y = np.random.randn(M, N) Y.observe(y) (m0, m1) = F._message_to_parent(1) np.testing.assert_allclose( m0, np.einsum('mn,ni,mnik->i', tau*y, a, C.get_moments()[0]), ) np.testing.assert_allclose( m1, np.einsum('mn,ni,nj,mnikjl->ij', -0.5*tau, a, a, C.get_moments()[1]), ) # Check: Gaussian-gamma parents X1 = GaussianGamma( np.random.randn(2), random.covariance(2), np.random.rand(), np.random.rand() ) x1 = X1.get_moments() X2 = GaussianGamma( np.random.randn(2), random.covariance(2), np.random.rand(), np.random.rand() ) x2 = X2.get_moments() F = SumMultiply('i,i->i', X1, X2) V = random.covariance(2) y = np.random.randn(2) Y = Gaussian(F, V) Y.observe(y) m0 = np.dot(V, y) * x2[0] m1 = -0.5 * V * x2[1] m2 = -0.5 * np.einsum('i,ij,j', y, V, y) * x2[2]#linalg.inner(V, x2[2], ndim=2) m3 = 0.5 * 2 #linalg.chol_logdet(linalg.chol(V)) + 2*x2[3] m = F._message_to_parent(0) self.assertAllClose(m[0], m0) self.assertAllClose(m[1], m1) self.assertAllClose(m[2], m2) self.assertAllClose(m[3], m3) # Delta moments N = 10 M = 8 D = 5 a = np.random.randn(N, D) B = GaussianGamma( np.random.randn(D), random.covariance(D), np.random.rand(), np.random.rand(), ndim=1 ) F = SumMultiply('i,i->', a, B) tau = np.random.rand(M, N) Y = GaussianARD(F, tau, plates=(M,N)) y = np.random.randn(M, N) Y.observe(y) (m0, m1, m2, m3) = F._message_to_parent(1) np.testing.assert_allclose( m0, np.einsum('mn,ni->i', tau*y, a), ) np.testing.assert_allclose( m1, np.einsum('mn,ni,nj->ij', -0.5*tau, a, a), ) np.testing.assert_allclose( m2, np.einsum('mn->', -0.5*tau*y**2), ) np.testing.assert_allclose( m3, np.einsum('mn->', 0.5*np.ones(np.shape(tau))), ) pass def test_compute_moments(self): return def check_performance(scale=1e2): """ Tests that the implementation of SumMultiply is efficient. This is not a unit test (not run automatically), but rather a performance test, which you may run to test the performance of the node. A naive implementation of SumMultiply will run out of memory in some cases and this method checks that the implementation is not naive but good. """ # Check: Broadcasted plates are computed efficiently # (bad implementation will take a long time to run) s = scale X1 = GaussianARD(np.random.randn(s,s), np.random.rand(s,s), shape=(s,), plates=(s,)) X2 = GaussianARD(np.random.randn(s,1,s), np.random.rand(s,1,s), shape=(s,), plates=(s,1)) F = SumMultiply('i,i', X1, X2) Y = GaussianARD(F, 1) Y.observe(np.ones((s,s))) try: F._message_to_parent(1) except e: print(e) print('SOMETHING BAD HAPPENED') # Check: Broadcasted dimensions are computed efficiently # (bad implementation will run out of memory) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_gamma.py0000644000175100001770000001536100000000000025170 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `gamma` module. """ import numpy as np from scipy import special from numpy import testing from .. import gaussian from bayespy.nodes import (Gaussian, GaussianARD, GaussianGamma, Gamma, Wishart) from ...vmp import VB from bayespy.utils import misc from bayespy.utils import linalg from bayespy.utils import random from bayespy.utils.misc import TestCase class TestGamma(TestCase): def test_lower_bound_contribution(self): a = 15 b = 21 y = 4 x = Gamma(a, b) x.observe(y) testing.assert_allclose( x.lower_bound_contribution(), ( a * np.log(b) + (a - 1) * np.log(y) - b * y - special.gammaln(a) ) ) # Just one latent node so we'll get exact marginal likelihood # # p(Y) = p(Y,X)/p(X|Y) = p(Y|X) * p(X) / p(X|Y) a = 2.3 b = 4.1 x = 1.9 y = 4.8 tau = Gamma(a, b) Y = GaussianARD(x, tau) Y.observe(y) mu = x nu = 2 * a s2 = b / a a_post = a + 0.5 b_post = b + 0.5*(y - x)**2 tau.update() testing.assert_allclose( [-b_post, a_post], tau.phi ) testing.assert_allclose( Y.lower_bound_contribution() + tau.lower_bound_contribution(), # + tau.g, ( special.gammaln((nu+1)/2) - special.gammaln(nu/2) - 0.5 * np.log(nu) - 0.5 * np.log(np.pi) - 0.5 * np.log(s2) - 0.5 * (nu + 1) * np.log( 1 + (y - mu)**2 / (nu * s2) ) ) ) return class TestGammaGradient(TestCase): """Numerically check Riemannian gradient of several nodes. Using VB-EM update equations will take a unit length step to the Riemannian gradient direction. Thus, the change caused by a VB-EM update and the Riemannian gradient should be equal. """ def test_riemannian_gradient(self): """Test Riemannian gradient of a Gamma node.""" # # Without observations # # Construct model a = np.random.rand() b = np.random.rand() tau = Gamma(a, b) # Random initialization tau.initialize_from_parameters(np.random.rand(), np.random.rand()) # Initial parameters phi0 = tau.phi # Gradient g = tau.get_riemannian_gradient() # Parameters after VB-EM update tau.update() phi1 = tau.phi # Check self.assertAllClose(g[0], phi1[0] - phi0[0]) self.assertAllClose(g[1], phi1[1] - phi0[1]) # # With observations # # Construct model a = np.random.rand() b = np.random.rand() tau = Gamma(a, b) mu = np.random.randn() Y = GaussianARD(mu, tau) Y.observe(np.random.randn()) # Random initialization tau.initialize_from_parameters(np.random.rand(), np.random.rand()) # Initial parameters phi0 = tau.phi # Gradient g = tau.get_riemannian_gradient() # Parameters after VB-EM update tau.update() phi1 = tau.phi # Check self.assertAllClose(g[0], phi1[0] - phi0[0]) self.assertAllClose(g[1], phi1[1] - phi0[1]) pass def test_gradient(self): """Test standard gradient of a Gamma node.""" D = 3 np.random.seed(42) # # Without observations # # Construct model a = np.random.rand(D) b = np.random.rand(D) tau = Gamma(a, b) Q = VB(tau) # Random initialization tau.initialize_from_parameters(np.random.rand(D), np.random.rand(D)) # Initial parameters phi0 = tau.phi # Gradient rg = tau.get_riemannian_gradient() g = tau.get_gradient(rg) # Numerical gradient eps = 1e-8 p0 = tau.get_parameters() l0 = Q.compute_lowerbound(ignore_masked=False) g_num = [np.zeros(D), np.zeros(D)] for i in range(D): e = np.zeros(D) e[i] = eps p1 = p0[0] + e tau.set_parameters([p1, p0[1]]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[0][i] = (l1 - l0) / eps for i in range(D): e = np.zeros(D) e[i] = eps p1 = p0[1] + e tau.set_parameters([p0[0], p1]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[1][i] = (l1 - l0) / eps # Check self.assertAllClose(g[0], g_num[0]) self.assertAllClose(g[1], g_num[1]) # # With observations # # Construct model a = np.random.rand(D) b = np.random.rand(D) tau = Gamma(a, b) mu = np.random.randn(D) Y = GaussianARD(mu, tau) Y.observe(np.random.randn(D)) Q = VB(Y, tau) # Random initialization tau.initialize_from_parameters(np.random.rand(D), np.random.rand(D)) # Initial parameters phi0 = tau.phi # Gradient rg = tau.get_riemannian_gradient() g = tau.get_gradient(rg) # Numerical gradient eps = 1e-8 p0 = tau.get_parameters() l0 = Q.compute_lowerbound(ignore_masked=False) g_num = [np.zeros(D), np.zeros(D)] for i in range(D): e = np.zeros(D) e[i] = eps p1 = p0[0] + e tau.set_parameters([p1, p0[1]]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[0][i] = (l1 - l0) / eps for i in range(D): e = np.zeros(D) e[i] = eps p1 = p0[1] + e tau.set_parameters([p0[0], p1]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[1][i] = (l1 - l0) / eps # Check self.assertAllClose(g[0], g_num[0]) self.assertAllClose(g[1], g_num[1]) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_gate.py0000644000175100001770000002552400000000000025030 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `gate` module. """ import numpy as np from bayespy.nodes import (Gate, GaussianARD, Gamma, Categorical, Bernoulli, Multinomial) from bayespy.inference.vmp.nodes.gaussian import GaussianMoments from bayespy.utils import random from bayespy.utils import linalg from bayespy.utils.misc import TestCase class TestGate(TestCase): """ Unit tests for Gate node. """ def test_init(self): """ Test the creation of Gate node """ # Gating scalar node Z = Categorical(np.ones(3)/3) X = GaussianARD(0, 1, shape=(), plates=(3,)) Y = Gate(Z, X) self.assertEqual(Y.plates, ()) self.assertEqual(Y.dims, ( (), () )) # Gating non-scalar node Z = Categorical(np.ones(3)/3) X = GaussianARD(0, 1, shape=(2,), plates=(3,)) Y = Gate(Z, X) self.assertEqual(Y.plates, ()) self.assertEqual(Y.dims, ( (2,), (2,2) )) # Plates from Z Z = Categorical(np.ones(3)/3, plates=(4,)) X = GaussianARD(0, 1, shape=(2,), plates=(3,)) Y = Gate(Z, X) self.assertEqual(Y.plates, (4,)) self.assertEqual(Y.dims, ( (2,), (2,2) )) # Plates from X Z = Categorical(np.ones(3)/3) X = GaussianARD(0, 1, shape=(2,), plates=(4,3)) Y = Gate(Z, X) self.assertEqual(Y.plates, (4,)) self.assertEqual(Y.dims, ( (2,), (2,2) )) # Plates from Z and X Z = Categorical(np.ones(3)/3, plates=(5,)) X = GaussianARD(0, 1, shape=(2,), plates=(4,1,3)) Y = Gate(Z, X) self.assertEqual(Y.plates, (4,5)) self.assertEqual(Y.dims, ( (2,), (2,2) )) # Gating non-default plate Z = Categorical(np.ones(3)/3) X = GaussianARD(0, 1, shape=(), plates=(3,4)) Y = Gate(Z, X, gated_plate=-2) self.assertEqual(Y.plates, (4,)) self.assertEqual(Y.dims, ( (), () )) # Fixed gating Z = 2 X = GaussianARD(0, 1, shape=(2,), plates=(3,)) Y = Gate(Z, X) self.assertEqual(Y.plates, ()) self.assertEqual(Y.dims, ( (2,), (2,2) )) # Fixed X Z = Categorical(np.ones(3)/3) X = [1, 2, 3] Y = Gate(Z, X, moments=GaussianMoments(())) self.assertEqual(Y.plates, ()) self.assertEqual(Y.dims, ( (), () )) # Do not accept non-negative cluster plates Z = Categorical(np.ones(3)/3) X = GaussianARD(0, 1, plates=(3,)) self.assertRaises(ValueError, Gate, Z, X, gated_plate=0) # None of the parents have the cluster plate axis Z = Categorical(np.ones(3)/3) X = GaussianARD(0, 1) self.assertRaises(ValueError, Gate, Z, X) # Inconsistent cluster plate Z = Categorical(np.ones(3)/3) X = GaussianARD(0, 1, plates=(2,)) self.assertRaises(ValueError, Gate, Z, X) pass def test_message_to_child(self): """ Test the message to child of Gate node. """ # Gating scalar node Z = 2 X = GaussianARD([1,2,3], 1, shape=(), plates=(3,)) Y = Gate(Z, X) u = Y._message_to_child() self.assertEqual(len(u), 2) self.assertAllClose(u[0], 3) self.assertAllClose(u[1], 3**2+1) # Fixed X Z = 2 X = [1, 2, 3] Y = Gate(Z, X, moments=GaussianMoments(())) u = Y._message_to_child() self.assertEqual(len(u), 2) self.assertAllClose(u[0], 3) self.assertAllClose(u[1], 3**2) # Uncertain gating Z = Categorical([0.2,0.3,0.5]) X = GaussianARD([1,2,3], 1, shape=(), plates=(3,)) Y = Gate(Z, X) u = Y._message_to_child() self.assertAllClose(u[0], 0.2*1 + 0.3*2 + 0.5*3) self.assertAllClose(u[1], 0.2*2 + 0.3*5 + 0.5*10) # Plates in Z Z = [2, 0] X = GaussianARD([1,2,3], 1, shape=(), plates=(3,)) Y = Gate(Z, X) u = Y._message_to_child() self.assertAllClose(u[0], [3, 1]) self.assertAllClose(u[1], [10, 2]) # Plates in X Z = 2 X = GaussianARD([1,2,3], 1, shape=(), plates=(4,3,)) Y = Gate(Z, X) u = Y._message_to_child() self.assertAllClose(np.ones(4)*u[0], np.ones(4)*3) self.assertAllClose(np.ones(4)*u[1], np.ones(4)*10) # Gating non-default plate Z = 2 X = GaussianARD([[1],[2],[3]], 1, shape=(), plates=(3,4)) Y = Gate(Z, X, gated_plate=-2) u = Y._message_to_child() self.assertAllClose(np.ones(4)*u[0], np.ones(4)*3) self.assertAllClose(np.ones(4)*u[1], np.ones(4)*10) # Gating non-scalar node Z = 2 X = GaussianARD([1*np.ones(4), 2*np.ones(4), 3*np.ones(4)], 1, shape=(4,), plates=(3,)) Y = Gate(Z, X) u = Y._message_to_child() self.assertAllClose(u[0], 3*np.ones(4)) self.assertAllClose(u[1], 9*np.ones((4,4)) + 1*np.identity(4)) # Broadcasting the moments on the cluster axis Z = 2 X = GaussianARD(1, 1, shape=(), plates=(3,)) Y = Gate(Z, X) u = Y._message_to_child() self.assertEqual(len(u), 2) self.assertAllClose(u[0], 1) self.assertAllClose(u[1], 1**2+1) pass def test_message_to_parent(self): """ Test the message to parents of Gate node. """ # Unobserved and broadcasting Z = 2 X = GaussianARD(0, 1, shape=(), plates=(3,)) F = Gate(Z, X) Y = GaussianARD(F, 1) m = F._message_to_parent(0) self.assertEqual(len(m), 1) self.assertAllClose(m[0], 0*np.ones(3)) m = F._message_to_parent(1) self.assertEqual(len(m), 2) self.assertAllClose(m[0]*np.ones(3), [0, 0, 0]) self.assertAllClose(m[1]*np.ones(3), [0, 0, 0]) # Gating scalar node Z = 2 X = GaussianARD([1,2,3], 1, shape=(), plates=(3,)) F = Gate(Z, X) Y = GaussianARD(F, 1) Y.observe(10) m = F._message_to_parent(0) self.assertAllClose(m[0], [10*1-0.5*2, 10*2-0.5*5, 10*3-0.5*10]) m = F._message_to_parent(1) self.assertAllClose(m[0], [0, 0, 10]) self.assertAllClose(m[1], [0, 0, -0.5]) # Fixed X Z = 2 X = [1,2,3] F = Gate(Z, X, moments=GaussianMoments(())) Y = GaussianARD(F, 1) Y.observe(10) m = F._message_to_parent(0) self.assertAllClose(m[0], [10*1-0.5*1, 10*2-0.5*4, 10*3-0.5*9]) m = F._message_to_parent(1) self.assertAllClose(m[0], [0, 0, 10]) self.assertAllClose(m[1], [0, 0, -0.5]) # Uncertain gating Z = Categorical([0.2, 0.3, 0.5]) X = GaussianARD([1,2,3], 1, shape=(), plates=(3,)) F = Gate(Z, X) Y = GaussianARD(F, 1) Y.observe(10) m = F._message_to_parent(0) self.assertAllClose(m[0], [10*1-0.5*2, 10*2-0.5*5, 10*3-0.5*10]) m = F._message_to_parent(1) self.assertAllClose(m[0], [0.2*10, 0.3*10, 0.5*10]) self.assertAllClose(m[1], [-0.5*0.2, -0.5*0.3, -0.5*0.5]) # Plates in Z Z = [2, 0] X = GaussianARD([1,2,3], 1, shape=(), plates=(3,)) F = Gate(Z, X) Y = GaussianARD(F, 1) Y.observe([10, 20]) m = F._message_to_parent(0) self.assertAllClose(m[0], [[10*1-0.5*2, 10*2-0.5*5, 10*3-0.5*10], [20*1-0.5*2, 20*2-0.5*5, 20*3-0.5*10]]) m = F._message_to_parent(1) self.assertAllClose(m[0], [20, 0, 10]) self.assertAllClose(m[1], [-0.5, 0, -0.5]) # Plates in X Z = 2 X = GaussianARD([[1,2,3], [4,5,6]], 1, shape=(), plates=(2,3,)) F = Gate(Z, X) Y = GaussianARD(F, 1) Y.observe([10, 20]) m = F._message_to_parent(0) self.assertAllClose(m[0], [10*1-0.5*2 + 20*4-0.5*17, 10*2-0.5*5 + 20*5-0.5*26, 10*3-0.5*10 + 20*6-0.5*37]) m = F._message_to_parent(1) self.assertAllClose(m[0], [[0, 0, 10], [0, 0, 20]]) self.assertAllClose(m[1]*np.ones((2,3)), [[0, 0, -0.5], [0, 0, -0.5]]) # Gating non-default plate Z = 2 X = GaussianARD([[1],[2],[3]], 1, shape=(), plates=(3,1)) F = Gate(Z, X, gated_plate=-2) Y = GaussianARD(F, 1) Y.observe([10]) m = F._message_to_parent(0) self.assertAllClose(m[0], [10*1-0.5*2, 10*2-0.5*5, 10*3-0.5*10]) m = F._message_to_parent(1) self.assertAllClose(m[0], [[0], [0], [10]]) self.assertAllClose(m[1], [[0], [0], [-0.5]]) # Gating non-scalar node Z = 2 X = GaussianARD([[1,4],[2,5],[3,6]], 1, shape=(2,), plates=(3,)) F = Gate(Z, X) Y = GaussianARD(F, 1) Y.observe([10,20]) m = F._message_to_parent(0) self.assertAllClose(m[0], [10*1-0.5*2 + 20*4-0.5*17, 10*2-0.5*5 + 20*5-0.5*26, 10*3-0.5*10 + 20*6-0.5*37]) m = F._message_to_parent(1) I = np.identity(2) self.assertAllClose(m[0], [[0,0], [0,0], [10,20]]) self.assertAllClose(m[1], [0*I, 0*I, -0.5*I]) # Broadcasting the moments on the cluster axis Z = 2 X = GaussianARD(2, 1, shape=(), plates=(3,)) F = Gate(Z, X) Y = GaussianARD(F, 1) Y.observe(10) m = F._message_to_parent(0) self.assertAllClose(m[0], [10*2-0.5*5, 10*2-0.5*5, 10*2-0.5*5]) m = F._message_to_parent(1) self.assertAllClose(m[0], [0, 0, 10]) self.assertAllClose(m[1], [0, 0, -0.5]) pass def test_mask_to_parent(self): """ Test the mask handling in Gate node """ X = GaussianARD(2, 1, shape=(4, 5), plates=(3, 2)) F = Gate([0, 0, 1], X) self.assertAllClose( F._compute_weights_to_parent(0, [True, False, False]), [True, False, False] ) self.assertAllClose( F._compute_weights_to_parent(1, [True, False, False]), [[True], [False], [False]] ) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_gaussian.py0000644000175100001770000014241700000000000025723 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `gaussian` module. """ import numpy as np from scipy import special from numpy import testing from .. import gaussian from bayespy.nodes import (Gaussian, GaussianARD, GaussianGamma, Gamma, Wishart, ConcatGaussian) from ..wishart import WishartMoments from ...vmp import VB from bayespy.utils import misc from bayespy.utils import linalg from bayespy.utils import random from bayespy.utils.misc import TestCase class TestGaussianFunctions(TestCase): def test_rotate_covariance(self): """ Test the Gaussian array covariance rotation. """ # Check matrix R = np.random.randn(2,2) Cov = np.random.randn(2,2) self.assertAllClose(gaussian.rotate_covariance(Cov, R), np.einsum('ik,kl,lj', R, Cov, R.T)) # Check matrix with plates R = np.random.randn(2,2) Cov = np.random.randn(4,3,2,2) self.assertAllClose(gaussian.rotate_covariance(Cov, R), np.einsum('...ik,...kl,...lj', R, Cov, R.T)) # Check array, first axis R = np.random.randn(2,2) Cov = np.random.randn(2,3,3,2,3,3) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=-3), np.einsum('...ik,...kablcd,...lj->...iabjcd', R, Cov, R.T)) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=0), np.einsum('...ik,...kablcd,...lj->...iabjcd', R, Cov, R.T)) # Check array, middle axis R = np.random.randn(2,2) Cov = np.random.randn(3,2,3,3,2,3) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=-2), np.einsum('...ik,...akbcld,...lj->...aibcjd', R, Cov, R.T)) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=1), np.einsum('...ik,...akbcld,...lj->...aibcjd', R, Cov, R.T)) # Check array, last axis R = np.random.randn(2,2) Cov = np.random.randn(3,3,2,3,3,2) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=-1), np.einsum('...ik,...abkcdl,...lj->...abicdj', R, Cov, R.T)) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=2), np.einsum('...ik,...abkcdl,...lj->...abicdj', R, Cov, R.T)) # Check array, middle axis with plates R = np.random.randn(2,2) Cov = np.random.randn(4,4,3,2,3,3,2,3) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=-2), np.einsum('...ik,...akbcld,...lj->...aibcjd', R, Cov, R.T)) self.assertAllClose(gaussian.rotate_covariance(Cov, R, ndim=3, axis=1), np.einsum('...ik,...akbcld,...lj->...aibcjd', R, Cov, R.T)) pass class TestGaussianARD(TestCase): def test_init(self): """ Test the constructor of GaussianARD """ def check_init(true_plates, true_shape, mu, alpha, **kwargs): X = GaussianARD(mu, alpha, **kwargs) self.assertEqual(X.dims, (true_shape, true_shape+true_shape), msg="Constructed incorrect dimensionality") self.assertEqual(X.plates, true_plates, msg="Constructed incorrect plates") # # Create from constant parents # # Use ndim=0 for constant mu check_init((), (), 0, 1) check_init((3,2), (), np.zeros((3,2,)), np.ones((2,))) check_init((4,2,2,3), (), np.zeros((2,1,3,)), np.ones((4,1,2,3))) # Use ndim check_init((4,2), (2,3), np.zeros((2,1,3,)), np.ones((4,1,2,3)), ndim=2) # Use shape check_init((4,2), (2,3), np.zeros((2,1,3,)), np.ones((4,1,2,3)), shape=(2,3)) # Use ndim and shape check_init((4,2), (2,3), np.zeros((2,1,3,)), np.ones((4,1,2,3)), ndim=2, shape=(2,3)) # # Create from node parents # # ndim=0 by default check_init((3,), (), GaussianARD(0, 1, plates=(3,)), Gamma(1, 1, plates=(3,))) check_init((4,2,2,3), (), GaussianARD(np.zeros((2,1,3)), np.ones((2,1,3)), ndim=3), Gamma(np.ones((4,1,2,3)), np.ones((4,1,2,3)))) # Use ndim check_init((4,), (2,2,3), GaussianARD(np.zeros((4,1,2,3)), np.ones((4,1,2,3)), ndim=2), Gamma(np.ones((4,2,1,3)), np.ones((4,2,1,3))), ndim=3) # Use shape check_init((4,), (2,2,3), GaussianARD(np.zeros((4,1,2,3)), np.ones((4,1,2,3)), ndim=2), Gamma(np.ones((4,2,1,3)), np.ones((4,2,1,3))), shape=(2,2,3)) # Use ndim and shape check_init((4,2), (2,3), GaussianARD(np.zeros((2,1,3)), np.ones((2,1,3)), ndim=2), Gamma(np.ones((4,1,2,3)), np.ones((4,1,2,3))), ndim=2, shape=(2,3)) # Test for a found bug check_init((), (3,), np.ones(3), 1, ndim=1) # Parent mu has more axes check_init( (2,), (3,), GaussianARD(np.zeros((2,3)), np.ones((2,3)), ndim=2), np.ones((2,3)), ndim=1 ) # DO NOT add axes if necessary self.assertRaises( ValueError, GaussianARD, GaussianARD(np.zeros((2,3)), np.ones((2,3)), ndim=2), 1, ndim=3 ) # # Errors # # Inconsistent shapes self.assertRaises(ValueError, GaussianARD, GaussianARD(np.zeros((2,3)), np.ones((2,3)), ndim=1), np.ones((4,3)), ndim=2) # Inconsistent dims of mu and alpha self.assertRaises(ValueError, GaussianARD, np.zeros((2,3)), np.ones((2,))) # Inconsistent plates of mu and alpha self.assertRaises(ValueError, GaussianARD, GaussianARD(np.zeros((3,2,3)), np.ones((3,2,3)), ndim=2), np.ones((3,4,2,3)), ndim=3) # Inconsistent ndim and shape self.assertRaises(ValueError, GaussianARD, np.zeros((2,3)), np.ones((2,)), shape=(2,3), ndim=1) # Incorrect shape self.assertRaises(ValueError, GaussianARD, GaussianARD(np.zeros((2,3)), np.ones((2,3)), ndim=2), np.ones((2,3)), shape=(2,2)) pass def test_message_to_child(self): """ Test moments of GaussianARD. """ # Check that moments have full shape when broadcasting X = GaussianARD(np.zeros((2,)), np.ones((3,2)), shape=(4,3,2)) (u0, u1) = X._message_to_child() self.assertEqual(np.shape(u0), (4,3,2)) self.assertEqual(np.shape(u1), (4,3,2,4,3,2)) # Check the formula X = GaussianARD(2, 3) (u0, u1) = X._message_to_child() self.assertAllClose(u0, 2) self.assertAllClose(u1, 2**2 + 1/3) # Check the formula for multidimensional arrays X = GaussianARD(2*np.ones((2,1,4)), 3*np.ones((2,3,1)), ndim=3) (u0, u1) = X._message_to_child() self.assertAllClose(u0, 2*np.ones((2,3,4))) self.assertAllClose(u1, 2**2 * np.ones((2,3,4,2,3,4)) + 1/3 * misc.identity(2,3,4)) # Check the formula for dim-broadcasted mu X = GaussianARD(2*np.ones((3,1)), 3*np.ones((2,3,4)), ndim=3) (u0, u1) = X._message_to_child() self.assertAllClose(u0, 2*np.ones((2,3,4))) self.assertAllClose(u1, 2**2 * np.ones((2,3,4,2,3,4)) + 1/3 * misc.identity(2,3,4)) # Check the formula for dim-broadcasted alpha X = GaussianARD(2*np.ones((2,3,4)), 3*np.ones((3,1)), ndim=3) (u0, u1) = X._message_to_child() self.assertAllClose(u0, 2*np.ones((2,3,4))) self.assertAllClose(u1, 2**2 * np.ones((2,3,4,2,3,4)) + 1/3 * misc.identity(2,3,4)) # Check the formula for dim-broadcasted mu and alpha X = GaussianARD(2*np.ones((3,1)), 3*np.ones((3,1)), shape=(2,3,4)) (u0, u1) = X._message_to_child() self.assertAllClose(u0, 2*np.ones((2,3,4))) self.assertAllClose(u1, 2**2 * np.ones((2,3,4,2,3,4)) + 1/3 * misc.identity(2,3,4)) # Check the formula for dim-broadcasted mu with plates mu = GaussianARD(2*np.ones((5,1,3,4)), np.ones((5,1,3,4)), shape=(3,4), plates=(5,1)) X = GaussianARD(mu, 3*np.ones((5,2,3,4)), shape=(2,3,4), plates=(5,)) (u0, u1) = X._message_to_child() self.assertAllClose(u0, 2*np.ones((5,2,3,4))) self.assertAllClose(u1, 2**2 * np.ones((5,2,3,4,2,3,4)) + 1/3 * misc.identity(2,3,4)) # Check posterior X = GaussianARD(2, 3) Y = GaussianARD(X, 1) Y.observe(10) X.update() (u0, u1) = X._message_to_child() self.assertAllClose(u0, 1/(3+1) * (3*2 + 1*10)) self.assertAllClose(u1, (1/(3+1) * (3*2 + 1*10))**2 + 1/(3+1)) pass def test_message_to_parent_mu(self): """ Test that GaussianARD computes the message to the 1st parent correctly. """ # Check formula with uncertain parent alpha mu = GaussianARD(0, 1) alpha = Gamma(2,1) X = GaussianARD(mu, alpha) X.observe(3) (m0, m1) = mu._message_from_children() #(m0, m1) = X._message_to_parent(0) self.assertAllClose(m0, 2*3) self.assertAllClose(m1, -0.5*2) # Check formula with uncertain node mu = GaussianARD(1, 1e10) X = GaussianARD(mu, 2) Y = GaussianARD(X, 1) Y.observe(5) X.update() (m0, m1) = mu._message_from_children() self.assertAllClose(m0, 2 * 1/(2+1)*(2*1+1*5)) self.assertAllClose(m1, -0.5*2) # Check alpha larger than mu mu = GaussianARD(np.zeros((2,3)), 1e10, shape=(2,3)) X = GaussianARD(mu, 2*np.ones((3,2,3))) X.observe(3*np.ones((3,2,3))) (m0, m1) = mu._message_from_children() self.assertAllClose(m0, 2*3 * 3 * np.ones((2,3))) self.assertAllClose(m1, -0.5 * 3 * 2*misc.identity(2,3)) # Check mu larger than alpha mu = GaussianARD(np.zeros((3,2,3)), 1e10, shape=(3,2,3)) X = GaussianARD(mu, 2*np.ones((2,3))) X.observe(3*np.ones((3,2,3))) (m0, m1) = mu._message_from_children() self.assertAllClose(m0, 2 * 3 * np.ones((3,2,3))) self.assertAllClose(m1, -0.5 * 2*misc.identity(3,2,3)) # Check node larger than mu and alpha mu = GaussianARD(np.zeros((2,3)), 1e10, shape=(2,3)) X = GaussianARD(mu, 2*np.ones((3,)), shape=(3,2,3)) X.observe(3*np.ones((3,2,3))) (m0, m1) = mu._message_from_children() self.assertAllClose(m0, 2*3 * 3*np.ones((2,3))) self.assertAllClose(m1, -0.5 * 2 * 3*misc.identity(2,3)) # Check broadcasting of dimensions mu = GaussianARD(np.zeros((2,1)), 1e10, shape=(2,1)) X = GaussianARD(mu, 2*np.ones((2,3)), shape=(2,3)) X.observe(3*np.ones((2,3))) (m0, m1) = mu._message_from_children() self.assertAllClose(m0, 2*3 * 3*np.ones((2,1))) self.assertAllClose(m1, -0.5 * 2 * 3*misc.identity(2,1)) # Check plates for smaller mu than node mu = GaussianARD(0,1, shape=(3,), plates=(4,1,1)) X = GaussianARD(mu, 2*np.ones((3,)), shape=(2,3), plates=(4,5)) X.observe(3*np.ones((4,5,2,3))) (m0, m1) = mu._message_from_children() self.assertAllClose(m0 * np.ones((4,1,1,3)), 2*3 * 5*2*np.ones((4,1,1,3))) self.assertAllClose(m1 * np.ones((4,1,1,3,3)), -0.5*2 * 5*2*misc.identity(3) * np.ones((4,1,1,3,3))) # Check mask mu = GaussianARD(np.zeros((2,1,3)), 1e10, shape=(3,)) X = GaussianARD(mu, 2*np.ones((2,4,3)), shape=(3,), plates=(2,4,)) X.observe(3*np.ones((2,4,3)), mask=[[True, True, True, False], [False, True, False, True]]) (m0, m1) = mu._message_from_children() self.assertAllClose(m0, (2*3 * np.ones((2,1,3)) * np.array([[[3]], [[2]]]))) self.assertAllClose(m1, (-0.5*2 * misc.identity(3) * np.ones((2,1,1,1)) * np.array([[[[3]]], [[[2]]]]))) # Check mask with different shapes mu = GaussianARD(np.zeros((2,1,3)), 1e10, shape=()) X = GaussianARD(mu, 2*np.ones((2,4,3)), shape=(3,), plates=(2,4,)) mask = np.array([[True, True, True, False], [False, True, False, True]]) X.observe(3*np.ones((2,4,3)), mask=mask) (m0, m1) = mu._message_from_children() self.assertAllClose(m0, 2*3 * np.sum(np.ones((2,4,3))*mask[...,None], axis=-2, keepdims=True)) self.assertAllClose(m1, (-0.5*2 * np.sum(np.ones((2,4,3))*mask[...,None], axis=-2, keepdims=True))) # Check non-ARD Gaussian child mu = np.array([1,2]) Mu = GaussianARD(mu, 1e10, shape=(2,)) alpha = np.array([3,4]) Lambda = np.array([[1, 0.5], [0.5, 1]]) X = GaussianARD(Mu, alpha, ndim=1) Y = Gaussian(X, Lambda) y = np.array([5,6]) Y.observe(y) X.update() (m0, m1) = Mu._message_from_children() mean = np.dot(np.linalg.inv(np.diag(alpha)+Lambda), np.dot(np.diag(alpha), mu) + np.dot(Lambda, y)) self.assertAllClose(m0, np.dot(np.diag(alpha), mean)) self.assertAllClose(m1, -0.5*np.diag(alpha)) # Check broadcasted variable axes mu = GaussianARD(np.zeros(1), 1e10, shape=(1,)) X = GaussianARD(mu, 2, shape=(3,)) X.observe(3*np.ones(3)) (m0, m1) = mu._message_from_children() self.assertAllClose(m0, 2*3 * np.sum(np.ones(3), axis=-1, keepdims=True)) self.assertAllClose(m1, -0.5*2 * np.sum(np.identity(3), axis=(-1,-2), keepdims=True)) pass def test_message_to_parent_alpha(self): """ Test the message from GaussianARD the 2nd parent (alpha). """ # Check formula with uncertain parent mu mu = GaussianARD(1,1) tau = Gamma(0.5*1e10, 1e10) X = GaussianARD(mu, tau) X.observe(3) (m0, m1) = tau._message_from_children() self.assertAllClose(m0, -0.5*(3**2 - 2*3*1 + 1**2+1)) self.assertAllClose(m1, 0.5) # Check formula with uncertain node tau = Gamma(1e10, 1e10) X = GaussianARD(2, tau) Y = GaussianARD(X, 1) Y.observe(5) X.update() (m0, m1) = tau._message_from_children() self.assertAllClose(m0, -0.5*(1/(1+1)+3.5**2 - 2*3.5*2 + 2**2)) self.assertAllClose(m1, 0.5) # Check alpha larger than mu alpha = Gamma(np.ones((3,2,3))*1e10, 1e10) X = GaussianARD(np.ones((2,3)), alpha, ndim=3) X.observe(2*np.ones((3,2,3))) (m0, m1) = alpha._message_from_children() self.assertAllClose(m0 * np.ones((3,2,3)), -0.5*(2**2 - 2*2*1 + 1**2) * np.ones((3,2,3))) self.assertAllClose(m1*np.ones((3,2,3)), 0.5*np.ones((3,2,3))) # Check mu larger than alpha tau = Gamma(np.ones((2,3))*1e10, 1e10) X = GaussianARD(np.ones((3,2,3)), tau, ndim=3) X.observe(2*np.ones((3,2,3))) (m0, m1) = tau._message_from_children() self.assertAllClose(m0, -0.5*(2**2 - 2*2*1 + 1**2) * 3 * np.ones((2,3))) self.assertAllClose(m1 * np.ones((2,3)), 0.5 * 3 * np.ones((2,3))) # Check node larger than mu and alpha tau = Gamma(np.ones((3,))*1e10, 1e10) X = GaussianARD(np.ones((2,3)), tau, shape=(3,2,3)) X.observe(2*np.ones((3,2,3))) (m0, m1) = tau._message_from_children() self.assertAllClose(m0 * np.ones(3), -0.5*(2**2 - 2*2*1 + 1**2) * 6 * np.ones((3,))) self.assertAllClose(m1 * np.ones(3), 0.5 * 6 * np.ones(3)) # Check plates for smaller mu than node tau = Gamma(np.ones((4,1,2,3))*1e10, 1e10) X = GaussianARD(GaussianARD(1, 1, shape=(3,), plates=(4,1,1)), tau, shape=(2,3), plates=(4,5)) X.observe(2*np.ones((4,5,2,3))) (m0, m1) = tau._message_from_children() self.assertAllClose(m0 * np.ones((4,1,2,3)), (-0.5 * (2**2 - 2*2*1 + 1**2+1) * 5*np.ones((4,1,2,3)))) self.assertAllClose(m1 * np.ones((4,1,2,3)), 5*0.5 * np.ones((4,1,2,3))) # Check mask tau = Gamma(np.ones((4,3))*1e10, 1e10) X = GaussianARD(np.ones(3), tau, shape=(3,), plates=(2,4,)) X.observe(2*np.ones((2,4,3)), mask=[[True, False, True, False], [False, True, True, False]]) (m0, m1) = tau._message_from_children() self.assertAllClose(m0 * np.ones((4,3)), (-0.5 * (2**2 - 2*2*1 + 1**2) * np.ones((4,3)) * np.array([[1], [1], [2], [0]]))) self.assertAllClose(m1 * np.ones((4,3)), 0.5 * np.array([[1], [1], [2], [0]]) * np.ones((4,3))) # Check non-ARD Gaussian child mu = np.array([1,2]) alpha = np.array([3,4]) Alpha = Gamma(alpha*1e10, 1e10) Lambda = np.array([[1, 0.5], [0.5, 1]]) X = GaussianARD(mu, Alpha, ndim=1) Y = Gaussian(X, Lambda) y = np.array([5,6]) Y.observe(y) X.update() (m0, m1) = Alpha._message_from_children() Cov = np.linalg.inv(np.diag(alpha)+Lambda) mean = np.dot(Cov, np.dot(np.diag(alpha), mu) + np.dot(Lambda, y)) self.assertAllClose(m0 * np.ones(2), -0.5 * np.diag( np.outer(mean, mean) + Cov - np.outer(mean, mu) - np.outer(mu, mean) + np.outer(mu, mu))) self.assertAllClose(m1 * np.ones(2), 0.5 * np.ones(2)) pass def test_message_to_parents(self): """ Check gradient passed to inputs parent node """ D = 3 X = Gaussian(np.random.randn(D), random.covariance(D)) a = Gamma(np.random.rand(D), np.random.rand(D)) Y = GaussianARD(X, a) Y.observe(np.random.randn(D)) self.assert_message_to_parent(Y, X) self.assert_message_to_parent(Y, a) pass def test_lowerbound(self): """ Test the variational Bayesian lower bound term for GaussianARD. """ # Test vector formula with full noise covariance m = np.random.randn(2) alpha = np.random.rand(2) y = np.random.randn(2) X = GaussianARD(m, alpha, ndim=1) V = np.array([[3,1],[1,3]]) Y = Gaussian(X, V) Y.observe(y) X.update() Cov = np.linalg.inv(np.diag(alpha) + V) mu = np.dot(Cov, np.dot(V, y) + alpha*m) x2 = np.outer(mu, mu) + Cov logH_X = (+ 2*0.5*(1+np.log(2*np.pi)) + 0.5*np.log(np.linalg.det(Cov))) logp_X = (- 2*0.5*np.log(2*np.pi) + 0.5*np.log(np.linalg.det(np.diag(alpha))) - 0.5*np.sum(np.diag(alpha) * (x2 - np.outer(mu,m) - np.outer(m,mu) + np.outer(m,m)))) self.assertAllClose(logp_X + logH_X, X.lower_bound_contribution()) def check_lower_bound(shape_mu, shape_alpha, plates_mu=(), **kwargs): M = GaussianARD(np.ones(plates_mu + shape_mu), np.ones(plates_mu + shape_mu), shape=shape_mu, plates=plates_mu) if not ('ndim' in kwargs or 'shape' in kwargs): kwargs['ndim'] = len(shape_mu) X = GaussianARD(M, 2*np.ones(shape_alpha), **kwargs) Y = GaussianARD(X, 3*np.ones(X.get_shape(0)), **kwargs) Y.observe(4*np.ones(Y.get_shape(0))) X.update() Cov = 1/(2+3) mu = Cov * (2*1 + 3*4) x2 = mu**2 + Cov logH_X = (+ 0.5*(1+np.log(2*np.pi)) + 0.5*np.log(Cov)) logp_X = (- 0.5*np.log(2*np.pi) + 0.5*np.log(2) - 0.5*2*(x2 - 2*mu*1 + 1**2+1)) r = np.prod(X.get_shape(0)) self.assertAllClose(r * (logp_X + logH_X), X.lower_bound_contribution()) # Test scalar formula check_lower_bound((), ()) # Test array formula check_lower_bound((2,3), (2,3)) # Test dim-broadcasting of mu check_lower_bound((3,1), (2,3,4)) # Test dim-broadcasting of alpha check_lower_bound((2,3,4), (3,1)) # Test dim-broadcasting of mu and alpha check_lower_bound((3,1), (3,1), shape=(2,3,4)) # Test dim-broadcasting of mu with plates check_lower_bound((), (), plates_mu=(), shape=(), plates=(5,)) # BUG: Scalar parents for array variable caused einsum error check_lower_bound((), (), shape=(3,)) # BUG: Log-det was summed over plates check_lower_bound((), (), shape=(3,), plates=(4,)) pass def test_rotate(self): """ Test the rotation of Gaussian ARD arrays. """ def check(shape, plates, einsum_x, einsum_xx, axis=-1): # TODO/FIXME: Improve by having non-diagonal precision/covariance # parameter for the Gaussian X D = shape[axis] X = GaussianARD(np.random.randn(*(plates+shape)), np.random.rand(*(plates+shape)), shape=shape, plates=plates) (x, xx) = X.get_moments() R = np.random.randn(D,D) X.rotate(R, axis=axis) (rx, rxxr) = X.get_moments() self.assertAllClose(rx, np.einsum(einsum_x, R, x)) self.assertAllClose(rxxr, np.einsum(einsum_xx, R, xx, R)) pass # Rotate vector check((3,), (), '...jk,...k->...j', '...mk,...kl,...nl->...mn') check((3,), (2,4), '...jk,...k->...j', '...mk,...kl,...nl->...mn') # Rotate array check((2,3,4), (), '...jc,...abc->...abj', '...mc,...abcdef,...nf->...abmden', axis=-1) check((2,3,4), (5,6), '...jc,...abc->...abj', '...mc,...abcdef,...nf->...abmden', axis=-1) check((2,3,4), (), '...jb,...abc->...ajc', '...mb,...abcdef,...ne->...amcdnf', axis=-2) check((2,3,4), (5,6), '...jb,...abc->...ajc', '...mb,...abcdef,...ne->...amcdnf', axis=-2) check((2,3,4), (), '...ja,...abc->...jbc', '...ma,...abcdef,...nd->...mbcnef', axis=-3) check((2,3,4), (5,6), '...ja,...abc->...jbc', '...ma,...abcdef,...nd->...mbcnef', axis=-3) pass def test_rotate_plates(self): # Basic test for Gaussian vectors X = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), shape=(2,), plates=(3,)) (u0, u1) = X.get_moments() Cov = u1 - linalg.outer(u0, u0, ndim=1) Q = np.random.randn(3,3) Qu0 = np.einsum('ik,kj->ij', Q, u0) QCov = np.einsum('k,kij->kij', np.sum(Q, axis=0)**2, Cov) Qu1 = QCov + linalg.outer(Qu0, Qu0, ndim=1) X.rotate_plates(Q, plate_axis=-1) (u0, u1) = X.get_moments() self.assertAllClose(u0, Qu0) self.assertAllClose(u1, Qu1) # Test full covariance, that is, with observations X = GaussianARD(np.random.randn(3,2), np.random.rand(3,2), shape=(2,), plates=(3,)) Y = Gaussian(X, [[2.0, 1.5], [1.5, 3.0]], plates=(3,)) Y.observe(np.random.randn(3,2)) X.update() (u0, u1) = X.get_moments() Cov = u1 - linalg.outer(u0, u0, ndim=1) Q = np.random.randn(3,3) Qu0 = np.einsum('ik,kj->ij', Q, u0) QCov = np.einsum('k,kij->kij', np.sum(Q, axis=0)**2, Cov) Qu1 = QCov + linalg.outer(Qu0, Qu0, ndim=1) X.rotate_plates(Q, plate_axis=-1) (u0, u1) = X.get_moments() self.assertAllClose(u0, Qu0) self.assertAllClose(u1, Qu1) pass def test_initialization(self): """ Test initialization methods of GaussianARD """ X = GaussianARD(1, 2, shape=(2,), plates=(3,)) # Prior initialization mu = 1 * np.ones((3, 2)) alpha = 2 * np.ones((3, 2)) X.initialize_from_prior() u = X._message_to_child() self.assertAllClose(u[0]*np.ones((3,2)), mu) self.assertAllClose(u[1]*np.ones((3,2,2)), linalg.outer(mu, mu, ndim=1) + misc.diag(1/alpha, ndim=1)) # Parameter initialization mu = np.random.randn(3, 2) alpha = np.random.rand(3, 2) X.initialize_from_parameters(mu, alpha) u = X._message_to_child() self.assertAllClose(u[0], mu) self.assertAllClose(u[1], linalg.outer(mu, mu, ndim=1) + misc.diag(1/alpha, ndim=1)) # Value initialization x = np.random.randn(3, 2) X.initialize_from_value(x) u = X._message_to_child() self.assertAllClose(u[0], x) self.assertAllClose(u[1], linalg.outer(x, x, ndim=1)) # Random initialization X.initialize_from_random() pass class TestGaussianGamma(TestCase): """ Unit tests for GaussianGamma node. """ def test_init(self): """ Test the creation of GaussianGamma node """ # Test 0-ndim Gaussian-Gamma X_alpha = GaussianGamma([1,2], [0.1, 0.2], [0.02, 0.03], [0.03, 0.04], ndim=0) # Simple construction X_alpha = GaussianGamma([1,2,3], np.identity(3), 2, 10) self.assertEqual(X_alpha.plates, ()) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) # Plates X_alpha = GaussianGamma([1,2,3], np.identity(3), 2, 10, plates=(4,)) self.assertEqual(X_alpha.plates, (4,)) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) # Plates in mu X_alpha = GaussianGamma(np.ones((4,3)), np.identity(3), 2, 10) self.assertEqual(X_alpha.plates, (4,)) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) # Plates in Lambda X_alpha = GaussianGamma(np.ones(3), np.ones((4,3,3))*np.identity(3), 2, 10) self.assertEqual(X_alpha.plates, (4,)) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) # Plates in a X_alpha = GaussianGamma(np.ones(3), np.identity(3), np.ones(4), 10) self.assertEqual(X_alpha.plates, (4,)) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) # Plates in Lambda X_alpha = GaussianGamma(np.ones(3), np.identity(3), 2, np.ones(4)) self.assertEqual(X_alpha.plates, (4,)) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) # Inconsistent plates self.assertRaises(ValueError, GaussianGamma, np.ones((4,3)), np.identity(3), 2, 10, plates=()) # Inconsistent plates self.assertRaises(ValueError, GaussianGamma, np.ones((4,3)), np.identity(3), 2, 10, plates=(5,)) # Unknown parameters mu = Gaussian(np.zeros(3), np.identity(3)) Lambda = Wishart(10, np.identity(3)) b = Gamma(1, 1) X_alpha = GaussianGamma(mu, Lambda, 2, b) self.assertEqual(X_alpha.plates, ()) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) # mu is Gaussian-gamma mu_tau = GaussianGamma(np.ones(3), np.identity(3), 5, 5) X_alpha = GaussianGamma(mu_tau, np.identity(3), 5, 5) self.assertEqual(X_alpha.plates, ()) self.assertEqual(X_alpha.dims, ( (3,), (3,3), (), () )) pass def test_message_to_child(self): """ Test the message to child of GaussianGamma node. """ # Simple test mu = np.array([1,2,3]) Lambda = np.identity(3) a = 2 b = 10 X_alpha = GaussianGamma(mu, Lambda, a, b) u = X_alpha._message_to_child() self.assertEqual(len(u), 4) tau = np.array(a/b) self.assertAllClose(u[0], tau[...,None] * mu) self.assertAllClose(u[1], (linalg.inv(Lambda) + tau[...,None,None] * linalg.outer(mu, mu))) self.assertAllClose(u[2], tau) self.assertAllClose(u[3], -np.log(b) + special.psi(a)) # Test with unknown parents mu = Gaussian(np.arange(3), 10*np.identity(3)) Lambda = Wishart(10, np.identity(3)) a = 2 b = Gamma(3, 15) X_alpha = GaussianGamma(mu, Lambda, a, b) u = X_alpha._message_to_child() (mu, mumu) = mu._message_to_child() Cov_mu = mumu - linalg.outer(mu, mu) (Lambda, _) = Lambda._message_to_child() (b, _) = b._message_to_child() (tau, logtau) = Gamma(a, b + 0.5*np.sum(Lambda*Cov_mu))._message_to_child() self.assertAllClose(u[0], tau[...,None] * mu) self.assertAllClose(u[1], (linalg.inv(Lambda) + tau[...,None,None] * linalg.outer(mu, mu))) self.assertAllClose(u[2], tau) self.assertAllClose(u[3], logtau) # Test with plates mu = Gaussian(np.reshape(np.arange(3*4), (4,3)), 10*np.identity(3), plates=(4,)) Lambda = Wishart(10, np.identity(3)) a = 2 b = Gamma(3, 15) X_alpha = GaussianGamma(mu, Lambda, a, b, plates=(4,)) u = X_alpha._message_to_child() (mu, mumu) = mu._message_to_child() Cov_mu = mumu - linalg.outer(mu, mu) (Lambda, _) = Lambda._message_to_child() (b, _) = b._message_to_child() (tau, logtau) = Gamma(a, b + 0.5*np.sum(Lambda*Cov_mu, axis=(-1,-2)))._message_to_child() self.assertAllClose(u[0] * np.ones((4,1)), np.ones((4,1)) * tau[...,None] * mu) self.assertAllClose(u[1] * np.ones((4,1,1)), np.ones((4,1,1)) * (linalg.inv(Lambda) + tau[...,None,None] * linalg.outer(mu, mu))) self.assertAllClose(u[2] * np.ones(4), np.ones(4) * tau) self.assertAllClose(u[3] * np.ones(4), np.ones(4) * logtau) pass def test_mask_to_parent(self): """ Test the mask handling in GaussianGamma node """ pass def test_messages(self): D = 2 M = 3 np.random.seed(42) def check(mu, Lambda, alpha, beta, ndim): X = GaussianGamma( mu, ( Lambda if isinstance(Lambda._moments, WishartMoments) else Lambda.as_wishart(ndim=ndim) ), alpha, beta, ndim=ndim ) self.assert_moments( X, postprocess=lambda u: [ u[0], u[1] + linalg.transpose(u[1], ndim=ndim), u[2], u[3] ], rtol=1e-5, atol=1e-6, eps=1e-8 ) X.observe( ( np.random.randn(*(X.plates + X.dims[0])), np.random.rand(*X.plates) ) ) self.assert_message_to_parent(X, mu) self.assert_message_to_parent( X, Lambda, postprocess=lambda m: [ m[0] + linalg.transpose(m[0], ndim=ndim), m[1], ] ) self.assert_message_to_parent(X, beta) check( Gaussian(np.random.randn(M, D), random.covariance(D), plates=(M,)), Wishart(D + np.random.rand(M), random.covariance(D), plates=(M,)), np.random.rand(M), Gamma(np.random.rand(M), np.random.rand(M), plates=(M,)), ndim=1 ) check( GaussianARD(np.random.randn(M, D), np.random.rand(M, D), ndim=0), Gamma(np.random.rand(M, D), np.random.rand(M, D)), np.random.rand(M, D), Gamma(np.random.rand(M, D), np.random.rand(M, D)), ndim=0 ) pass class TestGaussian(TestCase): def test_message_to_parents(self): """ Check gradient passed to inputs parent node """ D = 3 X = Gaussian(np.random.randn(D), random.covariance(D)) V = Wishart(D + np.random.rand(), random.covariance(D)) Y = Gaussian(X, V) self.assert_moments( Y, lambda u: [u[0], u[1] + u[1].T], rtol=1e-3, ) Y.observe(np.random.randn(D)) self.assert_message_to_parent(Y, X) #self.assert_message_to_parent(Y, V) pass class TestGaussianGradient(TestCase): """Numerically check Riemannian gradient of several nodes. Using VB-EM update equations will take a unit length step to the Riemannian gradient direction. Thus, the change caused by a VB-EM update and the Riemannian gradient should be equal. """ def test_riemannian_gradient(self): """Test Riemannian gradient of a Gaussian node.""" D = 3 # # Without observations # # Construct model mu = np.random.randn(D) Lambda = random.covariance(D) X = Gaussian(mu, Lambda) # Random initialization mu0 = np.random.randn(D) Lambda0 = random.covariance(D) X.initialize_from_parameters(mu0, Lambda0) # Initial parameters phi0 = X.phi # Gradient g = X.get_riemannian_gradient() # Parameters after VB-EM update X.update() phi1 = X.phi # Check self.assertAllClose(g[0], phi1[0] - phi0[0]) self.assertAllClose(g[1], phi1[1] - phi0[1]) # TODO/FIXME: Actually, gradient should be zero because cost function # is zero without observations! Use the mask! # # With observations # # Construct model mu = np.random.randn(D) Lambda = random.covariance(D) X = Gaussian(mu, Lambda) V = random.covariance(D) Y = Gaussian(X, V) Y.observe(np.random.randn(D)) # Random initialization mu0 = np.random.randn(D) Lambda0 = random.covariance(D) X.initialize_from_parameters(mu0, Lambda0) # Initial parameters phi0 = X.phi # Gradient g = X.get_riemannian_gradient() # Parameters after VB-EM update X.update() phi1 = X.phi # Check self.assertAllClose(g[0], phi1[0] - phi0[0]) self.assertAllClose(g[1], phi1[1] - phi0[1]) pass def test_gradient(self): """Test standard gradient of a Gaussian node.""" D = 3 np.random.seed(42) # # Without observations # # Construct model mu = np.random.randn(D) Lambda = random.covariance(D) X = Gaussian(mu, Lambda) # Random initialization mu0 = np.random.randn(D) Lambda0 = random.covariance(D) X.initialize_from_parameters(mu0, Lambda0) Q = VB(X) # Initial parameters phi0 = X.phi # Gradient rg = X.get_riemannian_gradient() g = X.get_gradient(rg) # Numerical gradient eps = 1e-6 p0 = X.get_parameters() l0 = Q.compute_lowerbound(ignore_masked=False) g_num = [np.zeros(D), np.zeros((D,D))] for i in range(D): e = np.zeros(D) e[i] = eps p1 = p0[0] + e X.set_parameters([p1, p0[1]]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[0][i] = (l1 - l0) / eps for i in range(D): for j in range(i+1): e = np.zeros((D,D)) e[i,j] += eps e[j,i] += eps p1 = p0[1] + e X.set_parameters([p0[0], p1]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[1][i,j] = (l1 - l0) / (2*eps) g_num[1][j,i] = (l1 - l0) / (2*eps) # Check self.assertAllClose(g[0], g_num[0]) self.assertAllClose(g[1], g_num[1]) # # With observations # # Construct model mu = np.random.randn(D) Lambda = random.covariance(D) X = Gaussian(mu, Lambda) # Random initialization mu0 = np.random.randn(D) Lambda0 = random.covariance(D) X.initialize_from_parameters(mu0, Lambda0) V = random.covariance(D) Y = Gaussian(X, V) Y.observe(np.random.randn(D)) Q = VB(Y, X) # Initial parameters phi0 = X.phi # Gradient rg = X.get_riemannian_gradient() g = X.get_gradient(rg) # Numerical gradient eps = 1e-6 p0 = X.get_parameters() l0 = Q.compute_lowerbound() g_num = [np.zeros(D), np.zeros((D,D))] for i in range(D): e = np.zeros(D) e[i] = eps p1 = p0[0] + e X.set_parameters([p1, p0[1]]) l1 = Q.compute_lowerbound() g_num[0][i] = (l1 - l0) / eps for i in range(D): for j in range(i+1): e = np.zeros((D,D)) e[i,j] += eps e[j,i] += eps p1 = p0[1] + e X.set_parameters([p0[0], p1]) l1 = Q.compute_lowerbound() g_num[1][i,j] = (l1 - l0) / (2*eps) g_num[1][j,i] = (l1 - l0) / (2*eps) # Check self.assertAllClose(g[0], g_num[0]) self.assertAllClose(g[1], g_num[1]) # # With plates # # Construct model K = D+1 mu = np.random.randn(D) Lambda = random.covariance(D) X = Gaussian(mu, Lambda, plates=(K,)) V = random.covariance(D, size=(K,)) Y = Gaussian(X, V) Y.observe(np.random.randn(K,D)) Q = VB(Y, X) # Random initialization mu0 = np.random.randn(*(X.get_shape(0))) Lambda0 = random.covariance(D, size=X.plates) X.initialize_from_parameters(mu0, Lambda0) # Initial parameters phi0 = X.phi # Gradient rg = X.get_riemannian_gradient() g = X.get_gradient(rg) # Numerical gradient eps = 1e-6 p0 = X.get_parameters() l0 = Q.compute_lowerbound() g_num = [np.zeros(X.get_shape(0)), np.zeros(X.get_shape(1))] for k in range(K): for i in range(D): e = np.zeros(X.get_shape(0)) e[k,i] = eps p1 = p0[0] + e X.set_parameters([p1, p0[1]]) l1 = Q.compute_lowerbound() g_num[0][k,i] = (l1 - l0) / eps for i in range(D): for j in range(i+1): e = np.zeros(X.get_shape(1)) e[k,i,j] += eps e[k,j,i] += eps p1 = p0[1] + e X.set_parameters([p0[0], p1]) l1 = Q.compute_lowerbound() g_num[1][k,i,j] = (l1 - l0) / (2*eps) g_num[1][k,j,i] = (l1 - l0) / (2*eps) # Check self.assertAllClose(g[0], g_num[0]) self.assertAllClose(g[1], g_num[1]) pass class TestConcatGaussian(TestCase): def test_message_to_parents(self): np.random.seed(42) N = 5 D1 = 3 D2 = 4 D3 = 2 X1 = Gaussian(np.random.randn(N, D1), random.covariance(D1)) X2 = Gaussian(np.random.randn(N, D2), random.covariance(D2)) X3 = np.random.randn(N, D3) Z = ConcatGaussian(X1, X2, X3) Y = Gaussian(Z, random.covariance(D1 + D2 + D3)) Y.observe(np.random.randn(*(Y.plates + Y.dims[0]))) self.assert_message_to_parent( Y, X1, eps=1e-7, rtol=1e-5, atol=1e-5 ) self.assert_message_to_parent( Y, X2, eps=1e-7, rtol=1e-5, atol=1e-5 ) pass def test_moments(self): np.random.seed(42) N = 4 D1 = 2 D2 = 3 X1 = Gaussian(np.random.randn(N, D1), random.covariance(D1)) X2 = Gaussian(np.random.randn(N, D2), random.covariance(D2)) Z = ConcatGaussian(X1, X2) u = Z._message_to_child() # First moment self.assertAllClose( u[0][...,:D1], X1.u[0] ) self.assertAllClose( u[0][...,D1:], X2.u[0] ) # Second moment self.assertAllClose( u[1][...,:D1,:D1], X1.u[1] ) self.assertAllClose( u[1][...,D1:,D1:], X2.u[1] ) self.assertAllClose( u[1][...,:D1,D1:], X1.u[0][...,:,None] * X2.u[0][...,None,:] ) self.assertAllClose( u[1][...,D1:,:D1], X2.u[0][...,:,None] * X1.u[0][...,None,:] ) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_gaussian_markov_chain.py0000644000175100001770000011237000000000000030437 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for gaussian_markov_chain module. """ import numpy as np from ..gaussian_markov_chain import GaussianMarkovChain from ..gaussian_markov_chain import VaryingGaussianMarkovChain from ..gaussian import Gaussian, GaussianMoments from ..gaussian import GaussianARD from ..gaussian import GaussianGamma from ..wishart import Wishart, WishartMoments from ..gamma import Gamma, GammaMoments from bayespy.utils import random from bayespy.utils import linalg from bayespy.utils import misc from bayespy.utils.misc import TestCase def kalman_filter(y, U, A, V, mu0, Cov0, out=None): """ Perform Kalman filtering to obtain filtered mean and covariance. The parameters of the process may vary in time, thus they are given as iterators instead of fixed values. Parameters ---------- y : (N,D) array "Normalized" noisy observations of the states, that is, the observations multiplied by the precision matrix U (and possibly other transformation matrices). U : (N,D,D) array or N-list of (D,D) arrays Precision matrix (i.e., inverse covariance matrix) of the observation noise for each time instance. A : (N-1,D,D) array or (N-1)-list of (D,D) arrays Dynamic matrix for each time instance. V : (N-1,D,D) array or (N-1)-list of (D,D) arrays Covariance matrix of the innovation noise for each time instance. Returns ------- mu : array Filtered mean of the states. Cov : array Filtered covariance of the states. See also -------- rts_smoother """ mu = mu0 Cov = Cov0 # Allocate memory for the results (N,D) = np.shape(y) X = np.empty((N,D)) CovX = np.empty((N,D,D)) # Update step for t=0 M = np.dot(np.dot(Cov, U[0]), Cov) + Cov L = linalg.chol(M) mu = np.dot(Cov, linalg.chol_solve(L, np.dot(Cov,y[0]) + mu)) Cov = np.dot(Cov, linalg.chol_solve(L, Cov)) X[0,:] = mu CovX[0,:,:] = Cov #for (yn, Un, An, Vn) in zip(y, U, A, V): for n in range(len(y)-1): #(yn, Un, An, Vn) in zip(y, U, A, V): # Prediction step mu = np.dot(A[n], mu) Cov = np.dot(np.dot(A[n], Cov), A[n].T) + V[n] # Update step M = np.dot(np.dot(Cov, U[n+1]), Cov) + Cov L = linalg.chol(M) mu = np.dot(Cov, linalg.chol_solve(L, np.dot(Cov,y[n+1]) + mu)) Cov = np.dot(Cov, linalg.chol_solve(L, Cov)) # Force symmetric covariance (for numeric inaccuracy) Cov = 0.5*Cov + 0.5*Cov.T # Store results X[n+1,:] = mu CovX[n+1,:,:] = Cov return (X, CovX) def rts_smoother(mu, Cov, A, V, removethis=None): """ Perform Rauch-Tung-Striebel smoothing to obtain the posterior. The function returns the posterior mean and covariance of each state. The parameters of the process may vary in time, thus they are given as iterators instead of fixed values. Parameters ---------- mu : (N,D) array Mean of the states from Kalman filter. Cov : (N,D,D) array Covariance of the states from Kalman filter. A : (N-1,D,D) array or (N-1)-list of (D,D) arrays Dynamic matrix for each time instance. V : (N-1,D,D) array or (N-1)-list of (D,D) arrays Covariance matrix of the innovation noise for each time instance. Returns ------- mu : array Posterior mean of the states. Cov : array Posterior covariance of the states. See also -------- kalman_filter """ N = len(mu) #n = N-1 # Start from the last time instance and smoothen backwards x = mu[-1,:] Covx = Cov[-1,:,:] for n in reversed(range(N-1)):#(An, Vn) in zip(reversed(A), reversed(V)): #n = n - 1 #if n <= 0: # break # The predicted value of n x_p = np.dot(A[n], mu[n,:]) Cov_p = np.dot(np.dot(A[n], Cov[n,:,:]), A[n].T) + V[n] # Temporary variable S = np.linalg.solve(Cov_p, np.dot(A[n], Cov[n,:,:])) # Smoothed value of n x = mu[n,:] + np.dot(S.T, x-x_p) Covx = Cov[n,:,:] + np.dot(np.dot(S.T, Covx-Cov_p), S) # Force symmetric covariance (for numeric inaccuracy) Covx = 0.5*Covx + 0.5*Covx.T # Store results mu[n,:] = x Cov[n,:] = Covx return (mu, Cov) class TestGaussianMarkovChain(TestCase): def create_model(self, N, D): # Construct the model Mu = Gaussian(np.random.randn(D), np.identity(D)) Lambda = Wishart(D, random.covariance(D)) A = Gaussian(np.random.randn(D,D), np.identity(D)) V = Gamma(D, np.random.rand(D)) X = GaussianMarkovChain(Mu, Lambda, A, V, n=N) Y = Gaussian(X, np.identity(D)) return (Y, X, Mu, Lambda, A, V) def test_plates(self): """ Test that plates are handled correctly. """ def test_message_to_mu0(self): pass def test_message_to_Lambda0(self): pass def test_message_to_A(self): pass def test_message_to_v(self): pass def test_message_to_parents(self): """ Check gradient passed to inputs parent node """ N = 3 D = 2 Mu = Gaussian(np.random.randn(D), random.covariance(D)) Lambda = Wishart(D, random.covariance(D)) A = Gaussian(np.random.randn(D,D), random.covariance(D)) V = Gamma(D, np.random.rand(D)) X = GaussianMarkovChain(Mu, Lambda, A, V, n=N+1) Y = Gaussian(X, random.covariance(D)) self.assert_moments( X, postprocess=lambda u: [ u[0], u[1] + linalg.transpose(u[1], ndim=1), u[2] ], rtol=1e-3, atol=1e-6, ) Y.observe(np.random.randn(N+1, D)) self.assert_message_to_parent(X, Mu, eps=1e-8) self.assert_message_to_parent( X, Lambda, eps=1e-8, postprocess=lambda u: [ u[0] + linalg.transpose(u[0], ndim=1), u[1], ] ) self.assert_message_to_parent(X, A) self.assert_message_to_parent(X, V, eps=1e-10, atol=1e-5) pass def test_message_to_parents_with_inputs(self): """ Check gradient passed to inputs parent node """ def check(Mu, Lambda, A, V, U): X = GaussianMarkovChain(Mu, Lambda, A, V, inputs=U) Y = Gaussian(X, random.covariance(D)) # Check moments self.assert_moments( X, postprocess=lambda u: [ u[0], u[1] + linalg.transpose(u[1], ndim=1), u[2] ] ) Y.observe(np.random.randn(N+1, D)) X.update() # Check gradient messages to parents self.assert_message_to_parent(X, Mu) self.assert_message_to_parent( X, Lambda, postprocess=lambda phi: [ phi[0] + linalg.transpose(phi[0], ndim=1), phi[1] ] ) self.assert_message_to_parent( X, A, postprocess=lambda phi: [ phi[0], phi[1] + linalg.transpose(phi[1], ndim=1), ] ) self.assert_message_to_parent(X, V) self.assert_message_to_parent(X, U) N = 4 D = 2 K = 3 check( Gaussian( np.random.randn(D), random.covariance(D) ), Wishart( D, random.covariance(D) ), Gaussian( np.random.randn(D,D+K), random.covariance(D+K) ), Gamma( D, np.random.rand(D) ), Gaussian( np.random.randn(N,K), random.covariance(K) ) ) check( Gaussian( np.random.randn(D), random.covariance(D) ), Wishart( D, random.covariance(D) ), GaussianGamma( np.random.randn(D,D+K), random.covariance(D+K), D, np.random.rand(D), ndim=1 ), Gamma( D, np.random.rand(D) ), Gaussian( np.random.randn(N,K), random.covariance(K) ) ) pass def test_message_to_child(self): """ Test the updating of GaussianMarkovChain. Check that the moments and the lower bound contribution are computed correctly. """ # TODO: Add plates and missing values! # Dimensionalities D = 3 N = 5 (Y, X, Mu, Lambda, A, V) = self.create_model(N, D) # Inference with arbitrary observations y = np.random.randn(N,D) Y.observe(y) X.update() (x_vb, xnxn_vb, xpxn_vb) = X.get_moments() # Get parameter moments (mu0, mumu0) = Mu.get_moments() (icov0, logdet0) = Lambda.get_moments() (a, aa) = A.get_moments() (icov_x, logdetx) = V.get_moments() icov_x = np.diag(icov_x) # Prior precision Z = np.einsum('...kij,...kk->...ij', aa, icov_x) U_diag = [icov0+Z] + (N-2)*[icov_x+Z] + [icov_x] U_super = (N-1) * [-np.dot(a.T, icov_x)] U = misc.block_banded(U_diag, U_super) # Prior mean mu_prior = np.zeros(D*N) mu_prior[:D] = np.dot(icov0,mu0) # Data Cov = np.linalg.inv(U + np.identity(D*N)) mu = np.dot(Cov, mu_prior + y.flatten()) # Moments xx = mu[:,np.newaxis]*mu[np.newaxis,:] + Cov mu = np.reshape(mu, (N,D)) xx = np.reshape(xx, (N,D,N,D)) # Check results self.assertAllClose(x_vb, mu, msg="Incorrect mean") for n in range(N): self.assertAllClose(xnxn_vb[n,:,:], xx[n,:,n,:], msg="Incorrect second moment") for n in range(N-1): self.assertAllClose(xpxn_vb[n,:,:], xx[n,:,n+1,:], msg="Incorrect lagged second moment") # Compute the entropy H(X) ldet = linalg.logdet_cov(Cov) H = random.gaussian_entropy(-ldet, N*D) # Compute xx = np.reshape(xx, (N*D, N*D)) mu = np.reshape(mu, (N*D,)) ldet = -logdet0 - np.sum(np.ones((N-1,D))*logdetx) P = random.gaussian_logpdf(np.einsum('...ij,...ij', xx, U), np.einsum('...i,...i', mu, mu_prior), np.einsum('...ij,...ij', mumu0, icov0), -ldet, N*D) # The VB bound from the net l = X.lower_bound_contribution() self.assertAllClose(l, H+P) # Compute the true bound + H(X) # # Simple tests # def check(N, D, plates=None, mu=None, Lambda=None, A=None, V=None): if mu is None: mu = np.random.randn(D) if Lambda is None: Lambda = random.covariance(D) if A is None: A = np.random.randn(D,D) if V is None: V = np.random.rand(D) X = GaussianMarkovChain(mu, Lambda, A, V, plates=plates, n=N) (u0, u1, u2) = X._message_to_child() (mu, mumu) = Gaussian._ensure_moments(mu, GaussianMoments, ndim=1).get_moments() (Lambda, _) = Wishart._ensure_moments(Lambda, WishartMoments, ndim=1).get_moments() (a, aa) = Gaussian._ensure_moments(A, GaussianMoments, ndim=1).get_moments() a = a * np.ones((N-1,D,D)) # explicit broadcasting for simplicity aa = aa * np.ones((N-1,D,D,D)) # explicit broadcasting for simplicity (v, _) = Gamma._ensure_moments(V, GammaMoments).get_moments() v = v * np.ones((N-1,D)) plates_C = X.plates plates_mu = X.plates C = np.zeros(plates_C + (N,D,N,D)) plates_mu = np.shape(mu)[:-1] m = np.zeros(plates_mu + (N,D)) m[...,0,:] = np.einsum('...ij,...j->...i', Lambda, mu) C[...,0,:,0,:] = Lambda + np.einsum('...dij,...d->...ij', aa[...,0,:,:,:], v[...,0,:]) for n in range(1,N-1): C[...,n,:,n,:] = (np.einsum('...dij,...d->...ij', aa[...,n,:,:,:], v[...,n,:]) + v[...,n,:,None] * np.identity(D)) for n in range(N-1): C[...,n,:,n+1,:] = -np.einsum('...di,...d->...id', a[...,n,:,:], v[...,n,:]) C[...,n+1,:,n,:] = -np.einsum('...di,...d->...di', a[...,n,:,:], v[...,n,:]) C[...,-1,:,-1,:] = v[...,-1,:,None]*np.identity(D) C = np.reshape(C, plates_C+(N*D,N*D)) Cov = np.linalg.inv(C) Cov = np.reshape(Cov, plates_C+(N,D,N,D)) m0 = np.einsum('...minj,...nj->...mi', Cov, m) m1 = np.zeros(plates_C+(N,D,D)) m2 = np.zeros(plates_C+(N-1,D,D)) for n in range(N): m1[...,n,:,:] = Cov[...,n,:,n,:] + np.einsum('...i,...j->...ij', m0[...,n,:], m0[...,n,:]) for n in range(N-1): m2[...,n,:,:] = Cov[...,n,:,n+1,:] + np.einsum('...i,...j->...ij', m0[...,n,:], m0[...,n+1,:]) self.assertAllClose(m0, u0*np.ones(np.shape(m0))) self.assertAllClose(m1, u1*np.ones(np.shape(m1))) self.assertAllClose(m2, u2*np.ones(np.shape(m2))) pass check(4,1) check(4,3) # # Test mu # # Simple check(4,3, mu=Gaussian(np.random.randn(3), random.covariance(3))) # Plates check(4,3, mu=Gaussian(np.random.randn(5,6,3), random.covariance(3), plates=(5,6))) # Plates with moments broadcasted over plates check(4,3, mu=Gaussian(np.random.randn(3), random.covariance(3), plates=(5,))) check(4,3, mu=Gaussian(np.random.randn(1,3), random.covariance(3), plates=(5,))) # Plates broadcasting check(4,3, plates=(5,), mu=Gaussian(np.random.randn(3), random.covariance(3), plates=())) check(4,3, plates=(5,), mu=Gaussian(np.random.randn(1,3), random.covariance(3), plates=(1,))) # # Test Lambda # # Simple check(4,3, Lambda=Wishart(10+np.random.rand(), random.covariance(3))) # Plates check(4,3, Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=(5,6))) # Plates with moments broadcasted over plates check(4,3, Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=(5,))) check(4,3, Lambda=Wishart(10+np.random.rand(1), random.covariance(3), plates=(5,))) # Plates broadcasting check(4,3, plates=(5,), Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=())) check(4,3, plates=(5,), Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=(1,))) # # Test A # # Simple check(4,3, A=GaussianARD(np.random.randn(3,3), np.random.rand(3,3), shape=(3,), plates=(3,))) # Plates on time axis check(5,3, A=GaussianARD(np.random.randn(4,3,3), np.random.rand(4,3,3), shape=(3,), plates=(4,3))) # Plates on time axis with broadcasted moments check(5,3, A=GaussianARD(np.random.randn(1,3,3), np.random.rand(1,3,3), shape=(3,), plates=(4,3))) check(5,3, A=GaussianARD(np.random.randn(3,3), np.random.rand(3,3), shape=(3,), plates=(4,3))) # Plates check(4,3, A=GaussianARD(np.random.randn(5,6,1,3,3), np.random.rand(5,6,1,3,3), shape=(3,), plates=(5,6,1,3))) # Plates with moments broadcasted over plates check(4,3, A=GaussianARD(np.random.randn(3,3), np.random.rand(3,3), shape=(3,), plates=(5,1,3))) check(4,3, A=GaussianARD(np.random.randn(1,1,3,3), np.random.rand(1,1,3,3), shape=(3,), plates=(5,1,3))) # Plates broadcasting check(4,3, plates=(5,), A=GaussianARD(np.random.randn(3,3), np.random.rand(3,3), shape=(3,), plates=(3,))) check(4,3, plates=(5,), A=GaussianARD(np.random.randn(3,3), np.random.rand(3,3), shape=(3,), plates=(1,1,3))) # # Test v # # Simple check(4,3, V=Gamma(np.random.rand(1,3), np.random.rand(1,3), plates=(1,3))) check(4,3, V=Gamma(np.random.rand(3), np.random.rand(3), plates=(3,))) # Plates check(4,3, V=Gamma(np.random.rand(5,6,1,3), np.random.rand(5,6,1,3), plates=(5,6,1,3))) # Plates with moments broadcasted over plates check(4,3, V=Gamma(np.random.rand(1,3), np.random.rand(1,3), plates=(5,1,3))) check(4,3, V=Gamma(np.random.rand(1,1,3), np.random.rand(1,1,3), plates=(5,1,3))) # Plates broadcasting check(4,3, plates=(5,), V=Gamma(np.random.rand(1,3), np.random.rand(1,3), plates=(1,3))) check(4,3, plates=(5,), V=Gamma(np.random.rand(1,1,3), np.random.rand(1,1,3), plates=(1,1,3))) # # Check with input signals # mu = 2 Lambda = 3 A = 4 B = 5 v = 6 inputs = [[-2], [3]] X = GaussianMarkovChain([mu], [[Lambda]], [[A,B]], [v], inputs=inputs) V = (np.array([[v*A**2, -v*A, 0], [-v*A, v*A**2, -v*A], [0, -v*A, 0]]) + np.array([[Lambda, 0, 0], [0, v, 0], [0, 0, v]])) m = (np.array([Lambda*mu, 0, 0]) + np.array([0, v*B*inputs[0][0], v*B*inputs[1][0]]) - np.array([v*A*B*inputs[0][0], v*A*B*inputs[1][0], 0])) Cov = np.linalg.inv(V) mean = np.dot(Cov, m) X.update() u = X.get_moments() self.assertAllClose(u[0], mean[:,None]) self.assertAllClose(u[1] - u[0][...,None,:]*u[0][...,:,None], Cov[(0,1,2),(0,1,2),None,None]) self.assertAllClose(u[2] - u[0][...,:-1,:,None]*u[0][...,1:,None,:], Cov[(0,1),(1,2),None,None]) pass def test_smoothing(self): """ Test the posterior estimation of GaussianMarkovChain. Create time-variant dynamics and compare the results of BayesPy VB inference and standard Kalman filtering & smoothing. This is not that useful anymore, because the moments are checked much better in another test method. """ # # Set up an artificial system # # Dimensions N = 500 D = 2 # Dynamics (time varying) A0 = np.array([[.9, -.4], [.4, .9]]) A1 = np.array([[.98, -.1], [.1, .98]]) l = np.linspace(0, 1, N-1).reshape((-1,1,1)) A = (1-l)*A0 + l*A1 # Innovation covariance matrix (time varying) v = np.random.rand(D) V = np.diag(v) # Observation noise covariance matrix C = np.identity(D) # # Simulate data # X = np.empty((N,D)) Y = np.empty((N,D)) x = np.array([0.5, -0.5]) X[0,:] = x Y[0,:] = x + np.random.multivariate_normal(np.zeros(D), C) for n in range(N-1): x = np.dot(A[n,:,:],x) + np.random.multivariate_normal(np.zeros(D), V) X[n+1,:] = x Y[n+1,:] = x + np.random.multivariate_normal(np.zeros(D), C) # # BayesPy inference # # Construct VB model Xh = GaussianMarkovChain(np.zeros(D), np.identity(D), A, 1/v, n=N) Yh = Gaussian(Xh, np.identity(D), plates=(N,)) # Put data Yh.observe(Y) # Run inference Xh.update() # Store results Xh_vb = Xh.u[0] CovXh_vb = Xh.u[1] - Xh_vb[...,np.newaxis,:] * Xh_vb[...,:,np.newaxis] # # "The ground truth" using standard Kalman filter and RTS smoother # V = N*(V,) UY = Y U = N*(C,) (Xh, CovXh) = kalman_filter(UY, U, A, V, np.zeros(D), np.identity(D)) (Xh, CovXh) = rts_smoother(Xh, CovXh, A, V) # # Check results # self.assertTrue(np.allclose(Xh_vb, Xh)) self.assertTrue(np.allclose(CovXh_vb, CovXh)) class TestVaryingGaussianMarkovChain(TestCase): def test_plates_from_parents(self): """ Test that VaryingGaussianMarkovChain deduces plates correctly """ def check(plates_X, plates_mu=(), plates_Lambda=(), plates_B=(), plates_S=(), plates_v=()): D = 3 K = 2 N = 4 np.random.seed(42) mu = Gaussian(np.random.randn(*(plates_mu+(D,))), random.covariance(D)) Lambda = Wishart(D+np.ones(plates_Lambda), random.covariance(D)) B = GaussianARD(np.random.randn(*(plates_B+(D,D,K))), 1+np.random.rand(*(plates_B+(D,D,K))), shape=(D,K), plates=plates_B+(D,)) S = GaussianARD(np.random.randn(*(plates_S+(N,K))), 1+np.random.rand(*(plates_S+(N,K))), shape=(K,), plates=plates_S+(N,)) v = Gamma(1+np.random.rand(*(plates_v+(1,D))), 1+np.random.rand(*(plates_v+(1,D)))) X = VaryingGaussianMarkovChain(mu, Lambda, B, S, v, name="X") self.assertEqual(plates_X, X.plates, msg="Incorrect plates deduced") pass check(()) check((2,3), plates_mu=(2,3)) check((6,7), plates_Lambda=(6,7)) check((2,3), plates_B=(2,3)) check((2,3), plates_S=(2,3)) check((2,3), plates_v=(2,3)) pass def test_message_to_child(self): # A very simple check before the more complex ones: # 1-D process, k=1, fixed constant parameters m = 1.0 l = 4.0 b = 2.0 s = [3.0, 8.0] v = 5.0 X = VaryingGaussianMarkovChain([m], [[l]], [[[b]]], [[s[0]],[s[1]]], [v]) (u0, u1, u2) = X._message_to_child() C = np.array([[l+b**2*s[0]**2*v, -b*s[0]*v, 0], [ -b*s[0]*v, v+b**2*s[1]**2*v, -b*s[1]*v], [ 0, -b*s[1]*v, v]]) Cov = np.linalg.inv(C) m0 = np.dot(Cov, [[l*m], [0], [0]]) m1 = np.diag(Cov)[:,None,None] + m0[:,:,None]**2 m2 = np.diag(Cov, k=1)[:,None,None] + m0[1:,:,None]*m0[:-1,:,None] self.assertAllClose(m0, u0) self.assertAllClose(m1, u1) self.assertAllClose(m2, u2) def check(N, D, K, plates=None, mu=None, Lambda=None, B=None, S=None, V=None): if mu is None: mu = np.random.randn(D) if Lambda is None: Lambda = random.covariance(D) if B is None: B = np.random.randn(D,D,K) if S is None: S = np.random.randn(N-1,K) if V is None: V = np.random.rand(D) X = VaryingGaussianMarkovChain(mu, Lambda, B, S, V, plates=plates, n=N) (u0, u1, u2) = X._message_to_child() (mu, mumu) = X.parents[0].get_moments() (Lambda, _) = X.parents[1].get_moments() (b, bb) = X.parents[2].get_moments() (s, ss) = X.parents[3].get_moments() (v, _) = X.parents[4].get_moments() v = v * np.ones((N-1,D)) #V = np.atleast_3d(v)[...,-1,:,None]*np.identity(D) plates_C = X.plates plates_mu = X.plates C = np.zeros(plates_C + (N,D,N,D)) plates_mu = np.shape(mu)[:-1] m = np.zeros(plates_mu + (N,D)) m[...,0,:] = np.einsum('...ij,...j->...i', Lambda, mu) #m = np.reshape(m, plates_mu + (N*D,)) A = np.einsum('...dik,...nk->...ndi', b, s) AA = np.einsum('...dikjl,...nkl->...ndij', bb, ss) C[...,0,:,0,:] = Lambda + np.einsum('...dij,...d->...ij', AA[...,0,:,:,:], v[...,0,:]) for n in range(1,N-1): C[...,n,:,n,:] = (np.einsum('...dij,...d->...ij', AA[...,n,:,:,:], v[...,n,:]) + v[...,n,:,None] * np.identity(D)) for n in range(N-1): C[...,n,:,n+1,:] = -np.einsum('...di,...d->...id', A[...,n,:,:], v[...,n,:]) C[...,n+1,:,n,:] = -np.einsum('...di,...d->...di', A[...,n,:,:], v[...,n,:]) C[...,-1,:,-1,:] = v[...,-1,:,None]*np.identity(D) C = np.reshape(C, plates_C+(N*D,N*D)) Cov = np.linalg.inv(C) Cov = np.reshape(Cov, plates_C+(N,D,N,D)) m0 = np.einsum('...minj,...nj->...mi', Cov, m) m1 = np.zeros(plates_C+(N,D,D)) m2 = np.zeros(plates_C+(N-1,D,D)) for n in range(N): m1[...,n,:,:] = Cov[...,n,:,n,:] + np.einsum('...i,...j->...ij', m0[...,n,:], m0[...,n,:]) for n in range(N-1): m2[...,n,:,:] = Cov[...,n,:,n+1,:] + np.einsum('...i,...j->...ij', m0[...,n,:], m0[...,n+1,:]) self.assertAllClose(m0, u0*np.ones(np.shape(m0))) self.assertAllClose(m1, u1*np.ones(np.shape(m1))) self.assertAllClose(m2, u2*np.ones(np.shape(m2))) pass check(2,1,1) check(2,3,1) check(2,1,3) check(4,3,2) # # Test mu # # Simple check(4,3,2, mu=Gaussian(np.random.randn(3), random.covariance(3))) # Plates check(4,3,2, mu=Gaussian(np.random.randn(5,6,3), random.covariance(3), plates=(5,6))) # Plates with moments broadcasted over plates check(4,3,2, mu=Gaussian(np.random.randn(3), random.covariance(3), plates=(5,))) check(4,3,2, mu=Gaussian(np.random.randn(1,3), random.covariance(3), plates=(5,))) # Plates broadcasting check(4,3,2, plates=(5,), mu=Gaussian(np.random.randn(3), random.covariance(3), plates=())) check(4,3,2, plates=(5,), mu=Gaussian(np.random.randn(1,3), random.covariance(3), plates=(1,))) # # Test Lambda # # Simple check(4,3,2, Lambda=Wishart(10+np.random.rand(), random.covariance(3))) # Plates check(4,3,2, Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=(5,6))) # Plates with moments broadcasted over plates check(4,3,2, Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=(5,))) check(4,3,2, Lambda=Wishart(10+np.random.rand(1), random.covariance(3), plates=(5,))) # Plates broadcasting check(4,3,2, plates=(5,), Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=())) check(4,3,2, plates=(5,), Lambda=Wishart(10+np.random.rand(), random.covariance(3), plates=(1,))) # # Test B # # Simple check(4,3,2, B=GaussianARD(np.random.randn(3,3,2), np.random.rand(3,3,2), shape=(3,2), plates=(3,))) # Plates check(4,3,2, B=GaussianARD(np.random.randn(5,6,3,3,2), np.random.rand(5,6,3,3,2), shape=(3,2), plates=(5,6,3))) # Plates with moments broadcasted over plates check(4,3,2, B=GaussianARD(np.random.randn(3,3,2), np.random.rand(3,3,2), shape=(3,2), plates=(5,3))) check(4,3,2, B=GaussianARD(np.random.randn(1,3,3,2), np.random.rand(1,3,3,2), shape=(3,2), plates=(5,3))) # Plates broadcasting check(4,3,2, plates=(5,), B=GaussianARD(np.random.randn(3,3,2), np.random.rand(3,3,2), shape=(3,2), plates=(3,))) check(4,3,2, plates=(5,), B=GaussianARD(np.random.randn(3,3,2), np.random.rand(3,3,2), shape=(3,2), plates=(1,3))) # # Test S # # Simple check(4,3,2, S=GaussianARD(np.random.randn(4-1,2), np.random.rand(4-1,2), shape=(2,), plates=(4-1,))) # Plates check(4,3,2, S=GaussianARD(np.random.randn(5,6,4-1,2), np.random.rand(5,6,4-1,2), shape=(2,), plates=(5,6,4-1,))) # Plates with moments broadcasted over plates check(4,3,2, S=GaussianARD(np.random.randn(4-1,2), np.random.rand(4-1,2), shape=(2,), plates=(5,4-1,))) check(4,3,2, S=GaussianARD(np.random.randn(1,4-1,2), np.random.rand(1,4-1,2), shape=(2,), plates=(5,4-1,))) # Plates broadcasting check(4,3,2, plates=(5,), S=GaussianARD(np.random.randn(4-1,2), np.random.rand(4-1,2), shape=(2,), plates=(4-1,))) check(4,3,2, plates=(5,), S=GaussianARD(np.random.randn(4-1,2), np.random.rand(4-1,2), shape=(2,), plates=(1,4-1,))) # # Test v # # Simple check(4,3,2, V=Gamma(np.random.rand(1,3), np.random.rand(1,3), plates=(1,3))) check(4,3,2, V=Gamma(np.random.rand(3), np.random.rand(3), plates=(3,))) # Plates check(4,3,2, V=Gamma(np.random.rand(5,6,1,3), np.random.rand(5,6,1,3), plates=(5,6,1,3))) # Plates with moments broadcasted over plates check(4,3,2, V=Gamma(np.random.rand(1,3), np.random.rand(1,3), plates=(5,1,3))) check(4,3,2, V=Gamma(np.random.rand(1,1,3), np.random.rand(1,1,3), plates=(5,1,3))) # Plates broadcasting check(4,3,2, plates=(5,), V=Gamma(np.random.rand(1,3), np.random.rand(1,3), plates=(1,3))) check(4,3,2, plates=(5,), V=Gamma(np.random.rand(1,1,3), np.random.rand(1,1,3), plates=(1,1,3))) # # Uncertainty in both B and S # check(4,3,2, B=GaussianARD(np.random.randn(3,3,2), np.random.rand(3,3,2), shape=(3,2), plates=(3,)), S=GaussianARD(np.random.randn(4-1,2), np.random.rand(4-1,2), shape=(2,), plates=(4-1,))) pass def test_message_to_mu(self): # TODO pass def test_message_to_Lambda(self): # TODO pass def test_message_to_B(self): # TODO pass def test_message_to_S(self): # TODO pass def test_message_to_v(self): # TODO pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_mixture.py0000644000175100001770000003104300000000000025576 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for mixture module. """ import warnings import numpy as np from bayespy.nodes import (GaussianARD, Gamma, Mixture, Categorical, Bernoulli, Multinomial, Beta, Gate, Dirichlet) from bayespy.utils import random from bayespy.utils import linalg from bayespy.utils.misc import TestCase class TestMixture(TestCase): def test_init(self): """ Test the creation of Mixture node """ # Do not accept non-negative cluster plates z = Categorical(np.random.dirichlet([1,1])) self.assertRaises(ValueError, Mixture, z, GaussianARD, GaussianARD(0, 1, plates=(2,)), Gamma(1, 1, plates=(2,)), cluster_plate=0) # Try constructing a mixture without any of the parents having the # cluster plate axis z = Categorical(np.random.dirichlet([1,1])) self.assertRaises(ValueError, Mixture, z, GaussianARD, GaussianARD(0, 1, plates=()), Gamma(1, 1, plates=())) def test_message_to_child(self): """ Test the message to child of Mixture node. """ K = 3 # # Estimate moments from parents only # # Simple case mu = GaussianARD([0,2,4], 1, ndim=0, plates=(K,)) alpha = Gamma(1, 1, plates=(K,)) z = Categorical(np.ones(K)/K) X = Mixture(z, GaussianARD, mu, alpha) self.assertEqual(X.plates, ()) self.assertEqual(X.dims, ( (), () )) u = X._message_to_child() self.assertAllClose(u[0], 2) self.assertAllClose(u[1], 2**2+1) # Broadcasting the moments on the cluster axis mu = GaussianARD(2, 1, ndim=0, plates=(K,)) alpha = Gamma(1, 1, plates=(K,)) z = Categorical(np.ones(K)/K) X = Mixture(z, GaussianARD, mu, alpha) self.assertEqual(X.plates, ()) self.assertEqual(X.dims, ( (), () )) u = X._message_to_child() self.assertAllClose(u[0], 2) self.assertAllClose(u[1], 2**2+1) # # Estimate moments with observed children # pass def test_message_to_parent(self): """ Test the message to parents of Mixture node. """ K = 3 # Broadcasting the moments on the cluster axis Mu = GaussianARD(2, 1, ndim=0, plates=(K,)) (mu, mumu) = Mu._message_to_child() Alpha = Gamma(3, 1, plates=(K,)) (alpha, logalpha) = Alpha._message_to_child() z = Categorical(np.ones(K)/K) X = Mixture(z, GaussianARD, Mu, Alpha) tau = 4 Y = GaussianARD(X, tau) y = 5 Y.observe(y) (x, xx) = X._message_to_child() m = z._message_from_children() self.assertAllClose(m[0] * np.ones(K), random.gaussian_logpdf(xx*alpha, x*alpha*mu, mumu*alpha, logalpha, 0) * np.ones(K)) m = Mu._message_from_children() self.assertAllClose(m[0], 1/K * (alpha*x) * np.ones(3)) self.assertAllClose(m[1], -0.5 * 1/K * alpha * np.ones(3)) # Some parameters do not have cluster plate axis Mu = GaussianARD(2, 1, ndim=0, plates=(K,)) (mu, mumu) = Mu._message_to_child() Alpha = Gamma(3, 1) # Note: no cluster plate axis! (alpha, logalpha) = Alpha._message_to_child() z = Categorical(np.ones(K)/K) X = Mixture(z, GaussianARD, Mu, Alpha) tau = 4 Y = GaussianARD(X, tau) y = 5 Y.observe(y) (x, xx) = X._message_to_child() m = z._message_from_children() self.assertAllClose(m[0] * np.ones(K), random.gaussian_logpdf(xx*alpha, x*alpha*mu, mumu*alpha, logalpha, 0) * np.ones(K)) m = Mu._message_from_children() self.assertAllClose(m[0], 1/K * (alpha*x) * np.ones(3)) self.assertAllClose(m[1], -0.5 * 1/K * alpha * np.ones(3)) # Cluster assignments do not have as many plate axes as parameters. M = 2 Mu = GaussianARD(2, 1, ndim=0, plates=(K,M)) (mu, mumu) = Mu._message_to_child() Alpha = Gamma(3, 1, plates=(K,M)) (alpha, logalpha) = Alpha._message_to_child() z = Categorical(np.ones(K)/K) X = Mixture(z, GaussianARD, Mu, Alpha, cluster_plate=-2) tau = 4 Y = GaussianARD(X, tau) y = 5 * np.ones(M) Y.observe(y) (x, xx) = X._message_to_child() m = z._message_from_children() self.assertAllClose(m[0]*np.ones(K), np.sum(random.gaussian_logpdf(xx*alpha, x*alpha*mu, mumu*alpha, logalpha, 0) * np.ones((K,M)), axis=-1)) m = Mu._message_from_children() self.assertAllClose(m[0] * np.ones((K,M)), 1/K * (alpha*x) * np.ones((K,M))) self.assertAllClose(m[1] * np.ones((K,M)), -0.5 * 1/K * alpha * np.ones((K,M))) # Mixed distribution broadcasts g # This tests for a found bug. The bug caused an error. Z = Categorical([0.3, 0.5, 0.2]) X = Mixture(Z, Categorical, [[0.2,0.8], [0.1,0.9], [0.3,0.7]]) m = Z._message_from_children() # # Test nested mixtures # t1 = [1, 1, 0, 3, 3] t2 = [2] p = Dirichlet([1, 1], plates=(4, 3)) X = Mixture(t1, Mixture, t2, Categorical, p) X.observe([1, 1, 0, 0, 0]) p.update() self.assertAllClose( p.phi[0], [ [[1, 1], [1, 1], [2, 1]], [[1, 1], [1, 1], [1, 3]], [[1, 1], [1, 1], [1, 1]], [[1, 1], [1, 1], [3, 1]], ] ) # Test sample plates in nested mixtures t1 = Categorical([0.3, 0.7], plates=(5,)) t2 = [[1], [1], [0], [3], [3]] t3 = 2 p = Dirichlet([1, 1], plates=(2, 4, 3)) X = Mixture(t1, Mixture, t2, Mixture, t3, Categorical, p) X.observe([1, 1, 0, 0, 0]) p.update() self.assertAllClose( p.phi[0], [ [ [[1, 1], [1, 1], [1.3, 1]], [[1, 1], [1, 1], [1, 1.6]], [[1, 1], [1, 1], [1, 1]], [[1, 1], [1, 1], [1.6, 1]], ], [ [[1, 1], [1, 1], [1.7, 1]], [[1, 1], [1, 1], [1, 2.4]], [[1, 1], [1, 1], [1, 1]], [[1, 1], [1, 1], [2.4, 1]], ] ] ) # Check that Gate and nested Mixture are equal t1 = Categorical([0.3, 0.7], plates=(5,)) t2 = Categorical([0.1, 0.3, 0.6], plates=(5, 1)) p = Dirichlet([1, 2, 3, 4], plates=(2, 3)) X = Mixture(t1, Mixture, t2, Categorical, p) X.observe([3, 3, 1, 2, 2]) t1_msg = t1._message_from_children() t2_msg = t2._message_from_children() p_msg = p._message_from_children() t1 = Categorical([0.3, 0.7], plates=(5,)) t2 = Categorical([0.1, 0.3, 0.6], plates=(5, 1)) p = Dirichlet([1, 2, 3, 4], plates=(2, 3)) X = Categorical(Gate(t1, Gate(t2, p))) X.observe([3, 3, 1, 2, 2]) t1_msg2 = t1._message_from_children() t2_msg2 = t2._message_from_children() p_msg2 = p._message_from_children() self.assertAllClose(t1_msg[0], t1_msg2[0]) self.assertAllClose(t2_msg[0], t2_msg2[0]) self.assertAllClose(p_msg[0], p_msg2[0]) pass def test_lowerbound(self): """ Test log likelihood lower bound for Mixture node """ # Mixed distribution broadcasts g # This tests for a found bug. The bug caused an error. Z = Categorical([0.3, 0.5, 0.2]) X = Mixture(Z, Categorical, [[0.2,0.8], [0.1,0.9], [0.3,0.7]]) X.lower_bound_contribution() pass def test_mask_to_parent(self): """ Test the mask handling in Mixture node """ K = 3 Z = Categorical(np.ones(K)/K, plates=(4,5,1)) Mu = GaussianARD(0, 1, shape=(2,), plates=(4,K,5)) Alpha = Gamma(1, 1, plates=(4,K,5,2)) X = Mixture(Z, GaussianARD, Mu, Alpha, cluster_plate=-3) Y = GaussianARD(X, 1, ndim=1) mask = np.reshape((np.mod(np.arange(4*5), 2) == 0), (4,5)) Y.observe(np.ones((4,5,2)), mask=mask) self.assertArrayEqual(Z.mask, mask[:,:,None]) self.assertArrayEqual(Mu.mask, mask[:,None,:]) self.assertArrayEqual(Alpha.mask, mask[:,None,:,None]) pass def test_nans(self): """ Test multinomial mixture """ # The probabilities p1 cause problems p0 = [0.1, 0.9] p1 = [1.0-1e-50, 1e-50] Z = Categorical([1-1e-10, 1e-10]) X = Mixture(Z, Multinomial, 10, [p0, p1]) u = X._message_to_child() self.assertAllClose(u[0], [1, 9]) p0 = [0.1, 0.9] p1 = [1.0-1e-10, 1e-10] Z = Categorical([1-1e-50, 1e-50]) X = Mixture(Z, Multinomial, 10, [p0, p1]) u = X._message_to_child() self.assertAllClose(u[0], [1, 9]) with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) warnings.simplefilter("ignore", UserWarning) p0 = [0.1, 0.9] p1 = [1.0, 0.0] X = Mixture(0, Multinomial, 10, [p0, p1]) u = X._message_to_child() self.assertAllClose(u[0], [1, 9]) pass def test_random(self): """ Test random sampling of mixture node """ o = 1e-20 X = Mixture([1, 0, 2], Categorical, [ [o, o, o, 1], [o, o, 1, o], [1, o, o, o] ]) x = X.random() self.assertAllClose(x, [2, 3, 0]) pass def test_deterministic_mappings(self): x = Categorical([0.8, 0.2]) y = Mixture( x, Categorical, [ [0.10, 0.90], [0.00, 1.00], ] ) y.observe(0) x.update() self.assertAllClose(x.u[0], [1, 0]) y.observe(1) x.update() p = np.array([0.8*0.9, 0.2*1.0]) self.assertAllClose(x.u[0], p / np.sum(p)) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_multinomial.py0000644000175100001770000001610600000000000026436 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `multinomial` module. """ import numpy as np import scipy from bayespy.nodes import (Multinomial, Dirichlet, Mixture) from bayespy.utils import random from bayespy.utils.misc import TestCase class TestMultinomial(TestCase): """ Unit tests for Multinomial node """ def test_init(self): """ Test the creation of multinomial nodes. """ # Some simple initializations X = Multinomial(10, [0.1, 0.3, 0.6]) X = Multinomial(10, Dirichlet([5,4,3])) # Check that plates are correct X = Multinomial(10, [0.1, 0.3, 0.6], plates=(3,4)) self.assertEqual(X.plates, (3,4)) X = Multinomial(10, 0.25*np.ones((2,3,4))) self.assertEqual(X.plates, (2,3)) n = 10 * np.ones((3,4), dtype=np.int64) X = Multinomial(n, [0.1, 0.3, 0.6]) self.assertEqual(X.plates, (3,4)) X = Multinomial(n, Dirichlet([2,1,9], plates=(3,4))) self.assertEqual(X.plates, (3,4)) # Probabilities not a vector self.assertRaises(ValueError, Multinomial, 10, 0.5) # Invalid probability self.assertRaises(ValueError, Multinomial, 10, [-0.5, 1.5]) self.assertRaises(ValueError, Multinomial, 10, [0.5, 1.5]) # Invalid number of trials self.assertRaises(ValueError, Multinomial, -1, [0.5, 0.5]) self.assertRaises(ValueError, Multinomial, 8.5, [0.5, 0.5]) # Inconsistent plates self.assertRaises(ValueError, Multinomial, 10, 0.25*np.ones((2,4)), plates=(3,)) # Explicit plates too small self.assertRaises(ValueError, Multinomial, 10, 0.25*np.ones((2,4)), plates=(1,)) pass def test_moments(self): """ Test the moments of multinomial nodes. """ # Simple test X = Multinomial(1, [0.7,0.2,0.1]) u = X._message_to_child() self.assertEqual(len(u), 1) self.assertAllClose(u[0], [0.7,0.2,0.1]) # Test n X = Multinomial(10, [0.7,0.2,0.1]) u = X._message_to_child() self.assertAllClose(u[0], [7,2,1]) # Test plates in p n = np.random.randint(1, 10) p = np.random.dirichlet([1,1], size=3) X = Multinomial(n, p) u = X._message_to_child() self.assertAllClose(u[0], p*n) # Test plates in n n = np.random.randint(1, 10, size=(3,)) p = np.random.dirichlet([1,1,1,1]) X = Multinomial(n, p) u = X._message_to_child() self.assertAllClose(u[0], p*n[:,None]) # Test plates in p and n n = np.random.randint(1, 10, size=(4,1)) p = np.random.dirichlet([1,1], size=3) X = Multinomial(n, p) u = X._message_to_child() self.assertAllClose(u[0], p*n[...,None]) # Test with Dirichlet prior P = Dirichlet([7, 3]) logp = P._message_to_child()[0] p0 = np.exp(logp[0]) / (np.exp(logp[0]) + np.exp(logp[1])) p1 = np.exp(logp[1]) / (np.exp(logp[0]) + np.exp(logp[1])) X = Multinomial(1, P) u = X._message_to_child() p = np.array([p0, p1]) self.assertAllClose(u[0], p) # Test with broadcasted plates P = Dirichlet([7, 3], plates=(10,)) X = Multinomial(5, P) u = X._message_to_child() self.assertAllClose(u[0] * np.ones(X.get_shape(0)), 5*p*np.ones((10,1))) pass def test_lower_bound(self): """ Test lower bound for multinomial node. """ # Test for a bug found in multinomial X = Multinomial(10, [0.3, 0.5, 0.2]) l = X.lower_bound_contribution() self.assertAllClose(l, 0.0) pass def test_mixture(self): """ Test multinomial mixture """ p0 = [0.1, 0.5, 0.2, 0.2] p1 = [0.5, 0.1, 0.1, 0.3] p2 = [0.3, 0.2, 0.1, 0.4] X = Mixture(2, Multinomial, 10, [p0, p1, p2]) u = X._message_to_child() self.assertAllClose(u[0], 10*np.array(p2)) pass def test_mixture_with_count_array(self): """ Test multinomial mixture """ p0 = [0.1, 0.5, 0.2, 0.2] p1 = [0.5, 0.1, 0.1, 0.3] p2 = [0.3, 0.2, 0.1, 0.4] counts = [[10], [5], [3]] X = Mixture(2, Multinomial, counts, [p0, p1, p2]) u = X._message_to_child() self.assertAllClose( u[0], np.array(counts)*np.array(p2) ) # Multi-mixture and count array # Shape(p) = (2, 1, 3) + (4,) p = [ [[ [0.1, 0.5, 0.2, 0.2], [0.5, 0.1, 0.1, 0.3], [0.3, 0.2, 0.1, 0.4], ]], [[ [0.3, 0.2, 0.1, 0.4], [0.5, 0.1, 0.2, 0.2], [0.4, 0.1, 0.2, 0.3], ]], ] # Shape(Z1) = (1, 3) + (2,) -> () + (2,) Z1 = 1 # Shape(Z2) = (1,) + (3,) -> () + (3,) Z2 = 2 # Shape(counts) = (5, 1) counts = [[10], [5], [3], [2], [4]] # Shape(X) = (5,) + (4,) X = Mixture( Z1, Mixture, Z2, Multinomial, counts, p, # NOTE: We mix over axes -3 and -1. But as we first mix over the # default (-1), then the next mixing happens over -2 (because one # axis was already dropped). cluster_plate=-2, ) self.assertAllClose( X._message_to_child()[0], np.array(counts)[:,0,None] * np.array(p)[Z1,:,Z2] ) # Can't have non-singleton axis in counts over the mixed axis p0 = [0.1, 0.5, 0.2, 0.2] p1 = [0.5, 0.1, 0.1, 0.3] p2 = [0.3, 0.2, 0.1, 0.4] counts = [10, 5, 3] self.assertRaises( ValueError, Mixture, 2, Multinomial, counts, [p0, p1, p2], ) return ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_node.py0000644000175100001770000010236200000000000025031 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `dot` module. """ import unittest import numpy as np import scipy from numpy import testing from ..node import Node, Moments from ...vmp import VB from bayespy.utils import misc class TestMoments(unittest.TestCase): def test_converter(self): """ Tests complex conversions for moment classes """ # Simple one step conversions class A(Moments): pass class B(Moments): _converters = {A: lambda x: x+1} f = B().get_converter(B) self.assertEqual(f(3), 3) f = B().get_converter(Moments) self.assertEqual(f(3), 3) f = B().get_converter(A) self.assertEqual(f(3), 4) f = A().get_converter(A) self.assertEqual(f(3), 3) f = A().get_converter(Moments) self.assertEqual(f(3), 3) self.assertRaises(Moments.NoConverterError, A().get_converter, B) # Convert via parent class C(B): pass f = C().get_converter(A) self.assertEqual(f(3), 4) # Convert via grand parent class D(C): pass class E(D): pass f = E().get_converter(A) self.assertEqual(f(3), 4) # Can't convert to child self.assertRaises(Moments.NoConverterError, Moments().get_converter, A) # Convert to grand child class F(Moments): _converters = {E: lambda x: 2*x} f = F().get_converter(B) self.assertEqual(f(3), 6) # Use two conversions f = F().get_converter(A) self.assertEqual(f(3), 2*3+1) # Can't use child's converter class H(Moments): pass class I(Moments): _converters = {A: lambda x: x+1} self.assertRaises(Moments.NoConverterError, H().get_converter, A) # Conversion to parent is not success class J(A): pass self.assertRaises(Moments.NoConverterError, I().get_converter, J) # Infinite loop class X(Moments): pass class Y(Moments): pass X.add_converter(Y, lambda x: x+1) Y.add_converter(X, lambda x: x+1) self.assertRaises(Moments.NoConverterError, X().get_converter, A) # Test that add_converter function does not change the converters of # parent classes class Z(Moments): pass class W(Z): pass W.add_converter(Y, lambda x: x) self.assertRaises(Moments.NoConverterError, Z().get_converter, Y) # Test that after using add_converter for a child class and then for the # parent class, the child class is still able to use the parent's # converters class X(Moments): pass class Y(Moments): pass class A(Moments): pass class B(A): pass B.add_converter(Y, lambda x: x+1) A.add_converter(X, lambda x: 2*x) f = B().get_converter(X) self.assertEqual(f(3), 6) pass class TestNode(misc.TestCase): def check_message_to_parent(self, plates_child, plates_message, plates_mask, plates_parent, dims=(2,)): # Dummy message msg = np.random.randn(*(plates_message+dims)) # Mask with every other True and every other False mask = np.mod(np.arange(np.prod(plates_mask)).reshape(plates_mask), 2) == 0 # Set up the dummy model class Dummy(Node): _moments = Moments() def __init__(self, *args, **kwargs): self._parent_moments = len(args)*(Moments(),) super().__init__(*args, **kwargs) def _get_message_and_mask_to_parent(self, index, u_parent=None): return ([msg], mask) def _get_id_list(self): return [] parent = Dummy(dims=[dims], plates=plates_parent) child = Dummy(parent, dims=[dims], plates=plates_child) m = child._message_to_parent(0)[0] * np.ones(plates_parent+dims) # Brute-force computation of the message without too much checking m_true = msg * misc.squeeze(mask[...,np.newaxis]) * np.ones(plates_child+dims) for ind in range(len(plates_child)): axis = -ind - 2 if ind >= len(plates_parent): m_true = np.sum(m_true, axis=axis, keepdims=False) elif plates_parent[-ind-1] == 1: m_true = np.sum(m_true, axis=axis, keepdims=True) testing.assert_allclose(m, m_true, err_msg="Incorrect message.") def test_message_to_parent(self): """ Test plate handling in _message_to_parent. """ # Test empty plates with scalar messages self.check_message_to_parent((), (), (), (), dims=()) # Test singular plates self.check_message_to_parent((2,3,4), (2,3,4), (2,3,4), (2,3,4)) self.check_message_to_parent((2,3,4), (2,1,4), (2,3,4), (2,3,4)) self.check_message_to_parent((2,3,4), (2,3,4), (2,1,4), (2,3,4)) self.check_message_to_parent((2,3,4), (2,3,4), (2,3,4), (2,1,4)) self.check_message_to_parent((2,3,4), (2,1,4), (2,1,4), (2,3,4)) self.check_message_to_parent((2,3,4), (2,3,4), (2,1,4), (2,1,4)) self.check_message_to_parent((2,3,4), (2,1,4), (2,3,4), (2,1,4)) self.check_message_to_parent((2,3,4), (2,1,4), (2,1,4), (2,1,4)) # Test missing plates self.check_message_to_parent((4,3), (4,3), (4,3), (4,3)) self.check_message_to_parent((4,3), ( 3,), (4,3), (4,3)) self.check_message_to_parent((4,3), (4,3), ( 3,), (4,3)) self.check_message_to_parent((4,3), (4,3), (4,3), ( 3,)) self.check_message_to_parent((4,3), ( 3,), ( 3,), (4,3)) self.check_message_to_parent((4,3), ( 3,), (4,3), ( 3,)) self.check_message_to_parent((4,3), (4,3), ( 3,), ( 3,)) self.check_message_to_parent((4,3), ( 3,), ( 3,), ( 3,)) # A complex test self.check_message_to_parent((7,6,5,4,3), ( 6,1,4,3), (1,1,5,4,1), ( 6,5,1,3)) # Test errors for inconsistent shapes self.assertRaises(ValueError, self.check_message_to_parent, (3,), (1,3,), (3,), (3,)) ## self.assertRaises(ValueError, ## self.check_message_to_parent, ## (3,), ## (3,), ## (1,3,), ## (3,)) self.assertRaises(ValueError, self.check_message_to_parent, (3,), (1,3,), (1,3,), (3,)) self.assertRaises(ValueError, self.check_message_to_parent, (3,), (4,), (3,), (3,)) self.assertRaises(ValueError, self.check_message_to_parent, (3,), (3,), (4,), (3,)) self.assertRaises(ValueError, self.check_message_to_parent, (3,), (4,), (4,), (3,)) self.assertRaises(ValueError, self.check_message_to_parent, (3,), (4,), (3,), (1,)) self.assertRaises(ValueError, self.check_message_to_parent, (3,), (3,), (4,), (1,)) self.assertRaises(ValueError, self.check_message_to_parent, (3,), (4,), (4,), (1,)) self.assertRaises(ValueError, self.check_message_to_parent, (1,), (4,), (3,), (1,)) self.assertRaises(ValueError, self.check_message_to_parent, (1,), (3,), (4,), (1,)) self.assertRaises(ValueError, self.check_message_to_parent, (1,), (4,), (4,), (1,)) def test_compute_message(self): """ Test the general sum-multiply function for message computations """ self.assertAllClose(Node._compute_message(3, plates_from=(), plates_to=(), ndim=0), 3) # Sum over one array self.assertAllClose(Node._compute_message([1, 2, 3], plates_from=(3,), plates_to=(), ndim=0), 6) # Sum plates self.assertAllClose(Node._compute_message([1, 2, 3], [4, 4, 4], [5, 5, 5], plates_from=(3,), plates_to=(), ndim=0), 20+40+60) # Do not sum plates self.assertAllClose(Node._compute_message([1, 2, 3], [4, 4, 4], [5, 5, 5], plates_from=(3,), plates_to=(3,), ndim=0), [20, 40, 60]) # Give ndim self.assertAllClose(Node._compute_message([1, 2, 3], [4, 4, 4], [5, 5, 5], plates_from=(), plates_to=(), ndim=1), [20, 40, 60]) # Broadcast plates_from self.assertAllClose(Node._compute_message(3, 4, 5, plates_from=(3,), plates_to=(), ndim=0), 3 * (3*4*5)) # Broadcast plates_to self.assertAllClose(Node._compute_message(3, 4, 5, plates_from=(3,), plates_to=(3,), ndim=0), 3*4*5) # Different ndims self.assertAllClose(Node._compute_message([1, 2, 3], [4], 5, plates_from=(3,), plates_to=(3,), ndim=0), [1*4*5, 2*4*5, 3*4*5]) # Broadcasting dims for some arrays self.assertAllClose(Node._compute_message([1, 2, 3], [4], 5, plates_from=(), plates_to=(), ndim=1), [1*4*5, 2*4*5, 3*4*5]) # Bugfix: Check that plate keys are mapped correctly self.assertAllClose( Node._compute_message( [[1], [2], [3]], plates_from=(3,2), plates_to=(1,2), ndim=0 ), [[6]] ) # Bugfix: Check plate key mapping when plates_to is shorter than shape # of the array self.assertAllClose( Node._compute_message( [[1, 2, 3], [4, 5, 6]], plates_from=(2,3), plates_to=(3,), ndim=0 ), [5, 7, 9] ) # Complex example x1 = np.random.randn(5,4,1,2,1) x2 = np.random.randn( 1,2,1) x3 = np.random.randn(5,1,1,1,1) self.assertAllClose(Node._compute_message(x1, x2, x3, plates_from=(6,5,4,3), plates_to=(5,1,1), ndim=2), 3*6*np.sum(x1*x2*x3, axis=(-4,-3), keepdims=True)) pass class TestSlice(misc.TestCase): def test_init(self): """ Test the constructor of the X[..] node operator. """ class MyNode(Node): _moments = Moments() _parent_moments = () def _get_id_list(self): return [] # Integer index X = MyNode(plates=(3,4), dims=((),)) Y = X[2] self.assertEqual(Y.plates, (4,)) X = MyNode(plates=(3,4), dims=((),)) Y = X[(2,)] self.assertEqual(Y.plates, (4,)) X = MyNode(plates=(3,4), dims=((),)) Y = X[2,-4] self.assertEqual(Y.plates, ()) X = MyNode(plates=(3,4,5), dims=((),)) Y = X[2,1] self.assertEqual(Y.plates, (5,)) # Full slices X = MyNode(plates=(3,4,5), dims=((),)) Y = X[:,1,:] self.assertEqual(Y.plates, (3,5,)) X = MyNode(plates=(3,4,5), dims=((),)) Y = X[1,:,:] self.assertEqual(Y.plates, (4,5,)) X = MyNode(plates=(3,4,5), dims=((),)) Y = X[:,:,1] self.assertEqual(Y.plates, (3,4,)) # Slice with step X = MyNode(plates=(9,), dims=((),)) Y = X[::3] self.assertEqual(Y.plates, (3,)) X = MyNode(plates=(10,), dims=((),)) Y = X[::3] self.assertEqual(Y.plates, (4,)) X = MyNode(plates=(11,), dims=((),)) Y = X[::3] self.assertEqual(Y.plates, (4,)) # Slice with a start value X = MyNode(plates=(10,), dims=((),)) Y = X[3:] self.assertEqual(Y.plates, (7,)) # Slice with an end value X = MyNode(plates=(10,), dims=((),)) Y = X[:7] self.assertEqual(Y.plates, (7,)) # Slice with only one element X = MyNode(plates=(10,), dims=((),)) Y = X[6:7] self.assertEqual(Y.plates, (1,)) # Slice starts out of range X = MyNode(plates=(10,), dims=((),)) Y = X[-20:] self.assertEqual(Y.plates, (10,)) # Slice ends out of range X = MyNode(plates=(10,), dims=((),)) Y = X[:20] self.assertEqual(Y.plates, (10,)) # Counter-intuitive: This slice is not empty X = MyNode(plates=(3,), dims=((),)) Y = X[-4::4] self.assertEqual(Y.plates, (1,)) # One ellipsis X = MyNode(plates=(3,4,5,6), dims=((),)) Y = X[...,2,1] self.assertEqual(Y.plates, (3,4)) X = MyNode(plates=(3,4,5,6), dims=((),)) Y = X[2,...,1] self.assertEqual(Y.plates, (4,5)) X = MyNode(plates=(3,4,5,6), dims=((),)) Y = X[2,1,...] self.assertEqual(Y.plates, (5,6)) # Multiple ellipsis X = MyNode(plates=(3,4,5), dims=((),)) Y = X[...,2,...] self.assertEqual(Y.plates, (3,5)) X = MyNode(plates=(3,4,5), dims=((),)) Y = X[...,2,...,...] self.assertEqual(Y.plates, (4,5)) X = MyNode(plates=(3,4,5), dims=((),)) Y = X[...,...,...,...] self.assertEqual(Y.plates, (3,4,5)) # New axis X = MyNode(plates=(3,), dims=((),)) Y = X[None] self.assertEqual(Y.plates, (1,3)) X = MyNode(plates=(3,), dims=((),)) Y = X[:,None] self.assertEqual(Y.plates, (3,1)) X = MyNode(plates=(3,4), dims=((),)) Y = X[None,:,None,:] self.assertEqual(Y.plates, (1,3,1,4)) # # Test errors # class Z: def __getitem__(self, obj): return obj # Invalid argument self.assertRaises(TypeError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()['a']) self.assertRaises(TypeError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()[[2,1]]) # Too many indices self.assertRaises(IndexError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()[:,:]) self.assertRaises(IndexError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()[...,...,...]) # Index out of range self.assertRaises(IndexError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()[3]) self.assertRaises(IndexError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()[-4]) # Empty slice self.assertRaises(IndexError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()[3:]) self.assertRaises(IndexError, MyNode(plates=(3,), dims=((),)).__getitem__, Z()[:-3]) pass def test_message_to_child(self): """ Test message to child of X[..] node operator. """ class DummyNode(Node): _moments = Moments() _parent_moments = (Moments(),) def __init__(self, u, **kwargs): self.u = u super().__init__(**kwargs) def _message_to_child(self): return self.u def _get_id_list(self): return [] # Message not a reference to X.u but a copy of it X = DummyNode([np.random.randn(3)], plates=(3,), dims=((),)) Y = X[2] self.assertTrue(Y._message_to_child() is not X.u, msg="Slice node operator sends a reference to the " "node's moment list as a message instead of a copy " "of the list.") # Integer indices X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[2,1] self.assertMessageToChild(Y, [X.u[0][2,1]]) # Too few integer indices X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[2] self.assertMessageToChild(Y, [X.u[0][2]]) # Integer for broadcasted moment X = DummyNode([np.random.randn(4)], plates=(3,4), dims=((),)) Y = X[2,1] self.assertMessageToChild(Y, [X.u[0][1]]) X = DummyNode([np.random.randn(4,1)], plates=(3,4), dims=((),)) Y = X[2,1] self.assertMessageToChild(Y, [X.u[0][2,0]]) # Ignore leading new axes X = DummyNode([np.random.randn(3)], plates=(3,), dims=((),)) Y = X[None,None,2] self.assertMessageToChild(Y, [X.u[0][2]]) # Ignore new axes before missing+broadcasted plate axes X = DummyNode([np.random.randn(3)], plates=(4,3,), dims=((),)) Y = X[1,None,None,2] self.assertMessageToChild(Y, [X.u[0][2]]) # New axes X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[2,None,None,1] self.assertMessageToChild(Y, [X.u[0][2,None,None,1]]) # New axes for broadcasted axes X = DummyNode([np.random.randn(4)], plates=(3,4), dims=((),)) Y = X[2,1,None,None] self.assertMessageToChild(Y, [X.u[0][1,None,None]]) # Full slice X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[:,2] self.assertMessageToChild(Y, [X.u[0][:,2]]) # Slice with start X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[1:,2] self.assertMessageToChild(Y, [X.u[0][1:,2]]) # Slice with end X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[:2,2] self.assertMessageToChild(Y, [X.u[0][:2,2]]) # Slice with step X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[::2,2] self.assertMessageToChild(Y, [X.u[0][::2,2]]) # Slice for broadcasted axes X = DummyNode([np.random.randn(4)], plates=(3,4), dims=((),)) Y = X[0:2:2,2] self.assertMessageToChild(Y, [X.u[0][2]]) X = DummyNode([np.random.randn(1,4)], plates=(3,4), dims=((),)) Y = X[0:2:2,2] self.assertMessageToChild(Y, [X.u[0][0:1,2]]) # One ellipsis X = DummyNode([np.random.randn(3,4)], plates=(3,4), dims=((),)) Y = X[...,2] self.assertMessageToChild(Y, [X.u[0][...,2]]) # Ellipsis over broadcasted axes X = DummyNode([np.random.randn(5,6)], plates=(3,4,5,6), dims=((),)) Y = X[1,...,2] self.assertMessageToChild(Y, [X.u[0][:,2]]) X = DummyNode([np.random.randn(3,1,5,6)], plates=(3,4,5,6), dims=((),)) Y = X[1,...,2] self.assertMessageToChild(Y, [X.u[0][1,:,:,2]]) # Multiple ellipsis X = DummyNode([np.random.randn(2,3,4,5)], plates=(2,3,4,5), dims=((),)) Y = X[...,2,...] self.assertMessageToChild(Y, [X.u[0][:,:,2,:]]) # Ellipsis when dimensions X = DummyNode([np.random.randn(2,3,4)], plates=(2,3), dims=((4,),)) Y = X[...,2] self.assertMessageToChild(Y, [X.u[0][:,2,:]]) # Indexing for multiple moments X = DummyNode([np.random.randn(2,3,4), np.random.randn(2,3)], plates=(2,3), dims=((4,),())) Y = X[1,1] self.assertMessageToChild(Y, [X.u[0][1,1], X.u[1][1,1]]) pass def test_message_to_parent(self): """ Test message to parent of X[..] node operator. """ class ParentNode(Node): _moments = Moments() _parent_moments = () def _get_id_list(self): return [] class ChildNode(Node): _moments = Moments() _parent_moments = (Moments(),) def __init__(self, X, m, mask, **kwargs): super().__init__(X, **kwargs) self.m = m self.mask2 = mask def _message_to_parent(self, index, u_parent=None): return self.m def _mask_to_parent(self, index): return self.mask2 def _get_id_list(self): return [] # General broadcasting V = ParentNode(plates=(3,3,3), dims=((),)) X = V[...] m = [ np.random.randn(3,1) ] msg = [ np.zeros((1,3,1)) ] msg[0][:,:,:] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Integer indices V = ParentNode(plates=(3,4), dims=((),)) X = V[2,1] m = [np.random.randn()] msg = [ np.zeros((3,4)) ] msg[0][2,1] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Integer indices with broadcasting V = ParentNode(plates=(3,3), dims=((),)) X = V[2,2] m = [ np.random.randn(1) ] msg = [ np.zeros((3,3)) ] msg[0][2,2] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Slice indices V = ParentNode(plates=(2,3,4,5), dims=((),)) X = V[:,:2,1:,::2] m = [np.random.randn(2,2,3,3)] msg = [ np.zeros((2,3,4,5)) ] msg[0][:,:2,1:,::2] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Full slice with broadcasting V = ParentNode(plates=(2,3), dims=((),)) X = V[:,:] m = [np.random.randn(1)] msg = [ np.zeros((1,1)) ] msg[0][:] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Start slice with broadcasting V = ParentNode(plates=(3,3,3,3), dims=((),)) X = V[0:,1:,-2:,-3:] m = [np.random.randn(1,1)] msg = [ np.zeros((1,3,3,1)) ] msg[0][:,1:,-2:,:] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # End slice with broadcasting V = ParentNode(plates=(3,3,3,3), dims=((),)) X = V[:2,:3,:4,:-1] m = [np.random.randn(1,1)] msg = [ np.zeros((3,1,1,3)) ] msg[0][:2,:,:,:-1] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Step slice with broadcasting V = ParentNode(plates=(3,3,1), dims=((),)) X = V[::1,::2,::2] m = [np.random.randn(1)] msg = [ np.zeros((1,3,1)) ] msg[0][:,::2,:] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Ellipsis V = ParentNode(plates=(3,3,3), dims=((),)) X = V[...,0] m = [np.random.randn(3,3)] msg = [ np.zeros((3,3,3)) ] msg[0][:,:,0] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # New axes V = ParentNode(plates=(3,3), dims=((),)) X = V[None,:,None,None,:] m = [np.random.randn(1,3,1,1,3)] msg = [ np.zeros((3,3)) ] msg[0][:,:] = m[0][0,:,0,0,:] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # New axes with broadcasting V = ParentNode(plates=(3,3), dims=((),)) X = V[None,:,None,:,None] m = [np.random.randn(1,3,1)] msg = [ np.zeros((1,3)) ] msg[0][:,:] = m[0][0,:,0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Multiple messages V = ParentNode(plates=(3,), dims=((),())) X = V[:] m = [np.random.randn(3), np.random.randn(3)] msg = [ np.zeros((3)), np.zeros((3)) ] msg[0][:] = m[0][:] msg[1][:] = m[1][:] Y = ChildNode(X, m, True, dims=((),())) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Non-scalar variables V = ParentNode(plates=(2,3), dims=((4,),)) X = V[...] m = [np.random.randn(2,3,4)] msg = [ np.zeros((2,3,4)) ] msg[0][:,:,:] = m[0][:,:,:] Y = ChildNode(X, m, True, dims=((4,),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Missing values V = ParentNode(plates=(3,3,3), dims=((3,),)) X = V[:,0,::2,None] m = [np.random.randn(3,2,1,3)] # mask shape: (3, 2, 1) mask = np.array([ [[True], [False]], [[False], [False]], [[False], [True]] ]) msg = [ np.zeros((3,3,3,3)) ] msg[0][:,0,::2,:] = (m[0] * mask[...,None])[:,:,0,:] Y = ChildNode(X, m, mask, dims=((3,),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Found bug: int index after slice V = ParentNode(plates=(3,3), dims=((),)) X = V[:,0] m = [np.random.randn(3)] msg = [ np.zeros((3,3)) ] msg[0][:,0] = m[0] Y = ChildNode(X, m, True, dims=((),)) X._update_mask() self.assertMessage(X._message_to_parent(0), msg) # Found bug: message requires reshaping after reverse indexing pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_poisson.py0000644000175100001770000000470400000000000025577 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `poisson` module. """ import numpy as np import scipy from bayespy.nodes import Poisson from bayespy.nodes import Gamma from bayespy.utils import random from bayespy.utils.misc import TestCase class TestPoisson(TestCase): """ Unit tests for Poisson node """ def test_init(self): """ Test the creation of Poisson nodes. """ # Some simple initializations X = Poisson(12.8) X = Poisson(Gamma(43, 24)) # Check that plates are correct X = Poisson(np.ones((2,3))) self.assertEqual(X.plates, (2,3)) X = Poisson(Gamma(1, 1, plates=(2,3))) self.assertEqual(X.plates, (2,3)) # Invalid rate self.assertRaises(ValueError, Poisson, -0.1) # Inconsistent plates self.assertRaises(ValueError, Poisson, np.ones(3), plates=(2,)) # Explicit plates too small self.assertRaises(ValueError, Poisson, np.ones(3), plates=(1,)) pass def test_moments(self): """ Test the moments of Poisson nodes. """ # Simple test X = Poisson(12.8) u = X._message_to_child() self.assertEqual(len(u), 1) self.assertAllClose(u[0], 12.8) # Test plates in rate X = Poisson(12.8*np.ones((2,3))) u = X._message_to_child() self.assertAllClose(u[0], 12.8*np.ones((2,3))) # Test with gamma prior alpha = Gamma(5, 2) r = np.exp(alpha._message_to_child()[1]) X = Poisson(alpha) u = X._message_to_child() self.assertAllClose(u[0], r) # Test with broadcasted plates in parents X = Poisson(Gamma(5, 2, plates=(2,3))) u = X._message_to_child() self.assertAllClose(u[0]*np.ones((2,3)), r*np.ones((2,3))) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_take.py0000644000175100001770000002757300000000000025042 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `take` module. """ import numpy as np from bayespy.nodes import GaussianARD from bayespy.nodes import Take from bayespy.inference import VB from bayespy.utils.misc import TestCase class TestTake(TestCase): def test_parent_validity(self): """ Test that the parent nodes are validated properly """ # Test scalar index, no shape X = GaussianARD(1, 1, plates=(2,), shape=()) Y = Take(X, 1) self.assertEqual( Y.plates, (), ) self.assertEqual( Y.dims, ( (), () ) ) # Test vector indices, no shape X = GaussianARD(1, 1, plates=(2,), shape=()) Y = Take(X, [1, 1, 0, 1]) self.assertEqual( Y.plates, (4,), ) self.assertEqual( Y.dims, ( (), () ) ) # Test matrix indices, no shape X = GaussianARD(1, 1, plates=(2,), shape=()) Y = Take(X, [[1, 1, 0], [1, 0, 1]]) self.assertEqual( Y.plates, (2, 3), ) self.assertEqual( Y.dims, ( (), () ) ) # Test scalar index, with shape X = GaussianARD(1, 1, plates=(3,), shape=(2,)) Y = Take(X, 2) self.assertEqual( Y.plates, (), ) self.assertEqual( Y.dims, ( (2,), (2, 2) ) ) # Test vector indices, with shape X = GaussianARD(1, 1, plates=(3,), shape=(2,)) Y = Take(X, [1, 1, 0, 2]) self.assertEqual( Y.plates, (4,), ) self.assertEqual( Y.dims, ( (2,), (2, 2) ) ) # Test matrix indices, no shape X = GaussianARD(1, 1, plates=(3,), shape=(2,)) Y = Take(X, np.ones((4, 5), dtype=np.int64)) self.assertEqual( Y.plates, (4, 5), ) self.assertEqual( Y.dims, ( (2,), (2, 2) ) ) # Test scalar indices with more plate axes X = GaussianARD(1, 1, plates=(4, 2), shape=()) Y = Take(X, 1) self.assertEqual( Y.plates, (4,), ) self.assertEqual( Y.dims, ( (), () ) ) # Test vector indices with more plate axes X = GaussianARD(1, 1, plates=(4, 2), shape=()) Y = Take(X, np.ones(3, dtype=np.int64)) self.assertEqual( Y.plates, (4, 3), ) self.assertEqual( Y.dims, ( (), () ) ) # Test take on other plate axis X = GaussianARD(1, 1, plates=(4, 2), shape=()) Y = Take(X, np.ones(3, dtype=np.int64), plate_axis=-2) self.assertEqual( Y.plates, (3, 2), ) self.assertEqual( Y.dims, ( (), () ) ) # Test positive plate axis X = GaussianARD(1, 1, plates=(4, 2), shape=()) self.assertRaises( ValueError, Take, X, np.ones(3, dtype=np.int64), plate_axis=0, ) # Test indices out of bounds X = GaussianARD(1, 1, plates=(2,), shape=()) self.assertRaises( ValueError, Take, X, [0, -3], ) X = GaussianARD(1, 1, plates=(2,), shape=()) self.assertRaises( ValueError, Take, X, [0, 2], ) # Test non-integer indices X = GaussianARD(1, 1, plates=(2,), shape=()) self.assertRaises( ValueError, Take, X, [0, 1.5], ) pass def test_moments(self): """ Test moments computation in Take node """ # Test scalar index, no shape X = GaussianARD([1, 2], [1, 0.5], shape=()) Y = Take(X, 1) self.assertAllClose( Y.get_moments()[0], 2, ) self.assertAllClose( Y.get_moments()[1], 6, ) # Test vector indices, no shape X = GaussianARD([1, 2], [1, 0.5], shape=()) Y = Take(X, [1, 1, 0, 1]) self.assertAllClose( Y.get_moments()[0], [2, 2, 1, 2], ) self.assertAllClose( Y.get_moments()[1], [6, 6, 2, 6], ) # Test matrix indices, no shape X = GaussianARD([1, 2], [1, 0.5], shape=()) Y = Take(X, [[1, 1, 0], [1, 0, 1]]) self.assertAllClose( Y.get_moments()[0], [[2, 2, 1], [2, 1, 2]], ) self.assertAllClose( Y.get_moments()[1], [[6, 6, 2], [6, 2, 6]], ) # Test scalar index, with shape X = GaussianARD([[1, 2], [3, 4], [5, 6]], [[1, 1/2], [1/3, 1/4], [1/5, 1/6]], shape=(2,)) Y = Take(X, 2) self.assertAllClose( Y.get_moments()[0], [5, 6], ) self.assertAllClose( Y.get_moments()[1], [[25+5, 30], [30, 36+6]], ) # Test vector indices, with shape X = GaussianARD([[1, 2], [3, 4], [5, 6]], [[1, 1/2], [1/3, 1/4], [1/5, 1/6]], shape=(2,)) Y = Take(X, [1, 1, 0, 2]) self.assertAllClose( Y.get_moments()[0], [[3, 4], [3, 4], [1, 2], [5, 6]], ) self.assertAllClose( Y.get_moments()[1], [ [[9+3, 12], [12, 16+4]], [[9+3, 12], [12, 16+4]], [[1+1, 2], [2, 4+2]], [[25+5, 30], [30, 36+6]] ], ) # Test matrix indices, no shape X = GaussianARD([[1, 2], [3, 4], [5, 6]], [[1, 1/2], [1/3, 1/4], [1/5, 1/6]], shape=(2,)) Y = Take(X, [[1, 1], [0, 2]]) self.assertAllClose( Y.get_moments()[0], [[[3, 4], [3, 4]], [[1, 2], [5, 6]]], ) self.assertAllClose( Y.get_moments()[1], [ [[[9+3, 12], [12, 16+4]], [[9+3, 12], [12, 16+4]]], [[[1+1, 2], [2, 4+2]], [[25+5, 30], [30, 36+6]]], ], ) # Test with more plate axes X = GaussianARD([[1, 2], [3, 4], [5, 6]], [[1, 1/2], [1/3, 1/4], [1/5, 1/6]], shape=()) Y = Take(X, [1, 0, 1]) self.assertAllClose( Y.get_moments()[0], [[2, 1, 2], [4, 3, 4], [6, 5, 6]], ) self.assertAllClose( Y.get_moments()[1], [[4+2, 1+1, 4+2], [16+4, 9+3, 16+4], [36+6, 25+5, 36+6]], ) # Test take on other plate axis X = GaussianARD([[1, 2], [3, 4], [5, 6]], [[1, 1/2], [1/3, 1/4], [1/5, 1/6]], shape=()) Y = Take(X, [2, 0], plate_axis=-2) self.assertAllClose( Y.get_moments()[0], [[5, 6], [1, 2]], ) self.assertAllClose( Y.get_moments()[1], [[25+5, 36+6], [1+1, 4+2]], ) # Test parent broadcasting X = GaussianARD([1, 2], [1, 1/2], plates=(3,), shape=(2,)) Y = Take(X, [1, 1, 0, 1]) self.assertAllClose( Y.get_moments()[0], [[1, 2], [1, 2], [1, 2], [1, 2]], ) self.assertAllClose( Y.get_moments()[1], [ [[1+1, 2], [2, 4+2]], [[1+1, 2], [2, 4+2]], [[1+1, 2], [2, 4+2]], [[1+1, 2], [2, 4+2]], ] ) pass def test_message_to_parent(self): """ Test parent message computation in Take node """ def check(indices, plates, shape, axis=-1, use_mask=False): mu = np.random.rand(*(plates+shape)) alpha = np.random.rand(*(plates+shape)) X = GaussianARD(mu, alpha, shape=shape, plates=plates) Y = Take(X, indices, plate_axis=axis) Z = GaussianARD(Y, 1, shape=shape) z = np.random.randn(*(Z.get_shape(0))) if use_mask: mask = np.mod(np.reshape(np.arange(np.prod(Z.plates)), Z.plates), 2) != 0 else: mask = True Z.observe(z, mask=mask) X.update() (x0, x1) = X.get_moments() # For comparison, build the same model brute force X = GaussianARD(mu, alpha, shape=shape, plates=plates) # Number of trailing plate axes before the take axis N = len(X.plates) + axis # Reshape the take axes into a single axis z_shape = X.plates[:axis] + (-1,) if axis < -1: z_shape = z_shape + X.plates[(axis+1):] z_shape = z_shape + shape z = np.reshape(z, z_shape) # Reshape the take axes into a single axis if use_mask: mask_shape = X.plates[:axis] + (-1,) if axis < -1: mask_shape = mask_shape + X.plates[(axis+1):] mask = np.reshape(mask, mask_shape) for (j, i) in enumerate(range(np.size(indices))): ind = np.array(indices).flatten()[i] index_x = N*(slice(None),) + (ind,) index_z = N*(slice(None),) + (j,) # print(index) Xi = X[index_x] zi = z[index_z] Zi = GaussianARD(Xi, 1, ndim=len(shape)) if use_mask: maski = mask[index_z] else: maski = True Zi.observe(zi, mask=maski) X.update() self.assertAllClose( x0, X.get_moments()[0], ) self.assertAllClose( x1, X.get_moments()[1], ) return # Test scalar index check(1, (2,), ()) check(1, (2, 3), ()) check(1, (2, 3), (4,)) check(1, (2, 3), (), axis=-2) check(1, (2, 3), (4,), axis=-2) check(1, (2,), (), use_mask=True) check(1, (2, 3), (), use_mask=True) check(1, (2, 3), (4,), use_mask=True) check(1, (2, 3), (), axis=-2, use_mask=True) check(1, (2, 3), (4,), axis=-2, use_mask=True) # Test vector index check([1, 1, 0, 1], (2,), ()) check([1, 1, 0, 1], (2, 3), ()) check([1, 1, 0, 1], (2, 3), (4,)) check([1, 1, 0, 1], (2, 3), (), axis=-2) check([1, 1, 0, 1], (2, 3), (4,), axis=-2) check([1, 1, 0, 1], (2,), (), use_mask=True) check([1, 1, 0, 1], (2, 3), (), use_mask=True) check([1, 1, 0, 1], (2, 3), (4,), use_mask=True) check([1, 1, 0, 1], (2, 3), (), axis=-2, use_mask=True) check([1, 1, 0, 1], (2, 3), (4,), axis=-2, use_mask=True) # Test matrix index check([[1, 1, 2], [0, 2, 1]], (4,), ()) check([[1, 1, 2], [0, 2, 1]], (4, 5), ()) check([[1, 1, 2], [0, 2, 1]], (4, 5), (6,)) check([[1, 1, 2], [0, 2, 1]], (4, 5), (), axis=-2) check([[1, 1, 2], [0, 2, 1]], (4, 5), (6,), axis=-2) check([[1, 1, 2], [0, 2, 1]], (4,), (), use_mask=True) check([[1, 1, 2], [0, 2, 1]], (4, 5), (), use_mask=True) check([[1, 1, 2], [0, 2, 1]], (4, 5), (6,), use_mask=True) check([[1, 1, 2], [0, 2, 1]], (4, 5), (), axis=-2, use_mask=True) check([[1, 1, 2], [0, 2, 1]], (4, 5), (6,), axis=-2, use_mask=True) pass def test_plates_multiplier_from_parent(self): X = GaussianARD(np.random.randn(3, 2), 1, ndim=1) Y = Take(X, [0, 1, 2, 1, 1]) self.assertEqual(Y._plates_multiplier_from_parent(0), ()) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/tests/test_wishart.py0000644000175100001770000000730500000000000025566 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `wishart` module. """ import numpy as np from scipy import special from .. import gaussian from bayespy.nodes import (Gaussian, Wishart) from ...vmp import VB from bayespy.utils import misc from bayespy.utils import linalg from bayespy.utils import random from bayespy.utils.misc import TestCase def _student_logpdf(y, mu, Cov, nu): D = np.shape(y)[-1] return (special.gammaln((nu+D)/2) - special.gammaln(nu/2) - 0.5 * D * np.log(nu) - 0.5 * D * np.log(np.pi) - 0.5 * np.linalg.slogdet(Cov)[1] - 0.5 * (nu+D) * np.log(1+1/nu*np.einsum('...i,...ij,...j->...', y-mu, np.linalg.inv(Cov), y-mu))) class TestWishart(TestCase): def test_lower_bound(self): """ Test the Wishart VB lower bound """ # # By having the Wishart node as the only latent node, VB will give exact # results, thus the VB lower bound is the true marginal log likelihood. # Thus, check that they are equal. The true marginal likelihood is the # multivariate Student-t distribution. # np.random.seed(42) D = 3 n = (D-1) + np.random.uniform(0.1, 0.5) V = random.covariance(D) Lambda = Wishart(n, V) mu = np.random.randn(D) Y = Gaussian(mu, Lambda) y = np.random.randn(D) Y.observe(y) Lambda.update() L = Y.lower_bound_contribution() + Lambda.lower_bound_contribution() mu = mu nu = n + 1 - D Cov = V / nu self.assertAllClose(L, _student_logpdf(y, mu, Cov, nu)) pass def test_moments(self): """ Test the moments of Wishart node """ np.random.seed(42) # Test prior moments D = 3 n = (D-1) + np.random.uniform(0.1,2) V = random.covariance(D) Lambda = Wishart(n, V) Lambda.update() u = Lambda.get_moments() self.assertAllClose(u[0], n*np.linalg.inv(V), msg='Mean incorrect') self.assertAllClose(u[1], (np.sum(special.digamma((n - np.arange(D))/2)) + D*np.log(2) - np.linalg.slogdet(V)[1]), msg='Log determinant incorrect') # Test posterior moments D = 3 n = (D-1) + np.random.uniform(0.1,2) V = random.covariance(D) Lambda = Wishart(n, V) mu = np.random.randn(D) Y = Gaussian(mu, Lambda) y = np.random.randn(D) Y.observe(y) Lambda.update() u = Lambda.get_moments() n = n + 1 V = V + np.outer(y-mu, y-mu) self.assertAllClose(u[0], n*np.linalg.inv(V), msg='Mean incorrect') self.assertAllClose(u[1], (np.sum(special.digamma((n - np.arange(D))/2)) + D*np.log(2) - np.linalg.slogdet(V)[1]), msg='Log determinant incorrect') pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/nodes/wishart.py0000644000175100001770000002251600000000000023366 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012,2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import scipy.special as special from bayespy.utils import misc, linalg from .expfamily import ExponentialFamily from .expfamily import ExponentialFamilyDistribution from .expfamily import useconstructor from .constant import Constant from .deterministic import Deterministic from .gamma import GammaMoments from .node import Moments, Node class WishartPriorMoments(Moments): def __init__(self, k): self.k = k self.dims = ( (), () ) return def compute_fixed_moments(self, n): """ Compute moments for fixed x. """ u0 = np.asanyarray(n) u1 = special.multigammaln(0.5*u0, self.k) return [u0, u1] @classmethod def from_values(cls, x, d): """ Compute the dimensions of phi or u. """ return cls(d) class WishartMoments(Moments): def __init__(self, shape): self.shape = shape self.ndim = len(shape) self.dims = ( 2 * shape, () ) return def compute_fixed_moments(self, Lambda, gradient=None): """ Compute moments for fixed x. """ Lambda = np.asanyarray(Lambda) L = linalg.chol(Lambda, ndim=self.ndim) ldet = linalg.chol_logdet(L, ndim=self.ndim) u = [Lambda, ldet] if gradient is None: return u du0 = gradient[0] du1 = ( misc.add_trailing_axes(gradient[1], 2*self.ndim) * linalg.chol_inv(L, ndim=self.ndim) ) du = du0 + du1 return (u, du) def plates_from_shape(self, shape): if self.ndim == 0: return shape else: return shape[:-2*self.ndim] def shape_from_plates(self, plates): return plates + self.shape + self.shape def get_instance_conversion_kwargs(self): return dict(ndim=self.ndim) def get_instance_converter(self, ndim): if ndim != self.ndim: raise NotImplementedError( "No conversion between different ndim implemented for " "WishartMoments yet" ) return None @classmethod def from_values(cls, x, ndim): """ Compute the dimensions of phi and u. """ if np.ndim(x) < 2 * ndim: raise ValueError("Values for Wishart distribution must be at least " "2-D arrays.") if ndim > 0 and (np.shape(x)[-ndim:] != np.shape(x)[-2*ndim:-ndim]): raise ValueError("Values for Wishart distribution must be square " "matrices, thus the two last axes must have equal " "length.") shape = ( np.shape(x)[-ndim:] if ndim > 0 else () ) return cls(shape) class WishartDistribution(ExponentialFamilyDistribution): """ Sub-classes implement distribution specific computations. Distribution for :math:`k \times k` symmetric positive definite matrix. .. math:: \Lambda \sim \mathcal{W}(n, V) Note: :math:`V` is inverse scale matrix. .. math:: p(\Lambda | n, V) = .. """ def compute_message_to_parent(self, parent, index, u_self, u_n, u_V): if index == 0: raise NotImplementedError("Message from Wishart to degrees of " "freedom parameter (first parent) " "not yet implemented") elif index == 1: Lambda = u_self[0] n = u_n[0] return [-0.5 * Lambda, 0.5 * n] else: raise ValueError("Invalid parent index {0}".format(index)) def compute_phi_from_parents(self, u_n, u_V, mask=True): r""" Compute natural parameters .. math:: \phi(n, V) = \begin{bmatrix} -\frac{1}{2} V \\ \frac{1}{2} n \end{bmatrix} """ return [-0.5 * u_V[0], 0.5 * u_n[0]] def compute_moments_and_cgf(self, phi, mask=True): r""" Return moments and cgf for given natural parameters .. math:: \langle u \rangle = \begin{bmatrix} \phi_2 (-\phi_1)^{-1} \\ -\log|-\phi_1| + \psi_k(\phi_2) \end{bmatrix} \\ g(\phi) = \phi_2 \log|-\phi_1| - \log \Gamma_k(\phi_2) """ U = linalg.chol(-phi[0]) k = np.shape(phi[0])[-1] #k = self.dims[0][0] logdet_phi0 = linalg.chol_logdet(U) u0 = phi[1][...,np.newaxis,np.newaxis] * linalg.chol_inv(U) u1 = -logdet_phi0 + misc.multidigamma(phi[1], k) u = [u0, u1] g = phi[1] * logdet_phi0 - special.multigammaln(phi[1], k) return (u, g) def compute_cgf_from_parents(self, u_n, u_V): r""" CGF from parents .. math:: g(n, V) = \frac{n}{2} \log|V| - \frac{nk}{2} \log 2 - \log \Gamma_k(\frac{n}{2}) """ n = u_n[0] gammaln_n = u_n[1] V = u_V[0] logdet_V = u_V[1] k = np.shape(V)[-1] g = 0.5*n*logdet_V - 0.5*k*n*np.log(2) - gammaln_n return g def compute_fixed_moments_and_f(self, Lambda, mask=True): r""" Compute u(x) and f(x) for given x. .. math: u(\Lambda) = \begin{bmatrix} \Lambda \\ \log |\Lambda| \end{bmatrix} """ k = np.shape(Lambda)[-1] ldet = linalg.chol_logdet(linalg.chol(Lambda)) u = [Lambda, ldet] f = -(k+1)/2 * ldet return (u, f) class Wishart(ExponentialFamily): r""" Node for Wishart random variables. The random variable :math:`\mathbf{\Lambda}` is a :math:`D\times{}D` positive-definite symmetric matrix. .. math:: p(\mathbf{\Lambda}) = \mathrm{Wishart}(\mathbf{\Lambda} | N, \mathbf{V}) Parameters ---------- n : scalar or array :math:`N`, degrees of freedom, :math:`N>D-1`. V : Wishart-like node or (...,D,D)-array :math:`\mathbf{V}`, scale matrix. """ _distribution = WishartDistribution() def __init__(self, n, V, **kwargs): """ Create Wishart node. """ super().__init__(n, V, **kwargs) @classmethod def _constructor(cls, n, V, **kwargs): """ Constructs distribution and moments objects. """ # Make V a proper parent node and get the dimensionality of the matrix V = cls._ensure_moments(V, WishartMoments, ndim=1) D = V.dims[0][-1] n = cls._ensure_moments(n, WishartPriorMoments, d=D) moments = WishartMoments((D,)) # Parent node message types parent_moments = (n._moments, V._moments) parents = [n, V] return (parents, kwargs, moments.dims, cls._total_plates(kwargs.get('plates'), cls._distribution.plates_from_parent(0, n.plates), cls._distribution.plates_from_parent(1, V.plates)), cls._distribution, moments, parent_moments) def scale(self, scalar, **kwargs): return _ScaledWishart(self, scalar, **kwargs) def __str__(self): n = 2*self.phi[1] A = 0.5 * self.u[0] / self.phi[1][...,np.newaxis,np.newaxis] return ("%s ~ Wishart(n, A)\n" " n =\n" "%s\n" " A =\n" "%s\n" % (self.name, n, A)) class _ScaledWishart(Deterministic): def __init__(self, Lambda, alpha, ndim=None, **kwargs): if ndim is None: try: ndim = Lambda._moments.ndim except AttributeError: raise ValueError("Give explicit ndim argument. (ndim=1 for normal matrix)") Lambda = self._ensure_moments(Lambda, WishartMoments, ndim=ndim) alpha = self._ensure_moments(alpha, GammaMoments) dims = Lambda.dims self._moments = Lambda._moments self._parent_moments = (Lambda._moments, alpha._moments) return super().__init__(Lambda, alpha, dims=dims, **kwargs) def _compute_moments(self, u_Lambda, u_alpha): Lambda = u_Lambda[0] logdet_Lambda = u_Lambda[1] alpha = misc.add_trailing_axes(u_alpha[0], 2*self._moments.ndim) logalpha = u_alpha[1] u0 = Lambda * alpha u1 = logdet_Lambda + np.prod(self._moments.shape) * logalpha return [u0, u1] def _compute_message_to_parent(self, index, m, u_Lambda, u_alpha): if index == 0: alpha = misc.add_trailing_axes(u_alpha[0], 2*self._moments.ndim) logalpha = u_alpha[1] m0 = m[0] * alpha m1 = m[1] return [m0, m1] if index == 1: Lambda = u_Lambda[0] logdet_Lambda = u_Lambda[1] m0 = linalg.inner(m[0], Lambda, ndim=2*self._moments.ndim) m1 = m[1] * np.prod(self._moments.shape) return [m0, m1] raise IndexError() ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.413372 bayespy-0.6.2/bayespy/inference/vmp/tests/0000755000175100001770000000000000000000000021357 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/tests/__init__.py0000644000175100001770000000000000000000000023456 0ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/tests/test_annealing.py0000644000175100001770000000621300000000000024726 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `vmp` module. """ import numpy as np from scipy import special from numpy import testing from bayespy.nodes import (Gaussian, GaussianARD, GaussianGamma, Gamma, Wishart) from ..vmp import VB from bayespy.utils import misc from bayespy.utils import linalg from bayespy.utils import random from bayespy.utils.misc import TestCase class TestVB(TestCase): def test_annealing(self): X = GaussianARD(3, 4) X.initialize_from_parameters(-1, 6) Q = VB(X) Q.set_annealing(0.1) # # Check that the gradient is correct # # Initial parameters phi0 = X.phi # Gradient rg = X.get_riemannian_gradient() g = X.get_gradient(rg) # Numerical gradient of the first parameter eps = 1e-6 p0 = X.get_parameters() l0 = Q.compute_lowerbound(ignore_masked=False) g_num = [(), ()] e = eps p1 = p0[0] + e X.set_parameters([p1, p0[1]]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[0] = (l1 - l0) / eps # Numerical gradient of the second parameter p1 = p0[1] + e X.set_parameters([p0[0], p1]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[1] = (l1 - l0) / (eps) # Check self.assertAllClose(g[0], g_num[0]) self.assertAllClose(g[1], g_num[1]) # # Gradient should be zero after updating # X.update() # Initial parameters phi0 = X.phi # Numerical gradient of the first parameter eps = 1e-8 p0 = X.get_parameters() l0 = Q.compute_lowerbound(ignore_masked=False) g_num = [(), ()] e = eps p1 = p0[0] + e X.set_parameters([p1, p0[1]]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[0] = (l1 - l0) / eps # Numerical gradient of the second parameter p1 = p0[1] + e X.set_parameters([p0[0], p1]) l1 = Q.compute_lowerbound(ignore_masked=False) g_num[1] = (l1 - l0) / (eps) # Check self.assertAllClose(0, g_num[0], atol=1e-5) self.assertAllClose(0, g_num[1], atol=1e-5) # Not at the optimum X.initialize_from_parameters(-1, 6) # Initial parameters phi0 = X.phi # Gradient g = X.get_riemannian_gradient() # Parameters after VB-EM update X.update() phi1 = X.phi # Check self.assertAllClose(g[0], phi1[0] - phi0[0]) self.assertAllClose(g[1], phi1[1] - phi0[1]) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/tests/test_transformations.py0000644000175100001770000012225200000000000026225 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for `transformations` module. """ import numpy as np from bayespy.inference.vmp.nodes.gaussian import GaussianARD from bayespy.inference.vmp.nodes.gaussian import Gaussian from bayespy.inference.vmp.nodes.gamma import Gamma from bayespy.inference.vmp.nodes.wishart import Wishart from bayespy.inference.vmp.nodes.dot import SumMultiply from bayespy.inference.vmp.nodes.gaussian_markov_chain import GaussianMarkovChain from bayespy.utils import linalg from bayespy.utils import random from bayespy.utils import optimize from ..transformations import RotateGaussianARD from ..transformations import RotateGaussianMarkovChain from ..transformations import RotateVaryingMarkovChain from bayespy.utils.misc import TestCase class TestRotateGaussianARD(TestCase): def test_cost_function(self): """ Test the speed-up rotation of Gaussian ARD arrays. """ # Use seed for deterministic testing np.random.seed(42) def test(shape, plates, axis=-1, alpha_plates=None, plate_axis=None, mu=3): if plate_axis is not None: precomputes = [False, True] else: precomputes = [False] for precompute in precomputes: # Construct the model D = shape[axis] if alpha_plates is not None: alpha = Gamma(2, 2, plates=alpha_plates) alpha.initialize_from_random() else: alpha = 2 X = GaussianARD(mu, alpha, shape=shape, plates=plates) # Some initial learning and rotator constructing X.initialize_from_random() Y = GaussianARD(X, 1) Y.observe(np.random.randn(*(Y.get_shape(0)))) X.update() if alpha_plates is not None: alpha.update() true_cost0_alpha = alpha.lower_bound_contribution() rotX = RotateGaussianARD(X, alpha, axis=axis, precompute=precompute) else: rotX = RotateGaussianARD(X, axis=axis, precompute=precompute) true_cost0_X = X.lower_bound_contribution() # Rotation matrices I = np.identity(D) R = np.random.randn(D, D) if plate_axis is not None: C = plates[plate_axis] Q = np.random.randn(C, C) Ic = np.identity(C) else: Q = None Ic = None # Compute bound terms rotX.setup(plate_axis=plate_axis) rot_cost0 = rotX.get_bound_terms(I, Q=Ic) rot_cost1 = rotX.get_bound_terms(R, Q=Q) self.assertAllClose(sum(rot_cost0.values()), rotX.bound(I, Q=Ic)[0], msg="Bound terms and total bound differ") self.assertAllClose(sum(rot_cost1.values()), rotX.bound(R, Q=Q)[0], msg="Bound terms and total bound differ") # Perform rotation rotX.rotate(R, Q=Q) # Check bound terms true_cost1_X = X.lower_bound_contribution() self.assertAllClose(true_cost1_X - true_cost0_X, rot_cost1[X] - rot_cost0[X], msg="Incorrect rotation cost for X") if alpha_plates is not None: true_cost1_alpha = alpha.lower_bound_contribution() self.assertAllClose(true_cost1_alpha - true_cost0_alpha, rot_cost1[alpha] - rot_cost0[alpha], msg="Incorrect rotation cost for alpha") return # Rotating a vector (zero mu) test( (3,), (), axis=-1, mu=0) test( (3,), (), axis=-1, alpha_plates=(1,), mu=0) test( (3,), (), axis=-1, alpha_plates=(3,), mu=0) test( (3,), (2,4), axis=-1, mu=0) test( (3,), (2,4), axis=-1, alpha_plates=(1,), mu=0) test( (3,), (2,4), axis=-1, alpha_plates=(3,), mu=0) test( (3,), (2,4), axis=-1, alpha_plates=(2,4,3), mu=0) test( (3,), (2,4), axis=-1, alpha_plates=(1,4,3), mu=0) # Rotating a vector (full mu) test( (3,), (), axis=-1, mu=3*np.ones((3,))) test( (3,), (), axis=-1, alpha_plates=(), mu=3*np.ones((3,))) test( (3,), (), axis=-1, alpha_plates=(1,), mu=3*np.ones((3,))) test( (3,), (), axis=-1, alpha_plates=(3,), mu=3*np.ones((3,))) test( (3,), (2,4), axis=-1, mu=3*np.ones((2,4,3))) test( (3,), (2,4), axis=-1, alpha_plates=(1,), mu=3*np.ones((2,4,3))) test( (3,), (2,4), axis=-1, alpha_plates=(3,), mu=3*np.ones((2,4,3))) test( (3,), (2,4), axis=-1, alpha_plates=(2,4,3), mu=3*np.ones((2,4,3))) test( (3,), (2,4), axis=-1, alpha_plates=(1,4,3), mu=3*np.ones((2,4,3))) # Rotating a vector (broadcast mu) test( (3,), (), axis=-1, mu=3*np.ones((1,))) test( (3,), (), axis=-1, alpha_plates=(1,), mu=3*np.ones((1,))) test( (3,), (), axis=-1, alpha_plates=(3,), mu=3*np.ones((1,))) test( (3,), (2,4,5), axis=-1, mu=3*np.ones((4,1,1))) test( (3,), (2,4,5), axis=-1, alpha_plates=(1,), mu=3*np.ones((4,1,1))) test( (3,), (2,4,5), axis=-1, alpha_plates=(3,), mu=3*np.ones((4,1,1))) test( (3,), (2,4,5), axis=-1, alpha_plates=(2,4,5,3), mu=3*np.ones((4,1,1))) #!! test( (3,), (2,4,5), axis=-1, alpha_plates=(1,4,5,3), mu=3*np.ones((4,1,1))) # Rotating an array test( (2,3,4), (), axis=-1) test( (2,3,4), (), axis=-2) test( (2,3,4), (), axis=-3) test( (2,3,4), (5,6), axis=-1) test( (2,3,4), (5,6), axis=-2) test( (2,3,4), (5,6), axis=-3) test( (2,3,4), (5,6), axis=-1, alpha_plates=(3,1)) test( (2,3,4), (5,6), axis=-2, alpha_plates=(3,1)) test( (2,3,4), (5,6), axis=-3, alpha_plates=(3,1)) test( (2,3), (4,5,6), axis=-1, alpha_plates=(5,1,2,1)) test( (2,3), (4,5,6), axis=-2, alpha_plates=(5,1,2,1)) # Test mu array broadcasting test( (2,3,4), (5,6,7), axis=-2, mu=GaussianARD(3, 1, shape=(3,1), plates=(6,1,1))) test( (2,3,4), (5,6,7), axis=-3, mu=GaussianARD(3, 1, shape=(3,1), plates=(6,1,1))) test( (2,3,4), (5,6,7), axis=-2, alpha_plates=(5,1,7,2,1,1), mu=GaussianARD(3, 1, shape=(3,1), plates=(6,1,1))) # Plate rotation test( (3,), (5,), axis=-1, plate_axis=-1) test( (3,), (4,5,6), axis=-1, plate_axis=-2) test( (2,3), (4,5,6), axis=-2, plate_axis=-2) test( (2,3,4), (5,6,7), axis=-2, plate_axis=-3) # Plate rotation with alpha test( (2,3,4), (5,6,7), axis=-2, alpha_plates=(3,1), plate_axis=-2) test( (2,3,4), (5,6,7), axis=-2, alpha_plates=(6,1,2,1,4), plate_axis=-3) # Plate rotation with mu test( (2,3,4), (5,6,7), axis=-2, plate_axis=-2, mu=GaussianARD(3, 1, shape=(3,1), plates=(6,1,1))) test( (2,3,4), (5,6,7), axis=-3, plate_axis=-2, mu=GaussianARD(3, 1, shape=(3,1), plates=(6,1,1))) test( (2,3,4), (5,6,7), axis=-2, alpha_plates=(5,1,7,2,1,1), plate_axis=-2, mu=GaussianARD(3, 1, shape=(3,1), plates=(6,1,1))) # # Plate rotation with mu and alpha # # Basic, matching sizes test( (3,), (4,), axis=-1, plate_axis=-1, alpha_plates=(4,3), mu=GaussianARD(3, 1, shape=(3,), plates=(4,))) # Broadcast for mu test( (3,), (4,), axis=-1, plate_axis=-1, alpha_plates=(4,3), mu=GaussianARD(3, 1, shape=(1,), plates=(4,))) test( (3,), (4,), axis=-1, plate_axis=-1, alpha_plates=(4,3), mu=GaussianARD(3, 1, shape=(), plates=(1,))) test( (3,), (4,), axis=-1, plate_axis=-1, alpha_plates=(4,3), mu=GaussianARD(3, 1, shape=(3,), plates=(1,))) # Broadcast for alpha test( (3,), (4,), axis=-1, plate_axis=-1, alpha_plates=(4,1), mu=GaussianARD(3, 1, shape=(3,), plates=(4,))) test( (3,), (4,), axis=-1, plate_axis=-1, alpha_plates=(3,), mu=GaussianARD(3, 1, shape=(3,), plates=(4,))) # Several variable dimensions test( (3,4,5), (2,), axis=-2, plate_axis=-1, alpha_plates=(2,3,4,5), mu=GaussianARD(3, 1, shape=(3,4,5), plates=(2,))) test( (3,4,5), (2,), axis=-2, plate_axis=-1, alpha_plates=(2,3,1,5), mu=GaussianARD(3, 1, shape=(4,1), plates=(2,1))) # Several plate dimensions test( (5,), (2,3,4), axis=-1, plate_axis=-2, alpha_plates=(2,3,4,5), mu=GaussianARD(3, 1, shape=(5,), plates=(2,3,4))) # Several plate dimensions, rotated plate broadcasted in alpha test( (5,), (2,3,4), axis=-1, plate_axis=-2, alpha_plates=(2,1,4,5), mu=GaussianARD(3, 1, shape=(5,), plates=(2,3,4))) test( (5,), (2,3,4), axis=-1, plate_axis=-2, alpha_plates=(4,5), mu=GaussianARD(3, 1, shape=(5,), plates=(2,3,4))) # Several plate dimensions, rotated plate broadcasted in mu test( (5,), (2,3,4), axis=-1, plate_axis=-2, alpha_plates=(2,3,4,5), mu=GaussianARD(3, 1, shape=(5,), plates=(2,1,4))) test( (5,), (2,3,4), axis=-1, plate_axis=-2, alpha_plates=(2,3,4,5), mu=GaussianARD(3, 1, shape=(5,), plates=(4,))) # Several plate dimensions, rotated plate broadcasted in mu and alpha test( (5,), (2,3,4), axis=-1, plate_axis=-2, alpha_plates=(2,1,4,5), mu=GaussianARD(3, 1, shape=(5,), plates=(2,1,4))) test( (5,), (2,3,4), axis=-1, plate_axis=-2, alpha_plates=(4,5), mu=GaussianARD(3, 1, shape=(5,), plates=(4,))) # TODO: Missing values pass def test_cost_gradient(self): """ Test gradient of the rotation cost function for Gaussian ARD arrays. """ # Use seed for deterministic testing np.random.seed(42) def test(shape, plates, axis=-1, alpha_plates=None, plate_axis=None, mu=3): if plate_axis is not None: precomputes = [False, True] else: precomputes = [False] for precompute in precomputes: # Construct the model D = shape[axis] if alpha_plates is not None: alpha = Gamma(3, 5, plates=alpha_plates) alpha.initialize_from_random() else: alpha = 2 X = GaussianARD(mu, alpha, shape=shape, plates=plates) # Some initial learning and rotator constructing X.initialize_from_random() Y = GaussianARD(X, 1) Y.observe(np.random.randn(*(Y.get_shape(0)))) X.update() if alpha_plates is not None: alpha.update() rotX = RotateGaussianARD(X, alpha, axis=axis, precompute=precompute) else: rotX = RotateGaussianARD(X, axis=axis, precompute=precompute) try: mu.update() except: pass # Rotation matrices R = np.random.randn(D, D) if plate_axis is not None: C = plates[plate_axis] Q = np.random.randn(C, C) else: Q = None # Compute bound terms rotX.setup(plate_axis=plate_axis) if plate_axis is None: def f_r(r): (b, dr) = rotX.bound(np.reshape(r, np.shape(R))) return (b, np.ravel(dr)) else: def f_r(r): (b, dr, dq) = rotX.bound(np.reshape(r, np.shape(R)), Q=Q) return (b, np.ravel(dr)) def f_q(q): (b, dr, dq) = rotX.bound(R, Q=np.reshape(q, np.shape(Q))) return (b, np.ravel(dq)) # Check gradient with respect to R err = optimize.check_gradient(f_r, np.ravel(R), verbose=False)[1] self.assertAllClose(err, 0, atol=1e-4, msg="Gradient incorrect for R") # Check gradient with respect to Q if plate_axis is not None: err = optimize.check_gradient(f_q, np.ravel(Q), verbose=False)[1] self.assertAllClose(err, 0, atol=1e-4, msg="Gradient incorrect for Q") return # # Basic rotation # test((3,), (), axis=-1) test((2,3,4), (), axis=-1) test((2,3,4), (), axis=-2) test((2,3,4), (), axis=-3) test((2,3,4), (5,6), axis=-2) # # Rotation with mu # # Simple test((1,), (), axis=-1, mu=GaussianARD(2, 4, shape=(1,), plates=())) test((3,), (), axis=-1, mu=GaussianARD(2, 4, shape=(3,), plates=())) # Broadcast mu over rotated dim test((3,), (), axis=-1, mu=GaussianARD(2, 4, shape=(1,), plates=())) test((3,), (), axis=-1, mu=GaussianARD(2, 4, shape=(), plates=())) # Broadcast mu over dim when multiple dims test((2,3), (), axis=-1, mu=GaussianARD(2, 4, shape=(1,3), plates=())) test((2,3), (), axis=-1, mu=GaussianARD(2, 4, shape=(3,), plates=())) # Broadcast mu over rotated dim when multiple dims test((2,3), (), axis=-2, mu=GaussianARD(2, 4, shape=(1,3), plates=())) test((2,3), (), axis=-2, mu=GaussianARD(2, 4, shape=(3,), plates=())) # Broadcast mu over plates test((3,), (4,5), axis=-1, mu=GaussianARD(2, 4, shape=(3,), plates=(4,1))) test((3,), (4,5), axis=-1, mu=GaussianARD(2, 4, shape=(3,), plates=(5,))) # # Rotation with alpha # # Simple test((1,), (), axis=-1, alpha_plates=()) test((3,), (), axis=-1, alpha_plates=(3,)) # Broadcast alpha over rotated dim test((3,), (), axis=-1, alpha_plates=()) test((3,), (), axis=-1, alpha_plates=(1,)) # Broadcast alpha over dim when multiple dims test((2,3), (), axis=-1, alpha_plates=(1,3)) test((2,3), (), axis=-1, alpha_plates=(3,)) # Broadcast alpha over rotated dim when multiple dims test((2,3), (), axis=-2, alpha_plates=(1,3)) test((2,3), (), axis=-2, alpha_plates=(3,)) # Broadcast alpha over plates test((3,), (4,5), axis=-1, alpha_plates=(4,1,3)) test((3,), (4,5), axis=-1, alpha_plates=(5,3)) # # Rotation with alpha and mu # # Simple test((1,), (), axis=-1, alpha_plates=(1,), mu=GaussianARD(2, 4, shape=(1,), plates=())) test((3,), (), axis=-1, alpha_plates=(3,), mu=GaussianARD(2, 4, shape=(3,), plates=())) # Broadcast mu over rotated dim test((3,), (), axis=-1, alpha_plates=(3,), mu=GaussianARD(2, 4, shape=(1,), plates=())) test((3,), (), axis=-1, alpha_plates=(3,), mu=GaussianARD(2, 4, shape=(), plates=())) # Broadcast alpha over rotated dim test((3,), (), axis=-1, alpha_plates=(1,), mu=GaussianARD(2, 4, shape=(3,), plates=())) test((3,), (), axis=-1, alpha_plates=(), mu=GaussianARD(2, 4, shape=(3,), plates=())) # Broadcast both mu and alpha over rotated dim test((3,), (), axis=-1, alpha_plates=(1,), mu=GaussianARD(2, 4, shape=(1,), plates=())) test((3,), (), axis=-1, alpha_plates=(), mu=GaussianARD(2, 4, shape=(), plates=())) # Broadcast mu over plates test((3,), (4,5), axis=-1, alpha_plates=(4,5,3), mu=GaussianARD(2, 4, shape=(3,), plates=(4,1))) test((3,), (4,5), axis=-1, alpha_plates=(4,5,3), mu=GaussianARD(2, 4, shape=(3,), plates=(5,))) # Broadcast alpha over plates test((3,), (4,5), axis=-1, alpha_plates=(4,1,3), mu=GaussianARD(2, 4, shape=(3,), plates=(4,5))) test((3,), (4,5), axis=-1, alpha_plates=(5,3), mu=GaussianARD(2, 4, shape=(3,), plates=(4,5))) # Broadcast both mu and alpha over plates test((3,), (4,5), axis=-1, alpha_plates=(4,1,3), mu=GaussianARD(2, 4, shape=(3,), plates=(4,1))) test((3,), (4,5), axis=-1, alpha_plates=(5,3), mu=GaussianARD(2, 4, shape=(3,), plates=(5,))) # Broadcast both mu and alpha over plates but different plates test((3,), (4,5), axis=-1, alpha_plates=(4,1,3), mu=GaussianARD(2, 4, shape=(3,), plates=(5,))) test((3,), (4,5), axis=-1, alpha_plates=(5,3), mu=GaussianARD(2, 4, shape=(3,), plates=(4,1))) # # Rotation with missing values # # TODO # # Plate rotation # # Simple test((2,), (3,), axis=-1, plate_axis=-1) test((2,), (3,4,5), axis=-1, plate_axis=-1) test((2,), (3,4,5), axis=-1, plate_axis=-2) test((2,), (3,4,5), axis=-1, plate_axis=-3) test((2,3), (4,5), axis=-2, plate_axis=-2) # With mu test((2,), (3,), axis=-1, plate_axis=-1, mu=GaussianARD(3, 4, shape=(2,), plates=(3,))) # With mu broadcasted test((2,), (3,), axis=-1, plate_axis=-1, mu=GaussianARD(3, 4, shape=(2,), plates=(1,))) test((2,), (3,), axis=-1, plate_axis=-1, mu=GaussianARD(3, 4, shape=(2,), plates=())) # With mu multiple plates test((2,), (3,4,5), axis=-1, plate_axis=-2, mu=GaussianARD(3, 4, shape=(2,), plates=(3,4,5))) # With mu multiple dims test((2,3,4), (5,), axis=-2, plate_axis=-1, mu=GaussianARD(3, 4, shape=(2,3,4), plates=(5,))) # # With alpha # print("Test: Plate rotation with alpha. Scalars.") test((1,), (1,), axis=-1, plate_axis=-1, alpha_plates=(1,1), mu=0) print("Test: Plate rotation with alpha. Plates.") test((1,), (3,), axis=-1, plate_axis=-1, alpha_plates=(3,1), mu=0) print("Test: Plate rotation with alpha. Dims.") test((3,), (1,), axis=-1, plate_axis=-1, alpha_plates=(1,3), mu=0) print("Test: Plate rotation with alpha. Broadcast alpha over rotated plates.") test((1,), (3,), axis=-1, plate_axis=-1, alpha_plates=(1,1), mu=0) test((1,), (3,), axis=-1, plate_axis=-1, alpha_plates=(1,), mu=0) print("Test: Plate rotation with alpha. Broadcast alpha over dims.") test((3,), (1,), axis=-1, plate_axis=-1, alpha_plates=(1,1), mu=0) test((3,), (1,), axis=-1, plate_axis=-1, alpha_plates=(), mu=0) print("Test: Plate rotation with alpha. Multiple dims.") test((2,3,4,5), (6,), axis=-2, plate_axis=-1, alpha_plates=(6,2,3,4,5), mu=0) print("Test: Plate rotation with alpha. Multiple plates.") test((2,), (3,4,5), axis=-1, plate_axis=-1, alpha_plates=(3,4,5,2), mu=0) test((2,), (3,4,5), axis=-1, plate_axis=-2, alpha_plates=(3,4,5,2), mu=0) test((2,), (3,4,5), axis=-1, plate_axis=-3, alpha_plates=(3,4,5,2), mu=0) # # With alpha and mu # print("Test: Plate rotation with alpha and mu. Scalars.") test((1,), (1,), axis=-1, plate_axis=-1, alpha_plates=(1,1), mu=GaussianARD(2, 3, shape=(1,), plates=(1,))) print("Test: Plate rotation with alpha and mu. Plates.") test((1,), (3,), axis=-1, plate_axis=-1, alpha_plates=(3,1), mu=GaussianARD(2, 3, shape=(1,), plates=(3,))) print("Test: Plate rotation with alpha and mu. Dims.") test((3,), (1,), axis=-1, plate_axis=-1, alpha_plates=(1,3), mu=GaussianARD(2, 3, shape=(3,), plates=(1,))) print("Test: Plate rotation with alpha and mu. Broadcast over rotated " "plates.") test((1,), (3,), axis=-1, plate_axis=-1, alpha_plates=(1,1), mu=GaussianARD(2, 3, shape=(1,), plates=(1,))) test((1,), (3,), axis=-1, plate_axis=-1, alpha_plates=(1,), mu=GaussianARD(2, 3, shape=(1,), plates=())) print("Test: Plate rotation with alpha and mu. Broadcast over dims.") test((3,), (1,), axis=-1, plate_axis=-1, alpha_plates=(1,1), mu=GaussianARD(2, 3, shape=(1,), plates=(1,))) test((3,), (1,), axis=-1, plate_axis=-1, alpha_plates=(), mu=GaussianARD(2, 3, shape=(), plates=(1,))) print("Test: Plate rotation with alpha and mu. Multiple dims.") test((2,3,4,5), (6,), axis=-2, plate_axis=-1, alpha_plates=(6,2,3,4,5), mu=GaussianARD(2, 3, shape=(2,3,4,5), plates=(6,))) print("Test: Plate rotation with alpha and mu. Multiple plates.") test((2,), (3,4,5), axis=-1, plate_axis=-1, alpha_plates=(3,4,5,2), mu=GaussianARD(2, 3, shape=(2,), plates=(3,4,5,))) test((2,), (3,4,5), axis=-1, plate_axis=-2, alpha_plates=(3,4,5,2), mu=GaussianARD(2, 3, shape=(2,), plates=(3,4,5,))) test((2,), (3,4,5), axis=-1, plate_axis=-3, alpha_plates=(3,4,5,2), mu=GaussianARD(2, 3, shape=(2,), plates=(3,4,5,))) # TODO: With missing values pass class TestRotateGaussianMarkovChain(TestCase): def test_cost_function(self): """ Test the cost function of the speed-up rotation for Markov chain """ np.random.seed(42) def check(D, N, mu=None, Lambda=None, rho=None, A=None): if mu is None: mu = np.zeros(D) if Lambda is None: Lambda = np.identity(D) if rho is None: rho = np.ones(D) if A is None: A = GaussianARD(3, 5, shape=(D,), plates=(D,)) V = np.identity(D) + np.ones((D,D)) # Construct model X = GaussianMarkovChain(mu, Lambda, A, rho, n=N+1, initialize=False) Y = Gaussian(X, V, initialize=False) # Posterior estimation Y.observe(np.random.randn(*(Y.get_shape(0)))) X.update() try: A.update() except: pass try: mu.update() except: pass try: Lambda.update() except: pass try: rho.update() except: pass # Construct rotator rotA = RotateGaussianARD(A, axis=-1) rotX = RotateGaussianMarkovChain(X, rotA) # Rotation true_cost0 = X.lower_bound_contribution() rotX.setup() I = np.identity(D) R = np.random.randn(D, D) rot_cost0 = rotX.get_bound_terms(I) rot_cost1 = rotX.get_bound_terms(R) self.assertAllClose(sum(rot_cost0.values()), rotX.bound(I)[0], msg="Bound terms and total bound differ") self.assertAllClose(sum(rot_cost1.values()), rotX.bound(R)[0], msg="Bound terms and total bound differ") rotX.rotate(R) true_cost1 = X.lower_bound_contribution() self.assertAllClose(true_cost1 - true_cost0, rot_cost1[X] - rot_cost0[X], msg="Incorrect rotation cost for X") return self._run_checks(check) pass def test_cost_gradient(self): """ Test the gradient of the speed-up rotation for Markov chain """ # Use seed for deterministic testing np.random.seed(42) def check(D, N, mu=None, Lambda=None, rho=None, A=None): if mu is None: mu = np.zeros(D) if Lambda is None: Lambda = np.identity(D) if rho is None: rho = np.ones(D) if A is None: A = GaussianARD(3, 5, shape=(D,), plates=(D,)) V = np.identity(D) + np.ones((D,D)) # Construct model X = GaussianMarkovChain(mu, Lambda, A, rho, n=N+1, initialize=False) Y = Gaussian(X, V, initialize=False) # Posterior estimation Y.observe(np.random.randn(*(Y.get_shape(0)))) X.update() try: A.update() except: pass try: mu.update() except: pass try: Lambda.update() except: pass try: rho.update() except: pass # Construct rotator rotA = RotateGaussianARD(A, axis=-1) rotX = RotateGaussianMarkovChain(X, rotA) rotX.setup() # Check gradient with respect to R R = np.random.randn(D, D) def cost(r): (b, dr) = rotX.bound(np.reshape(r, np.shape(R))) return (b, np.ravel(dr)) err = optimize.check_gradient(cost, np.ravel(R), verbose=False)[1] self.assertAllClose(err, 0, atol=1e-5, msg="Gradient incorrect") return self._run_checks(check) pass def _run_checks(self, check): # Basic test check(2, 3) # Test mu check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=())) check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=(5,))) # Test Lambda check(2, 3, Lambda=Wishart(3, random.covariance(2))) check(2, 3, Lambda=Wishart(3, random.covariance(2), plates=(5,))) # Test A check(2, 3, A=GaussianARD(2, 4, shape=(2,), plates=(2,))) check(2, 3, A=GaussianARD(2, 4, shape=(2,), plates=(3,2))) check(2, 3, A=GaussianARD(2, 4, shape=(2,), plates=(5,3,2))) # Test Lambda and mu check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=()), Lambda=Wishart(2, random.covariance(2))) check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=(5,)), Lambda=Wishart(2, random.covariance(2), plates=(5,))) # Test mu and A check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=()), A=GaussianARD(2, 4, shape=(2,), plates=(2,))) check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=(5,)), A=GaussianARD(2, 4, shape=(2,), plates=(5,1,2,))) # Test Lambda and A check(2, 3, Lambda=Wishart(2, random.covariance(2)), A=GaussianARD(2, 4, shape=(2,), plates=(2,))) check(2, 3, Lambda=Wishart(2, random.covariance(2), plates=(5,)), A=GaussianARD(2, 4, shape=(2,), plates=(5,1,2,))) # Test mu, Lambda and A check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=()), Lambda=Wishart(2, random.covariance(2)), A=GaussianARD(2, 4, shape=(2,), plates=(2,))) check(2, 3, mu=GaussianARD(2, 4, shape=(2,), plates=(5,)), Lambda=Wishart(2, random.covariance(2), plates=(5,)), A=GaussianARD(2, 4, shape=(2,), plates=(5,1,2,))) pass class TestRotateVaryingMarkovChain(TestCase): def test_cost_function(self): """ Test the speed-up rotation of Markov chain with time-varying dynamics """ # Use seed for deterministic testing np.random.seed(42) def check(D, N, K, mu=None, Lambda=None, rho=None): if mu is None: mu = np.zeros(D) if Lambda is None: Lambda = np.identity(D) if rho is None: rho = np.ones(D) V = np.identity(D) + np.ones((D,D)) # Construct model B = GaussianARD(3, 5, shape=(D,K), plates=(1,D)) S = GaussianARD(2, 4, shape=(K,), plates=(N,1)) A = SumMultiply('dk,k->d', B, S) X = GaussianMarkovChain(mu, Lambda, A, rho, n=N+1, initialize=False) Y = Gaussian(X, V, initialize=False) # Posterior estimation Y.observe(np.random.randn(N+1,D)) X.update() B.update() S.update() try: mu.update() except: pass try: Lambda.update() except: pass try: rho.update() except: pass # Construct rotator rotB = RotateGaussianARD(B, axis=-2) rotX = RotateVaryingMarkovChain(X, B, S, rotB) # Rotation true_cost0 = X.lower_bound_contribution() rotX.setup() I = np.identity(D) R = np.random.randn(D, D) rot_cost0 = rotX.get_bound_terms(I) rot_cost1 = rotX.get_bound_terms(R) self.assertAllClose(sum(rot_cost0.values()), rotX.bound(I)[0], msg="Bound terms and total bound differ") self.assertAllClose(sum(rot_cost1.values()), rotX.bound(R)[0], msg="Bound terms and total bound differ") rotX.rotate(R) true_cost1 = X.lower_bound_contribution() self.assertAllClose(true_cost1 - true_cost0, rot_cost1[X] - rot_cost0[X], msg="Incorrect rotation cost for X") return self._run_checks(check) pass def test_cost_gradient(self): """ Test the gradient of the rotation for MC with time-varying dynamics """ # Use seed for deterministic testing np.random.seed(42) def check(D, N, K, mu=None, Lambda=None, rho=None): if mu is None: mu = np.zeros(D) if Lambda is None: Lambda = np.identity(D) if rho is None: rho = np.ones(D) V = np.identity(D) + np.ones((D,D)) # Construct model B = GaussianARD(3, 5, shape=(D,K), plates=(1,D)) S = GaussianARD(2, 4, shape=(K,), plates=(N,1)) A = SumMultiply('dk,k->d', B, S) X = GaussianMarkovChain(mu, Lambda, A, rho, n=N+1, initialize=False) Y = Gaussian(X, V, initialize=False) # Posterior estimation Y.observe(np.random.randn(N+1,D)) X.update() B.update() S.update() try: mu.update() except: pass try: Lambda.update() except: pass try: rho.update() except: pass # Construct rotator rotB = RotateGaussianARD(B, axis=-2) rotX = RotateVaryingMarkovChain(X, B, S, rotB) rotX.setup() # Check gradient with respect to R R = np.random.randn(D, D) def cost(r): (b, dr) = rotX.bound(np.reshape(r, np.shape(R))) return (b, np.ravel(dr)) err = optimize.check_gradient(cost, np.ravel(R), verbose=False)[1] self.assertAllClose(err, 0, atol=1e-6, msg="Gradient incorrect") return self._run_checks(check) pass def _run_checks(self, check): # Basic test check(1, 1, 1) check(2, 1, 1) check(1, 2, 1) check(1, 1, 2) check(3, 4, 2) # Test mu check(2, 3, 4, mu=GaussianARD(2, 4, shape=(2,), plates=())) # Test Lambda check(2, 3, 4, Lambda=Wishart(3, random.covariance(2))) # Test Lambda and mu check(2, 3, 4, mu=GaussianARD(2, 4, shape=(2,), plates=()), Lambda=Wishart(2, random.covariance(2))) # TODO: Test plates pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/transformations.py0000644000175100001770000016412100000000000024025 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import warnings import scipy from bayespy.utils import optimize from bayespy.utils import random from bayespy.utils import linalg from bayespy.utils import misc from bayespy.utils.linalg import dot, tracedot from .nodes import gaussian from .nodes.categorical import CategoricalMoments class RotationOptimizer(): r""" Optimizer for rotation parameter expansion in state-space models Rotates one model block with :math:`\mathbf{R}` and one model block with :math:`\mathbf{R}^{-1}`. Parameters ---------- block1 : rotator object The first rotation parameter expansion object block2 : rotator object The second rotation parameter expansion object D : int Dimensionality of the latent space References ---------- :cite:`Luttinen:2010`, :cite:`Luttinen:2013` """ def __init__(self, block1, block2, D): self.block1 = block1 self.block2 = block2 self.D = D def rotate(self, maxiter=10, check_gradient=False, verbose=False, check_bound=False): """ Optimize the rotation of two separate model blocks jointly. If some variable is the dot product of two Gaussians, rotating the two Gaussians optimally can make the inference algorithm orders of magnitude faster. First block is rotated with :math:`\mathbf{R}` and the second with :math:`\mathbf{R}^{-T}`. Blocks must have methods: `bound(U,s,V)` and `rotate(R)`. """ I = np.identity(self.D) piv = np.arange(self.D) def cost(r): # Make vector-r into matrix-R R = np.reshape(r, (self.D,self.D)) # Compute SVD invR = np.linalg.inv(R) logdetR = np.linalg.slogdet(R)[1] # Compute lower bound terms (b1,db1) = self.block1.bound(R, logdet=logdetR, inv=invR) (b2,db2) = self.block2.bound(invR.T, logdet=-logdetR, inv=R.T) # Apply chain rule for the second gradient: # d b(invR.T) # = tr(db.T * d(invR.T)) # = tr(db * d(invR)) # = -tr(db * invR * (dR) * invR) # = -tr(invR * db * invR * dR) db2 = -dot(invR.T, db2.T, invR.T) # Compute the cost function c = -(b1+b2) dc = -(db1+db2) return (c, np.ravel(dc)) def get_bound_terms(r, gradient=False): """ Returns a dictionary of bound terms for the nodes. """ # Gradient not yet implemented.. if gradient: raise NotImplementedError() # Make vector-r into matrix-R R = np.reshape(r, (self.D,self.D)) # Compute SVD invR = np.linalg.inv(R) logdetR = np.linalg.slogdet(R)[1] # Compute lower bound terms dict1 = self.block1.get_bound_terms(R, logdet=logdetR, inv=invR) dict2 = self.block2.get_bound_terms(invR.T, logdet=-logdetR, inv=R.T) if not gradient: dict1.update(dict2) return dict1 else: terms = dict1[0].copy() terms = terms.update(dict2[0]) grad = dict1[1].copy() grad = grad.update(dict2[1]) return (terms, grad) def get_true_bound_terms(): nodes = set(self.block1.nodes()) | set(self.block2.nodes()) D = {} # TODO/FIXME: Also compute bound for child nodes as they could be # affected in practice although they shouldn't. Just checking that. for node in nodes: L = node.lower_bound_contribution() D[node] = L return D self.block1.setup() self.block2.setup() if check_gradient: R = np.random.randn(self.D, self.D) err = optimize.check_gradient(cost, np.ravel(R), verbose=verbose)[1] if err > 1e-5: warnings.warn("Rotation gradient has relative error %g" % err) # Initial rotation is identity matrix r0 = np.ravel(np.identity(self.D)) (cost_begin, _) = cost(r0) if check_bound: bound_terms_begin = get_bound_terms(r0) true_bound_terms_begin = get_true_bound_terms() # Run optimization r = optimize.minimize(cost, r0, maxiter=maxiter, verbose=verbose) (cost_end, _) = cost(r) if check_bound: bound_terms_end = get_bound_terms(r) # Apply the optimal rotation R = np.reshape(r, (self.D,self.D)) invR = np.linalg.inv(R) logdetR = np.linalg.slogdet(R)[1] self.block1.rotate(R, inv=invR, logdet=logdetR) self.block2.rotate(invR.T, inv=R.T, logdet=-logdetR) # Check that the cost function and the true lower bound changed equally cost_change = cost_end - cost_begin # Check that we really have improved the bound. if cost_change > 0: warnings.warn("Rotation optimization made the cost function worse " "by %g. Probably a bug in the gradient of the " "rotation functions." % (cost_change,)) if check_bound: true_bound_terms_end = get_true_bound_terms() bound_change = 0 for node in bound_terms_begin.keys(): node_bound_change = (bound_terms_end[node] - bound_terms_begin[node]) bound_change += node_bound_change true_node_bound_change = 0 try: true_node_bound_change += (true_bound_terms_end[node] - true_bound_terms_begin[node]) except KeyError: raise Exception("The node %s is part of the " "transformation but not part of the " "model. Check your VB construction." % node.name) if not np.allclose(node_bound_change, true_node_bound_change): warnings.warn("Rotation cost function is not consistent " "with the true lower bound for node %s. " "Bound changed %g but optimized function " "changed %g." % (node.name, true_node_bound_change, node_bound_change)) # Check that we really have improved the bound. # TODO/FIXME: Also compute bound for child nodes as they could be # affected in practice although they shouldn't. Just checking that. if bound_change < 0: warnings.warn("Rotation made the true lower bound worse by %g. " "Probably a bug in the rotation functions." % (bound_change,)) class RotateGaussian(): r""" Rotation parameter expansion for :class:`bayespy.nodes.Gaussian` """ def __init__(self, X): self.X = X def rotate(self, R, inv=None, logdet=None): self.X.rotate(R, inv=inv, logdet=logdet) def setup(self): """ This method should be called just before optimization. """ mask = self.X.mask[...,np.newaxis,np.newaxis] # Number of plates self.N = self.X.plates[0] #np.sum(mask) # Compute the sum over plates self.XX = misc.sum_multiply(self.X.get_moments()[1], mask, axis=(-1,-2), sumaxis=False, keepdims=False) # Parent's moments self.Lambda = self.X.parents[1].get_moments()[0] def _compute_bound(self, R, logdet=None, inv=None, gradient=False): """ Rotate q(X) as X->RX: q(X)=N(R*mu, R*Cov*R') Assume: :math:`p(\mathbf{X}) = \prod^M_{m=1} N(\mathbf{x}_m|0, \mathbf{\Lambda})` """ # TODO/FIXME: X and alpha should NOT contain observed values!! Check # that. # TODO/FIXME: Allow non-zero prior mean! # Assume constant mean and precision matrix over plates.. # Compute rotated moments XX_R = dot(R, self.XX, R.T) inv_R = inv logdet_R = logdet # Compute entropy H(X) logH_X = random.gaussian_entropy(-2*self.N*logdet_R, 0) # Compute logp_X = random.gaussian_logpdf(np.vdot(XX_R, self.Lambda), 0, 0, 0, 0) # Compute the bound if terms: bound = {self.X: bound} else: bound = logp_X + logH_X if not gradient: return bound # Compute dH(X) dlogH_X = random.gaussian_entropy(-2*self.N*inv_R.T, 0) # Compute d dXX = 2*dot(self.Lambda, R, self.XX) dlogp_X = random.gaussian_logpdf(dXX, 0, 0, 0, 0) if terms: d_bound = {self.X: dlogp_X + dlogH_X} else: d_bound = dlogp_X + dlogH_X return (bound, d_bound) def bound(self, R, logdet=None, inv=None): return self._compute_bound(R, logdet=logdet, inv=inv, gradient=True) def get_bound_terms(self, R, logdet=None, inv=None): return self._compute_bound(R, logdet=logdet, inv=inv, gradient=False, terms=True) def nodes(self): return [self.X] def covariance_to_variance(C, ndim=1, covariance_axis=None): # Force None to empty list if covariance_axis is None: covariance_axis = [] # Force a list from integer if isinstance(covariance_axis, int): covariance_axis = [covariance_axis] # Force positive axis indices covariance_axis = [axis + ndim if axis < 0 else axis for axis in covariance_axis] # Make a set of the axes covariance_axis = set(covariance_axis) keys = [i+ndim if i in covariance_axis else i for i in range(ndim)] keys += [i+2*ndim if i in covariance_axis else i for i in range(ndim)] out_keys = sorted(list(set(keys))) return np.einsum(C, [Ellipsis]+keys, [Ellipsis]+out_keys) def sum_to_plates(V, plates_to, plates_from=None, ndim=0): if ndim == 0: if plates_from is not None: r = gaussian.Gaussian.broadcasting_multiplier(plates_from, np.shape(V)) else: r = 1 return r * misc.sum_to_shape(V, plates_to) else: dims_V = np.shape(V)[-ndim:] plates_V = np.shape(V)[:-ndim] shape_to = tuple(plates_to) + dims_V if plates_from is not None: r = gaussian.Gaussian.broadcasting_multiplier(plates_from, plates_V) else: r = 1 return r * misc.sum_to_shape(V, shape_to) class RotateGaussianARD(): """ Rotation parameter expansion for :class:`bayespy.nodes.GaussianARD` The model: alpha ~ N(a, b) X ~ N(mu, alpha) X can be an array (e.g., GaussianARD). Transform q(X) and q(alpha) by rotating X. Requirements: * X and alpha do not contain any observed values """ def __init__(self, X, *alpha, axis=-1, precompute=False, subset=None): """ Precompute tells whether to compute some moments once in the setup function instead of every time in the bound function. However, they are computed a bit differently in the bound function so it can be useful too. Precomputation is probably beneficial only when there are large axes that are not rotated (by R nor Q) and they are not contained in the plates of alpha, and the dimensions for R and Q are quite small. """ self.precompute = precompute self.node_parent = X.parents[0] if len(alpha) == 0: self.update_alpha = False elif len(alpha) == 1: self.node_alpha = alpha[0] self.update_alpha = True else: raise ValueError("Too many arguments") self.node_X = X #self.node_mu = X.parents[0] self.ndim = len(X.dims[0]) # Force negative rotation axis indexing if not isinstance(axis, int): raise ValueError("Axis must be integer") if axis >= 0: axis -= self.ndim if axis < -self.ndim or axis >= 0: raise ValueError("Axis out of bounds") self.axis = axis # Allow rotation of only subset of elements/slices self.D = X.dims[0][axis] if subset is None: #self.subset = np.ones(self.D, dtype=bool) self.subset = None #tuple(range(self.D)) else: #self.subset = tuple(range(self.D)) self.subset = subset #self.subset[subset] if axis != -1: raise NotImplementedError("Subset indexing for non-last " "axis not yet implemented") ## self.subset = np.zeros(self.D, dtype=bool) ## self.subset[list(subset)] = True def nodes(self): if self.update_alpha: return [self.node_X, self.node_alpha] else: return [self.node_X] def _full_rotation_matrix(self, R): if self.subset is not None: R_full = np.identity(self.D) indices = np.ix_(self.subset, self.subset) R_full[indices] = R return R_full else: return R def rotate(self, R, inv=None, logdet=None, Q=None): ## R = self._full_rotation_matrix(R) ## if inv is not None: ## inv = self._full_rotation_matrix(inv) self.node_X.rotate(R, inv=inv, logdet=logdet, subset=self.subset, axis=self.axis) if self.plate_axis is not None: self.node_X.rotate_plates(Q, plate_axis=self.plate_axis) if self.update_alpha: self.node_alpha.update() def setup(self, plate_axis=None): """ This method should be called just before optimization. For efficiency, sum over axes that are not in mu, alpha nor rotation. If using Q, set rotate_plates to True. """ # Store the original plate_axis parameter for later use in other methods self.plate_axis = plate_axis # Manipulate the plate_axis parameter to suit the needs of this method if plate_axis is not None: if not isinstance(plate_axis, int): raise ValueError("Plate axis must be integer") if plate_axis >= 0: plate_axis -= len(self.node_X.plates) if plate_axis < -len(self.node_X.plates) or plate_axis >= 0: raise ValueError("Axis out of bounds") plate_axis -= self.ndim - 1 # Why -1? Because one axis is preserved! # Get the mean parameter. It will not be rotated. This assumes that mu # and alpha are really independent. (alpha_mu, alpha_mu2, alpha, _) = self.node_parent.get_moments() (X, XX) = self.node_X.get_moments() # mu = alpha_mu / alpha mu2 = alpha_mu2 / alpha # For simplicity, force mu to have the same shape as X mu = mu * np.ones(self.node_X.dims[0]) mu2 = mu2 * np.ones(self.node_X.dims[0]) ## (mu, mumu) = gaussian.reshape_gaussian_array(self.node_mu.dims[0], ## self.node_X.dims[0], ## mu, ## mumu) # Take diagonal of covariances to variances for axes that are not in R # (and move those axes to be the last) XX = covariance_to_variance(XX, ndim=self.ndim, covariance_axis=self.axis) ## mumu = covariance_to_variance(mumu, ## ndim=self.ndim, ## covariance_axis=self.axis) # Move axes of X and mu and compute their outer product X = misc.moveaxis(X, self.axis, -1) mu = misc.moveaxis(mu, self.axis, -1) mu2 = misc.moveaxis(mu2, self.axis, -1) Xmu = linalg.outer(X, mu, ndim=1) D = np.shape(X)[-1] # Move axes of alpha related variables def safe_move_axis(x): if np.ndim(x) >= -self.axis: return misc.moveaxis(x, self.axis, -1) else: return x[...,np.newaxis] if self.update_alpha: a = safe_move_axis(self.node_alpha.phi[1]) a0 = safe_move_axis(self.node_alpha.parents[0].get_moments()[0]) b0 = safe_move_axis(self.node_alpha.parents[1].get_moments()[0]) plates_alpha = list(self.node_alpha.plates) else: alpha = safe_move_axis(self.node_parent.get_moments()[2]) plates_alpha = list(self.node_parent.get_shape(2)) # Move plates of alpha for R if len(plates_alpha) >= -self.axis: plate = plates_alpha.pop(self.axis) plates_alpha.append(plate) else: plates_alpha.append(1) plates_X = list(self.node_X.get_shape(0)) plates_X.pop(self.axis) def sum_to_alpha(V, ndim=2): # TODO/FIXME: This could be improved so that it is not required to # explicitly repeat to alpha plates. Multiplying by ones was just a # simple bug fix. return sum_to_plates(V * np.ones(plates_alpha[:-1]+ndim*[1]), plates_alpha[:-1], ndim=ndim, plates_from=plates_X) if plate_axis is not None: # Move plate axis just before the rotated dimensions (which are # last) def safe_move_plate_axis(x, ndim): if np.ndim(x)-ndim >= -plate_axis: return misc.moveaxis(x, plate_axis-ndim, -ndim-1) else: inds = (Ellipsis,None) + ndim*(slice(None),) return x[inds] X = safe_move_plate_axis(X, 1) mu = safe_move_plate_axis(mu, 1) XX = safe_move_plate_axis(XX, 2) mu2 = safe_move_plate_axis(mu2, 1) if self.update_alpha: a = safe_move_plate_axis(a, 1) a0 = safe_move_plate_axis(a0, 1) b0 = safe_move_plate_axis(b0, 1) else: alpha = safe_move_plate_axis(alpha, 1) # Move plates of X and alpha plate = plates_X.pop(plate_axis) plates_X.append(plate) if len(plates_alpha) >= -plate_axis+1: plate = plates_alpha.pop(plate_axis-1) else: plate = 1 plates_alpha = plates_alpha[:-1] + [plate] + plates_alpha[-1:] CovX = XX - linalg.outer(X, X) self.CovX = sum_to_plates(CovX, plates_alpha[:-2], ndim=3, plates_from=plates_X[:-1]) # Broadcast mumu to ensure shape #mumu = np.ones(np.shape(XX)[-3:]) * mumu mu2 = mu2 * np.ones(np.shape(X)[-2:]) self.mu2 = sum_to_alpha(mu2, ndim=1) if self.precompute: # Precompute some stuff for the gradient of plate rotation # # NOTE: These terms may require a lot of memory if alpha has the # same or almost the same plates as X. self.X_X = sum_to_plates(X[...,:,:,None,None] * X[...,None,None,:,:], plates_alpha[:-2], ndim=4, plates_from=plates_X[:-1]) self.X_mu = sum_to_plates(X[...,:,:,None,None] * mu[...,None,None,:,:], plates_alpha[:-2], ndim=4, plates_from=plates_X[:-1]) else: self.X = X self.mu = mu else: # Sum axes that are not in the plates of alpha self.XX = sum_to_alpha(XX) self.mu2 = sum_to_alpha(mu2, ndim=1) self.Xmu = sum_to_alpha(Xmu) if self.update_alpha: self.a = a self.a0 = a0 self.b0 = b0 else: self.alpha = alpha self.plates_X = plates_X self.plates_alpha = plates_alpha # Take only a subset of the matrix for rotation if self.subset is not None: if self.precompute: raise NotImplementedError("Precomputation not implemented when " "using a subset") # from X self.X = self.X[...,self.subset] self.mu2 = self.mu2[...,self.subset] if plate_axis is not None: # from CovX inds = [] for i in range(np.ndim(self.CovX)-2): inds.append(range(np.shape(self.CovX)[i])) inds.append(self.subset) inds.append(self.subset) indices = np.ix_(*inds) self.CovX = self.CovX[indices] # from mu self.mu = self.mu[...,self.subset] else: # from XX inds = [] for i in range(np.ndim(self.XX)-2): inds.append(range(np.shape(self.XX)[i])) inds.append(self.subset) inds.append(self.subset) indices = np.ix_(*inds) self.XX = self.XX[indices] # from Xmu self.Xmu = self.Xmu[...,self.subset] # from alpha if self.update_alpha: if np.shape(self.a)[-1] > 1: self.a = self.a[...,self.subset] if np.shape(self.a0)[-1] > 1: self.a0 = self.a0[...,self.subset] if np.shape(self.b0)[-1] > 1: self.b0 = self.b0[...,self.subset] else: if np.shape(self.alpha)[-1] > 1: self.alpha = self.alpha[...,self.subset] self.plates_alpha[-1] = min(self.plates_alpha[-1], len(self.subset)) ## # from mu ## # from alpha ## alpha_mu = alpha_mu[...,self.subset] ## alpha_mu2 = alpha_mu2[...,self.subset] ## alpha = alpha[...,self.subset] ## dims = list(self.node_X.dims[0]) ## dims[-1] = len(self.subset) ## else: ## dims = list(self.node_X.dims[0]) def _compute_bound(self, R, logdet=None, inv=None, Q=None, gradient=False, terms=False): """ Rotate q(X) and q(alpha). Assume: p(X|alpha) = prod_m N(x_m|0,diag(alpha)) p(alpha) = prod_d G(a_d,b_d) """ ## R = self._full_rotation_matrix(R) ## if inv is not None: ## inv = self._full_rotation_matrix(inv) # # Transform the distributions and moments # plates_alpha = self.plates_alpha plates_X = self.plates_X # Compute rotated second moment if self.plate_axis is not None: # The plate axis has been moved to be the last plate axis if Q is None: raise ValueError("Plates should be rotated but no Q give") # Transform covariance sumQ = np.sum(Q, axis=0) QCovQ = sumQ[:,None,None]**2 * self.CovX # Rotate plates if self.precompute: QX_QX = np.einsum('...kalb,...ik,...il->...iab', self.X_X, Q, Q) XX = QX_QX + QCovQ XX = sum_to_plates(XX, plates_alpha[:-1], ndim=2) Xmu = np.einsum('...kaib,...ik->...iab', self.X_mu, Q) Xmu = sum_to_plates(Xmu, plates_alpha[:-1], ndim=2) else: X = self.X mu = self.mu QX = np.einsum('...ik,...kj->...ij', Q, X) XX = (sum_to_plates(QCovQ, plates_alpha[:-1], ndim=2) + sum_to_plates(linalg.outer(QX, QX), plates_alpha[:-1], ndim=2, plates_from=plates_X)) Xmu = sum_to_plates(linalg.outer(QX, self.mu), plates_alpha[:-1], ndim=2, plates_from=plates_X) mu2 = self.mu2 D = np.shape(XX)[-1] logdet_Q = D * np.log(np.abs(sumQ)) else: XX = self.XX mu2 = self.mu2 Xmu = self.Xmu logdet_Q = 0 # Compute transformed moments #mu2 = np.einsum('...ii->...i', mu2) RXmu = np.einsum('...ik,...ki->...i', R, Xmu) RXX = np.einsum('...ik,...kj->...ij', R, XX) RXXR = np.einsum('...ik,...ik->...i', RXX, R) # <(X-mu) * (X-mu)'>_R XmuXmu = (RXXR - 2*RXmu + mu2) D = np.shape(R)[0] # Compute q(alpha) if self.update_alpha: # Parameters a0 = self.a0 b0 = self.b0 a = self.a b = b0 + 0.5*sum_to_plates(XmuXmu, plates_alpha, plates_from=None, ndim=0) # Some expectations alpha = a / b logb = np.log(b) logalpha = -logb # + const b0_alpha = b0 * alpha a0_logalpha = a0 * logalpha else: alpha = self.alpha logalpha = 0 # # Compute the cost # def sum_plates(V, *plates): full_plates = misc.broadcasted_shape(*plates) r = self.node_X.broadcasting_multiplier(full_plates, np.shape(V)) return r * np.sum(V) XmuXmu_alpha = XmuXmu * alpha if logdet is None: logdet_R = np.linalg.slogdet(R)[1] inv_R = np.linalg.inv(R) else: logdet_R = logdet inv_R = inv # Compute entropy H(X) logH_X = random.gaussian_entropy(-2*sum_plates(logdet_R + logdet_Q, plates_X), 0) # Compute logp_X = random.gaussian_logpdf(sum_plates(XmuXmu_alpha, plates_alpha[:-1] + [D]), 0, 0, sum_plates(logalpha, plates_X + [D]), 0) if self.update_alpha: # Compute entropy H(alpha) # This cancels out with the log(alpha) term in log(p(alpha)) logH_alpha = 0 # Compute logp_alpha = random.gamma_logpdf(sum_plates(b0_alpha, plates_alpha), 0, sum_plates(a0_logalpha, plates_alpha), 0, 0) else: logH_alpha = 0 logp_alpha = 0 # Compute the bound if terms: bound = {self.node_X: logp_X + logH_X} if self.update_alpha: bound.update({self.node_alpha: logp_alpha + logH_alpha}) else: bound = (0 + logp_X + logp_alpha + logH_X + logH_alpha ) if not gradient: return bound # # Compute the gradient with respect R # broadcasting_multiplier = self.node_X.broadcasting_multiplier def sum_plates(V, plates): ones = np.ones(np.shape(R)) r = broadcasting_multiplier(plates, np.shape(V)[:-2]) return r * misc.sum_multiply(V, ones, axis=(-1,-2), sumaxis=False, keepdims=False) D_XmuXmu = 2*RXX - 2*gaussian.transpose_covariance(Xmu) DXmuXmu_alpha = np.einsum('...i,...ij->...ij', alpha, D_XmuXmu) if self.update_alpha: D_b = 0.5 * D_XmuXmu XmuXmu_Dalpha = np.einsum('...i,...i,...i,...ij->...ij', sum_to_plates(XmuXmu, plates_alpha, plates_from=None, ndim=0), alpha, -1/b, D_b) D_b0_alpha = np.einsum('...i,...i,...i,...ij->...ij', b0, alpha, -1/b, D_b) D_logb = np.einsum('...i,...ij->...ij', 1/b, D_b) D_logalpha = -D_logb D_a0_logalpha = a0 * D_logalpha else: XmuXmu_Dalpha = 0 D_logalpha = 0 D_XmuXmu_alpha = DXmuXmu_alpha + XmuXmu_Dalpha D_logR = inv_R.T # Compute dH(X) dlogH_X = random.gaussian_entropy(-2*sum_plates(D_logR, plates_X), 0) # Compute d dlogp_X = random.gaussian_logpdf(sum_plates(D_XmuXmu_alpha, plates_alpha[:-1]), 0, 0, (sum_plates(D_logalpha, plates_X) * broadcasting_multiplier((D,), plates_alpha[-1:])), 0) if self.update_alpha: # Compute dH(alpha) # This cancels out with the log(alpha) term in log(p(alpha)) dlogH_alpha = 0 # Compute d dlogp_alpha = random.gamma_logpdf(sum_plates(D_b0_alpha, plates_alpha[:-1]), 0, sum_plates(D_a0_logalpha, plates_alpha[:-1]), 0, 0) else: dlogH_alpha = 0 dlogp_alpha = 0 if terms: raise NotImplementedError() dR_bound = {self.node_X: dlogp_X + dlogH_X} if self.update_alpha: dR_bound.update({self.node_alpha: dlogp_alpha + dlogH_alpha}) else: dR_bound = (0*dlogp_X + dlogp_X + dlogp_alpha + dlogH_X + dlogH_alpha ) if self.subset: indices = np.ix_(self.subset, self.subset) dR_bound = dR_bound[indices] if self.plate_axis is None: return (bound, dR_bound) # # Compute the gradient with respect to Q (if Q given) # # Some pre-computations Q_RCovR = np.einsum('...ik,...kl,...il,...->...i', R, self.CovX, R, sumQ) if self.precompute: Xr_rX = np.einsum('...abcd,...jb,...jd->...jac', self.X_X, R, R) QXr_rX = np.einsum('...akj,...ik->...aij', Xr_rX, Q) RX_mu = np.einsum('...jk,...akbj->...jab', R, self.X_mu) else: RX = np.einsum('...ik,...k->...i', R, X) QXR = np.einsum('...ik,...kj->...ij', Q, RX) QXr_rX = np.einsum('...ik,...jk->...kij', QXR, RX) RX_mu = np.einsum('...ik,...jk->...kij', RX, mu) QXr_rX = sum_to_plates(QXr_rX, plates_alpha[:-2], ndim=3, plates_from=plates_X[:-1]) RX_mu = sum_to_plates(RX_mu, plates_alpha[:-2], ndim=3, plates_from=plates_X[:-1]) def psi(v): """ Compute: d/dQ 1/2*trace(diag(v)*<(X-mu)*(X-mu)>) = Q*'*R'*diag(v)*R* + ones * Q diag( tr(R'*diag(v)*R*Cov) ) + mu*diag(v)*R* """ # Precompute all terms to plates_alpha because v has shape # plates_alpha. # Gradient of 0.5*v** v_QXrrX = np.einsum('...kij,...ik->...ij', QXr_rX, v) # Gradient of 0.5*v*Cov Q_tr_R_v_R_Cov = np.einsum('...k,...k->...', Q_RCovR, v)[...,None,:] # Gradient of mu*v*x mu_v_R_X = np.einsum('...ik,...kji->...ij', v, RX_mu) return v_QXrrX + Q_tr_R_v_R_Cov - mu_v_R_X def sum_plates(V, plates): ones = np.ones(np.shape(Q)) r = self.node_X.broadcasting_multiplier(plates, np.shape(V)[:-2]) return r * misc.sum_multiply(V, ones, axis=(-1,-2), sumaxis=False, keepdims=False) if self.update_alpha: D_logb = psi(1/b) XX_Dalpha = -psi(alpha/b * sum_to_plates(XmuXmu, plates_alpha)) D_logalpha = -D_logb else: XX_Dalpha = 0 D_logalpha = 0 DXX_alpha = 2*psi(alpha) D_XX_alpha = DXX_alpha + XX_Dalpha D_logdetQ = D / sumQ N = np.shape(Q)[-1] # Compute dH(X) dQ_logHX = random.gaussian_entropy(-2*sum_plates(D_logdetQ, plates_X[:-1]), 0) # Compute d dQ_logpX = random.gaussian_logpdf(sum_plates(D_XX_alpha, plates_alpha[:-2]), 0, 0, (sum_plates(D_logalpha, plates_X[:-1]) * broadcasting_multiplier((N,D), plates_alpha[-2:])), 0) if self.update_alpha: D_alpha = -psi(alpha/b) D_b0_alpha = b0 * D_alpha D_a0_logalpha = a0 * D_logalpha # Compute dH(alpha) # This cancels out with the log(alpha) term in log(p(alpha)) dQ_logHalpha = 0 # Compute d dQ_logpalpha = random.gamma_logpdf(sum_plates(D_b0_alpha, plates_alpha[:-2]), 0, sum_plates(D_a0_logalpha, plates_alpha[:-2]), 0, 0) else: dQ_logHalpha = 0 dQ_logpalpha = 0 if terms: raise NotImplementedError() dQ_bound = {self.node_X: dQ_logpX + dQ_logHX} if self.update_alpha: dQ_bound.update({self.node_alpha: dQ_logpalpha + dQ_logHalpha}) else: dQ_bound = (0*dQ_logpX + dQ_logpX + dQ_logpalpha + dQ_logHX + dQ_logHalpha ) return (bound, dR_bound, dQ_bound) def bound(self, R, logdet=None, inv=None, Q=None): return self._compute_bound(R, logdet=logdet, inv=inv, Q=Q, gradient=True) def get_bound_terms(self, R, logdet=None, inv=None, Q=None): return self._compute_bound(R, logdet=logdet, inv=inv, Q=Q, gradient=False, terms=True) class RotateGaussianMarkovChain(): r""" Rotation parameter expansion for :class:`bayespy.nodes.GaussianMarkovChain` Assume the following model. Constant, unit isotropic innovation noise. Unit variance only? Maybe: Assume innovation noise with unit variance? Would it help make this function more general with respect to A. TODO: Allow constant A or not rotating A. .. math:: R x_n = R A R^{-1} R x_{n-1} + R B u_{n-1} + noise \\\ R x_n = R [A, B] [R^{-1}, 0; 0, I] [R, 0; 0, I] [x_{n-1}; u_{n-1}] :math:`A` may vary in time. Shape of A: (N,D,D) Shape of AA: (N,D,D,D) No plates for X. """ def __init__(self, X, *args): self.X_node = X # FIXME: Currently, GaussianMarkovChain wraps initial state mean and # precision into one node and dynamics plus innovation noise into # another node. This transformation doesn't yet support GaussianGamma # dynamics, so we'll do some ugly checking here: # Dynamics node from bayespy.inference.vmp.nodes.gaussian import ( WrapToGaussianGamma, GaussianToGaussianGamma, GaussianMoments, ) dynamics_innovation = X.parents[1] assert(isinstance(dynamics_innovation, WrapToGaussianGamma)) dynamics_gaussiangamma = dynamics_innovation.parents[0] assert(isinstance(dynamics_gaussiangamma, GaussianToGaussianGamma)) dynamics = dynamics_gaussiangamma.parents[0] assert(isinstance(dynamics._moments, GaussianMoments)) self.A_node = dynamics if len(args) == 0: raise NotImplementedError() elif len(args) == 1: self.A_rotator = args[0] else: raise ValueError("Wrong number of arguments") self.N = X.dims[0][0] def nodes(self): return [self.X_node] + self.A_rotator.nodes() def rotate(self, R, inv=None, logdet=None): if inv is None: inv = np.linalg.inv(R) if logdet is None: logdet = np.linalg.slogdet(R)[1] self.X_node.rotate(R, inv=inv, logdet=logdet) from scipy.linalg import block_diag if len(self.X_node.parents) >= 3: input_shape = self.X_node.parents[2].dims[0] input_len = input_shape[-1] I = np.identity(input_len) else: I = np.identity(0) self.A_rotator.rotate(block_diag(inv.T, I), inv=block_diag(R.T, I), logdet=-logdet, Q=R) def _computations_for_A_and_X(self, XpXn, XpXp): # Get moments of the state dynamics matrix (A, AA) = self.A_node.get_moments() # Make sure time axis is in the arrays A = misc.atleast_nd(A, 3) AA = misc.atleast_nd(AA, 4) CovA = AA - A[...,:,np.newaxis]*A[...,np.newaxis,:] # # Expectations with respect to A and X # # TODO: In case A does not depend on time, use a bit more efficient # formulas # Compute: \sum_n A_XpXn = np.einsum('...nik,...nkj->...ij', A, XpXn) A_XpXn = sum_to_plates(A_XpXn, (), ndim=2, plates_from=self.X_node.plates) # Compute: \sum_n ^T A_XpXp = np.einsum('...nik,...nkj->...nij', A, XpXp) A_XpXp_A = np.einsum('...nik,...njk->...ij', A_XpXp, A) A_XpXp_A = sum_to_plates(A_XpXp_A, (), ndim=2, plates_from=self.X_node.plates) # Compute: \sum_n tr(CovA_n ) CovA_XpXp = np.einsum('...ndij,...nij->...d', CovA, XpXp) CovA_XpXp = sum_to_plates(CovA_XpXp, (), ndim=1, plates_from=self.X_node.plates) return (A_XpXn, A_XpXp_A, CovA_XpXp) def setup(self): """ This method should be called just before optimization. """ # Get moments of X (X, XnXn, XpXn) = self.X_node.get_moments() # TODO/FIXME: Sum to plates of A/CovA XpXp = XnXn[...,:-1,:,:] # Add input signals if len(self.X_node.parents) >= 3: (U, UU) = self.X_node.parents[2].get_moments() UXn = linalg.outer(U, X[...,1:,:]) UXp = linalg.outer(U, X[...,:-1,:]) XpXn = np.concatenate([XpXn, UXn], axis=-2) XpXp = np.concatenate( [ np.concatenate([XpXp, linalg.transpose(UXp)], axis=-1), np.concatenate([UXp, UU], axis=-1) ], axis=-2 ) # # Expectations with respect to X # self.X0 = X[...,0,:] self.X0X0 = XnXn[...,0,:,:] #self.XnXn = np.sum(XnXn[...,1:,:,:], axis=-3) self.XnXn = sum_to_plates(XnXn[...,1:,:,:], (), plates_from=self.X_node.plates + (self.N-1,), ndim=2) # Get moments of the fixed parameter nodes Lambda_mu = self.X_node.parents[0].get_moments()[0] self.Lambda = self.X_node.parents[0].get_moments()[2] #self.Lambda = self.X_node.parents[1].get_moments()[0] self.Lambda_mu_X0 = linalg.outer(Lambda_mu, self.X0) self.Lambda_mu_X0 = sum_to_plates(self.Lambda_mu_X0, (), plates_from=self.X_node.plates, ndim=2) # # Prepare the rotation for A # (self.A_XpXn, self.A_XpXp_A, self.CovA_XpXp) = self._computations_for_A_and_X(XpXn, XpXp) self.A_rotator.setup(plate_axis=-1) # Innovation noise is assumed to be I #self.v = self.X_node.parents[3].get_moments()[0] def _compute_bound(self, R, logdet=None, inv=None, gradient=False, terms=False): """ Rotate q(X) as X->RX: q(X)=N(R*mu, R*Cov*R') Assume: :math:`p(\mathbf{X}) = \prod^M_{m=1} N(\mathbf{x}_m|0, \mathbf{\Lambda})` Assume unit innovation noise covariance. """ # TODO/FIXME: X and alpha should NOT contain observed values!! Check # that. # Assume constant mean and precision matrix over plates.. if inv is None: invR = np.linalg.inv(R) else: invR = inv if logdet is None: logdetR = np.linalg.slogdet(R)[1] else: logdetR = logdet # Transform moments of X and A: Lambda_R_X0X0 = sum_to_plates(dot(self.Lambda, R, self.X0X0), (), plates_from=self.X_node.plates, ndim=2) R_XnXn = dot(R, self.XnXn) RA_XpXp_A = dot(R, self.A_XpXp_A) sumr = np.sum(R, axis=0) R_CovA_XpXp = sumr * self.CovA_XpXp # Compute entropy H(X) M = self.N*np.prod(self.X_node.plates) # total number of rotated vectors logH_X = random.gaussian_entropy(-2 * M * logdetR, 0) # Compute yy = tracedot(R_XnXn, R.T) + tracedot(Lambda_R_X0X0, R.T) yz = tracedot(dot(R,self.A_XpXn),R.T) + tracedot(self.Lambda_mu_X0, R.T) zz = tracedot(RA_XpXp_A, R.T) + np.einsum('...k,...k->...', R_CovA_XpXp, sumr) logp_X = random.gaussian_logpdf(yy, yz, zz, 0, 0) # Compute the bound if terms: bound = {self.X_node: logp_X + logH_X} else: bound = logp_X + logH_X if not gradient: return bound # Compute dH(X) dlogH_X = random.gaussian_entropy(-2 * M * invR.T, 0) # Compute d dyy = 2 * (R_XnXn + Lambda_R_X0X0) dyz = dot(R, self.A_XpXn + self.A_XpXn.T) + self.Lambda_mu_X0 dzz = 2 * (RA_XpXp_A + R_CovA_XpXp[None,:]) dlogp_X = random.gaussian_logpdf(dyy, dyz, dzz, 0, 0) if terms: d_bound = {self.X_node: dlogp_X + dlogH_X} else: d_bound = ( + dlogp_X + dlogH_X ) return (bound, d_bound) def bound(self, R, logdet=None, inv=None): if inv is None: inv = np.linalg.inv(R) if logdet is None: logdet = np.linalg.slogdet(R)[1] (bound_X, d_bound_X) = self._compute_bound(R, logdet=logdet, inv=inv, gradient=True) # Compute cost and gradient from A # Handle possible input signals from scipy.linalg import block_diag if len(self.X_node.parents) >= 3: input_shape = self.X_node.parents[2].dims[0] input_len = input_shape[-1] I = np.identity(input_len) else: I = np.identity(0) (bound_A, dR_bound_A, dQ_bound_A) = self.A_rotator.bound(block_diag(inv.T, I), inv=block_diag(R.T, I), logdet=-logdet, Q=R) # Ignore input signals gradients D = self.X_node.dims[0][-1] dR_bound_A = dR_bound_A[...,:D,:D] dR_bound_A = -dot(inv.T, dR_bound_A.T, inv.T) # Compute the bound bound = bound_X + bound_A d_bound = d_bound_X + dR_bound_A + dQ_bound_A return (bound, d_bound) def get_bound_terms(self, R, logdet=None, inv=None): if inv is None: inv = np.linalg.inv(R) if logdet is None: logdet = np.linalg.slogdet(R)[1] # Handle possible input signals from scipy.linalg import block_diag if len(self.X_node.parents) >= 3: input_shape = self.X_node.parents[2].dims[0] input_len = input_shape[-1] I = np.identity(input_len) else: I = np.identity(0) terms_A = self.A_rotator.get_bound_terms(block_diag(inv.T, I), inv=block_diag(R.T, I), logdet=-logdet, Q=R) terms_X = self._compute_bound(R, logdet=logdet, inv=inv, gradient=False, terms=True) terms_X.update(terms_A) return terms_X class RotateVaryingMarkovChain(RotateGaussianMarkovChain): r""" Rotation for :class:`bayespy.nodes.SwitchingGaussianMarkovChain` Assume the following model. Constant, unit isotropic innovation noise. :math:`A_n = \sum_k B_k s_{kn}` Gaussian B: (1,D) x (D,K) Gaussian S: (N,1) x (K) MC X: () x (N+1,D) No plates for X. """ def __init__(self, X, B, S, B_rotator): self.X_node = X self.B_node = B self.S_node = S self.B_rotator = B_rotator if len(S.plates) > 0 and S.plates[-1] > 1: raise ValueError("The length of the last plate of S must be 1.") if len(B.plates) > 1 and B.plates[-2] > 1: raise ValueError("The length of the last plate of B must be 1.") if len(S.dims[0]) != 1: raise ValueError("S should have exactly one variable axis") if len(B.dims[0]) != 2: raise ValueError("B should have exactly two variable axes") super().__init__(X, B_rotator) def _computations_for_A_and_X(self, XpXn, XpXp): # Get moments of B and S (B, BB) = self.B_node.get_moments() CovB = BB - B[...,:,:,None,None]*B[...,None,None,:,:] u_S = self.S_node.get_moments() S = u_S[0] SS = u_S[1] # # Expectations with respect to A and X # # TODO/FIXME: If S and B have overlapping plates, then these will give # wrong results, because those plates of S are summed before multiplying # by the plates of B. There should be some "smart einsum" function which # would compute sum-multiplys intelligently given a number of inputs. # Compute: \sum_n # Axes: (N, D, D, D, K) S_XpXn = misc.sum_multiply(S[...,None,None,:], XpXn[...,:,None,:,:,None], axis=(-3,-2,-1), sumaxis=False) A_XpXn = misc.sum_multiply(B[...,:,:,None,:], S_XpXn[...,:,:,:], axis=(-4,-2), sumaxis=False) # Compute: \sum_n ^T # Axes: (N, D, D, D, K, D, K) SS_XpXp = misc.sum_multiply(SS[...,None,:,None,:], XpXp[...,None,:,None,:,None], axis=(-4,-3,-2,-1), sumaxis=False) B_SS_XpXp = misc.sum_multiply(B[...,:,:,:,None,None], SS_XpXp[...,:,:,:,:], axis=(-4,-3), sumaxis=True) A_XpXp_A = misc.sum_multiply(B_SS_XpXp[...,:,None,:,:], B[...,None,:,:,:], axis=(-4,-3), sumaxis=False) # Compute: \sum_n tr(CovA_n ) # Axes: (D,D,K,D,K) CovA_XpXp = misc.sum_multiply(CovB, SS_XpXp, axis=(-5,), sumaxis=False) return (A_XpXn, A_XpXp_A, CovA_XpXp) class RotateSwitchingMarkovChain(RotateGaussianMarkovChain): """ Rotation for :class:`bayespy.nodes.VaryingGaussianMarkovChain` Assume the following model. Constant, unit isotropic innovation noise. :math:`A_n = B_{z_n}` Gaussian B: (..., K, D) x (D) Categorical Z: (..., N-1) x (K) GaussianMarkovChain X: (...) x (N,D) No plates for X. """ def __init__(self, X, B, Z, B_rotator): self.X_node = X self.B_node = B self.Z_node = Z._moments.get_converter(CategoricalMoments)(Z) self.B_rotator = B_rotator (N,D) = self.X_node.dims[0] K = self.Z_node.dims[0][0] if len(self.Z_node.plates) == 0 and self.Z_node.plates[-1] != N-1: raise ValueError("Incorrect plate length in Z") if self.B_node.plates[-2:] != (K,D): raise ValueError("Incorrect plates in B") if len(self.Z_node.dims[0]) != 1: raise ValueError("Z should have exactly one variable axis") if len(self.B_node.dims[0]) != 1: raise ValueError("B should have exactly one variable axes") super().__init__(X, B_rotator) def _computations_for_A_and_X(self, XpXn, XpXp): # Get moments of B and Z (B, BB) = self.B_node.get_moments() CovB = BB - B[...,:,None]*B[...,None,:] u_Z = self.Z_node.get_moments() Z = u_Z[0] # # Expectations with respect to A and X # # Compute: \sum_n Z_XpXn = np.einsum('...nij,...nk->...kij', XpXn, Z) A_XpXn = np.einsum('...kil,...klj->...ij', B, Z_XpXn) A_XpXn = sum_to_plates(A_XpXn, (), ndim=2, plates_from=self.X_node.plates) # Compute: \sum_n ^T Z_XpXp = np.einsum('...nij,...nk->...kij', XpXp, Z) B_Z_XpXp = np.einsum('...kil,...klj->...kij', B, Z_XpXp) A_XpXp_A = np.einsum('...kil,...kjl->...ij', B_Z_XpXp, B) A_XpXp_A = sum_to_plates(A_XpXp_A, (), ndim=2, plates_from=self.X_node.plates) # Compute: \sum_n tr(CovA_n ) CovA_XpXp = np.einsum('...kij,...kdij->...d', Z_XpXp, CovB) CovA_XpXp = sum_to_plates(CovA_XpXp, (), ndim=1, plates_from=self.X_node.plates) return (A_XpXn, A_XpXp_A, CovA_XpXp) class RotateMultiple(): r""" Identical parameter expansion for several nodes simultaneously Performs the same rotation for multiple nodes and combines the cost effect. """ def __init__(self, *rotators): self.rotators = rotators def nodes(self): nodes = [] for rotator in self.rotators: nodes += rotator.nodes() return nodes def rotate(self, R, inv=None, logdet=None): for rotator in self.rotators: rotator.rotate(R, inv=inv, logdet=logdet) def setup(self): for rotator in self.rotators: rotator.setup() def bound(self, R, logdet=None, inv=None): bound = 0 dbound = 0 for rotator in self.rotators: (b, db) = rotator.bound(R, logdet=logdet, inv=inv) bound = bound + b dbound = dbound + db return (bound, dbound) def get_bound_terms(self, *args, **kwargs): d = dict() for rotator in self.rotators: d.update(rotator.get_bound_terms(*args, **kwargs)) return d ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/inference/vmp/vmp.py0000644000175100001770000005636700000000000021412 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np import warnings import time import h5py import datetime import tempfile import scipy import logging from bayespy.utils import misc from bayespy.inference.vmp.nodes.node import Node class VB(): r""" Variational Bayesian (VB) inference engine Parameters ---------- nodes : nodes Nodes that form the model. Must include all at least all stochastic nodes of the model. tol : double, optional Convergence criterion. Tolerance for the relative change in the VB lower bound. autosave_filename : string, optional Filename for automatic saving autosave_iterations : int, optional Iteration interval between each automatic saving callback : callable, optional Function which is called after each update iteration step """ def __init__(self, *nodes, tol=1e-5, autosave_filename=None, autosave_iterations=0, use_logging=False, user_data=None, callback=None): self.user_data = user_data for (ind, node) in enumerate(nodes): if not isinstance(node, Node): raise ValueError("Argument number %d is not a node" % (ind+1)) if use_logging: logger = logging.getLogger(__name__) self.print = logger.info else: # By default, don't use logging, just print stuff self.print = print # Remove duplicate nodes self.model = misc.unique(nodes) self.ignore_bound_checks = False self._figures = {} self.iter = 0 self.annealing_changed = False self.converged = False self.L = np.array(()) self.cputime = np.array(()) self.l = dict(zip(self.model, len(self.model)*[np.array([])])) self.autosave_iterations = autosave_iterations self.autosave_nodes = None if not autosave_filename: date = datetime.datetime.today().strftime('%Y%m%d%H%M%S') prefix = 'vb_autosave_%s_' % date tmpfile = tempfile.NamedTemporaryFile(prefix=prefix, suffix='.hdf5') self.autosave_filename = tmpfile.name self.filename = None else: self.autosave_filename = autosave_filename self.filename = autosave_filename # Check uniqueness of the node names names = [node.name for node in self.model] if len(names) != len(self.model): raise Exception("Use unique names for nodes.") self.callback = callback self.callback_output = None self.tol = tol def use_logging(self, use): if use_logging: logger = logging.getLogger(__name__) self.print = logger.info else: # By default, don't use logging, just print stuff self.print = print return def set_autosave(self, filename, iterations=None, nodes=None): self.autosave_filename = filename self.filename = filename self.autosave_nodes = nodes if iterations is not None: self.autosave_iterations = iterations def set_callback(self, callback): self.callback = callback def update(self, *nodes, repeat=1, plot=False, tol=None, verbose=True, tqdm=None): # TODO/FIXME: # # If no nodes are given and thus everything is updated, the update order # should be from down to bottom. Or something similar.. # By default, update all nodes if len(nodes) == 0: nodes = self.model if plot is True: plot_nodes = self.model elif plot is False: plot_nodes = [] else: plot_nodes = [self[x] for x in plot] converged = False if tqdm is not None: tqdm = tqdm(total=repeat) i = 0 while repeat is None or i < repeat: t = time.time() # Update nodes for node in nodes: X = self[node] if hasattr(X, 'update') and callable(X.update): X.update() if X in plot_nodes: self.plot(X) cputime = time.time() - t i += 1 if tqdm is not None: tqdm.update() if self._end_iteration_step(None, cputime, tol=tol, verbose=verbose): return def has_converged(self, tol=None): return self.converged def compute_lowerbound(self, ignore_masked=True): L = 0 for node in self.model: L += node.lower_bound_contribution(ignore_masked=ignore_masked) return L def compute_lowerbound_terms(self, *nodes): if len(nodes) == 0: nodes = self.model return {node: node.lower_bound_contribution() for node in nodes} def loglikelihood_lowerbound(self): L = 0 for node in self.model: lp = node.lower_bound_contribution() L += lp self.l[node][self.iter] = lp return L def plot_iteration_by_nodes(self, axes=None, diff=False): """ Plot the cost function per node during the iteration. Handy tool for debugging. """ if axes is None: import matplotlib.pyplot as plt axes = plt.gca() D = len(self.l) N = self.iter + 1 if diff: L = np.empty((N-1,D)) x = np.arange(N-1) + 2 else: L = np.empty((N,D)) x = np.arange(N) + 1 legends = [] for (d, node) in enumerate(self.l): if diff: L[:,d] = np.diff(self.l[node][:N]) else: L[:,d] = self.l[node][:N] legends += [node.name] axes.plot(x, L) axes.legend(legends, loc='lower right') axes.set_title('Lower bound contributions by nodes') axes.set_xlabel('Iteration') def get_iteration_by_nodes(self): return self.l def save(self, *nodes, filename=None): if len(nodes) == 0: nodes = self.model else: nodes = [self[node] for node in nodes if node is not None] if self.iter == 0: # Check HDF5 version. if h5py.version.hdf5_version_tuple < (1,8,7): warnings.warn("WARNING! Your HDF5 version is %s. HDF5 versions " "<1.8.7 are not able to save empty arrays, thus " "you may experience problems if you for instance " "try to save before running any iteration steps." % str(h5py.version.hdf5_version_tuple)) # By default, use the same file as for auto-saving if not filename: if self.autosave_filename: filename = self.autosave_filename else: raise Exception("Filename must be given.") # Open HDF5 file h5f = h5py.File(filename, 'w') try: # Write each node nodegroup = h5f.create_group('nodes') for node in nodes: if node.name == '': raise Exception("In order to save nodes, they must have " "(unique) names.") if hasattr(node, '_save') and callable(node._save): node._save(nodegroup.create_group(node.name)) # Write iteration statistics misc.write_to_hdf5(h5f, self.L, 'L') misc.write_to_hdf5(h5f, self.cputime, 'cputime') misc.write_to_hdf5(h5f, self.iter, 'iter') misc.write_to_hdf5(h5f, self.converged, 'converged') if self.callback_output is not None: misc.write_to_hdf5(h5f, self.callback_output, 'callback_output') boundgroup = h5f.create_group('boundterms') for node in nodes: misc.write_to_hdf5(boundgroup, self.l[node], node.name) # Write user data if self.user_data is not None: user_data_group = h5f.create_group('user_data') for (key, value) in self.user_data.items(): user_data_group[key] = value finally: # Close file h5f.close() @staticmethod def load_user_data(filename): f = h5py.File(filename, 'r') try: group = f['user_data'] for (key, value) in group.items(): user_data['key'] = value[...] except: raise finally: f.close() return def load(self, *nodes, filename=None, nodes_only=False): # By default, use the same file as for auto-saving if not filename: if self.autosave_filename: filename = self.autosave_filename else: raise Exception("Filename must be given.") # Open HDF5 file h5f = h5py.File(filename, 'r') try: # Get nodes to load if len(nodes) == 0: nodes = self.model else: nodes = [self[node] for node in nodes if node is not None] # Read each node for node_id in nodes: node = self[node_id] if node.name == '': h5f.close() raise Exception("In order to load nodes, they must have " "(unique) names.") if hasattr(node, 'load') and callable(node.load): try: node._load(h5f['nodes'][node.name]) except KeyError: h5f.close() raise Exception("File does not contain variable %s" % node.name) # Read iteration statistics if not nodes_only: self.L = h5f['L'][...] self.cputime = h5f['cputime'][...] self.iter = h5f['iter'][...] self.converged = h5f['converged'][...] for node in nodes: self.l[node] = h5f['boundterms'][node.name][...] try: self.callback_output = h5f['callback_output'][...] except KeyError: pass finally: # Close file h5f.close() def __getitem__(self, name): if name in self.model: return name else: # Dictionary for mapping node names to nodes dictionary = {node.name: node for node in self.model} return dictionary[name] def plot(self, *nodes, **kwargs): """ Plot the distribution of the given nodes (or all nodes) """ if len(nodes) == 0: nodes = self.model for node in nodes: node = self[node] if node.has_plotter(): import matplotlib.pyplot as plt try: fignum = self._figures[node] except KeyError: fig = plt.figure() self._figures[node] = fig.number else: fig = plt.figure(num=fignum) fig.clf() node.plot(fig=fig, **kwargs) fig.canvas.draw() @property def ignore_bound_checks(self): return self.__ignore_bound_checks @ignore_bound_checks.setter def ignore_bound_checks(self, ignore): self.__ignore_bound_checks = ignore def get_gradients(self, *nodes, euclidian=False): """ Computes gradients (both Riemannian and normal) """ rg = [self[node].get_riemannian_gradient() for node in nodes] if euclidian: g = [self[node].get_gradient(rg_x) for (node, rg_x) in zip(nodes, rg)] return (rg, g) else: return rg def get_parameters(self, *nodes): """ Get parameters of the nodes """ return [self[node].get_parameters() for node in nodes] def set_parameters(self, x, *nodes): """ Set parameters of the nodes """ for (node, xi) in zip(nodes, x): self[node].set_parameters(xi) return def gradient_step(self, *nodes, scale=1.0): """ Update nodes by taking a gradient ascent step """ p = self.add(self.get_parameters(*nodes), self.get_gradients(*nodes), scale=scale) self.set_parameters(p, *nodes) return def dot(self, x1, x2): """ Computes dot products of given vectors (in parameter format) """ v = 0 # Loop over nodes for (y1, y2) in zip(x1, x2): # Loop over parameters for (z1, z2) in zip(y1, y2): v += np.dot(np.ravel(z1), np.ravel(z2)) return v def add(self, x1, x2, scale=1): """ Add two vectors (in parameter format) """ v = [] # Loop over nodes for (y1, y2) in zip(x1, x2): v.append([]) # Loop over parameters for (z1, z2) in zip(y1, y2): v[-1].append(z1 + scale*z2) return v def optimize(self, *nodes, maxiter=10, verbose=True, method='fletcher-reeves', riemannian=True, collapsed=None, tol=None): """ Optimize nodes using Riemannian conjugate gradient """ method = method.lower() if collapsed is None: collapsed = [] scale = 1.0 p = self.get_parameters(*nodes) dd_prev = 0 for i in range(maxiter): t = time.time() # Get gradients if riemannian and method == 'gradient': rg = self.get_gradients(*nodes, euclidian=False) g1 = rg g2 = rg else: (rg, g) = self.get_gradients(*nodes, euclidian=True) if riemannian: g1 = g g2 = rg else: g1 = g g2 = g if method == 'gradient': b = 0 elif method == 'fletcher-reeves': dd_curr = self.dot(g1, g2) if dd_prev == 0: b = 0 else: b = dd_curr / dd_prev dd_prev = dd_curr else: raise Exception("Unknown optimization method: %s" % (method)) if b: s = self.add(g2, s, scale=b) else: s = g2 success = False while not success: p_new = self.add(p, s, scale=scale) try: self.set_parameters(p_new, *nodes) except: if verbose: self.print("CG update was unsuccessful, using gradient and resetting CG") if s is g2: scale = scale / 2 dd_prev = 0 s = g2 continue # Update collapsed variables collapsed_params = self.get_parameters(*collapsed) try: for node in collapsed: self[node].update() except: self.set_parameters(collapsed_params, *collapsed) if verbose: self.print("Collapsed node update node failed, reset CG") if s is g2: scale = scale / 2 dd_prev = 0 s = g2 continue L = self.compute_lowerbound() bound_decreased = ( self.iter > 0 and L < self.L[self.iter-1] and not np.allclose(L, self.L[self.iter-1], rtol=1e-8) ) if np.isnan(L) or bound_decreased: # Restore the state of the collapsed nodes to what it was # before updating them self.set_parameters(collapsed_params, *collapsed) if s is g2: scale = scale / 2 if verbose: self.print( "Gradient ascent decreased lower bound from {0} to {1}, halfing step length" .format( self.L[self.iter-1], L, ) ) else: if scale < 2 ** (-10): if verbose: self.print( "CG decreased lower bound from {0} to {1}, reset CG." .format( self.L[self.iter-1], L, ) ) dd_prev = 0 s = g2 else: scale = scale / 2 if verbose: self.print( "CG decreased lower bound from {0} to {1}, halfing step length" .format( self.L[self.iter-1], L, ) ) continue success = True scale = scale * np.sqrt(2) p = p_new cputime = time.time() - t if self._end_iteration_step('OPT', cputime, tol=tol, verbose=verbose): break def pattern_search(self, *nodes, collapsed=None, maxiter=3): """Perform simple pattern search :cite:`Honkela:2003`. Some of the variables can be collapsed. """ if collapsed is None: collapsed = [] t = time.time() # Update all nodes for x in nodes: self[x].update() for x in collapsed: self[x].update() # Current parameter values p0 = self.get_parameters(*nodes) # Update optimized nodes for x in nodes: self[x].update() # New parameter values p1 = self.get_parameters(*nodes) # Search direction dp = self.add(p1, p0, scale=-1) # Cost function for pattern search def cost(alpha): p_new = self.add(p1, dp, scale=alpha) try: self.set_parameters(p_new, *nodes) except: return np.inf # Update collapsed nodes for x in collapsed: self[x].update() return -self.compute_lowerbound() # Optimize step length res = scipy.optimize.minimize_scalar(cost, bracket=[0, 3], options={'maxiter':maxiter}) # Set found parameter values p_new = self.add(p1, dp, scale=res.x) self.set_parameters(p_new, *nodes) # Update collapsed nodes for x in collapsed: self[x].update() cputime = time.time() - t self._end_iteration_step('PS', cputime) def set_annealing(self, annealing): """ Set deterministic annealing from range (0, 1]. With 1, no annealing, standard updates. With smaller values, entropy has more weight and model probability equations less. With 0, one would obtain improper uniform distributions. """ for node in self.model: node.annealing = annealing self.annealing_changed = True self.converged = False return def _append_iterations(self, iters): """ Append some arrays for more iterations """ self.L = np.append(self.L, misc.nans(iters)) self.cputime = np.append(self.cputime, misc.nans(iters)) for (node, l) in self.l.items(): self.l[node] = np.append(l, misc.nans(iters)) return def _end_iteration_step(self, method, cputime, tol=None, verbose=True, bound_cpu_time=True): """ Do some routines after each iteration step """ if self.iter >= len(self.L): self._append_iterations(100) # Call the custom function provided by the user if callable(self.callback): z = self.callback() if z is not None: z = np.array(z)[...,np.newaxis] if self.callback_output is None: self.callback_output = z else: self.callback_output = np.concatenate((self.callback_output,z), axis=-1) t = time.time() L = self.loglikelihood_lowerbound() if bound_cpu_time: cputime += time.time() - t self.cputime[self.iter] = cputime self.L[self.iter] = L if verbose: if method: self.print("Iteration %d (%s): loglike=%e (%.3f seconds)" % (self.iter+1, method, L, cputime)) else: self.print("Iteration %d: loglike=%e (%.3f seconds)" % (self.iter+1, L, cputime)) # Check the progress of the iteration self.converged = False if not self.ignore_bound_checks and not self.annealing_changed and self.iter > 0: # Check for errors if self.L[self.iter-1] - L > 1e-6: L_diff = (self.L[self.iter-1] - L) warnings.warn("Lower bound decreased %e! Bug somewhere or " "numerical inaccuracy?" % L_diff) # Check for convergence L0 = self.L[self.iter-1] L1 = self.L[self.iter] if tol is None: tol = self.tol div = 0.5 * (abs(L0) + abs(L1)) if (L1 - L0) / div < tol: #if (L1 - L0) / div < tol or L1 - L0 <= 0: if verbose: self.print("Converged at iteration %d." % (self.iter+1)) self.converged = True # Auto-save, if requested if (self.autosave_iterations > 0 and np.mod(self.iter+1, self.autosave_iterations) == 0): if self.autosave_nodes is not None: self.save(*self.autosave_nodes, filename=self.autosave_filename) else: self.save(filename=self.autosave_filename) if verbose: self.print('Auto-saved to %s' % self.autosave_filename) self.annealing_changed = False self.iter += 1 return self.converged ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.413372 bayespy-0.6.2/bayespy/nodes/0000755000175100001770000000000000000000000016565 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/nodes/__init__.py0000644000175100001770000000330100000000000020673 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Package for nodes used to construct the model. Stochastic nodes ================ .. currentmodule:: bayespy.nodes Nodes for Gaussian variables: .. autosummary:: :toctree: generated/ Gaussian GaussianARD Nodes for precision and scale variables: .. autosummary:: :toctree: generated/ Gamma Wishart Exponential Nodes for modelling Gaussian and precision variables jointly (useful as prior for Gaussian nodes): .. autosummary:: :toctree: generated/ GaussianGamma GaussianWishart Nodes for discrete count variables: .. autosummary:: :toctree: generated/ Bernoulli Binomial Categorical Multinomial Poisson Nodes for probabilities: .. autosummary:: :toctree: generated/ Beta Dirichlet Nodes for dynamic variables: .. autosummary:: :toctree: generated/ CategoricalMarkovChain GaussianMarkovChain SwitchingGaussianMarkovChain VaryingGaussianMarkovChain Other stochastic nodes: .. autosummary:: :toctree: generated/ Mixture Point-estimation nodes: .. autosummary:: MaximumLikelihood Concentration GammaShape Deterministic nodes =================== .. autosummary:: :toctree: generated/ Dot SumMultiply Add Gate Take Function ConcatGaussian Choose """ # Currently, model construction and the inference network are not separated so # the model is constructed using variational message passing nodes. from bayespy.inference.vmp.nodes import * ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/plot.py0000644000175100001770000011102200000000000017002 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Functions for plotting nodes. Functions ========= .. currentmodule:: bayespy.plot .. autosummary:: :toctree: generated/ pdf contour plot hinton gaussian_mixture_2d ellipse_from_cov ellipse_from_precision Plotters ======== .. autosummary:: :toctree: generated/ Plotter PDFPlotter ContourPlotter HintonPlotter FunctionPlotter GaussianTimeseriesPlotter CategoricalMarkovChainPlotter """ import os, sys ############################################################################ # A STUPID WORKAROUND FOR A MATPLOTLIB 1.4.0 BUG RELATED TO INTERACTIVE MODE # See: https://github.com/matplotlib/matplotlib/issues/3505 import __main__ if hasattr(__main__, '__file__'): sys.ps1 = ('WORKAROUND FOR A BUG #3505 IN MATPLOTLIB.\n' 'IF YOU SEE THIS MESSAGE, TRY MATPLOTLIB!=1.4.0.') # This workaround does not work on Python shell, only on stand-alone scripts # and IPython. A better solution: require MPL!=1.4.0. ############################################################################# import numpy as np import scipy.sparse as sp import scipy from scipy import special import matplotlib.pyplot as plt from matplotlib import animation from matplotlib import colors from matplotlib import patches #from matplotlib.pyplot import * from bayespy.inference.vmp.nodes.categorical import CategoricalMoments from bayespy.inference.vmp.nodes.gaussian import (GaussianMoments, GaussianWishartMoments) from bayespy.inference.vmp.nodes.beta import BetaMoments from bayespy.inference.vmp.nodes.beta import DirichletMoments from bayespy.inference.vmp.nodes.bernoulli import BernoulliMoments from bayespy.inference.vmp.nodes.categorical import CategoricalMoments from bayespy.inference.vmp.nodes.gamma import GammaMoments from bayespy.inference.vmp.nodes.node import Node, Moments from bayespy.utils import (misc, random, linalg) # Users can use pyplot via this module import matplotlib mpl = matplotlib pyplot = plt def interactive(function): """A decorator for forcing functions to use the interactive mode. Parameters ---------- function : callable The function to be decorated """ def new_function(*args, **kwargs): if mpl.is_interactive(): was_interactive = True else: was_interactive = False mpl.interactive(True) retval = function(*args, **kwargs) if not was_interactive: mpl.interactive(False) return retval return new_function def _subplots(plotfunc, *args, fig=None, kwargs=None): """Create a collection of subplots Each subplot is created with the same plotting function. Inputs are given as pairs: (x, 3), (y, 2), ... where x,y,... are the input arrays and 3,2,... are the ndim parameters. The last ndim axes of each array are interpreted as a single element to the plotting function. All high-level plotting functions should wrap low-level plotting functions with this function in order to generate subplots for plates. """ if kwargs is None: kwargs = {} if fig is None: fig = plt.gcf() # Parse shape and plates of each input array shapes = [np.shape(x)[-n:] if n > 0 else () for (x,n) in args] plates = [np.shape(x)[:-n] if n > 0 else np.shape(x) for (x,n) in args] # Get the full grid shape of the subplots broadcasted_plates = misc.broadcasted_shape(*plates) # Subplot indexing layout M = int(np.prod(broadcasted_plates[-2::-2])) N = int(np.prod(broadcasted_plates[-1::-2])) strides_subplot = [np.prod(broadcasted_plates[(j+2)::2]) * N if ((len(broadcasted_plates)-j) % 2) == 0 else np.prod(broadcasted_plates[(j+2)::2]) for j in range(len(broadcasted_plates))] # Plot each subplot for ind in misc.nested_iterator(broadcasted_plates): # Get the list of inputs for this subplot broadcasted_args = [] for n in range(len(args)): i = misc.safe_indices(ind, plates[n]) broadcasted_args.append(args[n][0][i]) # Plot the subplot using the given function ind_subplot = int(np.einsum('i,i', ind, strides_subplot)) axes = fig.add_subplot(M, N, ind_subplot+1) plotfunc(*broadcasted_args, axes=axes, **kwargs) def pdf(Z, x, *args, name=None, axes=None, fig=None, **kwargs): """ Plot probability density function of a scalar variable. Parameters ---------- Z : node or function Stochastic node or log pdf function x : array Grid points """ # TODO: Make it possible to plot a plated variable using _subplots function. if axes is None and fig is None: axes = plt.gca() else: if fig is None: fig = plt.gcf() axes = fig.add_subplot(111) try: lpdf = Z.logpdf(x) except AttributeError: lpdf = Z(x) p = np.exp(lpdf) retval = axes.plot(x, p, *args, **kwargs) if name is None: try: name = Z.name except AttributeError: pass if name: axes.set_title(r'$q(%s)$' % (name)) axes.set_xlabel(r'$%s$' % (name)) return retval def contour(Z, x, y, n=None, axes=None, fig=None, **kwargs): """ Plot 2-D probability density function of a 2-D variable. Parameters ---------- Z : node or function Stochastic node or log pdf function x : array Grid points on x axis y : array Grid points on y axis """ # TODO: Make it possible to plot a plated variable using _subplots function. if axes is None and fig is None: axes = plt.gca() else: if fig is None: fig = plt.gcf() axes = fig.add_subplot(111) XY = misc.grid(x, y) try: lpdf = Z.logpdf(XY) except AttributeError: lpdf = Z(XY) p = np.exp(lpdf) shape = (np.size(x), np.size(y)) X = np.reshape(XY[:,0], shape) Y = np.reshape(XY[:,1], shape) P = np.reshape(p, shape) if n is not None: levels = np.linspace(0, np.amax(P), num=n+2)[1:-1] return axes.contour(X, Y, P, levels, **kwargs) else: return axes.contour(X, Y, P, **kwargs) def plot_gaussian_mc(X, scale=2, **kwargs): """ Plot Gaussian Markov chain as a 1-D function Parameters ---------- X : node Node with Gaussian Markov chain moments. """ timeseries_gaussian(X, axis=-2, scale=scale, **kwargs) def plot_bernoulli(X, axis=-1, scale=2, **kwargs): """ Plot Bernoulli node as a 1-D function """ X = X._ensure_moments(X, BernoulliMoments) u_X = X.get_moments() z = u_X[0] return _timeseries_mean_and_error(z, None, axis=axis, **kwargs) def plot_gaussian(X, axis=-1, scale=2, **kwargs): """ Plot Gaussian node as a 1-D function Parameters ---------- X : node Node with Gaussian moments. axis : int The index of the time axis. """ X = X._ensure_moments(X, GaussianMoments, ndim=0) u_X = X.get_moments() x = u_X[0] xx = misc.get_diag(u_X[1], ndim=len(X.dims[0])) std = scale * np.sqrt(xx - x**2) #std = scale * np.sqrt(np.einsum('...ii->...i', xx) - x**2) return _timeseries_mean_and_error(x, std, axis=axis, **kwargs) def plot(Y, axis=-1, scale=2, center=False, **kwargs): """ Plot a variable or an array as 1-D function with errorbars """ if misc.is_numeric(Y): return _timeseries_mean_and_error(Y, None, axis=axis, center=center, **kwargs) if isinstance(Y, Node): # Try Bernoulli plotting try: Y = Y._ensure_moments(Y, BernoulliMoments) except BernoulliMoments.NoConverterError: pass else: return plot_bernoulli(Y, axis=axis, scale=scale, center=center, **kwargs) # Try Gaussian plotting try: Y = Y._ensure_moments(Y, GaussianMoments, ndim=0) except GaussianMoments.NoConverterError: pass else: return plot_gaussian(Y, axis=axis, scale=scale, center=center, **kwargs) (mu, var) = Y.get_mean_and_variance() std = np.sqrt(var) return _timeseries_mean_and_error(mu, std, axis=axis, scale=scale, center=center, **kwargs) # Some backward compatibility def timeseries_gaussian_mc(*args, center=True, **kwargs): return plot_gaussian_mc(*args, center=center, **kwargs) def timeseries_gaussian(*args, center=True, **kwargs): return plot_gaussian(*args, center=center, **kwargs) timeseries_normal = timeseries_gaussian def timeseries(*args, center=True, **kwargs): return plot(*args, center=center, **kwargs) def _timeseries_mean_and_error(y, std, *args, axis=-1, center=True, fig=None, axes=None, **kwargs): # TODO/FIXME: You must multiply by ones(plates) in order to plot # broadcasted plates properly if fig is None: fig = plt.gcf() y = np.atleast_1d(y) shape = list(np.shape(y)) # Get and remove the length of the time axis T = shape.pop(axis) # Move time axis to first y = np.rollaxis(y, axis) if std is not None: std = np.rollaxis(std, axis) y = np.reshape(y, (T, -1)) if std is not None: std = np.reshape(std, (T, -1)) # Remove 1s shape = [s for s in shape if s > 1] # Calculate number of rows and columns shape = misc.multiply_shapes(shape, (1,1)) if len(shape) > 2: raise Exception("Can plot only in 2 dimensions (rows and columns)") (M, N) = shape # Prefer plotting to rows if M == 1: M = N N = 1 # Plot each timeseries if axes is None: ax0 = fig.add_subplot(M, N, 1) for i in range(M*N): if axes is None: if i > 0: # Share x axis between all subplots ax = fig.add_subplot(M, N, i+1, sharex=ax0) else: ax = ax0 # Autoscale the axes to data and use tight y and x axes ax.autoscale(enable=True, tight=True) ax.set_ylim(auto=True) if i < (M-1)*N: # Remove x tick labels from other than the last row plt.setp(ax.get_xticklabels(), visible=False) else: ax = axes[i] if std is None: errorplot(y=y[:,i], axes=ax, **kwargs) else: if len(args) > 0: raise Exception("Can't handle extra arguments") errorplot(y=y[:,i], error=std[:,i], axes=ax, **kwargs) if center: # Center the zero level on y-axis ylim = ax.get_ylim() vmax = np.max(np.abs(ylim)) ax.set_ylim([-vmax, vmax]) if axes is None: # Remove height space between subplots fig.subplots_adjust(hspace=0) def _blob(axes, x, y, area, colour): """ Draws a square-shaped blob with the given area (< 1) at the given coordinates. """ hs = np.sqrt(area) / 2 xcorners = np.array([x - hs, x + hs, x + hs, x - hs]) ycorners = np.array([y - hs, y - hs, y + hs, y + hs]) axes.fill(xcorners, ycorners, colour, edgecolor=colour) def _rectangle(axes, x, y, width, height, **kwargs): _x = x - width/2 _y = y - height/2 rectangle = plt.Rectangle((_x, _y), width, height, **kwargs) axes.add_patch(rectangle) return def gaussian_mixture_2d(X, alpha=None, scale=2, fill=False, axes=None, **kwargs): """ Plot Gaussian mixture as ellipses in 2-D Parameters ---------- X : Mixture node alpha : Dirichlet-like node (optional) Probabilities for the clusters scale : float (optional) Scale for the covariance ellipses (by default, 2) """ if axes is None: axes = plt.gca() mu_Lambda = X._ensure_moments(X.parents[1], GaussianWishartMoments) (mu, _, Lambda, _) = mu_Lambda.get_moments() mu = np.linalg.solve(Lambda, mu) if len(mu_Lambda.plates) != 1: raise NotImplementedError("Not yet implemented for more plates") K = mu_Lambda.plates[0] width = np.zeros(K) height = np.zeros(K) angle = np.zeros(K) for k in range(K): m = mu[k] L = Lambda[k] (u, W) = scipy.linalg.eigh(L) u[0] = np.sqrt(1/u[0]) u[1] = np.sqrt(1/u[1]) width[k] = 2*u[0] height[k] = 2*u[1] angle[k] = np.arctan(W[0,1] / W[0,0]) angle = 180 * angle / np.pi mode_height = 1 / (width * height) # Use cluster probabilities to adjust alpha channel if alpha is not None: # Compute the normalized probabilities in a numerically stable way logsum_p = misc.logsumexp(alpha.u[0], axis=-1, keepdims=True) logp = alpha.u[0] - logsum_p p = np.exp(logp) # Visibility is based on cluster mode peak height visibility = mode_height * p visibility /= np.amax(visibility) else: visibility = np.ones(K) for k in range(K): ell = mpl.patches.Ellipse(mu[k], scale*width[k], scale*height[k], angle=(180+angle[k]), fill=fill, alpha=visibility[k], **kwargs) axes.add_artist(ell) plt.axis('equal') # If observed, plot the data too if np.any(X.observed): mask = np.array(X.observed) * np.ones(X.plates, dtype=np.bool) y = X.u[0][mask] plt.plot(y[:,0], y[:,1], 'r.') return def _hinton(W, error=None, vmax=None, square=False, axes=None): """ Draws a Hinton diagram for visualizing a weight matrix. Temporarily disables matplotlib interactive mode if it is on, otherwise this takes forever. Originally copied from http://wiki.scipy.org/Cookbook/Matplotlib/HintonDiagrams """ if axes is None: axes = plt.gca() W = misc.atleast_nd(W, 2) (height, width) = W.shape if not vmax: #vmax = 2**np.ceil(np.log(np.max(np.abs(W)))/np.log(2)) if error is not None: vmax = np.max(np.abs(W) + error) else: vmax = np.max(np.abs(W)) axes.fill(0.5+np.array([0,width,width,0]), 0.5+np.array([0,0,height,height]), 'gray') if square: axes.set_aspect('equal') axes.set_ylim(0.5, height+0.5) axes.set_xlim(0.5, width+0.5) axes.set_xticks([]) axes.set_yticks([]) axes.invert_yaxis() for x in range(width): for y in range(height): _x = x+1 _y = y+1 w = W[y,x] _w = np.abs(w) if w > 0: _c = 'white' else: _c = 'black' if error is not None: e = error[y,x] if e < 0: print(e, _w, vmax) raise Exception("BUG? Negative error") if _w + e > vmax: print(e, _w, vmax) raise Exception("BUG? Value+error greater than max") _rectangle(axes, _x, _y, min(1, np.sqrt((_w+e)/vmax)), min(1, np.sqrt((_w+e)/vmax)), edgecolor=_c, fill=False) _blob(axes, _x, _y, min(1, _w/vmax), _c) def matrix(A, axes=None): if axes is None: axes = plt.gca() A = np.atleast_2d(A) vmax = np.max(np.abs(A)) return axes.imshow(A, interpolation='nearest', cmap='RdBu_r', vmin=-vmax, vmax=vmax) def new_matrix(A, vmax=None): A = np.atleast_2d(A) if vmax is None: vmax = np.max(np.abs(A)) (M, N) = np.shape(A) for i in range(M): for j in range(N): pass def gaussian_hinton(X, rows=None, cols=None, scale=1, fig=None): """ Plot the Hinton diagram of a Gaussian node """ if fig is None: fig = plt.gcf() # Get mean and second moment X = X._ensure_moments(X, GaussianMoments, ndim=0) (x, xx) = X.get_moments() ndim = len(X.dims[0]) shape = X.get_shape(0) size = len(X.get_shape(0)) # Compute standard deviation xx = misc.get_diag(xx, ndim=ndim) std = np.sqrt(xx - x**2) # Force explicit elements when broadcasting x = x * np.ones(shape) std = std * np.ones(shape) if rows is None: rows = np.nan if cols is None: cols = np.nan # Preprocess the axes to 0,...,ndim if rows < 0: rows += size if cols < 0: cols += size if rows < 0 or rows >= size: raise ValueError("Row axis invalid") if cols < 0 or cols >= size: raise ValueError("Column axis invalid") # Remove non-row and non-column axes that have length 1 squeezed_shape = list(shape) for i in reversed(range(len(shape))): if shape[i] == 1 and i != rows and i != cols: squeezed_shape.pop(i) if i < cols: cols -= 1 if i < rows: rows -= 1 x = np.reshape(x, squeezed_shape) std = np.reshape(std, squeezed_shape) if np.ndim(x) < 2: cols += 2 - np.ndim(x) rows += 2 - np.ndim(x) x = np.atleast_2d(x) std = np.atleast_2d(std) size = np.ndim(x) if np.isnan(cols): if rows != size - 1: cols = size - 1 else: cols = size - 2 if np.isnan(rows): if cols != size - 1: rows = size - 1 else: rows = size - 2 # Put the row and column axes to the end axes = [i for i in range(size) if i not in (rows, cols)] + [rows, cols] x = np.transpose(x, axes=axes) std = np.transpose(std, axes=axes) vmax = np.max(np.abs(x) + scale*std) if scale == 0: _subplots(_hinton, (x, 2), fig=fig, kwargs=dict(vmax=vmax)) else: def plotfunc(z, e, **kwargs): return _hinton(z, error=e, **kwargs) _subplots(plotfunc, (x, 2), (scale*std, 2), fig=fig, kwargs=dict(vmax=vmax)) def _hinton_figure(x, rows=None, cols=None, fig=None, square=True): """ Plot the Hinton diagram of a Gaussian node """ scale = 0 std = 0 if fig is None: fig = plt.gcf() # Get mean and second moment shape = np.shape(x) size = np.ndim(x) if rows is None: rows = np.nan if cols is None: cols = np.nan # Preprocess the axes to 0,...,ndim if rows < 0: rows += size if cols < 0: cols += size if rows < 0 or rows >= size: raise ValueError("Row axis invalid") if cols < 0 or cols >= size: raise ValueError("Column axis invalid") # Remove non-row and non-column axes that have length 1 squeezed_shape = list(shape) for i in reversed(range(len(shape))): if shape[i] == 1 and i != rows and i != cols: squeezed_shape.pop(i) if i < cols: cols -= 1 if i < rows: rows -= 1 x = np.reshape(x, squeezed_shape) size = np.ndim(x) if np.isnan(cols): if rows != size - 1: cols = size - 1 else: cols = size - 2 if np.isnan(rows): if cols != size - 1: rows = size - 1 else: rows = size - 2 # Put the row and column axes to the end if np.ndim(x) >= 2: axes = [i for i in range(size) if i not in (rows, cols)] + [rows, cols] x = np.transpose(x, axes=axes) #std = np.transpose(std, axes=axes) vmax = np.max(np.abs(x) + scale*std) kw = dict(vmax=vmax, square=square) if scale == 0: _subplots(_hinton, (x, 2), fig=fig, kwargs=kw) else: def plotfunc(z, e, **kwargs): return _hinton(z, error=e, **kwargs) _subplots(plotfunc, (x, 2), (scale*std, 2), fig=fig, kwargs=kw) # For backwards compatibility: gaussian_array = gaussian_hinton def timeseries_categorical_mc(Z, fig=None): if fig is None: fig = plt.gcf() # Make sure that the node is categorical Z = Z._ensure_moments(Z, CategoricalMoments, categories=None) # Get expectations (and broadcast explicitly) z = Z._message_to_child()[0] * np.ones(Z.get_shape(0)) # Compute the subplot layout z = misc.atleast_nd(z, 4) if np.ndim(z) != 4: raise ValueError("Can not plot arrays with over 4 axes") M = np.shape(z)[0] N = np.shape(z)[1] # Plot Hintons for i in range(M): for j in range(N): axes = fig.add_subplot(M, N, i*N+j+1) _hinton(z[i,j].T, vmax=1.0, square=False, axes=axes) def gamma_hinton(alpha, square=True, **kwargs): """ Plot a beta distributed random variable as a Hinton diagram """ # Make sure that the node is beta alpha = alpha._ensure_moments(alpha, GammaMoments) # Compute exp( ) x = alpha.get_moments()[0] # Explicit broadcasting x = x * np.ones(alpha.plates) # Plot Hinton diagram return _hinton_figure(x, square=square, **kwargs) def beta_hinton(P, square=True): """ Plot a beta distributed random variable as a Hinton diagram """ # Make sure that the node is beta P = P._ensure_moments(P, BetaMoments) # Compute exp( ) p = np.exp(P._message_to_child()[0][...,0]) # Explicit broadcasting p = p * np.ones(P.plates) # Plot Hinton diagram return _hinton(p, vmax=1.0, square=square) def dirichlet_hinton(P, square=True): """ Plot a beta distributed random variable as a Hinton diagram """ # Make sure that the node is beta P = P._ensure_moments(P, DirichletMoments) # Compute exp( ) p = np.exp(P._message_to_child()[0]) # Explicit broadcasting p = p * np.ones(P.plates+(1,)) # Plot Hinton diagram return _hinton(p, vmax=1.0, square=square) def bernoulli_hinton(Z, square=True): """ Plot a Bernoulli distributed random variable as a Hinton diagram """ # Make sure that the node is Bernoulli Z = Z._ensure_moments(Z, BernoulliMoments) # Get z = Z._message_to_child()[0] # Explicit broadcasting z = z * np.ones(Z.plates) # Plot Hinton diagram return _hinton(z, vmax=1.0, square=square) def categorical_hinton(Z, square=True): """ Plot a Bernoulli distributed random variable as a Hinton diagram """ # Make sure that the node is Bernoulli Z = Z._ensure_moments(Z, CategoricalMoments, categories=None) # Get z = Z._message_to_child()[0] # Explicit broadcasting z = z * np.ones(Z.plates+(1,)) # Plot Hinton diagram return _hinton(np.squeeze(z), vmax=1.0, square=square) def hinton(X, **kwargs): r""" Plot the Hinton diagram of a node The keyword arguments depend on the node type. For some node types, the diagram also shows uncertainty with non-filled rectangles. Currently, beta-like, Gaussian-like and Dirichlet-like nodes are supported. Parameters ---------- X : node """ if hasattr(X, "_ensure_moments"): try: X = X._ensure_moments(X, GaussianMoments, ndim=0) except Moments.NoConverterError: pass else: return gaussian_hinton(X, **kwargs) try: X = X._ensure_moments(X, GammaMoments) except Moments.NoConverterError: pass else: return gamma_hinton(X, **kwargs) try: X = X._ensure_moments(X, BetaMoments) except Moments.NoConverterError: pass else: return beta_hinton(X, **kwargs) try: X = X._ensure_moments(X, DirichletMoments) except Moments.NoConverterError: pass else: return dirichlet_hinton(X, **kwargs) try: X = X._ensure_moments(X, BernoulliMoments) except Moments.NoConverterError: pass else: return bernoulli_hinton(X, **kwargs) try: X = X._ensure_moments(X, CategoricalMoments, categories=None) except Moments.NoConverterError: pass else: return categorical_hinton(X, **kwargs) return _hinton_figure(X, **kwargs) class Plotter(): r""" Wrapper for plotting functions and base class for node plotters The purpose of this class is to collect all the parameters needed by a plotting function and provide a callable interface which needs only the node as the input. Plotter instances are callable objects that plot a given node using a specified plotting function. Parameters ---------- plotter : function Plotting function to use args : defined by the plotting function Additional inputs needed by the plotting function kwargs : defined by the plotting function Additional keyword arguments supported by the plotting function Examples -------- First, create a gamma variable: >>> import numpy as np >>> from bayespy.nodes import Gamma >>> x = Gamma(4, 5) The probability density function can be plotted as: >>> import bayespy.plot as bpplt >>> bpplt.pdf(x, np.linspace(0.1, 10, num=100)) # doctest: +ELLIPSIS [] However, this can be problematic when one needs to provide a plotting function for the inference engine as the inference engine gives only the node as input. Thus, we need to create a simple plotter wrapper: >>> p = bpplt.Plotter(bpplt.pdf, np.linspace(0.1, 10, num=100)) Now, this callable object ``p`` needs only the node as the input: >>> p(x) # doctest: +ELLIPSIS [] Thus, it can be given to the inference engine to use as a plotting function: >>> x = Gamma(4, 5, plotter=p) >>> x.plot() # doctest: +ELLIPSIS [] """ def __init__(self, plotter, *args, **kwargs): self._args = args self._kwargs = kwargs self._plotter = plotter def __call__(self, X, fig=None, **kwargs): """ Plot the node using the specified plotting function Parameters ---------- X : node The plotted node """ kwargs_all = self._kwargs.copy() kwargs_all.update(kwargs) return self._plotter(X, *self._args, fig=fig, **kwargs_all) class PDFPlotter(Plotter): r""" Plotter of probability density function of a scalar node Parameters ---------- x_grid : array Numerical grid on which the density function is computed and plotted See also -------- pdf """ def __init__(self, x_grid, **kwargs): super().__init__(pdf, x_grid, **kwargs) class ContourPlotter(Plotter): r""" Plotter of probability density function of a two-dimensional node Parameters ---------- x1_grid : array Grid for the first dimension x2_grid : array Grid for the second dimension See also -------- contour """ def __init__(self, x1_grid, x2_grid, **kwargs): super().__init__(contour, x1_grid, x2_grid, **kwargs) class HintonPlotter(Plotter): r""" Plotter of the Hinton diagram of a node See also -------- hinton """ def __init__(self, **kwargs): super().__init__(hinton, **kwargs) class FunctionPlotter(Plotter): r""" Plotter of a node as a 1-dimensional function See also -------- plot """ def __init__(self, **kwargs): super().__init__(plot, **kwargs) class GaussianMarkovChainPlotter(Plotter): r""" Plotter of a Gaussian Markov chain as a timeseries """ def __init__(self, **kwargs): super().__init__(timeseries_gaussian_mc, **kwargs) class GaussianTimeseriesPlotter(Plotter): r""" Plotter of a Gaussian node as a timeseries """ def __init__(self, **kwargs): super().__init__(timeseries_gaussian, **kwargs) class GaussianHintonPlotter(Plotter): r""" Plotter of a Gaussian node as a Hinton diagram """ def __init__(self, **kwargs): super().__init__(gaussian_array, **kwargs) class CategoricalMarkovChainPlotter(Plotter): r""" Plotter of a Categorical timeseries """ def __init__(self, **kwargs): super().__init__(timeseries_categorical_mc, **kwargs) def matrix_animation(A, filename=None, fps=25, fig=None, **kwargs): if fig is None: fig = plt.gcf() axes = fig.add_subplot(111) A = np.atleast_3d(A) vmax = np.max(np.abs(A)) x = axes.imshow(A[0], interpolation='nearest', cmap='RdBu_r', vmin=-vmax, vmax=vmax, **kwargs) s = axes.set_title('t = %d' % 0) def animate(nframe): s.set_text('t = %d' % nframe) x.set_array(A[nframe]) return (x, s) anim = animation.FuncAnimation(fig, animate, frames=np.shape(A)[0], interval=1000/fps, blit=False, repeat=False) return anim def save_animation(anim, filename, fps=25, bitrate=5000, fig=None): # A bug in numpy/matplotlib causes this not to work in python3.3: # https://github.com/matplotlib/matplotlib/issues/1891 # # So the following command does not work currently.. # # anim.save(filename, fps=fps) if fig is None: fig = plt.gcf() writer = animation.FFMpegFileWriter(fps=fps, bitrate=bitrate) writer.setup(fig, filename, 100) anim.save(filename, fps=fps, writer=writer, bitrate=bitrate) return def binary_matrix(A, axes=None): if axes is None: axes = plt.gca() A = np.atleast_2d(A) G = np.zeros(np.shape(A) + (3,)) G[A] = [0,0,0] G[np.logical_not(A)] = [1,1,1] axes.imshow(G, interpolation='nearest') def gaussian_mixture_logpdf(x, w, mu, Sigma): # Shape(x) = (N, D) # Shape(w) = (K,) # Shape(mu) = (K, D) # Shape(Sigma) = (K, D, D) # Shape(result) = (N,) # Dimensionality D = np.shape(x)[-1] # Cholesky decomposition of the covariance matrix U = linalg.chol(Sigma) # Reshape x: # Shape(x) = (N, 1, D) x = np.expand_dims(x, axis=-2) # (x-mu) and (x-mu)'*inv(Sigma)*(x-mu): # Shape(v) = (N, K, D) # Shape(z) = (N, K) v = x - mu z = np.einsum('...i,...i', v, linalg.chol_solve(U, v)) # Log-determinant of Sigma: # Shape(ldet) = (K,) ldet = linalg.chol_logdet(U) # Compute log pdf for each cluster: # Shape(lpdf) = (N, K) lpdf = misc.gaussian_logpdf(z, 0, 0, ldet, D) def matrixplot(A, colorbar=False, axes=None): if axes is None: axes = plt.gca() if sp.issparse(A): A = A.toarray() axes.imshow(A, interpolation='nearest') if colorbar: plt.colorbar(ax=axes) def contourplot(x1, x2, y, colorbar=False, filled=True, axes=None): """ Plots 2D contour plot. x1 and x2 are 1D vectors, y contains the function values. y.size must be x1.size*x2.size. """ if axes is None: axes = plt.gca() y = np.reshape(y, (len(x2),len(x1))) if filled: axes.contourf(x1, x2, y) else: axes.contour(x1, x2, y) if colorbar: plt.colorbar(ax=axes) def errorplot(y=None, error=None, x=None, lower=None, upper=None, color=(0,0,0,1), fillcolor=None, axes=None, **kwargs): if axes is None: axes = plt.gca() # Default inputs if x is None: x = np.arange(np.size(y)) # Parse errors (lower=lower/error/upper, upper=upper/error/lower) if lower is None: if error is not None: lower = error elif upper is not None: lower = upper if upper is None: if error is not None: upper = error elif lower is not None: upper = lower # Plot errors if (lower is not None) and (upper is not None): l = y - lower u = y + upper if fillcolor is None: color = colors.ColorConverter().to_rgba(color) fillcolor = tuple(color[:3]) + (0.2 * color[3],) axes.fill_between(x, l, u, facecolor=fillcolor, edgecolor=(0, 0, 0, 0), linewidth=1, interpolate=True) # Plot function axes.plot(x, y, color=color, **kwargs) def plotmatrix(X): """ Creates a matrix of marginal plots. On diagonal, are marginal plots of each variable. Off-diagonal plot (i,j) shows the joint marginal density of x_i and x_j. """ return X.plotmatrix() def _pdf_t(mu, s2, nu, axes=None, scale=4, color='k'): """ """ if axes is None: axes = plt.gca() s = np.sqrt(s2) x = np.linspace(mu-scale*s, mu+scale*s, num=100) y2 = (x-mu)**2 / s2 lpdf = random.t_logpdf(y2, np.log(s2), nu, 1) p = np.exp(lpdf) return axes.plot(x, p, color=color) def _pdf_gamma(a, b, axes=None, scale=4, color='k'): """ """ if axes is None: axes = plt.gca() if np.size(a) != 1 or np.size(b) != 1: raise ValueError("Parameters must be scalars") mean = a/b v = scale*np.sqrt(a/b**2) m = max(0, mean-v) n = mean + v x = np.linspace(m, n, num=100) logx = np.log(x) lpdf = random.gamma_logpdf(b*x, logx, a*logx, a*np.log(b), special.gammaln(a)) p = np.exp(lpdf) return axes.plot(x, p, color=color) def _contour_t(mu, Cov, nu, axes=None, scale=4, transpose=False, colors='k'): """ """ if axes is None: axes = plt.gca() if np.shape(mu) != (2,) or np.shape(Cov) != (2,2) or np.shape(nu) != (): print(np.shape(mu), np.shape(Cov), np.shape(nu)) raise ValueError("Only 2-d t-distribution allowed") if transpose: mu = mu[[1,0]] Cov = Cov[np.ix_([1,0],[1,0])] s = np.sqrt(np.diag(Cov)) x0 = np.linspace(mu[0]-scale*s[0], mu[0]+scale*s[0], num=100) x1 = np.linspace(mu[1]-scale*s[1], mu[1]+scale*s[1], num=100) X0X1 = misc.grid(x0, x1) Y = X0X1 - mu L = linalg.chol(Cov) logdet_Cov = linalg.chol_logdet(L) Z = linalg.chol_solve(L, Y) Z = linalg.inner(Y, Z, ndim=1) lpdf = random.t_logpdf(Z, logdet_Cov, nu, 2) p = np.exp(lpdf) shape = (np.size(x0), np.size(x1)) X0 = np.reshape(X0X1[:,0], shape) X1 = np.reshape(X0X1[:,1], shape) P = np.reshape(p, shape) return axes.contour(X0, X1, P, colors=colors) def _contour_gaussian_gamma(mu, s2, a, b, axes=None, transpose=False): """ """ pass def ellipse_from_cov(xy, cov, scale=2, **kwargs): """ Create an ellipse from a covariance matrix. Parameters ---------- xy : np.ndarray position of the ellipse cov : np.ndarray covariance matrix scale : float scale of the ellipse (default is three standard deviations) kwargs : dict keyword arguments passed on to `matplotlib.patches.Ellipse` Returns ------- ellipse : matplotlib.patches.Ellipse """ evals, evecs = np.linalg.eigh(cov) angle = np.arctan2(*evecs[::-1, 0]) width, height = scale * np.sqrt(evals) return patches.Ellipse(xy, width, height, np.rad2deg(angle), **kwargs) def ellipse_from_precision(xy, precision, scale=2, **kwargs): """ Create an ellipse from a covariance matrix. Parameters ---------- xy : np.ndarray position of the ellipse cov : np.ndarray covariance matrix scale : float scale of the ellipse (default is three standard deviations) kwargs : dict keyword arguments passed on to `matplotlib.patches.Ellipse` Returns ------- ellipse : matplotlib.patches.Ellipse """ return ellipse_from_cov(xy, np.linalg.inv(precision), scale, **kwargs) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/testing.py0000644000175100001770000000371500000000000017512 0ustar00runnerdocker00000000000000# Copyright (c) 2015 Bernhard Thiel, Jaakko Luttinen # MIT License # From: https://github.com/Bernhard10/WarnAsError/blob/master/warnaserror.py __author__ = 'Bernhard Thiel' from nose.plugins import Plugin import nose import warnings class WarnAsError(Plugin): enabled = False def options(self, parser, env): """ Add options to command line. """ super().options(parser, env) parser.add_option("--warn-as-error", action="store_true", default=False, dest="warnaserror", help="Treat warnings that occur WITHIN tests as errors.") def configure(self, options, conf): """ Configure plugin. """ super().configure(options, conf) if options.warnaserror: self.enabled = True def prepareTestRunner(self, runner): """ Treat warnings as errors. """ if self.enabled: return WarnAsErrorTestRunner(runner) else: return runner class WarnAsErrorTestRunner(object): def __init__(self, runner): self.runner=runner def run(self, test): with warnings.catch_warnings(): warnings.simplefilter("error") # Filter out some deprecation warnings inside nose 1.3.7 when run # on python 3.5b2. See # https://github.com/nose-devs/nose/issues/929 warnings.filterwarnings( "ignore", message=".*getargspec.*", category=DeprecationWarning, module="nose|scipy" ) # Filter out some deprecation warnings inside matplotlib on Python # 3.4 warnings.filterwarnings( "ignore", message=".*elementwise.*", category=DeprecationWarning, module="matplotlib" ) return self.runner.run(test) ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.413372 bayespy-0.6.2/bayespy/tests/0000755000175100001770000000000000000000000016617 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/tests/__init__.py0000644000175100001770000000061100000000000020726 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2015 Hannu Hartikainen # # This file is licensed under the MIT License. ################################################################################ import bayespy.plot as bpplt def setup(): for i in bpplt.pyplot.get_fignums(): fig = bpplt.pyplot.figure(i) fig.clear() ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.401372 bayespy-0.6.2/bayespy/tests/baseline_images/0000755000175100001770000000000000000000000021726 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.417372 bayespy-0.6.2/bayespy/tests/baseline_images/test_plot/0000755000175100001770000000000000000000000023743 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/tests/baseline_images/test_plot/contour.png0000644000175100001770000015574300000000000026161 0ustar00runnerdocker00000000000000‰PNG  IHDR€à5ÑÜäsBIT|dˆ pHYsaa¨?§i IDATxœìÝgTTWÛà{†) ¨€4QQ Á‚{°·h¬‰‰1v±Kì51Æ®±7ì],ر‹Ø{Á^P°"mîïG>Yy# 3 0žk­üygï}žNÞs眳÷V$„B!D®¡4tB!„"{IB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—‘(„B‘ËHB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—‘(„B‘ËHB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—‘(„B‘ËHB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—‘(„B‘ËHB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—‘(„B‘ËHB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—‘(„B‘ËHB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—Qº€ÏYtt4víÚ777˜™™º!„Bh!..wïÞEݺuagggèr B víÚ…víÚº !„BdÀòåËѶm[C—aõàææàïÈÓÓӰňlÑ·o_üþûï†.Cdù{ç.ò÷Î=®\¹‚víÚ¥\Çs# €zøøØ×ÓÓ¾¾¾®Fdkkkù[ç"ò÷Î]äïûäæ×·dˆB!D.#P!„"—‘(„B‘ËHBmÚ´1t "Éß;w‘¿·ÈM$ ¡¹@ä.ò÷Î]äï-r €B!„¹Œ@!„Bˆ\F B!D.#P!„"—‘(„B‘ËHB!„Èe$ !„Bä2…B!r €B!„¹Œ@!„Bˆ\F B!D.£2tBhB<À7pïÞ=<}ú±±±ˆ‹‹Cbb"”J%LLL`aaüùó£@°··‡‹‹ \\\'OC!„"Ç‘(r’8wîvìØàÔ©SˆMù<_¾|ÈŸ??ÌÍÍ¡R©@ñññxûö-bbbðþýû”¶J¥®®®ðôô„··7Ê–- ___+V J¥ÜüB‘{I9ÂóçÏ1þ|,Z´7oÞ„……ªV­Š~ýúÁÇÇ%J”€³³3LMM5ŽóîÝ;–,Y‚råÊ¡dÉ’3f îÞ½kèR…B½HÙæÍ›7¨S§®_¿Žƒâ›o¾1tIiR©T ÄìÙ³ñèÑ#ìÚµ å˗ǤI“P¤Hbùò刋‹3t©B!„Î$Šl¡V«Ñ¶m[\»v {÷î…ŸŸŸ¡KÒšJ¥B:u°téR/3Aß¾}ammý?ÿ[›6mЦMUdx&&&HHH0t8|ø0F…°°0äÏŸ7ƈ#€Â… C¡Ph=ÖëׯqíÚ5\¸p‘‘‘8}ú46n܈øøxXXX J•*¨U«êÕ«OOOÆÖ‡ƒƒ–,Y‚.]º G¨P¡~øáL˜0AÞÃBäz«V­ÂªU«þç{õꕪÉ9ÌI·i>3øòË/qæÌøúúºœ¥B… ðòòÂ_ýeã?{ö ={öÄš5kàãモY³fÈ“'O¦'!!gΜÁÁƒ†#GŽàÇpssCÆ Ñ´iST©R*Uöü·VRRæÌ™ƒaÆA¥RaòäÉèÔ©S¶…Q!„øÈõ[&ˆ,’/_>ƒML8|ø0Ê”)ƒýû÷céÒ¥8sæ Ú´i“éáøûNgÅŠ1xð`ìÙ³/_¾ÄŽ;ðÕW_aãÆ „ƒƒºvíŠ@­Vgz ÿ¤R©Ð£G\½zõêÕÃwß}‡jÕª¥<öB! €"‹(PÏŸ?ÏöãnÞ¼µjÕBÉ’%qþüy´oßJeöæfff¨_¿>fΜ‰{÷îáäÉ“øî»ïŠ5jÀÕÕC† ÁÕ«W³´{{{,_¾aaaxòä |||0lØ0ÄÅÅeéq…B|$Š,áàà€Geë1÷ïß–-[¢Q£Fؽ{7ìíí³õøÿ¦P(àçç‡I“&áÎ;GÆ 1wî\xzz¢R¥JX´hÞ½{—e5âüùó:t(&OžŒ2eÊ`ß¾}Yvœ%ãŸ;w^^^9nÑçìbccƒ &àÖ­[hÛ¶-† †bŊ᯿þBrr²ÞãÛÙÙ!$$+W®Dhh(J—.]»veBåB! M ÈRÕ«WÇÁƒ3%üÛÍ›7Q¢D‰L7-ïß¿ÇîÝ»1jÔ(´hÑeË–…ƒƒÌÌÌ P( T*aff†B… ÁËË µkׯ÷ßI“&aëÖ­ˆŠŠÊ’‰ööö˜1c®^½Š*UªàûᅦvîÜ©÷Ø …mÚ´ÁÅ‹áíízõê¡Gxÿþ}&T.„ÂPT†.@ü·Õ¬Y¿üò """àçç—©c?xð*TÈÔ1ÿ$ÂÂÂ0oÞàõë×xþü9>|ˆÈÈH¬Y³oÞ¼ð÷]µòåË# ÕªUƒŸŸ_¦í\R´hQ¬^½ýúõCÿþýñÕW_¡N:˜:u*¼½½õ»pá ÅÌ™31`À„……aÅŠðõõ͔څBd/ €"K•/_–––ؽ{w¦À/^ÀÎÎ.SÇü§#GŽ 88§N‚··7~ùå4hОžž:Í:&‰ 22§OŸÆ±cÇ0~üx :––– DPP‚‚‚àèè¨wÝ~~~8pà6oÞŒà‹/¾À?þˆÑ£Gëõ{) ôèÑ5kÖDÛ¶mQ¾|yŒ;ýû÷‡‘‘‘Þu !„È>òXd)cccÔ¬Y;vìÈô±ß¾} ‹L7!!}ûöE•*U{öìÁùóç1pà@”*UJç%g œÑ°aCŒ5 »wïFLL Ž?ŽAƒáÅ‹èÚµ+ .ŒŠ+bêÔ©¸wïž^ßA¡P I“&¸té&OžŒ•+W¢xñâøóÏ?‘””¤×Øžžž8~ü8úõë‡!C† V­Z¸ÿ¾^c !„È^E– ÂñãÇic’Drrr¦=>ýèíÛ·¨_¿>fΜ‰ßÿÇG­Zµ2}A•J…òåËãçŸÆáÇñüùs,]º… ÂÏ?ÿ WWWT­ZóæÍÓk‰ãÆhÙ²%z÷î²eËâàÁƒzÕobb‚‰'bß¾}¸yó&¾øâ ¬_¿^¯1…Bd €"Ë5hÐjµÛ·oÏô±Õju¦•€FáôéÓØ»w/úôé¥2{þ±±±Aûöí±iÓ&<{ö K—.…™™~úé'888 ]»v8xð`†'‘(PóæÍéS§`aaêÕ«ã›o¾ÁãÇõª»zõê8wîÑ¢E téÒE&ˆ!Äg@ Èrööö¨P¡6mÚ”ic* ˜ššâÇ™6fpp0ÂÃñmÛ6T­Z5ÓÆÕ•••Ú·o]»váþýû9r$Nž<‰êÕ«£T©R˜>}:^½z•¡±¿üòK„‡‡cÑ¢EØ»w/J”(?þøC¯ÇÂ666X»v-æÍ›‡åË—§ì#„"ç’(²E³fÍŠ·oßfÚ˜VVVBÿ¶oß>Ìœ9¿ýö[Ê»9££#Œk×®aß¾}(]º4úõë'''ôìÙ7nÜÐyL¥R‰N:áÚµkh×®úöí ???œ…B5j`Íš5ˆŠŠBß¾}‚%J I“&8zô¨ÎcæÏŸ³f͉' P(P¡Bôèѯ_¿Îpžžž8qâ:wîŒîÝ»£eË–™¶MBˆÌ#Pd 777øûûcõêÕ™6¦ƒƒƒÞï°À¶mÛpþüyLš4)Ó'{dGGGŒ=÷îÝüyópíÚ5 J•*رc‡ÎwÝ>Þýûí·ß°xñbxzzbãÆ®ÏÔÔ3gÎĆ œ8q"Ãã !„È|E¶iÓ¦ vî܉˜˜˜LÏÅÅQQQz3oÞ<øùù! ªÊ>¦¦¦øþûïqéÒ%lÚ´ III B¹rå°iÓ&&ȨT*ôéÓ—/_†¯¯/š5k†æÍ›ë°›6mŠÈÈH888 råÊøí·ßä‘°BäE¶iݺ5’’’°víÚLÏÝÝ7oÞÔkŒØØXìÚµ :tÈ”š A©T¢qãÆ8zô(ÂÂÂ`ee…¦M›Â××›7oÖ)t¹¸¸`Ë–- Á‘#Gàé鉅 f8¸¹ººâСCèÓ§úõë‡&MšàåË—K!Dæ‘(²½½=êÔ©ƒ%K–dÊx%K–ÄË—/ñüùó ±ÿ~$%%¡Aƒ™R“!) bÿþý8xð lllФIøûûc×®]Z‡8…BV­ZáÊ•+hÒ¤ :wuëâîÝ»ªËØØ“'OÆÖ­[qøðaøúúê5áD!„þ$ŠlÕ±cG=zׯ_×{¬ûÛ^¸p!Ãc;v ÎÎÎpssÓ»M^¿~³gÏbçÎX³f V¬Xl߾ǎÃÝ»w‘˜˜˜iÇ«Zµ*öíÛ‡}ûöÁÄÄõêÕC5püøq­Ç°±±ÁâÅ‹±sçN\½z¥K—ÆìÙ³3¼öbƒ pöìYØÛÛ£råÊøóÏ?å‘°B E†9s†xæÌC—òÙˆ‹‹c¾|ù8hÐ ½ÇJJJ¢™™§Nšá1êÖ­ËFé]Ë¿½ÿž!!!lß¾=ÝÜÜ ÝŒŒŒX¼xq6iÒ„¿üò ·oßΗ/_ê]‹Z­æÖ­[Yºti`³fÍxíÚ5Æxõê»téB ä;w2\O||<{÷îMüúë¯ùúõë %„!×oRîŠlejjŠvíÚañâÅzßñ222BÙ²eqêÔ© qûöm+VL¯:þ)66?ÿü3ñõ×_ãüùóhÚ´)–-[†ãÇãÞ½{ˆÅ»wïðêÕ+<|øçÏŸGhh(f̘   ¼}ûÓ§OGPPlllðÅ_ 88»víÊÐÂ× …"åîÛ’%KpêÔ)xyy¡W¯^Z/£cee…¹sçb÷îݸqãJ—.ùóçg螉‰ ¦M›†5kÖ`ÇŽðóóÃ¥K—tG!„ @?gò_sîÜ9àúõëõ+88˜®®®îoaaÁ)S¦è]I®_¿ž  ¹¹9ƒƒƒyãÆ ¥V«yãÆ .^¼˜:u¢““ÐÜܜ͚5ãòåËùêÕ« ýþý{Nœ8‘VVV´¶¶æo¿ýÆøøx­û¿zõŠ;w&Ö¯_Ÿ>ÌP$yíÚ5z{{ÓÜÜœ+W®Ìð8B¡ ¹~“õ 'PÆU¬X‘µjÕÒ{œuëÖïÝ»§sߤ¤$à‚ ôªA­VsøðáÀ¦M›òÑ£Gz—Ö1.\¸ÀñãÇÓßߟ˜'O6oÞœ7nÔ)À}ôìÙ3víÚ•J¥’ܾ}»Ný·mÛF{{{ÚØØ0$$Dçãôöí[¶mÛ–Ø«W¯ }!„Ð…\¿%êEN Œ[ºt)ðêÕ«zóìÙ3à²eËtîûþý{àòåËõªácø›8q"Õjµ^ci+**Š“'OfÙ²e €vvvìÛ·//_¾¬óXçÏŸg`` °Aƒ:ݹŒŽŽfË–- €mÛ¶eLLŒÎÇ'ÿ¸3fÌ ±±1ôº«(„é‘ë·¼( ¤eË–°³³Ã¬Y³ô§@(S¦ öîÝ›á1¨ÇLÔ7b̘1˜0a ”m;‰¸¸¸ ÿþˆˆˆÀ… СC,[¶ ¥J•BõêÕ±víZ$%%i5VéÒ¥±wï^¬[·çÏŸ‡——†ޏ¸¸tûÚÚÚ"$$Ë—/ǶmÛP¦L8p@çï£P(н{w|ø€Y³fáÇ„£££Îý³Ú_|… âÞ½{èÓ§/^Œ¢E‹¢sçÎZmŸW¼xq„††bõêÕ8zô(<==ñ矦ûw222 Aƒpüøq¼}û¾¾¾ÚJÎØØÓ¦MòeË‚*UªàÞ½{:!„"m…AãöíÛØ´iS†Ç011Aýúõ34FáÂ…µº»õoÛ·oGLL ºvíªsßìT¨P!Œ;÷îÝÃøñã±cÇ”(Q:tÀ74öU(øúë¯qåÊ´oß½{÷F¥J•´ÚyåË/¿DDDÚ´iƒÎ;£uëÖˆÕ¹þvíÚ!<<ÏŸ?G¹råpèÐ!ÇBñ) €Â Ê•+‡jÕªaòäÉz½ëÕ´iSDDDè¼_m‘"EpçηmÛ6x{{ÃÃÃC羆`ii‰þýûãöíÛ˜6mÂÂÂàéé‰ï¾û.Ýß,_¾|˜5kŽ9’rWoĈˆ×Ø/oÞ¼X°`BBB°k×.”-['NœÐ¹v___œ>}^^^¨Y³.2Y IDAT¦Þ‡„BH9À€pâÄ >|8ÃcÁÔÔk×®Õ©_ñâÅ3ôžÚ‘#GP£F û¥'11×®]Þ={°fÍ,[¶ +W®ÄæÍ›Ž»wïj=»75fffèÙ³'nݺ…©S§bûöíððð@ïÞ½ñìÙ3}+Uª„ˆˆüüóϘ8q"|}}µ t­ZµBdddÊÀ“'OÖy?á `÷îÝøé§ŸÐ½{wüøãšd"„âÿtšÏœ¬#”9’““éííÍúõëë5NóæÍéëë«SŸõë×?~¬uŸ·oß.\¨k‰©ŠŠŠâøñãY¹reæÉ“'Ý=ƒU*===ÙºukN™2…GÍðâÉoÞ¼á¸qãhmmM Ž=šoß¾M·ßùóçY®\9*•Jöïߟïß¿O·OBB ”²ƒÈ³gÏ2Tó_ýEV®\™OŸ>ÍÐBˆÜM®ß²´^äÊ<Ë—/'FDDdxŒ»‚\¹rEë>7oÞ$îØ±Cë>/^$>|8#e¦¸{÷.Û¶mK¥RIsss6mÚ”Ó¦Mã¾}ûxëÖ-ÆÆÆ2..Žoß¾åÓ§OyéÒ%†††rÆŒìÞ½;+UªD33³”-âêÕ«Ç?þøƒ·oßÖ¹–èèhÓÄÄ„ŽŽŽ\¸p!“““5öILLäĉ™'O–(Q‚ÇŽÓêX;wî¤ .ÌC‡é\+I†‡‡³P¡BtqqáÙ³g34†"÷’ë·@½È ”yéîîÎ-ZdxŒ¸¸8æË—C‡ÕºZ­fþüù9jÔ(­û„……€^{ý®X±‚tppàŒ3øæÍ› “À'NpÒ¤I¬U«MLL€eÊ”á˜1ct®ñöíÛüúë¯ €eË–Õ* ]¾|™þþþT*•¢"w“ë·@½|<ªV­Ê† þÏ?+W®4tyŸ¥Y³fQ¡Pèôß?©Õjz{{³yóæZ÷9zô(ðøñãZµß²e ðÉ“':ÕöðáCZYY±M›6Ù~§êáÇ;v,]\\Rî ®ZµJ«ÐûêÕ+ÓÈȈܳgÆö>|àÀ©P(X¥JÞ¹s'ÝcDEEÑßߟ&&&œ={v†~Ÿ5kÖÐÌÌŒþþþ|ôè‘Îý…ÿM+W®üä]µjU €†.às&ÿ‘ù>|ø@ggg¶nÝ:ÃcLŸ>*•J똘H+++Ž=Z«öûöíËÐ;€?üðíììøòåKúe¦¤¤$nÚ´‰@þþûïZÍü½xñbÊÿi¶iÓ&ݙӤ««+---¹lÙ²tÇg÷îÝS¿{÷NëïõÑéÓ§éèèH'''™"„H“\¿%êEN ¬1gÎ* ^¼x1CýcbbhffÆ1cÆhݧY³f¬X±¢VmÏŸ?O¹~KÔ‹œ@Y#>>žnnnlÖ¬Y†ÇøþûïY¸pa&$$hÕþ¯¿þ¢B¡ÐjbÇ‹/€k×®Õºž™3gR¥RñùóçZ÷É.wïÞe=hjjJkkkŽ92Ý Íï¾ûŽX¹råtÙ¯X±‚VVVtsscxxxº5?žÅŠc¾|ù¸mÛ6¾ù÷ZM›6¥B¡ào¿ý&“C„ÿC®ßõ"'PÖY¼x1h=aáß"## €!!!Zµúô)•J%,Xn[µZM+++îæ5hЀ5jÔк½!Bˆÿ>¹~KÔ‹œ@Y'))‰žžž¬S§N†Ç¨V­+Uª¤S{m—ƒñóóc§N´j«V«Y @Ž1BëZ´ñâÅ ^¸pGexx8Ïœ9Ã;wîhµŸ&>d÷îÝillL{{{Ξ=[c°{ÿþ= D###–-[Vã»w‰‰‰üù矩P(˜î{šÉÉÉ5j 5j”¡es,X@•Jźuëfú²;BˆÏ“\¿%êEN ¬õq¹•°°° õÿ¸^Ÿ¶³{g̘A•JÅèèètÛ~ûí·Zo;M\³fVíÓòêÕ+.X°€7fÁ‚ÓÜ*N¡PÐÙÙ™uëÖåàÁƒ¹aÆ m»vûöm¶k׎ …‚%J”à¦M›4>J=}ú4K—.M•JÅ‘#GjÜž.,,Œ,P CCCÓ­eÛ¶m´²²b‰%xõêU¿Ëž={hmmÍÒ¥K§»¦¡â¿O®ßõ"'PÖR«Õô÷÷§ŸŸ_†ÞáJJJb±bÅ´Þ]äÉ“'422âìÙ³Óm;}útš˜˜hu·íããhmƒè¿ÅÅÅqôèÑ´²²¢R©d•*U8lØ0†„„0<<œ.\à¥K—xúôi†††ò¯¿þâ!CذaC:::¦Ã2eÊpàÀ<|ø°N eŸ={–µjÕ"Ö¨QCã"Ïñññ1bU*}||xîܹ4Û>}ú”uëÖ%2$ÝG´×®]£§§'­¬¬¸uëV­ëÿèÒ¥Ktuu¥ƒƒƒü;+D.'×o €z‘(ëíß¿Ÿ¸zõê õŸ={6•J¥ÖK¶Ô«Wé¶û¸à´6ûßJ¥Rc°ÊL111üã?èááA,W®W¯^î{yÇŽ£ »uë–æú¯_¿NY7°Y³fiN´IJJâÈ‘#©P(X¿~ýtÿÞ‘‘‘tqqa¡B…tZ˜›üû±ö”)S€mÛ¶ÍÐf!ÄçI®ßõ"'Pöyüø1-,,اOŸ õ6lÍÍ͵š {îÜ9àºuë4¶û¸#È®]»4¶»wïpÇŽZÕš@GGG­—™ÉLÉÉÉܱckÖ¬Itwwçœ9s4†ÜÄÄDN›6tppàÚµkÓ¼£¶aÃÚØØÐÑÑQãìîÐÐPÚØØÐÍÍ-ÝàüôéSÐÄÄD«-çþ-$$„yòäa`` V;•!>rý–¨9²×øñã©R©24™"::š8p Ví+UªÄÀÀ@mÔj5]]]Ù­[7íµž]L’Û·o'ƒïe{æÌ¶lÙ’ …‚NNNœ5k–Æ xÿþ}6nܘظqc>xð Õv<```  œæÖ»wï²\¹rÌ“'.\¨±Ö>°S§N)tu]4úàÁƒÌ—/K—.fÝBˆÿ¹~KÔ‹œ@Ù+..Ž®®®lРA†ú:”æææZm÷¶bÅ Hw?â^½zÑÑÑ1ÝÀQ´hQöë×O«:{ôèAww÷ó^Ú•+WøÍ7ßP¡PÐÕÕ• .LóѰZ­æÚµkY¨P!Z[[§ùÎ_rr2'NœH•JÅòåËóöíÛ©ŽÇï¿ÿžØµkWiÕj5'OžL…BÁfÍš¥»‹É¿]ºt‰ÎÎÎtvvæ¥K—tê+„ø¼Èõ[ ^äÊ~kÖ¬!­þ·èèhZZZjõ.`||<Ø¥Kí:D<|ø°Æv 6dݺuµªÓÏÏ:tЪmvº|ù2[´hAôôôäæÍ›Ó ©/^¼`ÇŽ €uêÔaTTTªíŽ?N777Z[[kÜ[yÞ¼y411aÅŠÓ̳yófæÍ›—¾¾¾:ßÍ{ðàK—.ÍüùóóÈ‘#:õB|>äú-P/re?µZÍjÕª±dÉ’éNÒH͈#hjjªÕŒà1cÆÐÔÔTã{ƒÉÉÉ,\¸pº‡Î huW/_¾|œ0aBºíÒÇS§N1$$„sæÌáŒ3¸`Ánܸ‘§OŸÎð¶h§OŸNyG°jÕªßÑÛ±c .LKKË4ïÆÄİeË–Ànݺ1...Õ±Ž;FGGG:88¤;áãìÙ³trrbáÂ…¡Ó÷‹‰‰aµjÕhjjÊM›6éÔWñyë·@½È dçΣR©äÔ©SuîKvíÚ5ݶÑÑÑ433ã/¿ü¢±Ý€hkk«ññäÖ­[ €·nÝÒ8V||<pñâÅéÖ—š¤¤$®]»–uëÖ¥‰‰Éÿlgllüɶq®®®lÙ²%ÿøã^¼xQëÇÎjµš;w—°cÇŽi.W›ò~^PPPªíÔj5gÍšÅâ3g•J¥Æ) ÀB… i  î²'ÿtóæMZYY±V­Z:Ï|MZ­æ±cÇØ½{wÚØØ¤¼ë’î{—‘‘‘  ¶oß>Õ×jµš‹-¢……ÝÝÝSÝOY­Vó÷ß§J¥b•*UÒ¼Ó{àÀÚÙÙÑÝÝ=ÝÛ3gΤR©dãÆuÞ>îãß·uëÖ²`´ÿrý–¨9 +99™*T ···ÎBYªT)Ö¨Q#ÝG{—.]"~ýš>|xšmöïßO|øÀU«V±jÕª@'''þúë¯LNNNæ_ýEÚØØpáÂ…©þÆ7oÞd… hddÄ1cƤzgîÈ‘#tpp ½½}š3­ïܹÃÒ¥KÓÒÒ’Û¶mÓø}¶mÛÆ¼yóÒßß_«å€þiݺu411aݺu3%h ! G®ßõ"'áEDDP©TròäÉ:÷ݶmh5A E‹,R¤ˆÆ Ù­[7*T(Í;D‰‰‰ÌŸ??‡ªñXÞÞÞìÞ½{º5‘ä‰'€ëׯת½>Î;ÇN:ÑØØ˜VVV:t(Ÿ?žfûgÏž±]»vÀš5k¦:&11‘Æ £B¡`õêÕS]¶åÉ“'¬Zµ*U*ÿøãTÃäëׯٸqc* N:Uc¨?}ú4íííéîî®ó¢âaaa´°°`… 2´-¡"gë·@½È ”3ôêÕ‹yóæMs­¹´¨Õjjµ¤ÌÇíá4Í<½|ù2hÜŽ¬S§NôððÐPš4iÂZµj¥ÿø÷¢ÑNNNÙúnÚÇÙ¯_?æÍ›—yóæåàÁƒ5¾ƒJWWWš››sÚ´i©Öºÿ~:::ÒÖÖ6Õ»x ìÛ·oÊ£å÷ïßÒ&99™$~ÿý÷ÕÞ¹s‡žžž´µµÕùÉS§NÑÎÎŽ^^^Z-'$„Èyäú-P/rå ¯^½¢ƒƒ7n¬sß³gÏR¡PðÏ?ÿL·mË–-éêêª1XÔ©S‡¾¾¾i¼;w€Æµé†΂ j5ë´dÉ’Z-i“ž?ÎÁƒ3oÞ¼´²²âèÑ£ùæÍ›TÛ¾yó†=zôH™¡œÚ ìçÏŸ3((ˆØ¿ÿTCùŠ+hffF___Þ½{7Õc-Z´ˆÆÆÆ¬Q£_¾|™fý/^¼`•*UhjjÊ7jù­ÿvåÊ:99±H‘"i.Y#„ȹäú-P/råwÉȽ;w¦Mºô.]ºD…BÁ™3g¦Ù&44”Òœ›À‚ ²oß¾iޱiÓ&Ð8¡„ü{­:…B‘îzxYíéÓ§ìÓ§MLLX¨P!Î;7Í­â8@wwwš››sÆŒŸÜ T«Õœ2e U*+V¬˜êopöìYº¹¹ÑÎÎŽHõ8¤ =<<4.÷óÏõþ´Ý«ù£¨¨(zxxÐÞÞž.\Щ¯°äú-P/råjµš_}õœœR]ZD“'OžÐÒÒ’=zôH·mûöíiooŸæ$µZÍ2eʰ^½ziŽÑ§O,X0ÍÇÎOŸ>%®\¹Rc-7oÞ$îÝ»7ݺ³ÃÝ»wÙ¶m[ ··7÷ìÙ“j»7oÞð§Ÿ~"ÖªU‹÷ïßÿ¤Í±cÇèââB[[[îܹó“ÏŸ?ÎÀÀ@ªT*Κ5+Õãܸqƒ´µµÕ¸U_rr2{õêE6l˜Nëý=}ú”>>>ÌŸ???®u?!„aÉõ[ ^äÊYîܹCsssöìÙS羿þú+ŒŒÒ½“sûömsüøñi¶Y¾|¹ÆÇ¼ß'Ô´p´§§'øáµDDDh5«8=>äŽ;8þ|N›6³fÍâêÕ«yôèQPÓrêÔ©”¥`š6mÊ;wî¤Ún×®]tttdþüùòÉçÑÑѬ_¿> GŒñÉ,á„„„”àöã?¦úhþÅ‹¬V­MLL4jµZÍI“&;wîœæÌÔÄÄİR¥J´°°Hóί"g‘ë·@½È ”óL:• …Bç»1ñññôððÐjY˜ž={ÒÚÚ:͉‰‰‰twwg‹-Ò£|ùò¬[·®Æc¸¸¸h¬EŸÏY³f±téÒÿ³]œ™™ŒŒþg»8777~óÍ7\°`Ö“Ôj5W®\ÉÂ… ÓÔÔ”cÆŒá‡>i÷âÅ‹”}€;vìøÉÝÛäädŽ3† …‚uëÖMõ7_°`Y­ZµTg%ÇÇdz}ûöÀñãÇküM—,YB•JÅF¥:Ñ$-oß¾e:u˜'Ožt—¢Bž\¿%êEN œ'11‘_~ù%½¼¼t^°÷ãÕ«Wkl÷ìÙ3ZZZ²OŸ>i¶™?> Eš‹/Z´ˆÒ|?íc-çÏŸOó×®]#îß¿_c½ÿvãÆ –.]šJ¥’-Z´àš5kx÷îÝ”;ljµš111ŒŒŒäŠ+Ø·o_–+WŽJ¥’èççÇ)S¦¤ºdË¿½yó†¤J¥b‰%R­U­VsñâÅ´°°`Ñ¢EyòäÉOÚìÞ½›¶¶¶tuuMõ߷dz@twwç¥K—R=ƈ#€]ºtÑx‡oÇŽ477g@@€Nw@?|øÀ¦M›R¥R¥zGS‘sÈõ[ ^äÊ™"##iddÄQ£FéÜ·qãÆtttL÷=ÂqãÆÑØØ8ÍO¶jÕ*ÕÏß¿O[[[öîÝ;ÕÏ?|ø@KKKßáã¶qéÖŠŠŠ¢½½==<<©u?òïG²Ë—/g³fÍhjjJ¥RÉúõësãÆé>2½xñbÊcáï¾û.Õ 77nÜ ŸŸU*ýõ×O&ˆDEE±\¹r455MuOå;wîÐÛÛ›VVV MµŽE‹Q¥R±~ýúiÎX&ÿ~ÑÆÆ†ÞÞÞ:-õ’˜˜È¶mÛR©TrÑ¢EZ÷Bd/¹~KÔ‹œ@9×СCillœêÝ MîܹC333öïß_c»wïÞÑÉɉ͚5K³Í¼yó¨P(Ò¼‹7tèPZZZ¦¹«Æ7ß|Cooï4ÇW«Õ´¶¶æ„ 4ÖúÏöÕ«W§«««Î»`ü[ll,çÍ›G ‹‹ 'Ožœî!sçÎ¥µµ5 *”êâÕñññ0`°~ýú|öìÙÿ|Ço¿ý6eoÞO¤yýú5ƒ‚‚¨T*Ó\Úg÷îÝ´´´ä—_~É'Ož¤YïåË—éääD7773‰Sûž]ºt!Θ1Cë~Bˆì#×o €z‘(犋‹cÉ’%Y¾|ùT·ÓdܸqT©T¿’ä²eË4îEœ@www6iÒ$ÕÏ>|Hccã4w1Ù²e ðܹsiÖP¾|y¶k×NcíÙ³‡¸cÇ­ÚkëÌ™3ìØ±cÊ!C† ù$¸ýÓÇÙ¸qc`«V­Rm»sçNÚÙÙÑÑÑ‘‡úŸÏÔj5gΜI•JÅêÕ«Ò?)))eÑèž={¦ú÷ŒŒ¤ƒƒ‹)¢q7¨¨(–(Q‚ äÙ³gÓû)þ§Æ5dd—!DÖ’ë·@½È ”³…‡‡S¡PpÊ”):õ‹gÉ’% q‡äädúûûÓÇÇ'͹téRà‰'Rý¼S§NtttLu‚DBBíììœf ?ýô===ÓùFkß¾====uZæD>dÿþýiaaAsss4(͉2'‰ØØØ°`Á‚©.ÄüàÁV­Z•FFFœ4iÒ'‹C‡±@tuuMõqöìÙ³iddÄ   T÷FEE¥ì¢iÒгgÏX®\9ZYY}F5Q«ÕüùçŸ €£Gβß]¡;¹~KÔ‹œ@9_Ÿ>}hjjªóž¯û÷ï'Ο?_c»cÇŽçΛêçIIIôòòJsvñåË—©P(Ò}X @4'´,Y²„´Ú—ÖÉɉ H·¾¢££9tèДBÆŽ›æº‰?f£FRfÿûrbb"LlÔ¨cbbþçó¨¨(–-[–æææ\·nÝ'㇆†ÒÒÒ’>>>©NZyñâhff¦qöîëׯY£F š™™qûöíÚü )ÆŽK:t¨„@!r¹~KÔ‹œ@9ßÛ·oY¬X1VªTIçGÁ;vdþüùÓ}_®C‡´³³KsÆèÇG¹i=zmÖ¬‹-šêDŠ‹/@š³Jïܹ“îš‚äß—/_®±]fzúô){÷îMccc:::rñâÅ©ÞQU«A«ˆÖ IDATÕ\´h---éææ–ê¢Í[·ne¾|ùX´hÑOîö½{÷Ž­ZµJóNÛ¹sçèääD''§Të¿ÿž7¦‘‘‘ƉqqqlÔ¨U*•NoHrÊ”)À~ýúI"ë·@½È ôy8tè §NªS¿çÏŸÓÖÖ–mÚ´ÑØîÑ£G´°°Hs'µZͪU«ÒÛÛ;Õúq=¿%K–¤Ú¿J•*¬ZµjšÇ/^¼8»t颱Æ[·n@š»sd¥[·n¥´råÊñرc©¶»sç+W®L¥RÉáLjoݺEš™™qÙ²eÿó™Z­æèÑ£ €­[·þd ¿‡ÒÇLJVVV©þ‰‰‰üᇀ“&MJ3¤%$$°]»vïÚ¦åÏ?ÿ$öèÑCB &×o €z‘èóññQð•+Wtê÷ñ¾ô&NL:•J¥2͉'OžÔøH¹qãÆ,Z´hªÛÃ…„„@šË¶ôéÓ‡ŽŽŽßWü¸f`ZV²ÃáÇéëëKüöÛoSü‘””Ä1cÆÐÈȈ+Vüd‘÷ïß³C‡ÀÞ½{ò{­]»–fff,_¾<?~ü?Ÿ½~ýšõêÕ£J¥âÒ¥K?9¶Z­æ°aÃÒ½S—œœÌnݺûí7~ƒ¹sç¦ì\¢éï%„ÈZrý–¨9>ïÞ½£‡‡ýýýuÚæK­V³víÚtqqѸ6`BB½¼¼X¡B…4/ìmÛ¶e¡B…øêÕ«O>‹ŒŒ$.X° Õ±Ù±cÇTÇ=tè0<<<Íúîß¿O:¿¿–Ù’’’8{ölæÏŸŸ666ü믿R ZG¥››­­­?y·O­VsÆŒiÎ>uêèââòÉ#ß„„vîÜ™8nܸT=}úô”wÓ:WÔjuÊ»‰ºNðX¸p! ;wî,!P‘ë·@½È ôy9zô(•J%ǧS¿Û·oÓÜÜ<ÍG¼.\¨ñ;hríÚ5Î;— `×®]Ù«W/Ž;–«W¯æÕ«Wu 3OŸ>MÙž­zõê¼~ýú'mbbbØ¢E `÷îÝ?™)}èÐ!,X...ŸÜy½ÿ>¿øâ ZZZ~²(´Z­æÈ‘# €ÝºuKõ±üÊ•+µÚnܸqÀê—.]J¥RÉN:éünªBrý–¨9>?C† ¡±±±Nkº‘äï¿ÿN…B‘ê…JoâÈÈ‘#ibb’êÂÂ×®]£‘‘Qªï*¾zõŠÖÖÖi. Ó¿ÚÙÙiÜþ®páÂ:t¨ÆúSÉjÕªŒŒX´hQ–-[–^^^´µµMÙ3ØÖÖ–­Zµâ²eË4.ýO{öì¡»»;MMM9eÊ”ÿcï¼Ã¢¸Ö?>³}Y`—Þ‘j¥DTA± ØKb7ÆÞcטĘnLLL·—˜Xb,cï×KŒP"Š”Ýùþþà7\fÏ̺ÜÕ$&çó<>Ï}ΔÏÍùúž÷ý¾„â=ÿT*5j„7nŽß¹sÑÑѰ³³ÃÆÇxSh¹\.Z¥½|ùrÈd2tëÖMTtÿøãÐjµhÕª•hÔ–çý÷߯©5Á«W¯†L&ÃÀ©¤PþdèúM MÐ ôüQZZŠˆˆ„……‰.úRFÄÆÆ¢víÚ#B<€³³³¤9sqq1üüüšš*z|øðápvv&ìN`Ö¬Y°³³Ãljc|µ°˜ Orr2:vì(y\Œ;v@­V#,, 6lµsÉÍÍÅîÝ»1kÖ,4nÜ Ã@­V£GعsçEQQQÆ–e+jÙsêÔ)Â`0`Û¶m‚cÅÅÅèÝ»7†Á¼yó‘¸ŠŠŠª|½éÓ§ϲuëVh4ÄÇÇ‹~óƒB¯×£qãÆ¢ß‡ïú2xðà‰¹µk×B&“aÀ€TR("tý¦Ð&èz>9þß îÂ… –^±Š±cÇÂÅÅE42f ÇáÈ‘#èÓ§är9œœœ0oÞ<‹÷+**ªŠØµoßžd&“ ¯¿þ:d2Ú·oO`oÚ´ vvvhÒ¤ qí†  V«ÑªU+â.^¼â·ß~#žëÊ•+ðóóCPPèßKõßP(èÑ£‡hE·ëׯ‡\.§‘@ åO‚®ßTÚ@Ï/F£ ðóó«‘À©¨¨@Ó¦MQ§N‹[Á¿þú+Ôj5&Ož,züÊ•+P*•˜3gq¬¨¨>>>èÞ½;q,77¢ÑËÒÒR¸»»ã¥—^ýÍ’’h4«zÓrGI¥¸}û6ÆŽ F½^… ZŒîܹžžžpuuÅÖ­[‰ã»w³3‚‚‚ˆ^ɧN‚··7üüüˆc„““™™)8vëÖ-Ô®]¢–;·nÝBHH|||,Ú }ÿý÷P*•èÚµ«ÅœLsÖ­[Gs)”? º~Sht=ßܾ}z½þ‰FÏæð[Á&L°xÞo¼™L†“'OŠŸ9s&T*•hÎÛÊ•+Á0Œh„rîܹP«Õ„€€ùóçC«ÕJæ«uêÔ -Z´°øÜ@å·a†È·³•¬¬,Œ;*• žžžX¾|¹¤Øyðà:uê†a0fÌbëöƈŒŒ-¹{÷.¢¢¢`oo;vŽ]ºt þþþðóóÃ/¿üBüftt4ôz=:D<Óýû÷Ñ A¸¹¹â²:?üðT*RSSEû>:uüY³fÍ_ýx+X½zõÿÔ"íí·ß˲•ËËË…ððpÑHPII ‚‚‚ЦM"ßÍd2¡Y³f'òÉ àêêŠÁƒ÷|øð!´Z­hd¾úê+°,+*«Ã÷8kö4¸yó&úõë†a)ù9ŽÃ’%K R©ETO¡W¯^`sæÌˆ¦ÂÂB¤¦¦ŠVß½{áááprr"ü ­VKˆG òGGGÃÉÉ 'Nœ|Ç;wB£Ñ %%¥F9”|u0õ ¤PžkÖ¬!Öèøøx*ÿêxž¡ÿ‚øgЯ_?8::]',a4ѲeKZ4ˆ>sæ  æÎ+z|çÎ`_ý5qìäÉ“`YK–,!Ž-Y²2™L4Ÿoܸq0 ¢Ö%ùùùÐh4xóÍ7-¼Ý= ¥ò ŸÇGLL †A¿~ýˆî<§OŸFhh(ˆhÇqxíµ×À²,ºuë&¨T6xùå—Á0 ^yåÐÎËËC||<´Z-QSRR‚N:A©T¿Ç_ GGG‹Ü?ýô4 Ú·oo1eÀœ+V€eYŒ1‚¶£Pžtý¦Ð&èúgŸŸ€€4oÞ¼F՛ׯ_‡N§Ã!C,ž7{öl( IïÁ¾}ûÂÙÙYÔ;pøðáÐëõ„0*++CHHÚ·oO\s÷î]¨T*,\¸Pô÷z÷îºuëZ|b©í막Éd—_~ WWWèõz,_¾\4òUPPPÕSxܸqDTuóæÍÐétˆŠŠD8«W÷ïß_p]II ºví ¹\N´‡+//Gß¾}!“ÉðÕW_ÏóÇ !!:Îb1QFF´Z-’’’j$¿üòKÚ;˜ByFÐõ› @› èŸÃáÇ!—Ë%·N¥X¾|9†Á–-[$Ï)++Cxx8"##E·‚srràìì,š‹˜›› Q_Áï¿ÿ^²OñèÑ£áää$jÈœ‘‘†a°oß>Ég~ðà†Á¦M›$ÏyÚäææbРAUÝAĪqù-a¥R‰-Z¶,ç΃¿¿?¼¼¼pêÔ)Á±õë×C¥R!11Q­¨¨ÀàÁƒÁ0 Þÿ}Á5F£ÇÃ0øðÉç)..FÛ¶m¡ÕjñÓO?I¾ÛÞ½{ÿ'È÷ž8q"ÊS„®ßTÚ@ÿ,æÏŸ™L†X} ÇqèÔ©ÜÜÜ-yÞéÓ§¡P(0{ölÑã¼…‹XÑ ÊÈÈ ~»uëÖ¨S§!,ïÝ»F#ú{Ç¡N:HOO·ø^ÎÎΘ7ožä9ÏŠ={ö 00Z­K–,9rÞÞÞðôô$Š5²²²Ð´iShµZ|ÿý÷‚cûöíƒ^¯GÆ QUŽã0uêÔª\Âêb‹ã8Lœ8 Èn?~ü)))P«Õ¢9ƒ<¼lß¾}r—.]*º…M¡PþwèúM MÐ ôÏÂh4"..¾¾¾„¿œ%rrràææ†Ž;Z\ çÍ›¹\.Z8ÀqRRRàããCØÒp‡¸¸8„††ÂáüùóÉd¢Ö.S¦LN§¦}ôd2™EO»ääd$''K–Vù&&&âÎ;Ä9YYYˆ‹‹ƒR©Ä'Ÿ|"8VRR‚ôôt°,‹÷Þ{OpìüùóðööF`` QT²hÑ¢ªªãê“ã8Ìž= Ã`îܹÄßsii):uê•J…íÛ·K¾¿œœœ\#øÎ;ï€aÌŸ?ßêk(Š4tý¦Ð&èúçqçÎ899¡k×®5жlÛ¶ Ãàã?–<§¼¼5BݺuE·ïܹÑêÞK—.A¥RaæÌ™Ä±1cÆÀÞÞžØýý÷ßa0D}‹ŠŠàââ‚1cÆH>ïo¼NW#/»§ÍO?ý lذ8^^^^Uä1räHù²ÉdªŠê;V`7sëÖ-Ô©SnnnÄÿ—-[–eñ /9¡o¼ñ†Á´iÓˆùQVV†®]»B¥RY´ÏÙ³g4 :tèP#‹˜×^{ Ã`ñâÅV_C¡PÄ¡ë7€6A'Ð?“Í›7ƒa,]º´F×½ôÒKÐjµ¸té’ä9¼‡àرcEó9…by}óæÍƒB¡ ¬YòòòàîîŽ^½z×¼õÖ[Ëå„ß?F#¹u}æÌ‹-í¬áÆX»v-–.]ŠÏ?ÿ»ví²¸U.Æï¿ÿŽ=z€a 2D´ñgŸ}¥R‰V­Z!77WplÙ²eÉdHKKiÓ¦ppp Š8V¯^ ¹\ŽôôtB¿ûî»`&LiiiP*•¢Ö<»wï†Z­F§Nj$°gÍšõ?ÍM …"„®ßTÚ@ÿ\F •J%Y¹+Fqq1êÕ«‡ÈÈH‹‘÷ß È p‡öíÛÃÛÛ=+--EýúõѸqc"2ÅçîÚµ‹¸&(()))Äo=zô’ÝJ8ŽC`` † &ù.Rܽ{)))` Ã@©TVýo†aP§NLž<'Ož´*ÒÊq¾øâ ØÙÙ¡^½z¸xñ"qÎàêêŠàà`B„oݺZ­Í›7lï¢mÛ¶P«ÕD!ÏæÍ›¡R©Ð±cG"bûÑGa¼üòËÄó———£[·nP*•#;vì€J¥B·nݬnÇq&L˜†aðå—_Zu …B!¡ë7€6A'Ð?—Ç£aÆ µèógΙ3g R©0~üxÉsL&áíí-škxçÎèõzÑÊߣG‚eYbï?Lˆ•ï¾û Ã^w@¥EV«%úæò̘1ƒ¡Fùj†¯¯/V­ZUõޏvíÖ®]‹¡C‡ÂÝÝ à ,, Ë–-í›lÎ¥K—­V‹o¾ù†8~ãÆ 4hÐz½{öì;vì\]]Q¯^=ܾ}»j¼´´Ý»w‡\.ÇÊ•+×ìÚµ Z­‰‰‰Dä‘Ö¾ôÒKD¡JyyyU$Pì»ólݺ …½{÷¶ºýÇq1bd2Ö¯_oÕ5 E]¿©´ :þÙüú믰··Gß¾}k”ÈGø,ܽ{NNNHOO½÷7ß|#iÃ2qâD¨Õj¢í•+W R©0}útÁ8ÇqHLLDHH™ÌË˃Á`À¨Q£DŸóêÕ«5î”2oÞ\³oß>ØÛÛ#..Žx¶/¾ø,Ëbøðᄬžh©:xÓ¦MËå8p Õ?L&ú÷ï…BñÔÛõQ(ÿèúM MÐ ôÏgÍš5`Ÿ}ö™Õ×p‡Ž;ÂÕÕ•(Ì¨ÎÆ%·ò8ŽCZZ\\\èââb„††¢Y³fDÔhÁ‚ËåÄÖõ/¿ü…BW_}•ø­7ß| …¿þú«ès¶nÝÍ›7—|s"""ð /X}>P¹5jT*\\\ðÞ{ï=17îóÏ?‡Z­F“&MˆÖv9r$†ÁÌ™3"ûþýûˆˆˆ€³³3Ž=Z5n2™0~üxQ»—#GŽÀÑѱ±±„·"ß^oذa¢"°sçÎP«Õs)×®][%$­ýÇFEEÒÒÒ V«±wï^«®¡P(•Ðõ› @› èßÁðááÑhpîÜ9«¯yð༼¼Ð¦M‹[{ƒ ‚N§_<€‡‡:tè@ˆ‚ǃeY,Z´H0^VV†°°0DGGy‚S¦LF£!"s%%%ðóóC·nÝDŸ‘7œ>~ü¸ÅwæÑh4„¡²µdffbذaÉd ybtëäÉ“ðó󃻻;<(8Æq/^,Ú$//-[¶„ o’㸪B s»—'NÀ`0 I“&D~æ×_- ,--E‡ Ñh, µ¯¾úªªË‰µ"°´´IIIÐét8vì˜U×P(º~TÚ@ÿJJJ‰ÐÐPÑþºRìÝ»,ËbÁ‚’ç"445xmß¾]t[¨t*•Ѝ >qâd2^ýuÁxQQüýýѾ}{B`¬\¹R²;ˆÑhDHHºwïnñ}yX–%|ùjÊÅ‹Ѷm[0 ƒÎ; röÌyðà T*±|ùrâøºuë R©Ð¶m[Áß_qq1:tè¥R‰o¿ýVp ï8eÊÁ·:}ú4œÑ¨Q#Bò‘À‘#Gß÷ñãÇHJJ‚!T«óñÇWE-­¥¨¨-Z´€““1(Š8tý¦Ð&èú÷ð믿ÂÁÁ={ö¬Q>àœ9s “É,¶];uê”J%&Mš$z|Ô¨QÐh4„•ËãÇÑ AѪã©S§B¥RÕ²¼_áêÕ«ã&“ 111ˆŒŒí‡üÙgŸeYÑê[s\]]E·šk Çqظq#¼½½¡Óéðá‡JæÈ•——㥗^ªòü3‡Ÿþz½QQQšòòrôéÓG´ßï|Pu¿êçgÏž…‹‹ ¢££ øÅ_Höï-..F«V­ààà`1šúÖ[oa¼ñÆ¿OuòòòаaCxzzŠ¶Ð£P(BèúM MÐ ôï‚ÏÙë +…ÑhDBB¼½½ñàÁÉóÞ~ûm0 #Z,`É^æÌ™3P*•˜:uª`üñãǨW¯7nLXŒôìÙ®®®xøð¡`üĉ`YVôýÊÊÊP«V-ôèÑã‰ïܼysôîÝû‰çYKAAAU>_BBnÞ¼)yîG}¹\Ž””"Z{îÜ9xyy!88X° n41lØ0Q½eË–‰VúZ|uðøñã XXXˆæÍ›Ã`0X´š;wnýþrrrP»vmZÌ=¥P(tý¨´ :þ}Œ7J¥²FùV÷î݃››’’’$#X&“ )))pss]¼Ïž= •J…qãÆÇ-Z–e‰(ã‰' —ˉöaÙÙÙprrBß¾}‰{ 6 z½ž(<*‹.¬™ï'N„¯¯ïSï[›‘‘ZµjÁÁÁ+V¬¼ÿîÝ»áèèˆððp¢…ÜÍ›7 OOOANgõ~¿æ;Ÿþ¹h~ßÙ³gáììŒÆíûxŸÀéÓ§Ï™ŸŸÆÃÕÕUÒ4¼úóˆÙÝHqûömøùù¡Aƒ5jgH¡üÛ ë7€6A'п²²24kÖ ~~~DÍ»wï˲·Fù‘V­Z‰Žðö2æ¾r|”ÑÏψFÍž= …'OžŒóÆÑææÇ¹¹¹puuEŸ>}ˆß¯¨¨@ݺuѶm[‹ânçÎ`¦FE3Ö’ŸŸ€aôíÛWÒ2æâÅ‹¨U«¼½½‰H[NN¢¢¢`0UÀÕ @Ìó6ùü¾¡C‡ Dà™3gàì쌦M›ÕÁ|ǹsçÏ—››‹ððpxyyInÙr‡¡C‡B&“á»ï¾³ø]ªsùòe¸ºº"&&………V_G¡ü› ë7€6A'п“ÌÌL¸¹¹¡]»vV›÷ÿÍÌÈÈø@p¬´´iiiP(ظq£à¿…;mÚ4üòË/Eû>|:íÚµ´Î«Þ¿×¼Ò¨²ˆŠŠ"¶‘yŠ‹‹«¬^¬©ÄæÙ´id2™hU2…òo†®ßTÚ@ÿnL&ºté½^k×®Y}Ý¡C‡ P(0qâDÉsª E±(%4ß¾,**B½zõ.è Ìoùººº úþr‡^½zÁ`0Û|ÎáÏ?ÿLüþøñãagg‡[·nI¾C||}ú@.—QÂ÷Þ{O´˜ƒ¯ô0a‚`ü矆F£A§Nb”ã8 >2™Œš@e…²Á`@\\œdOä¼¼Be¾µ»gϰ,Kt ÉÈÈ˲D±ÂõëסÓé0tèPâ÷ùqëÖ­“|ǘ˜4mÚôOÛ~¼yó& ×ë‰è(Ï'Ÿ|–e1hÐ @3™L2dX–%r ysæyóæ Æù¿óèÚ¦ÍpO_ IDATM› —Ë1dÈ¢('55Z­‡"žm×®]P*•èׯŸdÞÞo¿ýOOO4jÔH² ÚŽãðòË/C&“‰Î% åß]¿©´ :(<|Æ[o½eõ5………¨_¿>êÖ­+¹s‡nݺÁÑÑQ´_ð —Ë #hŽãзo_ØÛÛãÊ•+‚c3fÌ€\.'Z’M›6 …'NœŒóbÒÜ2ºwïWWWɆŒŒ 0 ƒ 6ˆ }ûöP(’Å!+W®„\.GŸ>}âÙd2aèС`Y+V¬\Ãç|šwèà£kï½÷ž`œ·Ú™8q¢@–”” !!z½^Ô*gýúõ`Y–è@R³gÏÂÑщ‰‰„9¸F£éééÐh4¢â“Bù7A×o*m‚N Ju¦M›™LV#»Ž+W®ÀÑÑiii’‹}AAj×®-¹ÍÌG§6oÞ,/,,D:u&È+«¨¨@Ë–-áëë+È—+++C“&M$(Fà8:u"òJ+ggg‹B:v숀€AN⳦¼¼C† ©lbßöÛo¿…B¡@÷îÝ[®Õ#æ-óøÕ‹I8ŽÃ´iÓD+°ùmbóaAA¢££ááá!êÈw Y¸p¡ä;îÛ·jµºFU¾?F«V­àää$¹MN¡ü ë7€6A'¥:F£IIIpvvÆ7¬¾nóæÍ`†Ø–­¿ÍܧOBÌp‡´´4Ñ(á… `gg‡ ®ËÌÌ„««+’““âáúõëptt$zóùƒíÚµ#ÄÆºuëÀ0 Q@ÁsåÊ(•JbûôYÃqæÌ™SÕ–ML$mÙ²J¥iii[“ÉT•X=O“ã8L™2…{Çᥗ^˲X¿~½à7^}õUQ˜Tµnë¼2þ|0 ƒÏ>ûLòù*_KÑBsòòò???ܽ{תk(”tý¦Ð&袘óûï¿#((5* ™={6X–5_æY¿~=†Á»ï¾K³%\¹r%†Á§Ÿ~*ß¹s§hw~;{Ù²e‚ñÝ»wKšó•Äæ­×x¦OŸµZ-ºý¬Yºt)X–Å /¼ ZQýÃ?@¥R!--M 4èÛ·/ …Àê‡ã8Œ5 ,Ë òM&úõë¥R)ø{äm`X–%r2oݺoooDFF0üïÈd2"º[>Z(ö÷"ÅÝ»wáçç‡ððpIë åŸ ]¿©´ :(b\¸p:ŽèÊa “É„Ž;Â`0X´”™ûˆ# ÓéD þ vìØFƒ:Ñ6ظq#d2† "x·ââbÄÇÇÃ`0*wù¢ ½^/èIÌ›6ûøøàöíÛ‚ûÄÆÆÂÕÕW¯^üö–-[$»väåå!,, ~~~’mËËË‘’’œ9sÆêo§¼òÊ+V_C¡<ïÐõ› @› ˆb “É„´´4ØÛÛר}×Ö­[Á²¬h«7ž³gÏÂÎÎ}ûö%ÄBEEÚµk¢eÿþýËå˜0a‚`üäÉ“P«Õ„ßßöíÛEó·o߆aðÚk¯¿Ý¢E øùù!77—xî?þøAAAhÖ¬Ù3íl‰Ý»wW‰@±H oß2~üxÁ·•ªÜåÇ===ßûþýû@ݺuß"77uëÖ-þà»v˜[Í•Þ×ב‘‘(((}·ÂÂB4jÔ^^^áù$x?ÃO>ùÄêk(”çº~ShtQžDaa!ÂÃÃ$*ˆ¤xíµ×žØK—܈%ÿ[*FY²d †!|î¾øâ Q0wî\°,‹üQ0ή˜çfffÂÅÅ)))¢QÌ#GŽ@.—cöìÙ’ïö¬Ù½{7Ôj5ºté"Úuãã?5~ÎÉÉAíÚµ$oÙÙÙ AHHˆ Cǯ¿þ WWWÄÆÆ ¢µ|ñGTT±Ý?{ölÉ|¾ .@¯×£mÛ¶»… AƒÈË˳ê{p‡Ñ£GC.— r)”*tý¦Ð&è¢XÃÍ›7áêêŠV­ZI.Úæð=zíìì,nç͘1,Ëâ‡~ Ž?:ééé‚HÇqxñÅ¡V«qòäIÁ5£F‚R©˜L&¤¦¦*F£íÛ·‡‹‹ Ñ›–¯0^°`ès¿úê«`Y–ئþ3Ù¾};”J%z÷î-Z¸Â‹psû^¼5lØP‰»~ý:<<<ФIè>qâìììÐ¥KÁïœ;wŽŽŽhß¾=Ñ7xРAP*•¢ßgïÞ½P*•xñÅ%‹Œ._¾ ''§xFtîÜ:§OŸ¶ê åy…®ßTÚ@k9pà”J%FŒauepqq1¢££áçç‡ììlÑsL&:wî Ñï¾û Ã`þüù‚ñÇ£iÓ¦ðññ˜;———#>>îîî‚¢¾@¥~ýú‚ˆUnn®d>â¼yóD#‡@¥ØHLL„‡‡a.ýgÂûè >\Ô_‘·o1÷ö;þ<ôz=!°NŸ> {{{tèÐA°Å½}ûvÈårŒ5Jð;{öìB¡ÀàÁƒ‰–qIIIÐëõ¸páñÜ«V­PV‡/ð0÷€´DQQ7n oooIK åŸ]¿©´ :(5áóÏ?Ã0X²d‰Õ×dffÂÓÓ±±±¢E @e^]XX‚ƒƒE·™ùVe›6mŒß»wÞÞÞˆ‰‰Ü;''þþþˆŽŽˆ:¾@¥K—.‚­Ý³gÏB«ÕùˆO²¶ÉÎΆ——ââ⬎Œ> ¾úê+É"ÞÛO¥RÖ;¼ÀêÛ·¯à{ìÚµ …Æ |¾¥ÞâÅ‹÷ásÍ£¥ˆŒŒ„ŸŸŸ¨Hæ#”Ríî€Jû›' Es²²²P«V-DDDHæR(Ï;tý¦Ð&è¢Ô” &@&“açÎV_sìØ1¨Õj 0@2’sãÆ ÉmæêÛÉæ[{'Nœ€F£Aÿþý÷>sæ ìì숎 Û¶m˲˜9s¦à>|>â¢E‹ãO²¶9tè ÆŽkÝÇxFðEb⼬¬ íÚµÆñï=}útÁ8/*Í‹dfΜ)Ú™éæy™™™ðññAtt4‘ËÉq† ¥R)ê ɳpáÂ{^¼xŽŽŽHNNþËŠu(”g ]¿©´ :(5Åh4¢cÇŽptt¬Q/Ö5kÖ<±]¿Íly*·“5j___"š´víZÑÊÓo¿ýV4zôæ›o‚a¢OîÌ™3Á²¬ kP9tttDçÎEsíø~¹_}õ•ä»ýLš4 ,Ë‘Rà¿Ñ8___†…Ë—/Œó}ƒ«·Èã8}ûö…Z­Xåðy*• û÷ïÜçìÙ³°··ý~åååhÛ¶- ._¾,ú^–îm‰Ÿ~úItÛšBù'@×o*m‚N Êÿ¿e(0e~|u¨˜@áùòË/%ÛÅñ[¾Mš4AII‰è½¿ûî;Á8ßǶzÇq0`Ôjµ #HuÛ›ê^yÀíd¦NJ<Çq:t(T*•ÅnÏ”ƒa2™Ð«W/h4;vŒ8åîÝ»ðõõ­Ìw¬Ö̓oÙ&—Ë]>øïdþ^¥¥¥ˆ‹‹ƒ««+®_¿^5^VV†Ö­[ÃÙÙ™h™÷ã?B&“ö=@e”µ~ýúÆÃ‡E_ÏÒ½-Áo[¿ÿþûV_C¡<Ðõ› @› ˆò¿róæM¸»»£E‹’¹}æð[¹Z­–¨Þ­ÎÔ©SE#q@圵³³C=yk&“ ={ö„V«Å©S§¿Ù¯_?B•––¢E‹pwwt$)**BTTüýý »wÞy ÃàË/¿$ž«¬¬ ñññpss#¼ Ÿ9sç ,Z„Ç‹¾ÏÙ³g1[¥¦j66HII!"»¥¥¥ˆ'Ä^nn.BBBP¯^=UË£GP§N„††VøH©yf 2ÀÍÍ -[¶õ6¬~ïÚµk÷¶ÄäÉ“!“ɰmÛ6«¯¡PþîÐõ› @› ˆb G…Z­F¿~ý¬Þb+))ALL ¼¼¼$«4ùHœN§µùî»ïÀ²,f̘AÜ»iӦĽ?~ŒæÍ›ÃÝÝ]`÷òàÁ¢AƒÈÏϯÏÌÌ„——š4i"("á8LJB¡µ7yøð!BBBP·n]Aû¹gÊÁƒ•âÿ³hQÕ{EDD °°Pxþ¢EÂó«µ€+((¨ŠìVÄñb¯~ýú‚ïtõêU $%% òì~ûí7¸¸¸ uëÖD>ç˜1c —Ë ïE Ò_Q­V[¬ú½víœÑ¦M« oL&ºví*Ù¥PžWèúM MÐ D±•uëÖÕ¸J3;;»ªJSªgpQQ5jܽ{—8¾xñbÑh\VVüýý‰ Ð ((õë×D¬.]º½^víÚ Åþóèt:tíÚU·VÝÞD¬;ÊÕ«WáììŒøøx«#£6c.ê-ÂÅ‹áàà€´´´ÿFJÍΛ*R{óæM¸¹¹!>>^`såÊèõz¤¤¤¾GFF F-¸Ïþýû¡T* {šŠŠ $''Ã`0àÊ•+Ä«ðU¿bDžtoK¡aÆ¢‘] åy„®ßTÚ@”§_¥¹jÕ*«¯á«4SRR$«4ïß¿???4lØŠÕ£q{öì½·y¨”¹pFF”J%áe·mÛ6Èd2Œ;–h©???Qqzøðah4tïÞ]´hä™ "·lÙ†a*Ûà™çÞxƒ ‚Z­ÆñãÇ·:tèP•ÀªÎ®]» —Ë1iÒ$Áø²eËD;°ðùœ|ð`ÁøÄ‰!“ɰ{÷nÁøo¿ýggg$&&¯&“ éééÐét8{ö¬äëò¹};vì°îûà¿¶A5I[ PþŽÐõ› @› ˆò´à«4œœD·ö¤àÍ¥ßzë-És, ÅêÑ8s{>J´páBÁ8oI3kÖ,Á8_1l¾­üÊ+¯v(@e¤Ñ`0 !!A4¢Ä‹(1ƒæg†y¤ÏÙYTüñÜ»wžžžˆ'rêxQw°Zž Çq1U[¾æfÚ?ÿü3 ÆŒC¼Fõbœê½‰«SÝ’HÊBF >mÁ’%…òw‡®ßTÚ@”§É£GP¯^=ÕÈfÆŒ„U‹9¼P47j*£q>>>¢[ÅóæÍÃ0X±b…`œ÷üôÓO«Æ8ŽÃˆ# —˱}ûvÁøÀ¡T*‰HÖ¡C‡ ÕjѵkWÑ­lÞcïOæ‘Àÿÿóq­Z¢ÏxðàA( Lœ8Q0ηÕóôôˆk©Èann.ѰaCAñL^^j×®zõê9øëçŸN<×;wàáḸ8É~ÀUÛÉ5)¼™3g†ÁæÍ›­¾†Bù;A×o*m‚Ÿ@ñññèÔ©“ày´ƒB±†›7oÂÃÃ111Do])ª› W6™Ã熙8•[ÅŽŽŽHJJD²øˆ•B¡ˆ7Žã0zôhÈd2lÙ²¥j¼¢¢;w†À6¦¼¼)))Ðét8qâ„à·øá( ¼øâ‹k^„Šy>3Ì"厎Ëå˜]Íú¥:ï½÷ž¨Gcvv6|||мys»{÷.<<<ˆÈá¹sç`gg‡>}ú¢µ—/_†££#цFŒ¥R)0–æ9|ø°h>buøíävíÚYÝõÃd2¡{÷îÐét8þ¼U×P(kÖ¬!Öèøøx*ÿêxž¡ÿ‚ < Nž< ;;;¢‚Ö¥¥¥HHH€“““ÅŽ/¼ð”J¥¨ _ÌažßÇ‹7{{{Á\7HOO‡F£—”” E‹pqqÁ¥K—ªÆ‹ŠŠÐ¬Y3¸¸¸ϸzõj°,KŒðÏ=}útÑ‚ˆg‚DpO»vÉd¢fÕÇ!==ŽŽŽ¿? ÒžE©Tbܸq‚q>r8~üxÁ8ß^îwÞŒómøÌ+ÆËÊÊвeKxzzŠÕHå#V'##r¹\ÔhZŠ¢¢"DFF" @Ò€šBù»B×o*m‚N ʳBª‚Ö=Býúõ iÕÁÛ°8::Šzºñù}æyw………hÚ´)ÜÝÝñÛo¿U?~ü 0 ‚B’G!,, ~~~OÁßÿaaaðõõ%Œ–ùjX±œ?Žã0eÊ”gß•Â\ü™E?ò÷GPPéˆÊêÜ   4nܘØr]²d †ÁÆEÇÍ·ï§L™¹\ŽŸþY0Î÷ 67eÎÎΆ¯¯/bbbD ÇŒ…BH¾:ÿ,æ*–¸uëÜÜÜD{PS(gèúM MÐ Dy–ð‚ȼ‚Ö·o߆··7¢¢¢$=ÿøãDEEÁÛÛ[´Ûß±cÉ’%‚ñ 44ÁÁÁ‚ܵüü|DFFÂÛÛ[`}ïÞ= N:‚œÆû÷ï#((!!!D_b>çϼš¨S§NÃ0xóÍ7­ú5B¤ Xl|–R‰Q£F‰ÞâäÉ“P*•˜}úÀÞÞ^­¨¨@ëÖ­áîî.ˆê™L&téÒz½žhévâÄ ¨Õj 6Œx®òòr$$$ÀÝÝ™™™¢Ï^½@Å|›Þ|ê—_~Ùêk(”¿º~ShtQž5|íºu묾æÜ¹sUù|RÉÿYYY@ݺu‘››KŸ4iX–%~÷æÍ›ðòòBÆ ]-²²²„ÐÐP8¼ví<<<-8ÿÆðññAƒ ˆíCÞQÌ̘ã8Ìš5 Ã`Μ9OÏŠDJüIŸÊ0’Ñ´·Þz ÃÝ: ŒèèhA”®°°õêÕCƒ PTTT5ž““´hÑB]ËÏÏGíÚµ&8øoåöòåˉçÊÉÉŸŸš6m*éãWZZŠ˜˜QkK|òÉ'`Ÿ}ö™Õ×P(%tý¦Ð&è¢×®] –e1zôhÁuP«Õ„Áó?þ¥R‰=zÆ;GGG´lÙ’¨°åsg̘!šó—‘‘GGG4lØPRŒ= þ‰RççååÁÕՕضåá…›››‰ËÎΆ»»;RSSïTXXˆ:uê **J  òòòˆ&M𢥥¥hÒ¤ ˆmôŸ~ú ,ËbîܹijñQBI Žã0hÐ h4œ>}ú‰Ÿ„çøñãP©T½)”¿º~ShtQþl,åíIÁ/ä …?þø£äy~ø¡d×{÷î!00(ò€O?ý´J U‡7xîÛ·¯ õý÷ßC¡P _¿~‚ñ£GÂÁÁ-[¶$¢y|uð¸qãDsþΟ?___øúúâÌ™3–?ÈSdéÒ¥`Y/^=ž gggôë×8¶mÛ6¢› œ>}*•ŠÈû;qâ”J%&Mš$¿qã ºvíJä €eYìÚµ‹øý'E JOÇèèhùŒ–à;ÏТÊߺ~ShtQþ Ž?;;;¤¦¦ZݹïСÕj-&éó[®æ¢¨ìáéé‰ÈÈH¢m_ùjž¯·qãFÈd2 >>—Lîüø1áàà j²sçN¨T*¢¢×d2aèСÉd„pá­E&L˜ ëÖ­ƒ\.GŸ>}÷ºrå |}}DD,/\¸___øûûãÒ¥K¢ïpíÚ5DFFB£Ñ`ùòåO¯sˆwïÞ…L&³XôPXXoooôèу8–ŸŸooo¢*zöì ggg¢¿óèÑ£¡ÑhðË/¿Æy?Á5kÖÆM&’’’àáá!Ú+úàÁƒË兿ܼyó™D/^¼ Ã`óæÍV_C¡t:ôz=,X€¼¼<é$BEEt:/^lÕù¼]‹X^èEDDâöâÅ‹P(xõÕWãeeeGtt4áå÷úë¯C&“áèÑ£ÄïtìØîîî¢Ý:®_¿.é+ÈÃޏ»»‹þƒ@ŠË—/C§ÓaÀ€ÏÌž‡B‘‚®ßTÚ@”¿;W®\»»;5jT£Ž|ð†ÁìÙ³-ÞÛÓÓááá¢âæÛÁF£ƒ˲øüóÏ×ð ­ZµTÂfgg#** z½û÷ï\³cÇØÛÛ#::šèda41}út0 ƒ´´´'~ƒììlŒ;jµ˜8q"~ûí7‹×ðlÞ¼ ÃX]€SZZ 777Œ7NôøÑ£GÁ0 V¬XA›2e ´Z-Q€qüøqÈd2¢Â·¢¢Mš4Aݺu‰jשּׁ,¸¸¸ --MTˆñBU¬5 ONN<<<Ю]»µßã»Ù|ñÅV_C¡< èúM MÐ Dy8{ö, Z´haѢĜ7ß| Ã`áÂ…’ç\ºt îîˆ2òÛÁæ[Ê&“ /½ô†ÁÒ¥K×}:œœœÀ0 ðñÇ‹¿•‹¯œ­I4kòäÉpvvl{W§k×® $Š,þøãxyy!==¸fܸq°³³#ò/^¼•J%jôÌÛ̘GeÊ(aÏž=a0pçÎÉwÙ½{w»ÒÀ!C ÕjkT´D¡Ø ]¿©´ :(Ï ÇŽƒ½½=Ñ¿÷I,X° ÃXÜÖüå—_àîîŽÈÈHQÈo)9Râ8'N½ÿéÓ§áææ†zõê ÚŽ•––¢gÏž`Y–è{ÿþ}4nÜvvv„@*·3£££¡R©°dÉ«„Zqq1V¬XvíÚA.—ƒaÔªU )))Ã0X¿~}®£PþèúM MÐ DyÙ²e  úõë'):Ìá8“'OÍÙ«Î/¿ü„……‰n‡®_¿^Ôî¨ÌdYƒ ÄÜÜ\4kÖ :Žháöí·ßB£Ñ eË–ÈÍÍÛ´it:"""D=þîß¿ää䪮 æí×þ,8ðÄÿŽ$''#22R42×­[7øûû¶0×®]ƒJ¥Â¬Y³ˆk&NœF#ú]ƒÁ jTRR‚:uê qãÆÍŸùmÿš˜ló¹†z½Þ¢#…ò4 ë7€6A'åyeÆ Éd4hÕU›Ça„ `ü±äy—/_†§§'êÕ«'j ÂÛ½¤¦¦ùˆ«V­‚\.GZZš@СC‡P(D$ìÈ‘#pssCHHˆ`«¨ô †““Ñ>ާeË–A§Ó! à‰mÜž?†R©´(¬úé'Iãè+W®@&“‰nÍΘ1jµš0/,,„¿¿?’““ Qùûï¿ÃÃÃ]ºtœ|¥±¹ Mu8ŽC§NàêêŠììlÉóÌÉËËC­Zµû?o©S(Ö@×o*m‚N Êó̪U«À²,FŒauå*ÇqUùw–DàÕ«WáããƒÚµk‹ìܹS´ß/P¹ ©ÑhЦMAT®¼¼ƒ Ã0xóÍ7Ï|ýúuÔ«WNNN„Hzôè:tè–e1þ|QÁ{ýúu´jÕ Ã`È!6™Aÿ/DDDHvä*¿{ýúõѽ{wÑテ»»;±Ý]TToooÑjá­[·‚alذ8ÆW‹*…¥B¡ÀÙ³g%Ÿ9''îîî¢Å(–8räär¹hä’ByZÐõ› @› ˆò¼Ã÷e3fÌÿ$Í+q«óÛo¿ÁßßAAA„_ðß~¿±±±„àÚ¿?ѸqcA>aõ¢‘—_~Y°…——‡¶mÛB¡PàÓO?ÜÏd2aþüù`YÉÉÉ¢…*&“ Ÿ|ò áîîŽU«VýiÅݺuCRR’Ås–,Y…B!U½yó&”J¥hµöŠ+À0 8@ëܹ3¼½½E·¿ÓÒÒàáá!*†KKK.ZLR^d._¾Üâ»™óꫯB&“žÊÓ‚®ßTÚ@ÿÇÞ}‡Eu´ÿ.˲t)¢ READ@ 6 ŠŠX£¨–X»l±ÅDcì5j¬±—Ÿ±kìØ¬Q±D¨`AÚ~ß?rÁkdwÙeA,ó¹®çŸœ™9³Éy<·3gî[øÌŸ?Ÿ8xð`­‚Àœíà9sæ¨lwûöm:99±lÙ²J*çÔû­V­Zž­ÂóçÏÓÎÎŽ*TÈóMØ‚ ¨§§Ç–-[þgÕ+##ƒÑÑѹAíûÛˆ{÷µ5xüøq¥s~ðàAîIâ   <ÕGŠBïÞ½éíí­¶Í³gÏ(—ËUæÙ‹ŠŠ¢­­mžCÙÙÙôöö¦¯¯ožÿ¾·oߦ‘‘‘ÒúÅ< ¹¹9{õê¥ô~gÏž¥T*åwß}§vÞ½zõ¢‰‰‰Vµ–³²²HGGG­+²‚&Äû[€:ð¹˜;wn¾åßÞ÷n¿Ù³g«lwÿþ}V¬X‘¥K—Vzz4>>žööö¬P¡Bž•›7oÒÙÙ™ööö¼xñâ®íܹ“&&&¬Y³fžU±yóæQ__Ÿ 6Ìs8äÞ½{¬]»6¥R)§L™¢òȽ{÷ÒÍÍzzzìÛ·¯Ö9þ´Ñ·o_Ö¨Q#ßvááá¬^½ºÒk·nÝ¢T*UzRûàÁƒ*OØÆÆÆR&“ñêÕ«y®å$òV¶zHþ»,“Éxùòe•sNMMeùòå¨U•»wïÒ‚;v¥â„B'Þß"Ô‰x€„ÏINù·#Fh~óÍ7À3f¨l÷èÑ#V­Z•666<þ|žë7oÞ¤““ò‰=¢——ÍÍÍyðàÁÿ\‹‹‹£½½=Ë•+Çøøøÿ\;xð K–,Iggç<«x™™™9r$% ƒƒƒùðáC¥óNOOçÌ™3iiiISSSN˜0¡HN ·mÛ–ÁÁÁù¶ËÉïwýúu¥×;uêÄråÊ)=@ *äÙ²MKK£““›5k–§Ovv6ýüüèîî®4wdZZ+UªD___µ'Ê:”o.IeÖ®]«²B‰ èB¼¿E¨ñ Ÿ›Y³fG¥U8jÔ(àäÉ“U¶{úô)}||hiiÉ'Nä¹þàÁV­Z•%K–äéÓ§ÿs-55•5¢Ažš´IIIôôô¤¹¹9÷ìÙóŸk·oßfµjÕhll¬tõkß¾},UªK–,©6eÉÓ§O9dÈÊår–,Y’“'OfJJŠÊöÚP(,_¾¼ÊšÀïzýú5 Unçä Tö[/\¸ ò{¼Í›7wîÜ©´ŸT*Uš7ü7¡D"Q› šüÿ9•­4ªIsss¥ß‘ BA‰÷·u" ás4cÆ àèÑ£µ s’/?^e¿/^°N:411áþýûó\ONN¦¿¿¿ÒœéééìÒ¥Kîjã»÷HMMeHH¥Rižt*¯_¿fDDpèСyVÇž…BÁ‘#Gû÷ïÿŸ€ +++÷trttô…BÁ~øR©”AAAy¾T(\¸p!MLLèìì¬ò›·÷ïßçСCinnN©Tʰ°0îܹSëÜu¯_¿¦——=<<4þ>nÆŒ444TYÎ/g5OÙV{BB%‰Ò>çÏŸWZc™$SRRhoo¯2 MNýaUeär;vŒ‰DíçÊ:tˆ‰DmMjAІx‹P'â>g s¶‘¿þúk•ýÞ¾}˰°0J¥R¥%Î222عsgJ$¥‡æÏŸ¯ô0ùï ᜠÉÉÉÿ¹vøða–*UŠ¥J•RšbäæÍ›¬]»6% ¿þúë|ëÙ¦¦¦rîܹôðð 򯯡OŸ>ܱcG¾ß &$$ä®vÆÅÅ©mû®“'O@žmò™™™,]º4ûöí«ôzDDË–-«4€ìÖ­mmm•Îý×_%îÛ·Oé¸[·n%€lÆ 400`ãÆ•¦c¹téèèè˜çÛ¾¬¬¬Üíä¾}ûæ ävîÜIKKKºººæÙ.ÎÊÊâ¤I“(•Jéëë«ò€ÂÁƒYµjUJ$~õÕW*HµÖ­[+MÜü®œmà÷·¿s¬Y³†”VñxñâK”(Á*í;iÒ$Êd2Þ¸qCéõ£G/^¬vŽ9'–7lØ ¶Ýû(—Ë9tèP­ú Bñþ NÄ$|‰–-[F‰DÂîÝ»k•—íäÉ“,Q¢kÔ¨¡¶¬ÚþýûiffF¥ß¡Ý»w/7ñóü‘çúâÅ‹)“ÉX¿~ý<ßíݼy“žžž422RzúøÔ©Stuu¥±±1,X te*33“sçÎ¥••MLLЬ:ˆ:õêÕcÇŽómW¡B•§ÓÓÓiggÇèèh¥×¿ûî;* rß¼yC†‡‡«¼wçÎimmo-ßÖ­[ç{pD™ï¿ÿž‰„þù§VýïoR€:ð¥úõ×_)•JÙ©S'­rß]ºt‰vvv¬X±¢Ê¤Âä¿ûÛÙÙÑÕÕ•·nÝÊs=%%…Mš4¡¾¾¾ÒÓ¬‡¦µµ5ó¬¾~ýš]»vÍ])|ÿÛÄ—/_2**Š¢r•ïÙ³g>|8år9mlløÃ?|°-ÉÒ¥KsäÈ‘ù¶ëÛ·/ÝÜÜT^5jÍÍÍ•nÍ?þœæææ*›,]ºTm:š‡ªÝ*Α””DSSS•¨*YYY¬U«–Ø  D¼¿E¨ñ _² 6P__ŸaaaZ¥å¸yó&Ë—/O&&&ªlwëÖ-ºººÒÖÖV邌Œ öîÝ;7Wáû[Ò·oߦ‡‡‡ÒSÀ …‚ , \.§···Ò sûöí´³³c‰%¸zõj•ߩݽ{—={ö¤T*e©R¥8kÖ¬|óê"gkW]麫W¯&þóÏ?J¯ßºu‹¸råJ¥×GŽISSS>{ö,ϵ¬¬,º»»«­J2uêTJ¥Ò<õß7{ölJ$ž:uJm»÷%$$ÐÀÀ€ß|óVýA¼¿E¨ñ _ºmÛ¶ÑÀÀ€!!!Z­Â­¬¬8vìØ<ù ÷ß~K333µjÞS~eæ‚‚‚¬ôÚ£G(—ËUÖxÎù†OU®Ä·oßÒÙÙ™M›6U;ÏÌÌLÖ¨Qƒ5jÔкÜÛäÉ“©§§§r%R”ïoê$çª[·.CCCÿó¿5kÖ÷ôáƒØ»w/ŒŒØ Aµutß—œœL???ššš*ý–/Ç«W¯Ø¢E J¥R¥Û½$¹iÓ&Ñ××Wi™·ØØXJ$¶jÕŠ)))ÿ¹þâÅ †‡‡ç–SÈnÚ´‰vvv´°°à¢E‹ÔžZ½sçHÊd2vêÔ‰‡Ö꤫*W¯^¥‘‘‘FÛ¿ä¿ilŒ9sæL•m-ZD===•ÁjTTííí•®ò* Ö¬Y“*_N¸{÷nµs=uê”ÚƒªdffÒËË‹U«V ¢¥Ö¬Y“ç“ãS€Bˆ¿A¿>L333äûÑÿ»^½zÅ&MšÐÀÀ€ëׯWÙ.333w»wüøñJƒ³gÏÒÞÞžŽŽŽJëànß¾tss˳‡Ã| IDAT%©P(8oÞ<ÊårzxxðÊ•+yú'''³[·nÀ   ^½zUío{þü9gΜIWWW ‹‹ ÇÇøøøƒ×¯_§³³3+Uª¤Õ³‡‡ÿ÷¿ÿ©¼žœœL}}}Î;Wéõ+W®W­Z¥ôúîݻծ2* Ö­[—U«VÍwu¯wïÞ´°°à£GÔ¶{ßùó穯¯Ï˜˜­ú _.ñþ NÄ$ÿß©S§rOù*;½«Jzz:#""(‘HÔV†P(œ4i°{÷îJW{îÝ»G///sóæÍy®_¿~îîî455Up^¼x‘•+W¦‘‘ùå¥Ú¾}ûèââB™LÆ1cÆä»õ̓²{÷î477'ºººrÀ€ܼys¾ÛÄ<`LL MLLX±bEÞ¹sGmû÷5kÖŒ­ZµRÛ¦Q£F*·I288˜¾¾¾J¯) úùùÑßß_e`{æÌà¢E‹ÔÎãéÓ§´²²b·nÝÔ¶SfÔ¨Q”ÉdJƒwAxŸx‹P'â„ÿºxñ"mmmY¹reÞ¿_ã~ÙÙÙ8p Ú¾¿þú+e26l¨4uÈëׯٮ];`lllž±^¾|É:‡ ’'_àëׯٷo_`Ë–-•³oÞ¼áØ±ci``ÀòåËsË–-­ê¥¥¥qûöíìÝ»7Ë—/O@;;;²cÇŽìÝ»7£¢¢Fwww ¡¡! T t3¬W¯žÚ6sæÌ¡L&˳=žcË–-jÿ¬Ûµk¨Ýʈˆ ½½}¾ß.æäšÔ6½KZZ+V¬ÈÚµkÕZ„›x‹P'ℼ®]»FGGG:99)=]«ŠB¡à”)S€½{÷V›^æÐ¡C´´´¤»»;oß¾çzvv6'L˜@ìСCž C¡Pð‡~ ¾¾>ëÔ©£ôpÇ–-[hmmM;;;îØ±Cé<®_¿Î&Mš7n¬õêÓÝ»w¹aÃŽ;–‘‘‘¬_¿>}||èããÃFñÿûW­Z¥ô®¦ºvíÊ:uê¨msóæMµI£333éààÀÞ½{+½®P(èíí­²²ùï‰c™L¦ò@Iެ¬,zyyÑÛÛ[ë@îСCª1,|yÄû[€: (w÷î]V¨PöööyòðågÙ²e”J¥lÙ²¥ÚÕ¢ÄÄD:;;ÓÖÖVeú7ÒØØ˜5jÔPºuzìØ1–)S†666Ü·o_žëÿý7CBB€QQQJWà ·nÝJJ¥RFGG«Mtý¡µk×NíönŽòåË«ÍÙ7vìXš™™©üþpãÆ ¶Foÿþýiaa¡²ŽpŽãÇkTID™^½zÑÜÜ\í‰mAïoêD<@‚ Ú£GèééI+++­ó»íܹ“ÆÆÆ ÈSÍã]OžMLLX¾|y8p@é}Þ¾}Ë3fЂ¦¦¦Œ‰‰ÑêTtQ©]»6###óm×¥Kúøø¨¼ž“3PÕa¬¬,V¨PmÛ¶U9Æ£GhllÌ#Fä;ŸN:ÑÖÖVå¶´*ÉÉÉ´µµU[¥DÄû[€: ¨÷üùsÐÄÄDe®8UNŸ>Mº¹¹ñ¯¿þRÙ.---7WߤI“”~‹÷Ï?ÿ°Aƒ”J¥üñÇó´ÉÊÊâwß}G===Ö«WOéêÑ­[·rSGôë×Oep÷ôéS<˜´±±áÌ™3‹­REvv6---9qâÄ|ÛÎ;—2™,Oe”wÕ©S‡Mš4Qy}þüùÔÓÓS»õ?räHç»JzïÞ=ñÛo¿ÍwîïËI€½sçN­û _ñþ NÄ$ù{7Õ‹²“¹êܸqƒ...´³³SZ $‡B¡àøñã €;wfZZZž6™™™}: Õ~רºukzzzª ˆGE“|wVV«V­Êºuëj]AåÍ›7tvvfppp¡”â>âý-@ˆH æ§Ÿ~¢D"a×®]ó$bVG¡PpæÌ™”H$lß¾½Ú äõë×lß¾=p„ JÆììlNœ8‘zzzlÔ¨‘Ҥϯ^½bŸ>}€-Z´P¹m¹wï^ºººR__Ÿß|óM¾I›³²²¸mÛ6¶lÙ’R©”2™ŒÍ›7ç²eË´ª¤ò¾7n°F477W¹ªŠŸŸ{ôè¡¶MNi8U¥ßÈÿ_N]J˜'OžÐÈȈãÆËw^;wî$•ùÕÙ±c¨<%.|™Äû[€: ÜÚµk)“É¢Õ*InÞ¼™FFFô÷÷W,) ÆÆÆÛ´i£2(ûã?hccÃ2eÊðèÑ£JÛlÛ¶¶¶¶´¶¶Vy˜%--±±±422¢½½=W®\©Ñ*ç£Gøã?²víÚ”H$”H$ôõõåˆ#¸}ûö|¿•S(_âý-@ˆH Çõë×éììL{{{­·._¼xÁ&MšP*•rÁ‚jÛ>þœ!!!”H$œ}úðþýûZýÆ!==2™Œ?ÿüs¾mÿüóO`\\œÚvÞÞÞ SÛ&çÏÑ7æ{ß‹/€Ú44ª¼zõŠlÓ¦Ö}…Ïx‹P'â„Âó÷ßÓËË‹æææZW ÉÌÌd¿~ý€ƒ RûXvvvîIä6mÚ¨,5öàÁÖ«W‰„cÇŽUºB¥P(¸dÉZXX°T©R*O“ÿ~8cÆ ZYYQ.—sРAüû￵úE)'iôñãÇóm›ššJ\±b…ÚvÓ¦M£¡¡a¾ßø200P£y¶oßžåÊ•S»ú¨Êš5k€üñ‡Ö}…Ï‹x‹P'â„•ššÊÆS&“qÍš5Z÷Ÿ;w.¥R)›5kÆ/^¨m»eËš››ÓÍÍW®\QÚ&++‹±±±”J¥ àíÛ·•¶»ÿ>[¶l™T*+%—#%%…111´°° ¡¡!Ȥ¤$cQ‰‰‰¡©©©ÆU™2eòý¦.§~p~ÛÀëׯ'V¯\¹B‰D’ïj¯2 …‚ôôô,Ðaáó!Þß"Ô‰x€¡ðedd°k×®ÀiÓ¦i¿mïÞ½´°° »»»Ê9®]»Æ*UªÐÄÄDmŒãdz\¹r477çêÕ«•¶Q(\·níììhnnιsçª 2ž?Î &°D‰”ÉdìÞ½;/_¾¬Ù,d™™™trrb·nÝ4îS§NFFFæÛÎÃÃ]ºtQÛ&##ƒ¥K—fŸ>}4ºw‡ ¼ xêÔ)àÂ… µî+|>Äû[€:  …BÁÑ£Gû÷ï¯õjMbb"+T¨@+++8p@mÛ—/_2""‚øõ×_« *^¼x‘Û.22Rå ã³gÏE¬Y³f¾>¤¦¦rÆŒ,S¦ °Q£Fܾ}û]¡š?¾Fßô½+<<œ 6Ì·ÝÈ‘#icc“o:œqãÆÑÔÔ4ßü‰$yùòeк¬`ŽÈÈHÚÙÙit/áó$Þß"Ô‰x€¡h-X°€zzzlÕª_¿~­UßäädS__?߃ …‚sæÌ¡L&£ŸŸŸÚ-Ù_ý•æææ,[¶¬Ê"ä¿«†ÔÓÓc¿~ýTÉ‘žžÎ_ý•>>>ÀòåË366–÷îÝSÛOWW¯^¥¹¹y¾  ß×»woz{{çÛîСCý9y÷î]J$x´iÓ†*T(P |÷î]r̘1Z÷>âý-@ˆHŠÞŽ;hllL___>~üX«¾™™™8p °OŸ>ùnž}âý-@ˆH>œ;wîÐÝÝ%J”à¡C‡´îüøq–*UŠjƒ5òßànêÔ©”J¥ T» O///J¥RŽ3Fí*ãŸþI___`XXoܸ¡ñüï߿ϟ~ú‰ 4 ¾¾>ÐÑÑ‘:uâìÙ³yðàA>zôHe0ôúõk9r„#GŽd™2e(‘HUàïàºwï®ñjݘ1chkk›o öæÍš››sìØ±ûÿ÷ uÚ ?þø#õôôTž>_âý-@ˆH>¬çÏŸ³Aƒ”Éd\¹r¥Öýïß¿O___Êåò|sØ‘äÑ£GéààÀ’%Krûöí*Ûedd0&&†úúú¬Zµ*Ïž=«²mvv6W­ZEGGGêëë³ÿþZom§¦¦rÛ¶m2dkÕªE¹\ž[2ÎÈȈNNN¬V­}||èááA{{{J$ ••{÷îÍøøx­îù¾&Mš°uëֵݲe h”î¦GtuuÕhUO¡PÐÓÓ“Íš5ÓhïKOO§““[µjU þ§K¼¿E¨ñ ‡—žžÎ=zǯõö_ZZ¿úê+à€ò-+öôéS†††æ&™~ûö­Ê¶.\`5(•Jùí·ßª­ƒûæÍN™2…455å¸qãòÍ]¨JFF¯\¹ÂÍ›7söìÙ1b£££Åþýûsܸq\ºt)ãââ ít±““‡ªQÛ;wî€Ú :ÇÞ½{µúsuùòåP OÈõàŸþY þ§I¼¿E¨ñ BñP(œ4iRnJU‡0Ôõÿù矩¯¯ÏÀÀÀ|+r( Ξ=›¬Q£†ÚoÔ2228qâD°B… ù~ï÷ôéS6Œ†††´²²âäÉ“?úô$?&•9ß§P(hjjªÑAŒŒ ZYYqÔ¨Qýöí[ÚÙÙñÿûŸFíß—MÖ«W¯@ý…O“x‹P'â„âµnÝ:²víÚ|òä‰Öý;F{{{ÚÛÛkTíܹs¬X±"¹xñbµ«  FEE囿ÁƒŒŽŽ¦­¬¬8iÒ¤¯µ•+W>|¨qŸjÕª±oß¾µíÖ­ÝÝÝ5{üøñ411)ð)é­[·Šq_ñþ NÄ$ÅïäÉ“´µµ¥““S>æøð!ëÔ©C}}}Ι3'ß-å—/_æn!·mÛ–ÉÉÉ*ÛfggsÞ¼y477§׬Y“ïøIIIŒŽŽ¦\.§¹¹9GŒ¡U õ!4lØuêÔѪOhh(›7o®QÛÍ›7oݺ¥Qû‡R__Ÿ?þø£VsÊ¡P(èëëK??¿Ÿ¬>-âý-@ˆH>wîÜ¡‡‡ÍÍ͹gÏ­ûgddð믿&vêÔ‰¯^½Ê·Ï† X¢D –)S&ß•£°]»vÀàà`Òœ<|øÃ‡§™™ Ø£G^¼xQãßTTrJ©i[«9**Š>>>µMMM¥L&ãܹs5¿}ûö¬T©R¸œowîÜY þ§E¼¿E¨ñ ÂÇ#%%…Í›7§žžúé§¿ýöMLLèîî®Ñ¡‚{÷î±aƹDòûqçÎtrr¢G¥Qu“çÏŸóûï¿§ƒƒ°nݺüí·ß TWWéééôöö¦§§§Ö‡I† Æ *hÜ>((H£Ü98@çX|ŸB¡`íÚµY³fM± øïoêD<@‚ðqÉÊÊâ!Cr+äwÂW™„„V®\™¦¦¦\»vm¾í³³³9kÖ,Êårº»»çûçÁ›7o8nÜ8Êår:::rýúõ\·nëÖ­K´±±á!CxéÒ%›. £¢¢(“Éxúôi­û3†ŽŽŽ·¥¹¹¹Æ¦B¡ ««+;wî¬õÜrìÛ·O¬~!Äû[€: |œ-ZD™LÆúõëóéÓ§Z÷ùò%;uêDŒŽŽV›ú%G||¬ó6±B¡à¾}ûX¥JÊåò%ßÎ1dȺ¹¹iÜþÍ›7Zxÿþ}êééqÑ¢E™"Éÿ{à¾}û <†ðñïoêD<@‚ðqû믿XµjUš™™i”„X™sçÎÑÅÅ…æææÜ°aƒF}N:ÅÊ•+ÓÀÀ€S¦LÉw50##ƒ?ýô­¬¬hbb˜˜¾|ùR«y¾zõŠ7nd—.]rWÙ AŽ=š¿ÿþ;oݺ•ï–jvv6¯\¹Â~ø^^^@??? tíÚ•Zõññña×®]µêÓ¨Q#jÕç] …‚ÞÞÞ"/àgN¼¿E¨ñ ÂÇ/55•­[·¦D"áäÉ“ ´µ÷âÅ †‡‡çn k’x:--ß|ó õôôèããÃË—/çÛçÙ³g:t( XªT)Λ7¯@ß1fggóìÙ³œ6m[¶lI[[ÛÜRqtqqa:uÊððp¶k׎M›6eõêÕijjšÛ®e˖ܵkW¡l‡Ö­[—;vÔªOïÞ½éáá¡UŸ+VïÞ½«U¿wmܸ‘xâĉ!|ÜÄû›Ôƒ ÂgÌÌÌ ›6m˜1c0jÔ(têÔ oÞ¼Ñj ¬[·óæÍÃ’%Kàçç‡k×®©íchhˆï¿ÿþù'Þ¼y///ÄÄÄ ##CeŸ%J`ÆŒ¸v킃ƒÑ¯_?T®\«W¯Fvv¶ÆóÕÓÓƒ··7†Ž­[·âÑ£G¸ÿ>öîÝ‹™3g",, åË—I<þ)))022B­Zµ0nÜ8ìÛ·ÉÉÉØºu+š5k‰D¢ñ½•!‰øøxT®\Y«~žžž¸zõ*2335îÓºukbýúõÚN3W›6màææ†©S§x Aøèwú)ƒ„Oˆ hllÌêÕ«óÎ;ãÂ… tss£±±1—-[¦ÑêØÛ·o9fÌêëë³J•*¯,]¼x‘-[¶$º»»sݺuÌÎÎ.м‹Sbb"h£qÿþý QÞÄwµiÓ†¾¾¾ZõyßâÅ‹uª1,|ÜÄû[¬ ‚ði×®Nœ8””øøøààÁƒZQ­Z5œ={:t@=ЩS'¤¤¤¨í#—Ë‹³gÏÂÈÈ0`RSSÕöóôôÄÖ­[qêÔ)8::¢C‡ðððÀš5k••¥õ܋ˎ;`hhˆ:uêhÕÏÅÅð×_iÕ¯]»v8}ú4îÝ»§U¿wuîÜ¥J•¬Y³ <† |ÌD(ÂÅÓÓgΜAõêÕѨQ#Ìž=$µÃÔÔK—.ÅÚµk±k×.T¯^'NœÈ·_µjÕpòäIÌœ9Ë–-ƒ»»;6oÞœïý}}}±gÏœ8qåË—Gdd$ÜÜÜ0þ|¤¥¥i5÷$–/_Ž-ZÀÄÄD«¾eÊ”D"Ñ:kÞ¼9d2¶mÛ¦U¿wÉår 0+W®Ä?ÿüSàqác%@A¾8%K–ÄîÝ»1xð` <]ºtÑú»@èØ±#.\¸{{{"&&&ß•9©TŠÁƒãÊ•+ðòòBÛ¶mѲeKܾ};ßûùùùaçÎ8wî¼½½råÊ!&&Ož<Ñzþ®]»påÊDGGkÝW__ÖÖÖxüø±Vý,,,¤S}ûö…žž~ù寄‘Aø"éëëcúôéX»v-~ÿýwh„½ÏÉÉ GŽÁ˜1cðÝwß¡nݺ¸uëV¾ýÊ•+‡­[·bóæÍ¸páÜÝÝ1qâD¼}û6ß¾^^^X¿~=®_¿ŽððpL›6 ŽŽŽèÞ½;Μ9£õo(*oß¾ÅСC„zõêh +++<þ\ë~!!!8|øpûwïݵkWüòË/jï§H€‚ |Ñ:vìˆ'NàåË—ðööÆž={´C__&LÀÑ£GñèÑ#T¯^K—.ÍwkW"‘ M›6HLLÄ€ìÚµK£ûºººâçŸƽ{÷‹C‡Á××>>>X°`A¾ß&%’~þùgT©R¥ÀcI¥ÒͽR¥J(S¦ öïß_à{€»»;6lˆŸþY§qác#@Aü›ƒoûöí?~<Æ–-[hëÑÌÌ K—.ÅæÍ›qìØ1xxx` wG=IDATëÖ­õuwwÇü 6àòå˨R¥ †®ñJžT*E‹-°cÇ$%%aìØ±ˆ‹‹C³fÍ`oo>}ú`ïÞ½EºùòåKtëÖ 111˜2e zõê¥ÓxYYYÉdZ÷“H$¨_¿>:¤Óý _¿~8qâ.\¸ óX‚ð±PÛãoB®¸¸8x{{£nݺ°°°øÏµˆˆDDDÓÌAÐÅîÝ» KKKlܸ^^^çñãÇˆŠŠÂöíÛÑ­[7Ìž=–––õMKKÃŒ30uêT˜˜˜ &&QQQÐ×××j$qþüy¬[·6lÀíÛ·aff†F¡qãÆhР\]]uNöœ‘‘U«VaüøñHIIÁüùó©Ó˜P¡B´iÓÓ¦MÓºï¢E‹Ð·o_¼xñfffžCVVÊ•+‡ÐÐPÌŸ?¿ÀãÅcíÚµX»víþYJJ Ž9‚sçÎøÿߟ¼bË@ø‰$áóuûömz{{S.—sÑ¢E.‡¦P(¸téRš››ÓÁÁAëdÈ÷ïßg·nÝ(‘HX©R%nݺU§¹\¸p±±±  T*%ÚÛÛ3,,ŒS¦Lá®]»x÷î]N'''sçÎìß¿?mll€íÚµãíÛ· 4?eÌÍÍ9mÚ´õ'8p@çyŒ7ަ¦¦Z×h>Nâý-jëD<@‚ðyKKKcŸ>}€]»våëׯ <ÖÝ»wLŒŠŠbJJŠVýãââØ°aC`:uxüøñÏ%GJJ wìØÁo¿ý–AAA433ËS38 €!!!lÛ¶-Û¶mËúúúÒÞÞ>·mÙ²e9xð`&$$è<§w%''×­[W þYYY455åÔ©SužËÝ»w)‘H¸xñbÇŠŸx“b X9[À_ô² |V­Z…¾}ûÂÙÙ6l@¥J• 4I,\¸Æ C‰%°páB4mÚT«þ{÷îň#pñâE„††bâĉðôô,Ð|Þ§P(p÷î]$&&⯿þBRRžMš4Á«W¯püøqÇŠ—xClëBü B¾ñññ¬T©MLL¸zõjƺsç5jDìÖ­“““µêŸÍ5kÖÐÕՕؾ}{^¹rE§9}Œ¾ÿþ{333³ÀcDGGÓÝݽPæ³víÚÕ&>>âý-j ‚ h¤J•*8sæ Z·nÈÈHôéÓ§ÀeØÊ•+‡½{÷bÉ’%زe ÜÝݱqãFKÒééé!"" X´hNž<‰ªU«¢cÇŽˆ/М>F{÷îEPPÖ_ÞåîîŽ7nJœV­ZÁ«V­Òy,A(n"AЩ©)V­Z…E‹aåÊ•ðóóõk× 4–D"ÁW_}…„„øûû#<<­[·Æýû÷5C&“¡W¯^¸qãæÍ›‡'NÀÃÃaaaUE‚xôè>ŒV­Zé4NÅŠ‘™™‰¤¤$çddd„ððp¬^½Zëúт𱠠‚$ zõê…S§N!==ÞÞÞ:­•.]›7oÆÆqúôi¸»»cîܹÈÎÎÖx ôíÛ7nÜÀâÅ‹qùòeøúú"88{÷îý$ƒ•… ÂÀÀíÛ·×iœòåËîܹ£û¤têÔ ·o߯ɓ' e|ˆS§NÚ˜‚ð¡‰P¡T¬X'Nœ@tt4 „-Zè´ê$“É0|øp$&&ÂßßhР®\¹R ñ*T¨€Ù³gãáÇX´hž>}а°0”)SƒF\\\±+¸oß>tíÚ:uBŸ>} m\¼yó¦ÐÆ €­­-~ÿý÷BS>4 ‚ ¹\Ž~ø;wîÄ™3gàéé‰={öè4fÙ²e±iÓ&ìÙ³<@µjÕ0hР¯h™˜˜ gÏž8}ú4.^¼ˆÈÈH¬Y³ÞÞÞ¨\¹2ÆË—/Ð`$–,Y‚æÍ›£AƒXºti¡&–Ö××/Ô•N©TŠÐÐPlÛ¶­ÐÆ„M€‚ …,$$—.]B5ЬY3 4HçMš4ÁåË—1iÒ$,Y²*TÀÂ… µ:-ü>OOOüðøÿ>víÚ…ZµjaöìÙðôôD… 0xð`ìÛ·¯Hüõ×_hݺ5zõê…=z`Ë–-Ëå…z…B©TZ¨c¶hÑ×®]ÃÍ›7 u\AøPD(‚PJ•*…;wböìÙ˜?>jÖ¬‰K—.é4¦\.Ç·ß~‹k×®!$$}úô——öï߯Ӹ2™ Íš5Ê+ðäÉìÚµ 6Ć иqc”(QÁÁÁˆÅðòåKî§P(päÈtéÒnnn8wî6mÚ„ ÀÀÀ@§±•yóæ ŒŒŒ uÌàà`Èd2ìÞ½»PÇ„E€‚ EDOO_ý5Μ9‰D‚š5kbæÌ™:Ÿ-]º4V¬XS§NÁÄÄÁÁÁ Ebb¢Îs–ËåhÖ¬,X€{÷îáÒ¥K˜4i 1sæL4lØpssCxx8ÆŽ‹+VàСC¸zõ*ž}Zh÷pssÃêÕ«‘€àà` 6 åÊ•ÃĉñüùóB»ÏÇdùòå°´´DÓ¦M‹ì9Ο?_d÷„¢ @A„DéÒ¥±mÛ6¬\¹»wïF•*U°aÆBÝ^tssÊ+pãÆ „‡‡câĉpttÄ AƒpçÎB»OqKMMÅÂ… Ñ£GÙ}Ê”)ƒ’%Kêœä[>4 ‚ |D$ ºté‚+W® víÚhß¾=Ú¶m«q®=M999aÞ¼yHJJ Aƒ°jÕ*¸¸¸ <<ÇŽûä¿i›>}:Þ¼yƒ!C†é}$ ªT©‚+W®é}¡°‰Pá#dooM›6aýúõ8~ü8*W®ŒÅ‹z`fkk‹‰'")) sæÌÁ¥K—ooo,Y²ä“üN0!!Ó¦MðaÃàààPä÷sssÃõë׋ü>‚P˜D(‚ð‘’H$Gbb"Ú´iƒ¨¨(4hРH‚ DGG#11»wïFéÒ¥…Ò¥KcÀ€¸xñb¡ß³(¼|ùíÛ·‡‹‹ FýAîéêêŠ[·n}ò«¦Â—E€‚ 9+++,[¶ ûöíCRR<==‹ŒŒŒB¿—žžš6mŠ;vàÖ­[ˆŽŽÆÆQ½zux{{cΜ9…z8¥0½}ûmÛ¶ERR6nÜ##£rßråÊ!55)))ä~‚PD(‚ð‰ÆåË—1hÐ |÷Ýw¨^½:Ž9Rd÷srrÂäÉ“‘””„-[¶ÀÁÁC† ½½=Z´hÕ«W#55µÈî¯ääd4mÚÇŽÃÖ­[‹¤ò‡*9ÛÌ<ø`÷]‰PábllŒ©S§"..––– ÂW_}…þù§Èî)“ÉЪU+lݺ>Ä?ü€gÏž¡sçΰµµEhh(–,Y‚ÇÙÔÙ¿?¼¼¼}ûö¡~ýúôþvvvPl¿_ B€‚ Ÿ ;v ,À–-[P©R%,\¸ …¢HïkccƒþýûãÏ?ÿÄÝ»w1eʼxñQQQ°··‡¯¯/ÆŽ‹Ã‡yé¹øøx„‡‡#88NNNˆ‹‹CíÚµ‹ôžÊä”™û\ó) Ÿ' ‚ |¢ôôôлwo\½z¡¡¡èÓ§üýýqöìÙrÿ²eËbðàÁ8zô(?~ŒeË–ÁÙÙ¿üò êÕ«KKKÔ«W£F–-[””¤óA‰ÇcÉ’%¨_¿><<}"AøLÔ©SçÎüyó0nÜ8¬_¿±±±èÓ§ôõ?Ü÷zzz¨Q£jÔ¨Áƒîß¿óçÏãòåËHHHÈM7óäÉ“|Ç377G¥J•„Ñ£G£Q£F°µµ-꟡©TŠììì➆ hL€‚ Ÿ}}} 8;vĨQ£0`À,X°³gÏFƒ Šm^ppp@hhèþyff&þùç<{ö ¯_¿FFF$ ann[[[XXX@"‘ÓÌ5“ ©TZÜÓ‰Pá3dkk‹Å‹£oß¾8p 6lˆÖ­[cúôépuu-îéå’Éd(]º4J—.]ÜS)°ÌÌL(Š"­9,…MAøŒùøøàøñãX³f Î;www :TœX-D/^¼XXXóLAs"AøÌI$DDDàÚµk7n,XÌž=»ÈSµ| r¾cüؾKuD(‚ð…022˜1cpóæM„‡‡cèС¨\¹2Ö®][äù?gIIIþEAøˆPá SªT),X°—/_FÕªUÑ©S'øøøàÿþïÿtÎÓ÷%º~ý:är¹…OŠA¾PîîîØ¶mŽ9###4iÒ 4À‰'Š{jŸ”K—.ÁÝÝ]œ>)"AøÂâØ±cضm’““€æÍ›#..®¸§öI8yò$|||Š{‚  ‚ H$ Å… °fÍܼyÞÞÞhݺ5.\¸PÜÓûh=~ü  *î©‚VD(‚ äÒÓÓCDD®\¹‚åË—#>>5jÔ@ëÖ­qîܹâžÞGgÇŽH$hÔ¨QqOE´"@A!}}}tëÖ W¯^ÅòåË‘4kÖ ÇŽ+îé}4Ö®]‹   ‘FøäˆPAP)'LLLÄêÕ«qÿþ}"00;vìø¢ÓÇܼyû÷ïG÷îÝ‹{*‚ 5Q ® <8OøˆˆDDDÓŒA —T*E§NбcGìØ±S§NEhhhne‘ÈÈHÈåòâžæ5cÆ ØØØ }ûöÅ=Aµk×bíÚµÿùg)))Å4›‡„"éSÅÅÅÁÛÛç΃——WqOGáƒ!‰cÇŽaúô騱clmm>}úÀÎή¸§Wä®_¿Ž*Uª`òäÉ>|xqOGÐ’x‹-`A¡$ ±mÛ6$&&¢M›6øþûïQ¶lYtíÚ§OŸ.î)’èׯпÿ➎ ˆA¸¹¹á—_~Á½{÷0qâD=zµjÕ‚/^ŒW¯^÷ Õœ9sðÇ`þüù022*îéBˆPA(VVV>|8nÞ¼‰íÛ·ÃÎν{÷FéҥѧOœ:uê“/5wàÀ :ƒ B“&MŠ{:‚P`"A •T*E‹-°sçNܾ}ƒ ®]»àçç‡*Uª`êÔ©¸wï^qOSkÇG«V­Ð AL›6­¸§#:  ü¿öîߥÊþãø[Íð¡C?$¬C§„¶ˆmj:“AK´YKÑÔ"A`CDCDeÔªB”ÐÐ 8Èi¤l¨09pìÜÓ}Ã=}¿÷zÕý~<þó:|†ë .`ÍôôôÄèèhÔjµ˜ššŠJ¥£££ÑÓÓGމ›7oÆÇ‹žù?=yò$c`` ž>}íííEO‚Ÿ"Xsmmm144ãã㱸¸wî܉R©###±cÇŽ8zôhŒÅû÷ú7ËËËqþüù8uêTœ8q"&&&bóæÍEÏ‚Ÿ&XWqæÌ™˜œœŒÅÅŸuëVtttÄ¥K—¢··7Êår\¾|9ž?õz½«««1>>ýýýqûöí‹G¹ôÁ† 0ÝÝÝqöìÙ˜˜˜ˆOŸ>ÅãÇãàÁƒq÷îÝ8vìXlÛ¶-ãêÕ«1==ËËËkºçóçÏqãÆèïïááá¨T*ñúõ뉖––5ýmXO¾À/¡³³3ªÕjT«ÕøñãGÌÍÍųgÏbzz:®]»W®\‰ÖÖÖØ·o_ ÄþýûcïÞ½±gϞصk׿ ´z½sssñòå˘œœŒ/^D³ÙŒ“'Oƽ{÷âСCkðO¡x€_NkkkT*•¨T*qñâÅX]]7oÞÄ«W¯bff&fggc||ü¯WÄ¥R)zzzb÷îݱ}ûöèêꊭ[·Æ¦M›¢½½=šÍfÔëõøöí[,--Ňbaa!æçç£ÑhDGGG>|8®_¿Õj5Å×LÈMðËkkk‹r¹år9Î;F#âÝ»w1??µZí¯°›™™‰¯_¿Æ÷ïߣÑhDKKKlܸ1¶lÙ]]]±sçΊ .Ä¢R©¤û–1¹ @~K6lˆ¾¾¾èëë+z üv\HF$#’€É@€d @2 ŒHF$#’€É@€d @2 ŒHF$#’€É@€d @2 ŒHF$#áxøðaÑXGÎ;çM&þˆ\œw.ΛL @2 ̆¢üÎVVV""âíÛ·/a½|ùò%fgg‹žÁ:qÞ¹8ï<þ|nÿùϨ¥Ùl6‹ñ»zðàAœ>}ºèÀ¿pÿþý.zF!àOXZZŠ©©©èííR©Tôàÿ°²²µZ-Ž?ÝÝÝEÏ)„HÆ%€d @2 ŒHF$#’€É@€d @2 ŒHF$#’€É@€d @2 ŒHF$#’€É@€d @2 ŒHF$#’€É@€d @2 ŒHF$#’€É@€d @2 ŒHF$#’€É@€d @2"Ó±&k8òIEND®B`‚././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/tests/baseline_images/test_plot/gaussian_mixture.png0000644000175100001770000010721000000000000030041 0ustar00runnerdocker00000000000000‰PNG  IHDR Xšv‚psBIT|dˆ pHYsaa¨?§i IDATxœìÝ{|ÎõÿÇñçµsaTˆ«ŒGæLV)”äXcT"”(ª_:I…ŠcdäKåÔ·oN–SÑW4¤¾aDbÛï«‹ksm»¶}®Ï纮=î·[·µÏu]ŸÏkMõy~Þ‡—-;;;[`‚ « Pr@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜&ÄêpÁ‘#G´bÅ Ùív•)SÆêrËéÓ§•––¦.]º(""Âêrüć¬X±BqqqV—€$%%iÀ€V—á— >Än·KrünРµÅ@’ôðÃkòäÉV—/á÷Øøý6~¿Í—¿;wîT\\Üùû6ć8§]5hÐ@Í›7·¸HRÅŠù]0~¿ßo`ã÷Øüá÷Ëtù¢c:Ó@˜†À4 ±±±V—/â÷Øøý6~¿ßo`³eggg[]RSS­Í›7ûüÂ+€’ˆûµâc€i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 €i LC`Ó@˜†À4¦!€0 œ¥´4÷¯¥¥9^ÅB§˜)!áâ’–æ8cEU”« Ÿa·K3gJqqRF†tì˜T©’&%%9^ÅBWv»#|lÞ|áXt4áƒ0 r;v,ÿï@‘@ ·J•òÿS°ÀUZšcÍGttÎ5 iiLÃÀprîv•{Á¹óøÌ™„Љ)Xà´zµûáÜkõj+ª  0Nññy¿f·3ú€`Ó0Ë=üðêX±bŽc±±±Šµ¨"€’'99YÉÉÉ9Ž?~Ü¢j‡-;;;Ûê"àššªèèhmÞ¼YÍ›7·ºäÂýZñ1 €i LC`Ó@˜†À4¦!€0 €i LbuàoNŸ>­}ûöiß¾}JKKÓ¾}û”žž®ˆˆU«V-Ç_v»]áááV— €Ï €€‡6lØ iÓ¦iÞ¼y:sæŒ$)((H5kÖT•*UtôèQýþûïÊÈÈ8ÿ™   EEE©}ûöêСƒÚ·o¯+®¸ÂªË@ {÷îÕàÁƒµfÍÙív=ûì³jÓ¦j×®­5j(44ôü{³³³uòäI:tH¿ÿþ»~øá­_¿^+W®Ô´iÓ$IW\q…:tè [n¹E=zô`„P¢@ Ë–-S¿~ý¡Å‹ëÖ[oUpppžï·Ùl Wxx¸êÔ©£víÚièС’¤ão¾©oJ—֪ݻµfÍ 8Paaaºå–[4øúëuS©R*}Ï=fýhX‚yøúë¯Õ«W/ÝtÓMš3gŽ*T¨P¬óU¼í6uIHP—™3¥W_ÕþýûõÑGiͬY*»h‘š……©é_¨_¿~ºùæ›fÐO€ï €€ÿý· ¤æÍ›+%%E¥K—.þIíviæL).NÊÈP­cÇôh¥Jz´BíûòK úê+Í›7O½{÷VÅŠ5xð`=ðÀŠŒŒ,þµðlà nL›6M?ÿü³Þÿ}c‡“Ý.edH›7K{÷:¾fd¨vLŒž|òImݺU;wîÔ°aÃ4{ölÕ«WO·Þz«V¬X¡¬¬,ãêÀ"pãÃ?ÔwÜ¡† òcÇòý¾~ýúš8q¢8 3fèàÁƒêÚµ«6l¨©S§êÏ?ÿ4¾&LB€\öìÙ£íÛ·«OŸ>Þ¹@¥Jùÿ2eÊhðàÁJMMÕÚµkÕ´iS9R5jÔÐC=¤½{÷z§>¼ˆ¹,]ºTaaaêÚµ«ñ'OK“¤èh骫_ÃÂÇó`³ÙÔ¡CÍŸ?_iiiz衇4oÞ<Õ«WO Ú³gñuà%Èå×_ÕW\¡råÊ{â´4)!AJJ’6m’öìq|MJrÏ'„8Õ¬YSÏ?ÿ¼öíÛ§×^{MË—/×ÕW_­Áƒë§Ÿ~2¶^¼€¹üý÷߯.¾Ð§v "“'OÖÊ•+U¿~}ÅÇÇk÷îÝFý†¹üý÷߯×þ-ZHuê8¾ÆÅ]<"RDeÊ”Ñðáõwï^Mž\………"¯¿þº–/_®zõêiôèÑúã? ½EA€\êÖ­«]»v);;ÛØÐÿÃHÎ ²{÷n=ZS§NUdd¤Þxã 9sÆk×  ͹ܴ4Çë¹DEEéÈ‘#úý÷ß­ÅÃþF ׄ ´{÷nõêÕK£FRÆ õñǰð@`s.ÏB\€ç%IÚ¶m›qu¡ÿ‡‘ªW¯®÷Þ{O[·nUdd¤z÷î­N:iëÖ­¦\'€ÀV„àW^y¥Ê•+gÜ͹ý?ŒÒ¸qc-[¶LŸ~ú©~ûí7]sÍ5ºçž{Œí ¯ Àƒ‚‚Ô¤IãF@ÜõÿpÖUÈþF°Ùlºå–[´mÛ6½ñÆJIIQݺuõæ›oêܹs¦Ö(y J†B.oÙ²¥¾üòKeeeÿÚññyïvåìÿQ„µ*ŪáÇ맟~Ò€4räHµjÕJ7n4üZ8@” …\Þ»woýòË/Ú°aƒ‹rQ„µ*F©\¹²Þ~ûm}ýõ×ÊÊÊRëÖ­uÿý÷³m/À+ _€_{íµºôÒK•’’bN&4+,HëÖ­µqãFMž©—_~Y7ÖÊ•+-­ àß ç4¬ùó盳=­ÅÍ óR¦LM˜0AÛ·o—ÝnW—.]4dÈ©Š„ù:t¨öï߯ýë_Þ½5+ÌK½zõôùçŸë½÷ÞÓ‚ Ô¸qc}öÙgV—ð3ÈG‹-Ô±cG½öÚkÞ½5+Ì‹ÍfÓСCµcÇ5nÜXݺuÓ]wÝ¥c­Uø`Ô¨QZ¿~½¾ýö[ï]¤kU¬P«V--[¶L3fÌÐâŋըQ#-]ºÔê²~€èÞ½»"##½? âgl6›´cÇ]sÍ5ºí¶Û4pà@=zÔêÒ>Œ ÒÃ?¬””¥ùÀZ _S³fM}òÉ'JLLÔ'Ÿ|¢† jñâÅV—ðQð@||¼*V¬¨7ß|ÓêR|’ÍfÓ Aƒôý÷ß«eË–êÕ«—ú÷ï¯#GŽX]ÀÇ@ÀåʕӰaÃôþûïëøñãV—㳪W¯®%K–höìÙZ¾|¹š4i¢+VX]À‡@ÀC>ø Îž=«W^yÅêR|šÍfS\\œö>óŒ:׫§®]»jĈ9»¨§¥I‰‰–Õ°[÷€ß#€@uéÒE:uÒ¨Q£rö¸€ÃêÕÒÌ™ïve·;ޝ^-IŠŒŒÔÚµkõØcéþ>ÒŽV­ôÇ–-9?ÃÖ½0 PD6›MS§NÕþýûõ /X]Žï‰Ï{«]»Ýñú?BCC5qâDMûä ÌÌÔîÖ­u²A©N©E ).Î}˜øCýúõõÄO襗^b[^tëÖMK·oW¹ÐP•ßµKÚ»WÚ¼YÊÈ |@€ €@13FµjÕÒ}÷ÝÇ‚tÔªUK ªUËqì\zºEÕŒF€b ÓÛo¿­Õ«WkÖ¬YV—l•+çøþû_UjjªEÕŒDÜtÓMŠÕ£>ªto<­OLt,Äv'Ðv‡ÊµuoFãÆ:¢Ø¶mõÎ;ï0Ê~.Äêp±‡~X+VÌq,66V±±±UÀ¯½öšêׯ¯Ñ£Gëý÷ß7öä11Ž] r/Ävî5s¦±×+¬ÄDGîÖi¤¥9v¼rYtž'×­{ÿ9W˜¤†?þ¨%:©Ë°aZ³fÞ}÷]…‡‡W?¸‘œœ¬äääÇŽ?nQ5Öͣ$Ÿ‘ššªèèhmÞ¼YÍ›7·ºEðöÛoëþûïך5ktíµ×{ò´4ÇnPÒ±cŽãaa9nÖ-ã„ò HžÔX@ù楗tSR’ªW¯®””5iÒÄòÀSܯS°À@÷Þ{¯Z·n­ûî»OgΜ1öäv»#|lÞì{»C9{{ÄÅ9¶Í-êö¹lÝÛæí·µiÓ&•.]Z­[·Ö|`Ð0  ¤wß}WÿýïõÜsÏcÇòÿÞJ&¤«¯¾Zß~û­ú÷﯄„Ýu×]ú믿 ½À{ `°¦M›jܸqš8q¢6nÜhìÉ+UÊÿ{«™Ê”)£÷ß_‰‰‰Z°`Z·n­;wzåZc@À ž|òI5kÖLñññÊÈÈ0椹v‡Rt´ãû¼vDz‚ÉiРAÚ°aƒ²²²Ô²eKÍ™3ǫ׼ 44T‰‰‰Ú³gÆWüºîµi“´gãkR’ã¸/„‹R£F´aÃõêÕKqqqzà”™™éÕkŠŽ^Ò¨Q#=ÿüóš4i’Ö®][¼“­^í~1·sñ÷êÕÅ;qYÊ—/¯Y³féwÞÑôéÓÕµkW=zÔ«× ¼hÔ¨Qj×® ¤'NýDìåQ oò€d³Ùtï½÷êóÏ?×Ö­[ÕºukíÚµËë×¼(88X³gÏÖ‘#G4bÄ«Ëñ H111Ú°aƒJ•*¥6mÚhùòå¦]P0xÙ•W^©7ß|S~ø¡-Zdu9%ÂUW]¥¯¿þZíÛ·W·nÝôÆoˆ¾»à `‚»îºK½zõÒ=÷Ü£ß~ûÍêrJ„ *hÉ’%5j”FŽ©{î¹Çøæ€B#€€ l6›¦OŸ®ÐÐPÅÇÇ+++Ëê’J„àà`½òÊ+úàƒ”˜˜¨›nºIGޱº,(Ñ `’ˆˆÍž=[Ÿþ¹&Nœhu9%Ê]wÝ¥ÿüç?Ú¹s§Zµj¥ï¿ÿÞê’ Ä"€€‰n¼ñF;VO?ý´V[½un Ó¾}{mܸQááájÛ¶­>ùä«K€‰&{æ™gÔ±cGõïß_‡²®Äļûs¤¥9^0µk×Öúõëuà 7è¶ÛnÓ«¯¾Êât0L¬¹sçêìÙ³8p uëAbbÜ7 t6Œ‰±¢*¯+_¾¼-Z¤'žxB=ö˜ô÷ß[]”°Àå—_®¤¤$­ZµJ/¾ø¢5E8›ÆÅI-ZHuê8¾ÆÅ¹o*@‚‚‚ô /())IÉÉɺᆬ€„¹é¦›ôÔSOiܸqZ³f5EØíRF†´y³´w¯ãkFF@‡W З_~©={ö¨eË–Ú¶m›Õ%@À#€€…žyæ]{íµŠµî ü±cùàÚ´i£7ªJ•*j×®þõ¯Y]4X($$DsçÎUff¦uëA*UÊÿûàŠ+®ÐÚµkÕµkWõêÕK'Ndq:x ,V½zuëÖƒ¤¥IaaRt´tÕUޝaayïŽe‹vç*W®œ>úè#7NcÆŒÑÀ•‘‘á•k@IFйsçóëAþýï›sQçnWIIÒ¦MÒž=ޝIIîwÇ2‹…»si„ š7ož.\¨ë®»N¿ýö›×®%|ÄøñãÕ©S'õë×O?ÿü³÷/¸zµûÝ®œ»cYÕ(ÑvçêׯŸÖ®]«¨U«VJMMõú5 ¤ €|~Òp/88XóæÍSÅŠÕ«W/ýõ×_Þ½`||Þ7óv»ãu«øÀî\-Z´ÐÆuùå—«C‡Z¸p¡i×€@Føü¨á^åÊ•µxñbíÞ½[C† ñ…ÐV8Ø«zõêZ½zµzô衾}ûêÙgŸõß ø±« ¯sÒ“‘ḑ­TɱØ:)Éçz^4iÒD‰‰‰ºýöÛÕ¼ys=þøãÖtäˆ+%'çüg•–æ8Þ·¯w®ë#»s•)SFsçÎU£F4nÜ8ýôÓOš1c†BCC-©ü@Éà:¥Ç):Úç‡Sß¾}5fÌ=ñÄŠŠŠR×®]­+¦OiáBÇW›íB€“¤R¥Çæº;—k`LK³äwf³Ù4vìXÕ­[Wƒ Ò¡C‡”’’¢òåË›^ ø;€’æôƳÏ>«-[¶(66V7nTdd¤5…ØíÒܹRãÆ’ëº”rå¤;Œ®»såqIH0m!º;ýúõSÕªUÕ³gO]wÝuúôÓOu饗ZR ø+Ö€(9|dJ§‚ƒƒ5gΜó7¼þù§uÅØíRffÎc™™Þ ¾º;×?n¸á­Y³FTûöíµgÏKëCP2øbÃ=\rÉ%ú׿þ¥ýû÷+>>ÞšNéfóåݹþѬY3}ýõ× V»ví´Ùuj _ÏWîy¨Aƒš={¶>þøc½ð Ö‘–&å^têóÿì¼Én·kýúõºòÊ+£•+WZ]ø€ÀçãSz<Ñ£G?^ãÆÓ’%K̽¸s·« rŽ 5hà8î/!Ä Û GDDèßÿþ·bbbÔ­[7%%%«D(  ŸLéñĸqãÔ»woÅÆÆšÛ™;%űûÕ‚9G,pOI1¯–âðR?˜råÊiñâÅ8p ¨W^y…^!vÁ?¤Ù³g+&&FÝ»w×·ß~«š5kzÿÂU«:vÁr7‚4w®_Œ Iòj?˜ÐÐP͘1CÕ«W×ã?®_ýU“&MRPÏù 7ø‘²eËjÉ’%jݺµn½õV­]»VáááÞ½h~#Dv»ÏöRqË‹ý`l6›žþyU¯^]>ø ~ûí7%&&ªtéÒÅ>7Í€Ÿ¹üòËõé§ŸjïÞ½ŠÕÙ³gÍ-À k)L½†—ûÁÜÿýZ°`/^¬›o¾Y'Nœ0ôüàï à‡š4i¢ hùòå5j”¹÷ÒZ Ó®aB?˜>}úhåÊ•JMMÕõ×_¯Ã‡~ ðWLÁ?Õ¥KM:UÆ Sݺu5|øps.ìŵ^»Fbâ…Ðâìã<§$Mš$=òHñëvѱcG­^½Z]ºtQ‡´jÕ*ÕªUËÐk€?büØ}÷ݧQ£FiäÈ‘úôÓOÍ»°ëZн{_Oœp Š:eÊn—ÌyŒŒ ×(Ìycb¤þýÛ»öƒII‘J•r|õÂvÂM›6ÕºuëtæÌµoß^»ví2üào àç^~ùeuïÞ]ýúõÓ–-[Ì»pîµ»w?eêÜ9÷×,ìyív©wo)3SêÛWªSGjÑÂ1Â2w®”œìµÝ¼"##µnÝ:U¬XQ:tЦM›¼rðLÁ?¬9sæ(&&F·Þz«¾ýö[Õ¨QÃûνv¢n]ã§eU«&8páû_~q‡¢œ÷ÑG¥yóòÞË‹»yÕ¨QCkÖ¬Ñ-·Ü¢ë¯¿^K–,Ñõ×_ïµë€/c@¹rå´dÉÙl6uïÞ]'OžôîÓÒ.¬¥pvF¯PÁ1 +¯)SŹF©RŽcgÎï¼^Þ+?•+WÖ矮¶mÛªk×®Z¼x±i×_B€Q½zu}úé§Ú½{·î¸ãeffzçBÎéO®k)6mr|¿{wÎ÷õ?÷5r7\¬áÇëÒK/Õã?îÝ zëߨóæ7j3s¦ãuˆ$Ùl6Mœ8QUªTÑc=¦ôôtM›6MÁEÝ?Bà?Ìh€@|ðA:tH£GVDD„¼w1oÝàuÞüFÆìvKÿì<ú裪T©’î¹çýñÇš={¶BCC-«¼À¿¸6ÀsrÝJ9L˜0A‡ÒÝwß­ˆˆÝvÛmÞ¹·nð}88iÈ!ºä’Ktçw*++KsæÌ!„X,Bà,ÜJÕߨl6M›6M½zõR¿~ý´víZc/˜˜÷"î¢v@/¡úôé£ èã?Ö€¼·‹XŒÀÿX¼•ª¿ VRR’Ú¶m«îÝ»kÛ¶mÆÜ¹.'¿èf„” B={ö$„xþÅG¶Rõ7aaaZ¼x±®ºê*uíÚU?ÿü³1'v]—Ó¢…T§Žãk\Ü…µž„”â2ã&!„tþ#¿xîn>ÝHMMU=tíµ×êé§ŸVFF†×Ëö*TвeËT¶lYuîÜY‡ŠÚÌ/7×u9î: {RŒ¨ÁÛ×ð”£1¹CÈÙ³g -¬Dà?<Ù)o¿ý¶Ú´i£={ö¨Fz饗ԴiS¥• z.þy IDATÑ“K/½T+W®ÔÉ“'uóÍ7ëĉÆœ¸ u9…#ØíÒÁƒy_ì©XƸ†þýûB ÿŸ÷ «ÝžïŽI_ýµ|ðA 2D©©©š7ož¾ûî;effªoß¾%j$䪫®ÒòåËõÓO?©W¯^úûï¿‹ROÖ嘱yÀ¹sî¯aæT,Gc!@À;yò¤âââÔªU+M™2E¥J•’$5lØP)))Ú±c‡}ôQ‹«4WÓ¦MµtéR­_¿^qqq:—ûƽ0<]—cÆæÕªåüþ—_¬™ŠeàˆOÏž=õÑGB €€7}útíß¿_³gÏVHHÎöGÍ›7× /¼ ·ß~[?þø£EZ£cÇŽš?¾-Z¤|PÙÙÙ…?‰§ërÌØ<Àõÿ„L9ãé^ž0pħW¯^„ƒ  effêõ×_Wÿþýéö=÷ß¿ªW¯®gŸ}Öäê¬×£GMŸ>]ï¼óŽ&L˜Pøx².Ç€Í ”û5kæ|ݨ÷…aðˆkaa:FÐ>úè#8p ß)Vaaazê©§”œœ¬;wšXo2dˆ&Nœ¨ &è­·Þ*܇=Y—SÌÍ<’û¹oöƒƒ‹ÂðÒˆ3„,Z´ˆÀo@´™3gªS§NjÒ¤I¾ïKHHPDD„Þÿ}“*ó-£GÖÈ‘#õàƒ*))ÉØ“có€"]ÃÝÍæõŠñòˆ!€¿ )ø-0ÛÃ?¬Š+æ8«ØØX‹*üSzzºV¯^­iÓ¦øÞR¥J)66VsçÎÕK/½tÑZ‘@g³Ù4iÒ$?~\ñññ²Ùl0`€ÕežëÍ¿kèq7c!º'#>Ŭ¡W¯^š?¾úõë§hΜ9%îÏ,`†ääd%''ç8vüøq‹ª ¶ì"­:„7¤¦¦*::Z›7oVóæÍ­.ð{³fÍR||¼~ýõW]~ùå¾Ó¦MjÙ²¥–/_®.]º˜P¡ïÉÊÊÒСC•˜˜¨Y³fù_ILtlµëî?-ÍqóoĈ‹X´h‘úõë§>}ú())‰˜€ûµâã¿TÖâŋզM‡$EGGkôå—kÙÛo» x›[PPÐùihƒ ’$ÿ !ùýnìvówÂò²Þ½{Ÿ ‘DàX :uJË—/W¯^½<þŒÍfÓ¨×Ò¥úëûïs¾hf#;‹9CH||¼ ¤9sæX]rKL<¿–ÄB.\¨¸¸8ýé's:¾@ñ˜@@Zµj•NŸ>­ž={êsÝx@×½úª¾íÕKå*Tpôn¨Tɱ¨9÷º‚–{$Äf³©ÿþW…óbbr¬iq†Çï¸C»þóÕ_¿žÿÁðYü÷ @@úì³ÏtõÕW«^½z…úœÝn×åmÚèÏÔ¥»w_x!:ºÄ„'gÉÎÎÖÀ%É÷CHIYâ\Ðçh²xì˜zWª¤këÖUÛÝ»ÕbìX¦cðYü— @@Ú´i“ÚµkW¤Ï^ýõ Ù°!çÁbt±ög®#!~Br œçº V °Ûácóæó‡ªFGë¥ùóuçwÊf³iöìÙ„>‡ÿ*8gΜÑöíÛ5xðà"}¾cÇŽ:„˜½(ÜÑIùOÓ2‚7Ï_ÈŽï÷Þ{¯:¤±cǪjÕªºçž{Œ« ‰  ìܹSjÞ¼y±ÎS¹re5iÒDk֬ѠAƒ ª.°*„xs4ÀU!Fò¦eoŸ¿ÆŽ«C‡iذaŠˆˆPïÞ½-­@ÉEP6ÿs“{Í5×û\;vÔÊ•+‹}ž@æqñöhƒ“§#žLÓ*oŸ¿l6›Þxã >|X±±±Z¾|¹®¿þzKjP²± €€’ššªzõê)<<¼ØçêØ±£vïÞ­ß~û̀ʗ3„ 8PqqqJNN¾øMy˜¹e­ëyÓ´6m2~§3oŸ¿‚‚‚4kÖ,ÅÄĨGúî»ï,«@ÉEPе=—Ž;J’Ö®]kÈù™3„ÄÅÅ)..NóæÍ»ðb~ÛÅšµe­+O¦iùòù‹©T©RZ¸p¡êׯ¯®]»ê§Ÿ~²´%S°ŒììlíØ±Cݺu3ä|—]v™jÖ¬©-[¶èŽ;î0äœ,88X3ÿÙÕjÀ€’¤;Û´)xQ¸ÙË ¹€ÛçÎo€ððp}úé§êСƒºté¢õë×ë²Ë.³º,%# ƱcÇtüøqEFFüf§þDEEiÛ¶m†Õèœ!$..N Ð7/½äÙ¢ðü:—Ã+ªV­ª•+W*##C]»vÕñãÇ­. @ A0öîÝ+Iºêª« ~³‡S …çBÚOŸ®yß|ãþvû…ѳ©ÄŠ5)ª]»¶V®\©ýû÷ë¶ÛnSFF†Õ%( Æž={$y@\ûS´h!Õ©ãø—ã‰}TT”8 cVÝû©Ü#!9Ö„¸SЖµf+Ö¤X¬Q£Fúä“O´qãFÅÆÆêìÙ³V— À@Œ½{÷ªR¥Jªäi¿¦þDEEI’¶oßn|Á.wùðÃÝ¿1¿EêNfƒi i×®,X ¥K—jذaÊÎζº$ŒEèÆÞ½{=ýpUÀÔŸzõê©T©RÚ¶mÛù]±à9g)]º´¬cÇŽéᇾðO;—›ÙQÝÌÆ‰>¤[·nš9s¦âããU­Z5ýßÿýŸÕ%P£H¤€©?¡¡¡jР¶nÝZÌêJ®àà`½ûR¥ŠF¥ôôt=÷Üs²Ùl…ë\nf0ð•5)&4h>¬G}TÕªUÓˆ#¬. @"€{÷îUË–-=ÿ€‡Ýª7n¬~øÁèrK›Í¦‰'ªJ•*zì±Ç”žž®©S§*¸°[Öš Z“Àyä:tH#GŽTÕªUÕ¿«K` ¹sçôË/¿¨V­Zž}ÀÓ©?’4h Ï>ûLÙÙÙŽ§ö(²G}T•+WÖÝwß­cÇŽiÖ¬Y*Uª”ç'0#xLÙ‹/¾¨Ã‡+>>^UªTQ—.]¬. @a:€€ðûï¿ëìÙ³ºâŠ+<û@!ºUׯ__ÇŽÓ¡C‡ «·$KHHЂ ôñÇ«Gú믿<û '‹Õ‹Ë5˜nÚ$íÙãøš”ä~|€²Ùlš>}ººvíªÛo¿MŠ 8p@’<‰Ïûi¶k 9F@$içÎŨ®z÷î­Ï>ûLëÖ­SçÎ ÞæØ¬`Pˆ`èBBB”œœ¬ÈÈHuëÖM¿ýö›Õ%aÿþý’äùH!DFF*88˜b°N:é‹/¾Ð?þ¨˜˜˜üopÍ …¦%AùòåµtéReee©{÷îžV@> ÂT®\9Ï{€B©R¥©]»v~î’®eË–Z³fŽ=ª:œïf_ %¬Sº$Õ¨QCŸ|ò‰víÚ¥¸¸8;wÎê’ø9€€°ÿ~]qÅ^[$Þ AF@¼¤aÆZ¿~½‚ƒƒÕ¾}{ß^oP;¥KR³fÍ4oÞ<-Y²D£G¶º~Ž 8pÀóõE@ñ®Úµkkݺuºì²ËÔ±cG}õÕWV—ä^îNéÕªIQQî;¥؈ȭ·ÞªÉ“'kÒ¤Iz÷Ýw­.€#€¿þú«ªW¯îµó_yå•:xð Îœ9ãµk”tÕªUÓ—_~©¨¨(Ýxãúä“O¬.É=׆ˆ‡KÛ·K'N¸ßÎ9ÀFDzè! >\<ð€V¬Xau9ü@@HOOWDD„×Î_»vmeggëàÁƒ^»¤Š+jùòåêÚµ«zöì©™3gîf­ÑȽk×ÿë©SÇñÕ݈H€˜}ºî¸ãeddüA×5FHäÕ±^½œïóÂvо¦r¯^ú¼V-ý¹}»âãã/ŒRå(̇ü€ß;räˆ$å@ zòZ»víó až¡C‡jñâÅZ¶l™g]Ó¥‹G Œ‘p×1-MªPAjÒDªZÕ1"f\wvw|a:“Ý®ð ´Ön×ã))ú£J•‚…7Ã!¿Aà÷ÒÓÓ%)ï)X=y­]»6# ¹õÖ[õÅ_èûï¿×µ×^«_~ù%ÿä0jD"wCDבmÛ¤C‡#"IIîC¯Q|e:“Ý®ˆråÔRR¥cÇ< Þ ‡ü€ßs|¡ðäµV­Z µiÓFëׯ×É“'Õ¶m[}ÿý÷îßèn†·F$܈HBïêÕÆ_Óõü¾0)W€8sèPþï÷V8à7 üžs VåÊ•óc1Ÿ¼Ö®][þYÙ?ÿìþ ìäãuõë××W_}¥Ê•+«C‡Z·n]Î7äµFÃ[#¹GD\ÙíŽ×½5]ÊW¦3å ?>¬'N¸¯™á€Ï"€ð{éééªX±¢BCCóc1Ÿ¼Ö®][ŸŸ=«Ì­ŸúR‚U¯^]kÖ¬Q³fÍtÓM7iñâÅ^´jD"?Þœ.eõt¦\ât£F:yæŒíÛWYYY¿×ÌpÀg…X]WzzzÁ=@\o”Žs„ç“׬‘ôãªY\œãi³ë¹’’XLkgêOŸ>š2eŠî¿ÿ~LjC^ìvk~?®Ó¥Œþ3cåt&×@ñÏÏPFÒé™3uç!zsÔ(|ýõ ï÷$òïP"@ø½›º¹QÊqÜÃ9óÎ^ »33ÕÌ9õÅ):š›'“•.]ZóæÍÓ¨Q£ôÀhçΚë_xåýÕ§âââôÃ?{]~Àï8Ë“…°Ùl¶âe'Ÿrï½÷jåÊ•JMMU›6môßÿþ×ê’.fäŸw¡:&FzúiéÙgs†j Ö'Ùl6}øá‡²ÛíêÙ³§þøãÓ® À÷@ø½§`¹Sĉj×®­Ó;w²“ºþúëõí·ß*88X­[·ÖªU«¬.é£wrª#zO-.S¦ŒÊ•+§ªŸ}V¤>"°Vpp°^|ñE%&&*))I7Þx£¹7Á>ºFÃLÎõ gΜa=PB@øµÓ§OK*Bq> vw3èúºU«VÕÖK.)Rø†Aƒé?ÿù~üñGµlÙRÛ·o÷þE}|†™X”l~­ØS°Šp3X­Z5íÎÌôþŽFðªvíÚiãÆºä’KÔ®];-]ºÔ»ôƒ5fr]²nÝ:«Ë`"¿æ eÊ”ñüC‰‰RJJÞA¡[·|o«V­êX„îc]§QxµjÕÒºuëtÓM7©Gzùå—½·8ÝOÖhœ—˜èõi†ãÇWÛ¶m§'Nû|ü€_;uê”BBBêù‡bb¤Ï>»ø¸sT¤OŸ§`_7àc]§QxåË—WJJŠÆŒ£Ñ£Gë®»îrßh²¤q×8Q2tšaHHˆfÍš¥£Gꡇ*öùø‡« €â8uêTÑ{€ÄÅ9F,ŽsìD昂UÀFÕªU/l!êƒ]§QxAAAzþùçÕ°aC%$$h÷îÝZ´h‘.»ì2ã.’˜è¸iw÷ç+-Í1êæK£ Åü÷ÄSW^y¥Þ|óM } 9/ßEñA?ü°*V¬˜ãXll¬bcc-ªð]E RÎéSNÑÑÝTqÝÑÈõæ,-iX~ªÿþŠŒŒT=ÔªU+-Y²DÍš53æäÎ…ÜSÿœ# 3gs#ƒN1þ=)Œøøx-]ºT÷ÜsÚ¶m«êÕ«z~ ¨’““•œœœãØñãÇ-ª&p@|Ðäɓռys«ËüB‘ˆTäéSÕªUS¥'”u×] Êý$Øõf’â—Zµj¥7ªGjß¾½fÏž­Þ½{ÿÄ&(tL˜fh³Ùôî»ïªI“&JHHвeËd³Ù ¿PX¦*::Ú¢Šk@øµÓ§O=€qúTÕªU#é÷‰ÙÑ(@Õ¬YSk×®U·nÝÔ§O=óÌ3:wî\ñOlÆÆ®ALjÚLšf¡™3gjÅŠzï½÷¼r ¾À¯y¤ áªV­ªY’þçì’›/îh„B+[¶¬æÏŸ¯çŸ^Ï?ÿ¼ºté¢ßÿ½ø'6c㣂ŽÉo¾ùf :T<òˆ~þùg¯\€õ üÚ©S§ ·¯tq§ŸvlË뮈›íF«V­*IævІ%l6›žzê)­ZµJ;vìP³fÍ´º¸£[fm\PÜ cDãÄ"lå;iÒ$U®\Yƒ¦K:  üÚE# žÜðänçœ3/åœ>•Çv£’ç†nЖ-[T¿~}Ýpà zá…ŠvslæˆBqƒŽ‹°•o… ôÁhõêÕš:ujájàX„À¯:uJåË—¿pÀ“¸yÝP¹.ž2%ÏÅÁåÊ•SÙ²e %Ìe—]¦U«Vi„ ;v¬Ö®]«Ù³g+""³¸Ž(x{ã#vhËo¡ÝîÙyЏðþ†nЃ>¨'žxB·Þz«®ºê*Ïjàà×.)ê\OæÌ»LÉÑŒ%FHHˆž{î9-[¶L›6mÒ5×\£¯¾úʳ1¢à #¦N©ˆëQ&Nœ¨ˆˆ=øàƒÞëNÀ~íôéÓ¯±Û¥ƒ/¾áY½ÚqóånîyZš´Îc®sæsM©R¥ŠŽ9bì¿Ñ¥K}÷Ýwª]»¶bbbôꫯ|“Ÿ6j㳂Naa=Jùòå5eÊ-[¶L .ôRa¬À,~íÌ™3*UªÔÅ/äÞ2õØ1Gxèß_ÊΖ\K9ÃÅe—I®£ÿûŸcÅÍ”‘ *èÏ?ÿ4üçÿ¨Y³¦þóŸÿhìØ±zì±Ç´víZ}øá‡ªä­Eåž2bê”ÑŠ¸¥Gºí¶Û4bÄuîÜY*TðBqÌÆ¿–••¥  7ÿ)«V-ç÷¿ü"õí+9#ef:þÞuzÖ³ÏJ*\X\¶¬têTžSFÂÃà Phh¨^zé%-Y²Dk×®UóæÍµqãF«Ëò-Å\x?eÊ?~\cÇŽõj™ÌCàײ³³/ ®7<ÎÑ‘3gAÂÉuz։ޭx]çÌ_vYÎsæš2B«îÝ»ë»ï¾SµjÕÔ¾}{M™2…u ’!ëQjÕª¥ &hÚ´iÚ´i“×Kà}~-ÇHb¢´n]Ξš5s~àС‹çŸÿïÏ™/`ʹծ][k×®Õý÷߯‡zHwÜq‡Ž?nuYÖ2h=ʈ#Ô¤IÝwß}Æt¤`)¿–#€ÄÄHwßí˜Nå¼áÉ$‚ƒ/>V«ÖÅ»]0e„wJ•*¥×_])))Z¹r¥Z´h¡-[¶X]–u Zx¢wÞyG©©©zë­· +€5 üZŽb·KË–IO<áXÛQ«–´k—T®œtÅŽ Q¥Šd³å.<œ2BA~úôé£ÔÔT…‡‡«M›6š>}ºñS²ŠÐeÜŸµiÓF÷Þ{¯žzê)\zñEÇt­gŸÍ¹‹PT”Ô¹s»9[ñ¢¸z÷î­íÛ·«Q£Fêܹ³F¥ŒŒŒ‚?hP—qöâ‹/jïÞ½zï½÷¬.@!@ø—PqÓ%—èÇËûɯ»›´˜˜ ácÏžœ»mßî2,PgFª^½º–/_®É“'kÚ´ijÕª•¶mÛ–ÿ‡ ê2n©b6Slܸ±âããõÜsÏyÚøÿã*.ꄞûÉoî›4ç ÍÌ™ŽŽéS¦H¹û üõW%”.]Z¡¡¡&((H#GŽäÞ5ß IDATÔÆ%IÑÑÑš0aB`O/2 ™â“O>©C‡iΜ9^)€ñ üK¨ÈÊÊR“ÔÔ 70¹Ÿüæ~Šê¼á‘.Œ|œ=›óüî"Nᢢ¢´qãF=ùä“zî¹çÔªU+mÙ²Å겼ÀfŠõêÕS÷îÝõÚk¯o71¦!€ðkYYY:YðSTçTמíÛs¾?8øÂZ’W_-pjH…  ðŠÒ¥KëÙgŸÕ† ”­–-[ê™gž ÌÑš)>òÈ#úᇴråJ¯• À8~-++K'#" ~Šê:ÕÃyÓûf®Y³ kI-’bcó 5Œ€ÀÛš7o®7ê©§žÒ /¼ -ZhóæÍV—e¼b6S¼öÚk­I“&Xo!€ðkç×€ô5÷TÜSZJ•º°‹–Ý.Í+õí›o¨!€À ¥J•ÒøñãµqãF«uëÖzê©§ô÷ß[]šqŠÙLÑf³é‘GѪU«´=÷È&ŸCà×rÌù.è)ªkH9w.çkMšäÜEËn—y$ßPS¦L>}ÚàŸp¯Y³fÚ°aƒÆ¯W^yEÑÑÑç¬û5ƒš)öíÛW5kÖÔäÉ“½R&ã@øµs†‰üž¢:×€ä%!!Žðqü¸ãûÜ»håjJ—.XO¡áóBCC5vìXmÞ¼Y¥K—V›6môÄOøï´6S Õˆ#4gÎýþûï^+@ñ@øµeffüÕ¹$44ç 4""¤>¸°;–ë.Zù„¬Ò¤I}óÍ7zî¹ç4yòd5iÒ¤p]Ô}…Á͇ ¢ììl-X°À°À¯…††ªÜáÃ?EµÛ÷í“Ê”qŒ|”-+íÞ- &uèpñ ϺuÒ‰y†šR¥J@`™ÐÐP3F[¶lQÍš5Õ¹sgÝyçúõ×_­.Ís7S¬T©’n¼ñF¥¤¤»4ÞCà×BBBTã§Ÿ ~Šš–æè~^·®tú´£÷Ç©SRíÚÒ=÷8†ë ϺuR÷îŽÏçj/hР¾øâ Íš5K_|ñ…êׯ¯7ß|óÂÔĦoß¾Z³fþ÷¿ÿY] €<@øµÐÐPmnܸ২Ω¹»œgfJK—JƒçÜíjð`Çñ.>ß?¡¦téÒÙ—~Çf³iàÀúñÇ5`À9R­Zµ ŒEê…Ô³gOkÑ¢EV— ~íöS§žžîþE×.èΩîÖttè U¬˜s·«Š/Nÿ„F@àk*Uª¤·ß~[_ýµ²²²ÔºukÝÿýúã?¬.Í4•+WV§NXø0¿–®ÞŸ~št×cy-T/B#4|UëÖ­µqãFMžŸ}Ö±0ÝudäÄ ÇBtwþ™ÚE)œ=lÜqf˜‡5jhþüùZ¾|¹~ùå5kÖLÆ ÓáÇ /ÕWDDD¨N:Úºu«Õ¥pƒÀ¯…††êìÙ³EšB%Éq÷ØcÒ}÷å™9Ó± Vîâ2µ‹S8{Øx2Í0]ºtÑŽ;4iÒ$Í›7O‘‘‘š4iRÀn¤¥mÛ¶Y]7 üZhh¨ã*¿.èRÞ7q))’Í&õé“óx‡îwÇr™Úâ?€79w^+hš¡J•*¥‘#Gj÷îÝ8p F­FiáÂ…·>$**J[·n ¸Ÿ ~­téÒŽ]°òZ\âî&.*Jš>]š;×ýMœ»Ý±\¦v@`»½ài†…¡©S§jëÖ­ŠŒŒTß¾}Õ¡C}õÕW†–m¥¦M›*==~ €"€(šbÎK7ÊçÎéžo¾É{qy:9»¡»ÞÄmßî*ÎF…î:”ó{—©]¡¡¡ÊÌÌôÖäTÔi†ùhÔ¨‘–-[¦U«VéÔ©Sjß¾½úöí«Ý»wûÜV‹ŠŠ’$¦a>ˆ h š—^\ÍÿüS“›4É» úž=9G>r/Jýë¯ü–Üý\¦vùýˆ„Hx¨ i†Åpã7jóæÍJLLÔ† Ô°aCÝ}÷ÝúùçŸ »†ÙjÖ¬)IŒ€>(Äêø)×)Mާ±•*9F’’òœ’­ÌÌLedd(##Cÿý÷ù¿w÷—óõsçÎIrt|vý+)8X§~ÿ]|ðÁE¯Ùl6ÙBB¤õëÕõ×_UÅÍÍÔÉR¥´ûØ1•òIÕìÓGÁgÏ*øøq©reËV¯žã®?ß?£)¡¡¡ÊÎÎVVV–‚‚üðyŽ3xå^Gà ‘3gZUrsíaãæÏ¢‚‚‚4hРÿoïÎ㢬×ÿñ¿†}QVÁÄ%ÓA35)L-ËÒ’2M³ÓnuÌO'ûÚ9Yvú•Õé”-§“Išhn•š)Z‘zlR1w`Qd ¹ÜÞÌÂÌ0û ãëùxÌcœ{fîû=ê}Ýï÷u]˜9s&>øà¼ñÆh]µ 3fàÑ×_GŸ>}Ú)+Klò邤~~~N ébBDf¹rå ª««ÑºjÎ%$ oYBóóÛžWvë†O>‰˜ÂBlðóCMMM» Â’¤P‚`ð½óçÏ7úþ3Â578 ñÄ <0lŠü `¸ô‚¢"\pƒŸ*»tA`` ._F|S^¾þz¼;dŽ74-Z„nݺ!,, ¡¡¡Z·°°0„„„ÀÓÓÓìÏlw‘ä`š=l Š6üYùûûcÑ¢Exì±Ç°îµ×0à70qóf¤Ì‹_|qqq"HeB亀] ÒÓÅ«ÝúNR®^Ŭ½óN”””´»={UUU¨®®FUUêëë±V¸@s!ˆom-úßÿðé¨QÀ××~~~ín†¶k>çããÓn†A F,X€£Gâÿû_Û6Í›ôZß#€?þh{¿*) a[¶ 5?G^|U7Ü€7jå|x„…áÑ—^‚wi)ºŸ8}û¢¾¾kÊË‘\R‚߯–/ݱcjjjPSSÓ6S£+((H+(Ñ TÂÃÃÑ«W/ôêÕ QQQˆˆˆp̬Šf^ŒD¡`ðáJŒõ°YµJ|Þ?¯€€,xõU4Ü?~œ2•ŸaÕ*†‡£‡\ÿM›\ú÷¤±±€X¨‚ˆ\ ¢kAJ TóæáìË/ãdSòóóQPP€KGŽ`þxÀÑl{¹‡‡zöì‰èèhôîÝýû÷GXXX»[ÆFDO›\¾ÜöÞž>>è™›‹÷í|b"-±ò÷÷Ç•+WàåeäŸ3¥hjÅœÐPøúù!ÞÓHMúõCÐæÍb3B ]úôÁÓwÜÑv¥÷nÏÔsÓ&Ìœ9¿þú+BBB þüóOÔÔÔ ººº-(‘nºÛŠŠŠ´kÎîxzz¢gÏžmA‰¡[XXd2™u_¦’›É†Œ-q’ËíøˆÞááè-å UU!»ª ,[†çŸýû÷·ëñ-uùê¿K @ˆ\"7ÔÚÚŠ'N ''ÙÙÙÈÎÎFun.>IIA€‰ê}|àáïŒiÓðÀ!ˆ‰‰Att4¢££Ñ«W/ã'ô¥Rì¡áD¾¾¾Æ©IKE23/_¹ûn`ýzàøq@ªluâ0c†Ø+Dz¯Æl’ôI‰è²¢"tÍÊB×¹scÖçhiiAyy9Î;×v+--mûóþýûqîÜ9TVVb€,Eû:h$ŠðpŒljæÎEß¾}Ñ«W¯ŽgRì˜ÜLnB'(íŽ;wâ³Ï>äI“ðÔSOaòäÉ.• µoß>xxx`РAÎ é`BÔÉ©T*œ:uJ+Øøý÷ßÛ–JõïßÉÉɸçž{ôÁ–Êk^¹$&bù_Xv`é>,LœU„…Ù~]º‘%dÝ0µ²Òð{MY¾ˆcÞ¸Q 8¤åHW®§Ni¿/>˜<ع³-inn¶zM¼——W[aLSS*~û ]Ÿ~ÿ[°gZZÚ‚”–3g0vûvÌnjBáêÕÄõüñññèÛ·/´î£¢¢àQ\l÷äfr:Aiˆ\ŽÂýûñå—_â½÷ÞÃm·Ý†¾}ûâ‰'žÀ¼yó줪mݺ£GFdd¤³‡BDºr999!''ÇÙC!WWW'lذA˜9s¦ÐµkW€@èÛ·¯0kÖ,aÅŠÂ?ü ÔÖÖj¿1.Nõ-.ÎòA¬^-……‚ PhïS¡·/X ÞëSX(¾ßT……‚pË-í÷WX(ÈåˆÈHK>šôY¡ýw­þ,Ò8öí„Ñ£…š¾}% 4GG Âèц?¯=ŠÇT(Ä1+mchllNœ8!lÛ¶Mxçw„Ç\HMMâââ¶ß—~>>ÂÏÂÃ' Ï=÷œðñÇ êÿøCÿ÷mkšß»¾ÏgÎïÙ‘ß5A•J%ýôSÓþ}²¡‡zH *++z\º6ð|Íz @\¡IŸ’’á™gžˆ#„7ß|S(((0oG\Å4›)'&¶>fb¢ö w×®ÂÚW_úÈd‚J󘦜¼jÎfH÷#GŠÇèÚU :<<´¨þ,>>¶›M²”)3ZfÌ$444ÙÙÙ§Ÿ~*<õÔSÂØ±c…   ¶“ž={ wÜq‡ðꫯ »víªªª¬ÿ ¶þ!Û²p–ª±±QX»v­’’"Èd2ÁÛÛ[¸í¶Û„5kÖ´Ÿ™µ±uëÖ „>øÀ®Ç¡kÏ׬ÇÄ…ðš4Õ×× O=õ”àíí-„„„K—.”J¥e;³Ç•f}'&Ò6Í[Îèžp{z M¾¾Â%@híÝÛðÉ«¡“¨ÂBA6Lz÷ AÉÁß_¼¼ÁÛ»ýlˆæcW›ILlÿù¬œIP©TBaa¡°yófá…^Æ/·%ñññƒ>(dddX~µÙf“ÈnJKK…wß}W5j”@ðõõ¦OŸ.¬_¿^¸té’ÍŽÓÚÚ*¼óÎ;‚§§§0gÎA¥RÙlßDšx¾f=&¡¹ ãÇãž{îA~~>^~ùe<ñÄ ²lgéé@E…á¤ð©S-ë# ¯4¨fgméy[–xÕ­ÎÔÚ ŸÖVøÀÙ³âM_ C¿àôi ®N{›N9^CZ½¼ liAÌõ×ÃÛщۺ±ÂB )©­Ô°Ù õ$ûËd2ÈårÈÜUW¼öT*Μ9ƒß~û ¿þú+¾ÿþ{¬^½2™ ÉÉɘ4i&Mš„n¸ÞÞÞ—e€ÝZ¯^½°páB,\¸ÅÅÅØ¸q#6lØ€Y³fÁÃÃC† ÁèÑ£1fÌŒ=½{÷6kÿ---ضmÞ~ûmìß¿Ï<ó Þ|óMëËS‘ÝÈÁ‚–Äd¹¹¹P(ÈÉÉÁ°aÜ=r’šš <ÁÁÁøòË/1xð`ëvh¨[²º(C©Ôî¬]V¦Õ' mÝ~®–¾m'.ÐèÌ.5ZDJŠþŽßÅÅ@I‰ñcûøˆ÷--€JÕ¶ùψt­¨@MM B.^¿Ï©SÅ’¾Fš>íë` C?»¤$ /OýØÜïÚŠß•ÒÒRìÞ½»víBff&ª««„qãÆµ$}úôÑÜääö-ù¡N¥  ßÿ=öïßàÌ™3€˜˜Œ5 qqqˆŒŒDdd$ºwïŽÈÈHtéÒgÏžEQQŠŠŠ T*ñÝwß¡¤¤7Þx#þñ 55ÕÉŸŒÜÏ׬Ç"ó / ¾¾¿üò‹ÙWõ’ÊÍê;ù6çê¸)¤+躵€ðpÀÓÓô¯šWã¥`i¼º'Úš4gItOœõuüŽŒ4€xxˆåx%244 ±KÄVTˆ½¤ïyÓ¦ŽOâ­e¨´°f9dÀü™+~W¢¢¢0oÞ<Ì›7­­­ÈÍÍÅ®]»°k×.<ùä“hmmÅ€°`Á<øàƒߨ;“Ã2À׌¸¸8ÄÅÅaÁ‚€ .àÀ8pà~þùgiû;b½”sݾ”­¥I⚪AL ó¢£…??á2 4†‡‹Iâ¾¾b¹¾|}ã=ZÜ—¯¯˜û!“‰¹>>âvݼ±Wa¡ðÍ¿ÿ-섺¼¼öc—«#"ÄýèËM±u©Y[ýŒmü»rñâEaË–-Â}÷Ý'øøø¾¾¾Âœ9s„œÍ›ÕÅ4± ]¥R©„ºº:!??_8xð °k×.áèÑ£B]]³‡F×0ž¯YÏuZ– (('Ož´ýޱÎ^º‚^]­½½ºZ}Å^·`Gûš=xï=`üxqæcêTàÉ'Ûo¢¢PýÓO ®W/qùUSpîœ8ÓÑØ¨¾Š®T/jãâEàõ×nÝ€={€‘#Ÿ~ÆŒöî·ë.ºr ²² æðþßÿÚ¿´TCE…8[sé’þ‘”ã߃©4gââÄ{i&Á\6þ] ÆôéÓñÅ_àìÙ³xùå—±oß>¼{÷ݘZV†Œƒ!h®6õw„ÜžL&C—.]‡‘#G"55ƒ B—.]œ=4"²"óüóÏ#33ëׯ·nGééÚ'Ÿº ܹ 6¡{6xøa`à®A™™(pࡇÄ×jÊË;—''‹Îû÷o;A/ð[]ðóÏâÉn~¾x?fŒúñ²e@k+ ›H]XÄÇcÄÇT< ~nÿ~±Szk«ö{Ž(i<³g«ƒ2¥Rüþ,¥¹D-;[{v¶øxþ|óƒÝßÝÇVˆˆˆÀóÏ?3gÎ mçNÈúôÁ}÷݇Q£Fá—_~Q¿ÐÔß""êt˜₞}öYkmKKKCZZš“FDŽ4ýÒ%Œ…iiø1=kÖ¬iÿ^k—ª‘K`B䊮–Œ ÍÏGLK ’ÔWU¡_j*n¾ùf¬_¿W4KÃã¡ó×\3¡Ø–ëìÓÓÅü‡Ó§µ·÷è!&zO™bZ.‚t²{÷ÝbÎDq±¸]÷s”—ëÏM S[+!½z‰Éäš ÙÒR6)É=9YnF'äþú¦á==õ—à Çcìça)'õf'ºÏk8²ÕïŠfq=ß»âµ×°=2>ÿ¼öûl´ODDNÄÈU霘‰Žn[‡š––†èèh,Y²yyyÚ„tÙ1¡XKJŠ˜«½=0xé%à?ÿ1ý º\.VÎÒœ¹ñôÔ~ ••@@®ÈdPâlKY™¼¼öš8üö0}:p×]À’%êYÍÞ íg$:8!¿¬“«…€ *J{›f‚ø‘#ÀСí¿kupRïrý4Œ}ïr9Ž>û,Þ¼p*…Âõ? ™‘«Ò daa˜5k~üñGüñǸ÷Þ{±råJ$%%¡wïÞ˜7oÖ¯_ªª*õ›lYšµ#r9ðÉ'b%ª€ÀËK¼/,«I£“>ºËktg Z[ÕŸ#1Q\êÔ¥ ”„y7Þˆ…S§ˆeoÿxðAñ³67‹Iäo¿-vP_³FûD¶¼\û8¦ÎHìß.ÕÕ8¨¿Û¡CÅÙÍÙÍYÝŸGb"ÐÐ`›ŸGGÁ”>¶Zºe ¥~U*>úî;„øúÂ#7×ôÏBDD‡³“PHIMÔF_³žd円a÷îÝ¢E‹„Áƒ ™L& >\Xñä“BͰaBóéÓí÷m&oÒ~uÃ;nB¨o»nòyb¢ú¹Õ«aÉA9Rö펅… 'ƒ‚Ú'ž´ß¨ý½šßt¯°P¶.Z$xzz¶ÿûöéOl·w‚w\œög‰‹3þzGŒÉ=Í…Ç{L \îÙÓ¼ÏBDä <_³Ëð¹Í%;ú×i\½÷óóÃĉ1qâD¬X±¥¥¥Ø½{7víÚ…‚Ï>Ãõõõ¨Q(0~üxL˜07ÜpácniVSHWúÇ×Þ^_|óº|°´T¨±Q»Ì«æçÕœ)(.óH‚‚Ô¥`¥\„«¹¾â.]j?&ÍdxIs³vIÝÀ@q&¢¬L\®eJÉÙ¬,àÛoqîÛoá¡9S£™¬­›/aËR¹†˜»ÜNsé–±Ÿ‡­é)õ[×Ò‚û’“±ûÔ)|ðÁðÿôSàüyÓ? u @ˆ\'ªQQQ˜7oæÍ›‡ÖÖVÌÍÉÁ®]»°k×.,\¸­­­ðõõÅ!C0|øp 0|øpôïßžº9æ’N¸õKU·¤Ï!-’H=1³0i¿].4}œºù2ÁÁb’ú7ß´ïËaèüêgU©TڈƘ ½G/Cï1‡¥ý;:úyØšÆÏWˆÅ¾}ûðÚk¯áÄ®]X€×6mBbb"ðÅöíE’ž®ÎÒ7FSûí‘Ù€¹¨zzzbĈ1b–.]І†:t¿ýö~ûí7ìÙ³+W® 11C† Áõ×_!C† )) ]º´+2kœ©'ÁÖÿ°(k 4g@||ÄÝ1çBSmmû ÆÄ ½ˆ3˜´écìçacÂ?âðÂ…øbåJlÚ´ J¥‰‰‰x=#LJçÖ­À;ïXþYL¥Ù¨ÓÐqˆˆÈ.€]#üýýqã7âÆolÛV[[‹œœ:t‡ÂÏ?ÿŒU«V¡¥¥2™ ñññ0`зoß¶û˜˜˜ö3&æœ[*dn¦TÂÓÃuºFG‹ý>d2ur÷åËêY†ñ9cUà GÖÖV×@¬]ÞeçJiuuuÈÉÉÁ7ß|ƒM›6¡¤¤¸ë®»0sæLÜrË-êï1"ÂþKÕ4÷çèågDDÄ„èZŒqãÆaܸqmÛšššpüøq:t‡Æ©S§°}ûv¢¥¥àíí¸¸8­Àdla!‚_~½¢£¡šèž8ZºTHŸ«ÁMÁˆxpÛ6œ.*‚¬¨HKO*¿û¸ï>±”¯ÄÃC¬¨% ³øÊzcc#üýýͳ=X3kfËŸ€úúz:tÙÙÙm·“'OBôèÑ£-è¸é¦›ô/û³÷R5Ýý9rù`BD:|}}qýõ×ãúë¯×ÚÞÒÒ‚¢¢"œ9s§OŸÆ™3gpæÌ|ûí·(((ƒ“+àíí>}ú !!ñññèÝ»7zõê…^11ˆÙ»±Ë–ÁËVËk®^ù?÷ë¯Èÿê+ÔÖÖ"$+ ¸Ú/YY⌈±$2Rk–ìI\õ;±ñò3"‡c1꤀‘ÝuéÒ HHH0øApéÒ¥¶ÀÄoà Åñ†TVV¢¦¦çÎCuu5ºTVbÈÅ‹X­“L>eÊ@PPx{ãD÷îh|è!„††bJE. €+ÇŽá±€Dæåµ½ïRl,J·oG___øùùµÝÚ-’ËñŸ‘#±<3HN¶K³ hiiAcc£Ö­©©©Ý¶††ÔÖÖ¢ººUUUê CçvY_oˆù@QQQˆŽŽÆÐ¡C1mÚ4DGG·Ýbccáççg|À)±ÛÚÊaD® 3ý#Ò „ŽÊÁ£äææB¡P ''Æ söpˆœËЉàÕí§ŸâÏnÝP]]ÊÊJ$''ãÑG…B¡@MMMÛ­ººZëñåË—q ¬ }4þéËзƒáxzzj$¾¾¾(//ÇõõPhì+Ï×÷ÆÅA&“µÝh=Ö}Nƒ…J¥28&iI[ÑÕÇ2™ !!! àÀ@ܤRá÷¤$„……é½…‡‡#,, !!!ðò²áõ¨ääö‰ÝÙÙ¶Û¿-pé ¹“ÎðwÎð|Ízœ!"×ÔÁ•=™\Ž®ºvíŠØØX„‡‡#:: ,èxß:ÿYG'%áôæÍ¸|ùrÛM30ôç/¿üÅÅÀÕê`ÐÓÏ·Þz+A€t}Gú³îMzN&“i7º7ÝÙéÖ¥²QK—¢îÝwီ]»j/?sÖUüÎØíÈj[DöÖþÎi`BD®ËŒ2©‘‘‘¨¨¨èxŸzÖýûøù¡¯—”dÖð~þùg4×Ôhýgß­o_¼ýöÛfíÇ*ññðsµåLì&r,þ£N†¹6¯ìEDDh%Mëeãuÿ!/¢ÕÛÛ¹IÌ®ÖË‚‰ÝDŽÅ¿sÔ 1!"×f╽ˆˆˆŽg@¬í®I©Ä §OcÝŒøÇêÕZÛ¾üÉU–_0±›È±øwŽ:)g€ˆÈ Í+{qqâ½teO‡IÈܹ†ÿ3–ËÍK:ÎÊÂâðp4öèÑ~?R0ã(®²ü”ˆl‡稓â ¹&3¯ì™œb+sçâô /`¼¾F„ŽLbv¥åLì&r,þ£NŠ3 DäšÌ¼²'倨¤²xzºÞYâöôtÀåË—õwBwÍ -;ÈÏï×®·ú DDDNÄ„ˆ\“™Ë¥"""ÐÜÜŒK—.Yl©»°î ¼tŸ’À.¿ "¢NˆK°ˆÈ-DDD***lÝÎLè.ÜÒÒ‚ææfç \~ADDg@ˆÈ-DFF€íò@4ËÛˆ÷m'õ ÛˆˆèÁ„ˆÜBDDæ¨ËËÓÿÜ “)o{ùòepî Q'Ä„ˆÜBxx8² ~ûís7Lf¤¼-"""Ë0„ˆÜ‚——êÂÂðÕ´ixÜHî†É:(o+ \‚EDDd Dä6ºwMMêÜ ‰Ba~ðÑA’š«Ë±BÕôˆˆ¨“bBDn#&&ÅÅÅFs7LbByÛÊ«•¶ÂÃÃ-/ѵˆ9 Dä6Ú#¹&1¡IUU ,,Ììq]Ë€‘Ûˆ…ª @»'ÞK¹6TUU…xyq"™ˆˆÈüŸ“ˆÜÆuxëâE\þÏpÝuê'4r7lÕœ¯²²’˯ˆˆˆ,À"r×UT`>€"™Lû Ü £ÒÓ Ï”èô©ªªB·nݬ-ѵ‰¹ ¿GA€¢¢¢öO^ÍÝ0*%Eœ)1¡HUUg@ˆˆˆ,À%XDä6¢¢¢àéé©?1…4SbB‘ÊÊJÄÅÅÙdÜDDD×΀‘ÛðòòBTT”å R‘‚ñ¾±±]î—`Y†¹•¶R¼Ö0¡—`Y†¹•ØØXëf@€ûˆ¨T* DDDbBDnÅêD©ì°Hmm-T*—`Y€IèDäVbccQZZŠ––ó›JÕ®tÎuûˆH]Ð9BDDd>΀‘[‰‰‰J¥Bii©ùoÎÊÒ߬P§Hee% DDD–à ¹•ØØXb/éÏ&3Ö'D.o L¤.Á"""2g@ˆÈ­ÄÄÄ0ÐŒÐF¸‹ˆˆÈr @ˆÈ­¢[·nÖ—â5¢²²ðõõµÛ1ˆˆˆÜ"r;111PjT­²µòòrDDDØmÿDDD„„œ:uÊnû/))i[êEDDDæaBDngàÀ8~ü¸Ýö_\\Œèèh»íŸˆˆÈ1!"·3pà@TTT´%‹ÛZII """ 1!"·3pà@À‰'l¾ïÖÖV”––r ‘…€‘Ûéׯ<<<ì² «¬¬ ---œ!""²"r;¾¾¾ˆ‹‹³KRRR @ˆˆˆ,Ä„ˆÜÒ€ì€p ‘e€‘[ºîºëpìØ1›ïW©T¢k×® ±ù¾‰ˆˆ® @ˆÈ-%&&¢¨¨µµµ6Ýo~~>âãã!“Élº_""¢k"rKIII€£GÚt¿ˆ‹‹³é>‰ˆˆ®% @ˆÈ- 0^^^8räˆM÷›ŸŸÏ„ˆˆÈ @ˆÈ-ùøø`àÀ6 @ZZZPTT„øøx›í“ˆˆèZãåìP{Ï>û,‚ƒƒµ¶¥¥¥!--ÍI#"꜑——g³ý£µµ•3 DD׈ŒŒ dddhm³unᵈˆ zçw0lØ0gƒ¨ÓKJJ¶mÛ ‚M’Æ €3 DD×}€sss¡P(œ4"÷À%XDä¶’’’PWW‡ÂÂBõÆôt@©Ôÿ¥R|Þ€'NÀÛÛ›=@ˆˆˆ¬À„ˆÜÖˆ#û÷ïWoLIæÏo„(•âö”ƒûËÍÍEbb"¼½½m?X""¢k—`‘Û Gbb"~úé'Ì™3GÜ(—«V³g@M  øùk׊Ï““ƒn¸Á!c'""rWœ!"·6vìXüôÓOÚår1øÈÉ ÄûÆF£ÁGcc#þøãægY‰¹µ±cÇâôéÓ8þ¼ö55ÆëÈËËCkk+‰ˆˆ¬Ä„ˆÜÚØ±c ý,Hh¨ñÇ:rrràéé‰ÄÄD[ˆˆèšÃ„ˆÜZ=Я_?íD©s> .N¼¯­4“Õ5)•ð]¿ƒ ‚ŸŸŸCÆMDD䮘„NDnO+Dªv¥›p¾?pûíÀ¶mÀ˜1êíW_¿¥ª ŠáÃ8j"""÷Ä"r{cÇŽÅÑ£GQUUde‰U°¤àCê 2fŒ|ÌŸ$'ññ@R0q"®|ôvŸ:Åt"""`BDnOÊÙ·o0w®ö̇f_1c€  uu¬¼<ÀßÔ×ãÊ•+ @ˆˆˆl€¹½ØØXÄÆÆjçH3š}A’“ǵß\_œœxxx`È!Ž6‘[bBDׄvý@4g>4û‚´´h¿149990`9d"""·Ä„ˆ® ãÆCnn.JKKÅ Í|xy ??ݾ½mY‡]î¼óNøøø`ݺuêÆf>† ²³qâÅñòÙ³˜5r¤CÇKDD䮀I¤œ}”Jñyê´BBB0mÚ4¬Y³Fû ÝèWg>àç(•øò×_ñt—.­ ‘E€I4s4I}#RRœ1*²¡xyyy8räˆz£n^ÇÕ™¬] ÌŸ_6l@âí·Ã롇;X"""7ÅF„DÍœ€ÆFñÊxh¨x%\·iuJ·Þz+ºuë†5kÖàÍÁƒÅ^@b"P_¯þyïßäçãì²eˆ¸é&LýÇ?œ=t"""·Á„H“fN€D¡`ðá&¼½½1kÖ,¬[·¯¯[O}Ï5:¢oÊÎÆz¼?y²óMDDäf€éÒÍ Ð}LÚœ9sðþûïã÷M›¼mð·¿µŸñÚ¶ ™3øôÓO1yòdtíÚÕÙÃ&""rÌ!ºª©© ÅÅÅøÓÇGk{s×®¸r劓FE¶6|øpŒ?ó~üªQ£Ô3^â}c#0f võè£GâÙgŸuö‰ˆˆÜ g@Èõ¥§‹ àú–A)•@V0wn‡»'OžÄž={püøq”——ãÂ… m·ÚÚZÄXÀ@€j‡ã__4vïŽèèhÄÄÄ 66£F¸qãfËOK°lÙ2Œ=›6mÂ=f¼Þ|óM$''³ÿ‘1!×'U§ZµJ;‘ªS­Zeð­•••سg233‘™™‰’’x{{càÀèÞ½;bcc1|øptïÞq¸eíZœõUøô뇋uu¨«­EË™3øß;ïàËI“pôÏ?Q\\Œ¯¾ú ï¼ód2 &Nœˆ &`Ô¨Qðóó³ûWBÖ5j&OžŒçž{ÓÃÂà­ùdh(6n܈ï¿ÿ›7o†L&sÖ0‰ˆˆÜ’LÁÙƒ Qnn. rrr0lØ0gǵ(•&W§:tèÖ¯_ÌÌLüþûïƒ Bjj*&Nœˆ±cÇ"P·ô*`öLKQQöîÝ‹={ö`Ïž=¨¨¨€¿¿?nºé&Lœ8<ðºwïnËol¨¤¤w ‚Ï×ÅÇÃãêïU£L†NB¿ÔT|ùå— @ˆˆH Ï׬ÇÄ…ðºÉÉí«Seg—WíÝ»o¼ñ233ÑpL˜0QQQí÷g£¥] R©pôèQdffbÏž=ÈÊÊ‚J¥ÂüùóñÜsÏ!..ÎüÏKö¥T¢úÎ;¡8|½FÂë¯¿Ž²²2¼ñøãø×¥K¸îçŸ:t¨³GIDD.†çkÖc:uzÖê·´´`ýúõmË *++±nÝ:œ;wk×®Åܹsõ€Mzxx )) ‹-ÂÎ;QZZŠ¥K—bÓ¦MHHH@ZZ:dÖÇ%;ËÊBØW_áóŸ~Bee%ÆŽ‹{î¹á öíC¨f³B"""²æ€Pçªõ°F&Crÿþ(((Àĉ‘™™‰ñãÇ›¾dÆŽCCCñâ‹/â¯ý+>ûì3¼ùæ›:t(&Mš„¿ýíoHIIáÒg»:»u“\ŽÃ‡ã?þ€¿¿? `Ä'ˆˆÈ=q„:¥R  4GGãTPŽæçcLïÞÈÍÍÅîÝ»1aÂóOê5j–aµQãA<þøã8}ú4Ö­[‡óçÏã–[nAjj*JJJlr ²žŸŸ ®»î:1ø """»áÿ´äú®.‰jMOÇÿ7s&º–—cbH>þ«==1TgfÄlh<èååÕ¶ ë믿Ɖ'˜˜ˆ5kÖ€iXDDDt-aB®/+ å¯¿Ž‰?Œ^xO?ý4Ž?ŽÔ¿ü²U«Ädqkè0Ö4FÈd2L›6 yyy¸ýöÛ1gÎ̘1v;&‘+a¹¼}qq˜9m<<<°wï^ÜrË-ê'årë–Ki,íÒÊQ*m¶ KŸ¬Y³wÜq}ôQ$&&âóÏ?GjjªÝŽIDDDä 8B.móæÍ˜0a €ÜÜ\íàÃééêªWRµ«µkÅr¾{÷O=%>ÖWËf̘£GbèС˜2e >úè#»“ˆˆˆÈ™€Ëúè£0sæLÜu×]ؽ{7zôèaÞ4ƒ ‰Tzwÿ~`ùruwuÍÒ»Ru,k—v™¨Gؾ};žxâ <öØcX¼x1T*•CŽMDDDäh\‚E.iÅŠX¼x1žzê)üë_ÿ²¬2‘lHA Þ/[L »v‘‘6)½k OOO¼û‹‹Ã³Ï>‹sçÎáóÏ?‡§§§SÆCDDDd/ @È大§cñâÅX²d ^}õUËûeëó ä副+)s@¤àCš YµJßàlÖ=]Ÿ§Ÿ~½zõBZZ<==±zõj–…%"""·Â„\ÊÎ;ñÐCaÁ‚ÖÍ>…¨¯×~]^ßq#B}³*€ñ ÅL3g΄J¥BZZ‚ƒƒñïÿ›M ‰ˆˆÈm0!—qìØ1̘1S¦LÁ‡~h»“n}}>tKí^¹"6"´gCtÙ±{º¦{ï½µµµxä‘G —˱hÑ"›ì—ˆˆˆÈÙ€KhhhÀ¬Y³‹ŒŒ xyÙðWS7¡;0P]z·¼8hiQ?ßQ#BC³*ºÁ‡•K²þò—¿ ??Ï?ÿ< n¾ùf‹öCDDDäJ¸¸œœëj¥ªÅ‹ãÔ©SX¿~=Åç”Jñyk(•@x8е+ $&EEÀ믛6}ûj¿Ç”F„ºAJyyûãNž,.ë24.>ÛòåË1vìXÜ{ï½(--íx\DDDD.Ž9WJ ªî¼ÛW®Ä[o½…¤¤$q»fY\KIûØ´ 8rˆ‰êê€.]€)S€™3ÅŠXAAâ Ft´8‹!5"4F7H©®’“Å€#9Y\¢õÉ'ÀK/µß—ŸÍËË ëׯ‡··7î¿ÿ~Û•çÕW¢Xs|Ö~DDDD0!§jŽŠÂ=uuø&(ö™ö ¼n¢·!†N¦³²Ä#+K½lJ©Î‘²2àá‡ÅÙìl ¸(,T7"Ü¿_ÿ‰¸f÷ô¸8ñ—dˆ÷À˜1ê|ÝàÄÔÏ 22éééÈÊÊÂgŸ}fÒ{:$%Ó[Y‚9ÕÊ•+ñ£R‰¾½{C¦{¯[eÊÐUyC'Ó))â „t2­»lª© ¸ë.`ñbí÷J½B|¨¬Ô~n÷ôü|ñ>,LûuÒ±4óE }6Œ?sæÌÁâÅ‹Q®»ÜËšÉôVGDDDDæb:9MUUþñà‘GAÀ®]ÚOj •¸5µ2•˜q䇀»ïd2í Y‘‘âvMYYúOÐ##Å~"Ícé«Âe·Þz ;vìÀsÏ=‡Ï?ÿÜ¢}h15™žˆˆˆÈ†€Ó|ôÑGhjjÂË/¿ üú«ö“ee¦õått2­¹lJ3@€uë€Áƒµ{ƒG¶?¦¾ŠV†ö­TŠï× |LIr×£[·nxå•WðÄOà…^À@ÝäyKØ(8""""2—`‘S455áý÷ßÇœ9sQ_¯S\¾lþ’%C'Ó†–MI¹ÐܬýÞæfÓŽÙѾ÷ïoŸ/bJ’; 7·eþüùˆŠŠÂ–,±M¢¸‚#""""S1!§Ø°aÊÊʰxæÌö'ð=zh¿ØÔ«ò†N¦ -›’–neeYò:Þ÷²eÀ_þb88é(Ñ“Ûâëë‹å?ŒÛ¿ú …11–ПLojpDDDDd!.Á"§øä“OššŠ¾¥¥íOà-¹*ol”´l*=]<©×<–ôg//±ºÄÛ[½„ÊcMóóo¿5øt´¬LOnËý>>Ù­nܺÿ¾åãã3DsæF7Ù_Ê·¹–sAôý®H¬l0IDDt­ã 9Ü… pàÀÌš5K<‰Ó=6÷ª|GË ¤÷ê«–¥TÓ§b“Bé˜iiÖÍè~6Mr¹i'°zªhy^¹‚”¹s‘‘‘fÝ¥c¦²ç¬;`™b"""»á 9Ü×_ ™L†õõÚ³ š ¾ÊÜÑUySN¦årý3 W®/;wŠ};$J%pß}bÃçž³åÇ7ŸžÜ–9sæà­·ÞÂwß}‡Ûo¿Ýü} ~¤ïêZfje5"""2g@Èá¶nÝŠ””t½í6í«ÌR h_eî誼93 YYÀ¥Kê…³g>}ÄàC³×ˆ\.VÇŠˆ0üALí&nm×q=KÒ’’’0dȬY³Æø{Ér6êáBDDDÚ€C ‚€ƒbüøñí›á-[̘¡¿ž©K–:’’œ>­½Mš‰Ñ]ZÓÑ1M]¦cÍr#KÒî¼óNìÙ³*•ÊÈ&«°L1‘Íq 9T~~>jkk‘œœ,npt3<¹HHòòÔÛÊÊÄ ÇÜ¥5¦.Ó±t9O‰â·þå/x¹¦yyy2dˆéã&Ó±L1‘Íq„*çj ¡P(Ôy•Y©‚‚Ä^#’Ë—ÅeY–=¦.Ó±d9O¹-Ãþü¾¾¾øñÇÍ7uŒeЉˆˆì‚9Ô‘#GлwotëÖM½ÑQW™5gt{œ>mù‰¥©”¹V¹-> `Ĉ8xð IÃ$3˜ZYˆˆˆÌÆ„ª¤¤rkËîZJsFA7ÈIH°¼ô¬î¾õ¿Nw» ­þýûãÌ™3Vï‡t°L1‘Ý0„ª¬¬ =¤ÙG7ÓÊõ5-¬­âãõ¿ÏXã9Ý}EEÀþýÚe}÷ï·'&ŠIïš­øŒñññظq#A€L&³x?¤ƒeЉˆˆì†9Tuu5úôé#>0µ‡- zöïn¿ض­}?)Ò·¯)S€ÿü§}°1u*г'P]-.÷**vì0¼o ?g||°ÕUæôt±”­¾×ëÎ^ zÆŒƒùóÅ$uS*Uee‰ÁÇK/iïsÌ1ø8yR|\Q!Î|hÒg´2Ð’f“***€Q§À„ÊÇÇÍÍͶݩÔgC7°Ð7{¡/èIO*+»ïƒÍ’À‰‰b7ôˆˆöï;W|ï²eíKìêæ°Ô×뻕Ëy|}}W®\±xDDDDŽÄÄ=ûì³ÖÚ–––†´´4'Èv|}}ÑÔÔdÛZÚgC’’¤¥›7ååÚÏ< lÙ"vE×$ͬHÁÏ¥KÚ½E¤Y‰*{IˆÍ¿S"""BFF222´¶ÕÖÖ:i4 zçw0lØ0gÃ.|||ÐØØhû[ÓÐP.22ijgµŸknÞxÃx’üªUÀ€Úïkh°y¹>>>>8BDDdú.çææj÷3#³1!‡êÞ½;Ο?oŸ[ÓÐP.ƒ AÐÞîï<óŒz†:{z¶ß‘#êÇvªìUuiW€fcE""""Æ> äPñññÈÏÏ·Ïέih¨TŠÍu…‡§Nï`®T­­ÚïS©´ó@ìÔ?¢ªª j¯æDDDD6Æ„*>>%%%¦-JO7ÜP©Ÿ×|lMCì,±¡®ÈH $D{›æÌŠ4³Ñ¯ŸökúõsHÇli6©‡ngw""""Å„*!!*• §õÍ6è’¼uO⥓þ”íÇk×ÙÙ@~¾x¿v­éA@J àãøúŠ÷>>bSA™¬}¢9Û•%VÁ ƒžèh kWq?Ë–©g‰ˆˆˆì‰9T—.]0|øpüøã¦½A³º•¡Œ¹sÅ Vº³))ÀâÅb]Í™ÝÙˆ¬,`Æ ±Ôî˜1ê gÚ4àøq K1Çãøq1¸ÐœYY¾\ r4Ç#—‹3'uu@I‰þ1ÛHvv6’““m¾_""""{a,r¸›o¾Ÿ~ú)A€L&ëø ¦T·2Ô D&g#fÌ0\ÅJ·Á ôH}=ÊÊ´Kúj&”¿ø¢þêVºýD Uä2§‹»Ž²²2œ;wŽ¥‰ˆˆ¨Sá 9ܤI“P^^ŽƒšöS«[é›-116ƒ¨—dI‰ïú‚Íe[r¹Z&V]m|ÌÒqôå¹(•ÀŠæŒìÞ½0jÔ(ƒó¾'^ úIDAT¯!"""r5œ!‡»é¦›ƒôôôŽOž5«[iÎ`jêgÊl‰îì„f@üs` ökµ_£I_ÄÀ@ãc–U«´gnÊˋŜ“.ÛÚºu+Fމž={| ‘«aBçááÙ³gcåÊ•x÷Ýw WpÒ¬ne¬¹&SfKª«ÅY )8¸tI{_Ë–“'@S“Ød°°øàÃn &&¥³î’±¼<@³4± >Vz:.Žï¾ûË–-Ó~®ƒe[DDDDÎÆ%XäsæÌAmm-6nÜhøEYYúƒ CMý4gK""ÄŠURމfAÐ^’åç'V²R*ÅÛK/}ú—/‹ÉçW®QQê×è3=½} Ùñ˜5gNtû¢ëâž’‚º™3ѽ±Ó§O׋JýÙg@È)ú÷ï)S¦àŸÿü'î¿ÿ~xx艅]Å—f$ÒÉ÷wŠ ç€XZ·©I 0Ö­·mÚü¿ÿ§½¯úzà›oÄÙˆÒR1è¸pAû5EEÀÿ+Òq¥c.[fÚ21Ý1† #Í…ØX<âímAAè;k–áäz""""Ärš¥K—âøñãØ¼y³õ;“fKf̃@ü³Lüù§$%6´ïZªžP*ÅÒ¹º3 Àßþ¼÷žv?’eË´û€˜ÛQ7Ððñé°‹ûÁƒñõáÃˆŽŒì8¹žˆˆˆÈÅp„œfäÈ‘HMMŲeËp×]wÁÓÓÓòiΖhæV=ªL45©»–ëÎVèÎFøø½{«_sé’º4/ î#?¿ãeb†‚)ÀèÚUì¶©žÅ æ¹¼õÖ[èß¿?‚››µ÷glÙ‘‹`BNõÊ+¯`äÈ‘X¹r%.\h›ê«J%9}Zl(¨où–nå«ÄDu"¹R ¨ý|MyËÄ4IÇLK›(ê.ë’ªcé0À–-[ðÙgŸAöþûÚû4²l‹ˆˆˆÈUp 9Õˆ#ðØcáÅ_DII‰ívlh6 !AÿlŲe@CƒøçèèöË ärñ½š¬9á—–Œ-ZÔ>OD3ðÐpT*ž~úi( Ì;Vw"%×Y¶EDDDä*8BN÷Úk¯aëÖ­xê©§°uëVÓº£wD_nEb¢þäp©òUf¦ár¿€zéVq1У‡á}™Rׂ™“Õ«W#''Ù›6ÁcÁóʹ΀ÓcåÊ•øúë¯ñá‡Z¿CÍÜ i6cøp±–¾äðŽÊýnÚ¤îG’ üú+ЭðúëÚû²c\¥R‰¿þõ¯˜={6þi^yb""""Âr Ó§OÇSO=…gžyÆ ÃÈ‘#-Û‘%¹ÍFDDhŸðk6¼t 1ˆ‰±[Üææf¤¥¥!$$ï½÷ž˜°nl¼œý """Æ„\ÆŠ+3f 77‘ºÍüLaJó¹sÍ;I× H‰îRU¬Š q¦Å'ÿÿûßñÛo¿aß¾}1|u\‚E.ÃÇÇ7nDKK ¦NŠºº:ówb,¸ÐIê¶šn¢»Êধ§ãŸÿü'^}õUÜxã6ß?‘£1!—…;wâÔ©S¸ãŽ;pùòegÉ0ÝDw—Áݹs'zè!,X°Ï?ÿ¼M÷MDDDä, @Èå :;vìÀ/¿ü‚;î¸õõõÎR{J¥]Ëàú(Μ9ãì¡u¨¬¬ Ï=÷’’’PTT„;v`Ë–-ˆˆˆèøÍv®¾EDDDä @¨S’ÉdxôÑGQTT„—^z [·nEÿþýqï½÷"77×ÙÃkçäÉ“xøá‡‹?þ/½ôŽ=Š)S¦˜¾;Vß""""r äx6l¨Š%K–@©Tâý÷ßGvv6 RSS±wï^‚`“![êàÁƒ˜>}:ˆíÛ·cÙ²e())ÁÒ¥KáççgÞÎìP}‹ˆˆˆÈÑ€ãÙ¡¡ž¿¿?{ì1œ>£F‰'ðßÿþJ¥Ï?ÿ""""We¤Ö'‘h6ÔklËȆ†ŠÉÔk×ZÕÓÂËË ³fͽ÷Þ‹~øÛ¶mÞ={ðÙgŸ’’’0qâDL˜0cÇŽÕnôgææfüüóÏÈÌÌÄîÝ»ñÛo¿A¥R¡_¿~˜2e ¦NŠÔÔTxx0Ö'"""€­¥§‹3ú‚¥RÌS–I õ$ …ÍêÉd2Œ7®­²Ôùóç±wï^ìÙ³xë­·àãヾ}û"&&111ˆŽŽF÷îÝ„àà`C&“¡¼¼.\h»i>.))A}}=ÂÂÂ0aÂ,X°'NDll¬M>‘»aB¶%-¯ÒM–––W­Z¥ÞæÀ†z={öÄìÙ³1{öl‚€“'ObïÞ½8uꊋ‹‘““ƒ-[¶ ªªJoÞˆL&C·nÝн{wtïÞ½zõÂСCÑ»wo¤¤¤`èСðôô´Ûø‰ˆˆˆÜ²-s–W9©¡žL&À0`À€vÏ©T*üù矨­­Emm-T*"##Ñ­[7ãÝɉˆˆˆÈ$<£"Û3ey•fC=Í E©´Ù2,Kxxx ((AAAˆŽŽvÚ8ˆˆˆˆÜ²cË«4êZ¦åÄ „ˆˆˆˆì‡¥yÈ>Œ-¯bC="""¢kg@Èö:Z^e¬až\ÎÙ""""7Æ„l‹Ë«ˆˆˆˆÈ.Á"Ûâò*""""2‚3 d[\^EDDDDFp„ˆˆˆˆˆ†9 """""r D–HO+{é£TŠÏQ; @ˆ,‘’"–Ö B¤rÃ))ΑËc,"KHe…gϵ.êö@!"""¢6 @ˆ,%—‹ÁGNŽz›BÁàƒˆˆˆÈ.Á"²FMñÇDDDD¤…‘Æ_jü1¹´¾Ô©ñçëÞøóuoüùº7 DFýP©s> .N¼÷ó3\‹\ÿƒsoüùº7þ|ݾî9 D–ª]é&œKÛW­b.‘œ!²DV–þ CªŽ••åŒQ¹<΀Ybî\ÃÏÉåœý """2€ˆ ihh?~ÜÉ#!Imm-rss= ²þ|ݾî?_÷æÊ?_é[Æ»@¨Ë‘/`ÀÙäfåå¬b¹›ì ^i@P›ùN»3Õ]]*}Òwô<Ë0=þgÒz«Žt~óöíÛ· Àÿ­=°#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0#@€0s³öð~ø¡¼|ù²<{ö¬¼÷Þ{k»ðæÍ›òÝwß•O?ý´|ðÁkCE›÷òåËòù矯=ìÒW_}U>û쳵Ǡ"„Í{öìY)¥”?þñå£>Zwx„ßýîwå¯ýk¹¹ÉwËõ¯ý«|ñÅ?ÿ†kÉ÷¯Ý9n»úè£ÊǼò4p¾§OŸ–?ýéOk1‹íÏ\›‡Ð€0#@€0#@€0#@€0ã BªÔu]iÛv±¿Ç2 Ãb¿ùïg~XNöõŸ}~ØBuº®+‡Ã¡4M³Øcš¦Ò÷ý"¿„Ìÿ0óÃ2²¯ÿìóÃ^Ø‚EuÚ¶]ô—O)¥4M³Ø'læ˜ùaÙ×öùa/F€a<Àæ]ú`ñVÎ>?À5 6m΃Å[x`8ûü×f ›6çÁâ-<0œ}~€kó ÀÎ<´ȶ–$@väœí@¶ý°$[°väœí@¶ý°$„ @„ @„ @°#ã8–išîý3Ó4•qƒ&`oD°#Ã0”¾ï„ÀjÀÎ Ã 0X-XlÚ9ÛÆNÙÂv²ìó\›o@Ø´s¶²…ídÙç¸6Àæeß6–}~€k² #@€0„êÌyàó\K>jþ‡™–‘}ýgŸöâ7oß¾}»öpŸo¾ù¦üá(þóŸËÇ|ÖÏt]wÑŸçZúÁPóßÏü°œìëkó?}ú´üå/Ylž%ÿþãÿ(¿ÿýïׇŠx*eàÓüëÊ>?Ì‘}ýgŸöÀ, ŒÂ ŒÂ ŒÂ ŒÂ€…ŒãX¦iZ{Œ‹ÜÞÞ®=•r:ÀB†a(}ß—¶m×åÑ^½zµöTJ€¢ëºT7aã8–aÖƒ Ãr-½~ýzí¨”`q]וÃáPš¦Y{”³MÓTú¾Oyã°e€Åµm›*>J)¥išÒ¶íÿÈÖ¿ÉñÍͶY? @àl¾ÉñÍÍvY?ðoÁ€3eø&çøÍ ÛcýÀOF€aÆC蕈~»Ž·åp P5Þ®ãm9\B€P¥¥? öÉïý²_ÿŒó¯ñvSçd[ÆõÿKÙç‡= T'â“`Ÿüž–ýúgŸæÈ¾þ³Ï{á!tªñI°÷¤Ÿ–ýúgŸæÈ¾þ³Ï{!@€0#@€0#@€0#@€0"„Äîžøë„^`ë$õ®Ð l-XÔ»NüuB/°u#@€0’DZLÓô«ÿmš¦2ŽãJ<ÌCèÔ0 ¥ï{oÁR Ø0 ‚HÅ, ŒÂ ŒÂ ŒÂ Œ¡:ï:!üÚœ8~ZöëŸ}~˜#ûúÏ>?ì…ƒ©Î»N¿6'ŽŸ–ýúgŸæÈ¾þ³Ï{!@¨’ÂוýúgœÿøÉoÓ4aÿMŸ×)ãúÿ¥ìóÀ D|ò{—O‚¸„¨„O~ÈÀCè@„ @gŠ8gb.o'Û.ë~â!t8Óo{,o'Û.ë~"@XÜç”Ìuê“`ocë2|ò{—O‚–!@á“_öªëºTñ}ôí·ß®=• 麮‡TÛ>ùä“òâÅ‹µÇ BÞ‚°¶mSÆG)¥ÜÜøœšeXYð[ßNãÙ`ëœ)ÃvšišJß÷"Ø,[°àL¶Ó4M³éohF€aF€aÆkx©ÒÒïé÷žýûe¿þÙç‡9²¯ÿìóêñž~ïÙ?-ûõÏ>?Ì‘}ýgŸöÂ,ªñž~ïÙ?-ûõÏ>?Ì‘}ýgŸöB€alÁ`Óæîé·g`[$qéM˜›/2»Æž~{ö¶E€@snÂÜ|‘Ù5öô÷ìû7° žæÜ„y`Ø߀ìÌCÛùlÛ`I`GÎÙÎgÛK² `GÎÙÎgÛK @„ @„ @„ ;2Žc™¦éÞ?3MSÇ1h"öÆA„;2 CéûÞIè¬F€ìÌ0 €ÕØ‚ œ³mæÛi€-ñ $pζ™Sl§!³c|7Msñß!¶E€@¶Í°GsâûH„l‹`ÓÄ7@]<„ TgÎÛç²§ü´ì×?ûü0GöõŸ}~Ø [°¨Î5öŒ?ÄžòÓ²_ÿìóÃÙ×öùa/U²g|]Ù¯öùaŽìë?ûü°¶`aF€aF€aœ)✉¹œSl×ðÀ™"Ι˜Ë9ÀÖ xçLÌc ÀB2lÛ;åööví¨”o@X\×u›Þ¶tŠíLÌ5 CyñâEyÿý÷×åѾþúëµG R€Eu]W‡CišfíQmš¦Ò÷½áb]וçÏŸ§\ÿ~øaùòË/׃ Ù‚À¢Ú¶MyóUJ)MÓ¤üæ†íȼþon|NÍ2¬,x„­o'³m Ø:gʰ̶1`ëlÁ€3eØNcÛ°u#@€0#@€0ã5¼Tié÷ô{Ïþý²_ÿìóÃÙ×öùaÕ‰xO¿÷쟖ýúgŸÿh©›07_u˾þ³Ï{!@¨NÄ{úïÙ÷ èe¿þÙç/eÙ›07_u˾þ³Ï{á€Ê,yæ;æ @„ @„ @„q$v÷°9‡Ä['@ ©w6ç8`ëlÁ‚¤ÞuØœC‭ @„ Ô8Žeš¦_ýoÓ4•qWšàaB‡¤†a(}ß{ ŠÄ†a@*¶`aF€aF€aF€Tæ]‡T^‹Ã.˜Ë9 TçxóÕ4Íbÿ 7a§e¿þÙç/å݇T^‹Ã.ë–}ýgŸöB€P%o¾ŽÜ„–ýúgŸÿÈ!•\"ûúÏ>?ì…¡Jn¾Ö•ýúgŸæÈ¾þ³Ï{à ŒÂ ŒÂ Œ€3-yÈãµ8§Ø:¯á€3Eœ31—s*€­ ðΙ˜Ç,•aÛÒ)¶31Wæõ{{»öTÊ7 ,*ö¥Slgb®aÊ‹/Êûï¿¿ö(öõ×_¯=• ,ζ%öªëºòüùóÒ4ÍÚ£<Ú‡~X¾üò˵ǠB¶`,¤mÛ”ñQJ)77>§f#@€0#@€0#@€0#@€0N˜¡J]ו¶mûûÇqtªó=²_ÿìóÃÙ×öùaÕ麮‡EOž¦©ô}ï—Ð;d¿þÙç‡9²¯ÿìóÃ^Ø‚EuÚ¶]ô—O)¥4M³è'l™e¿þÙç‡9²¯ÿìóÃ^ ŒÂx€Í»ôÁâ­<0œ}~€k lÚœ‹·ðÀpöù®M€@2ç~’ê“Sj1çÁâãÃkþ[È>?Àµ Hä1Ÿ¤úä”SŠXñ À’$ò˜OR}rÊ»œ±â€%y ÀŽœ±Î9`I#@€0#@€0#@€0`GÆq,Ó4Ýûg¦i*ã8MÀÞ8ˆ9Þ<ž{º›Hô}ï$tV#@ ‘snÜDrÊ0 Ö« Œ›Göæ1ßüݵ…o³Ïpm€M{Ì7wmá›Àìó\›`ó²ó—}~€kò, ŒÂªsÎ9sy0ô´ì×?ûü0GöõŸ}~Ø Ï€P9|žËƒ¡§e¿þÙç‡9²¯ÿìóÃ^ªäÏue¿þÙç‡9²¯ÿìóÃØ‚„ @„ @„ @„ @°qË4Mkq‘ÛÛÛµG RNB‡ ]ו¶m×ãÑÆqtª0À…†a(}ß§üÿÿW¯^­=•  ëºr8JÓ4kòhÓ4•¾ïE³e‹pñ͵ Ãr-½~ýzí¨”mÛ¦ŒRJiš¦´m›ò—g²ÜÄß½yÏáâ` ‰L7ñwoÞ3Fø©øÞzúæØ:D¦›øZ¿9˾¹¶Î[°àL"ð[%@€0#@€0#@€0^ÃK•–~O¿÷ìß/ûõÏ>?Ì‘}ýgŸö@€Pˆ÷ô{ÏþiÙ¯öùaŽìë?ûü°¶`Qˆ÷ô{ÏþiÙ¯öùaŽìë?ûü°#@€0#@€0ã5¼I?ì…×ðRsÞ“>—÷¬Ÿ–ýúgŸæÈ¾þ³Ï{!@¨’÷¤¯+ûõÏ>?Ì‘}ýgŸöÀ, ŒÂ ŒÂ Œ€3Eœ31—s*€­ó^€$Ž7¿MÓ¬=ʃj½ Ž8gb.çT['@’Èpó{t÷&8S<Š(çLÌ#@ @Æ›¯£Z?ÉÎ*ëÍo¦x:òMÀ2Èxóuä&lº®K¿~²Æ×%@ ÈñÆ+ÓM¤øØ†®ëÊápHû Zß÷Ö? $ãM¤›ÇmhÛ6Õºù¥¦iJÛ¶Ö?ó^’ñ&òxóp-#@€0#@€0#@€0ã BªÔuÝ¢çW8!ü~Ù¯öùaŽìë?ûü°„êDœ8î„ðÓ²_ÿìóÃÙ×öùa/lÁ¢:'Ž;!ü´ì×?ûü0GöõŸ}~Ø „± €M›»§ßž}€m lÖ5öôoaÏþ¥%ž(Åú¡>€ÍºÆžþãžýµnÄæDÔâ‰uY?ÔÈ3 ° 9åg¬j$@€0#@€0#@€0#@€0#@€0#@€0#@`Aã8–iš.úÙišÊ8ŽWžˆL¬jt³öpÊñæ«iš‹ÿ޵o†a(}ß—¶mý³ã8–a˜Š,¬j$@ج97_G[¸ †aõÈËú¡6€MsóPÏ€aÕ™óÀÞ¹ÖÞS¾eÙ¯öùaŽìë?ûü°¶`QkìÈö”oUöëŸ}~˜#ûúÏ>?ì…¡JöŒ¯+ûõÏ>?Ì‘}ýgŸöÀ, ŒÂ ŒÂ ŒÂ Œ 'ô^›€ks!躮´m[þþ÷¿—ßþö·ks¶ÿüç??Ÿ(ì`¯õãµišµGy´išÊ“'OÊÓ§O×åbN¾¸. 뺮‡”7GÓ4•¾ïÝ„­d†Ò÷ýÏ1˜É“'OÊóçÏ­~&@`amÛ¦¾ù*¥”¦iJÛ¶nÀV4 CÊëÿôéSë€_ ‰·ómmKœ"@’È´ï¸m îò,€$2mç;n[€»F€aF€aF€aœB•–>¬Í!k÷Ë~ý³Ïsd_ÿÙç‡= T'â°¶ã!k~ ý¯ì×?ûü0GöõŸ}~Ø [°¨NÄamY;-ûõÏ>?Ì‘}ýgŸöB€aÆ3 °°KŒöÀ3¥X?ÔG€À‚æ<íg¬jd ,h΃ÑxÆú¡F#@€0#@€0#@€0#@€0#@€0#@€04Žc™¦é¢Ÿ¦©Œãxå‰ÈÄú¡F7k5†¡ô}_Ú¶}ôÏŽãX†aX`*²°~¨‘€… ÃàF‹Y?ÔÆ, ŒÂª3ç½sy°ï´ì×?ûü0GöõŸ}~Ø Ï€P9ì˃}§e¿þÙç‡9²¯ÿìóÃ^ªä½ue¿þÙç‡9²¯ÿìóÃØ‚„ @„ @„ @DÄ!k×â°6Nq@‡¬]Ëñ°¶ ³K€$â5²³ –iÛÌ)¶Óp)뀻| †¡¼xñ¢|ñÅåæ&ß?¹ÛÛÛòâÅ Ÿºs‘LÛÆN9n'à:òÝ AB?þøcÊø(¥”›››òã?®=ÿÕu]Š›ù_Þ´Û6À/å¼#Ø¡®ëÊáp(MÓ¬=ʃ¦i*}ßW[@ߨ['@’hÛ6E|”RJÓ4¥mÛên„3D`ÍñÔÁCèp¦ xŒ?€­ @„ @„ @¯á¥JK¿§ß{öï—ýúgŸæÈ¾þ³Ï{ @¨NÄ{ú½gÿ´ì×?ûü0GöõŸ}~Ø [°¨NÄ{ú½gÿ´ì×?ûü0GöõŸ}~Ø ß€°Y×ÚNcÛ Àv6éšÛil›ØÀ&]s;ÍqÛŒ!£K¿ ôÍ[%@6jÎ7¾ùc«<„°Qs¾ ôÀ<[%@€0#@€0#@€0#@€0#@€0#@€0#@6jÇ2MÓE?;MSÇñÊÁ|7kÀ» ÃPú¾/mÛ>úgÇq,Ã0,0Ì#@ؤã'¿MÓÌþ»|LfÃ0 ª"@ؤ9ŸüÞå“`€í l–O~êã!tª3ç½sÙÎqZöëŸ}~˜#ûúÏ>?ì…o@¨Î5·mœb;ÇiÙ¯öùaŽìë?ûü°„*Ù¶±®ì×?ûü0GöõŸ}~Ø[°€0#@€0#@€0ÎqÎÄ\Ω¶Îkx’8Þü6M³ö(ªõ&8✉¹œSlH"ÃÍïQÍ7ÁΙ˜G€@€'Ož¤ùäú®išÊ“'OÖƒÿró @vÖu]yþüyÊø(¥”¦iÊóçÏKß÷n|¹X×u)¾¹¹«æorÖ"@`amÛ¦£¦iJÛ¶nĸH×uåp8¤üw0M“ø¸2oÁ`Q™#üß\ ŒÂ ŒÂ ŒÂ Œ“ЩR×u‹6Ž£“‘ï‘ýúgŸæÈ¾þ³Ï{ @¨N×uåp8,zòò4M¥ï{¿„Þ!ûõÏ>?Ì‘}ýgŸöÂ,ªÓ¶í¢¿|J)¥išE?aË,ûõÏ>?Ì‘}ýgŸöB€alÁب¹Ï´xf…- °°Ko"Ý<îÛ5žiñÌ [$@`Asn"Ý<îÛ5ži9>³b ±%ž€͹‰ôÀ3P#߀ìÌCÛlû`I`GÎÙdÛK² `GÎÙdÛK @„ @„ @„ ;2Žc™¦éÞ?3MSÇ1h"öÆA„;2 CéûÞIè¬F€ìÌ0 €ÕØ‚ :gÛÛ)¶Ã5ò ,èœmo§Ø·oÇxmšæâ¿CIJEfÛ—˜¯G"–- %^©‘g@€0„êÌyàó\öÔž–ýúgŸæÈ¾þ³Ï{a չƞهØS{ZöëŸ}~˜#ûúÏ>?ì…¡JöÌ®+ûõÏ>?Ì‘}ýgŸöÀ, ŒÂ ŒÂ ŒÂ ŒÂ8ˆÍ{óæM)¥”ÿûß+Or™ï¿ÿ¾|óÍ7k1Û?ÿùÏòý÷߯= }ûí·å“O>)77ù~åÜÞÞ–W¯^•ׯ_¯= „;þÞ=þ†kùÍÛ·oß®=Üçoû[ùüóÏ×v髯¾*Ÿ}öÙÚcPÂæýðÃååË—åÙ³gå½÷Þ[{Ø…7oÞ”ï¾û®|úé§åƒ>X{*"@€0B ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂ ŒÂü?I}¢Ý» HrIEND®B`‚././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/tests/baseline_images/test_plot/hinton_r.png0000644000175100001770000000727100000000000026300 0ustar00runnerdocker00000000000000‰PNG  IHDR Xšv‚psBIT|dˆ pHYsaa¨?§i[IDATxœíݽm[W€áã *¤B•j¸rç4„<‚ÇÑ^‚;pBµ*dÁ‚);ˆ##Vh¼÷òæyJU_s¾:ïN§Ói~›zàÿC€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d~Ÿzþòòò2ÖëõX­VãúúzêqøÎápÛív<<<Œ»»»©Ç¹HdFÖëõøôéÓÔcð/¾|ù2§ã" Y­VcŒ1>|ø0îïï§€x~~›ÍæÛï6ÞN€ÌÈ×mW÷÷÷ãýû÷OÀk6›íògpÈ #@€Œ2È #@€Œ2Èx €Ù¹½½777SñÍ~¿»Ýnê1`|ßfåööv|þüy\]]M=Ê7Çãq<==ù‘gò}3†-XÌÌÍÍͬ~œŒ1ÆÕÕÕ¬þc —Ê÷ÍV@Xˆ©–s-Û¼áâM¹œkÙàmlÁââM¹œkÙàm @F€dq /ÀBü×÷p¼g@I€,À9ïáxÏ€’-X pÎ{8Þ³ $@€Œ2΀ÀB| €9 °¯]Dà‚`ŽlÁ€xí" s$@€Œ2`¿ßãñø·¿DZßï'šàu¡Àìv»ñôôä,`ö,Än·ÀìÙ‚d ðÚäŸå 2%g@àµÈ?ËAeJ`!@àØ‚d @F€dïœ÷Îåý€·q /ïœ÷Îåý€· ,‚÷.ƒ-XÌÊ”Û*ÄvKø5|ߌa€™™r[åØn ¿†ï›13d[%,—ï[°€Œ2È #@€Œ2È #@€Œ—Ðgäp8Œ1Æx~~žx^óõwÚ×ßm¼™‘ív;Æc³ÙŒÍf3í0üÐv»?~œzŒ‹ôît:¦‚?½¼¼Œõz=V«Õ¸¾¾žz¾s8Æv»ãîînêq.’2¡d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€d @F€dþ“ý&Š¢HHIEND®B`‚././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/tests/baseline_images/test_plot/hinton_z.png0000644000175100001770000002020200000000000026275 0ustar00runnerdocker00000000000000‰PNG  IHDR Xšv‚psBIT|dˆ pHYsaa¨?§i IDATxœíÝ?×¹Àá—¦Yd@LeÀnd¸ $@˜.EJ•à& ý8ؤL«àr+7Z,Ý v#ªÆ] o!ì^ÿÄ"9çÌ÷y*™¹ŸT¹?œyçím·ÛmdðYÛ^d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù dóyÛ =>|ˆ~ø!¾ùæ›øÃþÐöqÎöÛo¿Å»wïâÛo¿/¾ø¢íãð À öÃ?Ä?ÿù϶qµÿüç?ñÝwßµ} >xÁ¾ù曈ˆøÛßþ_ýõUÏúòË/ã_ÿú×Ñÿìßÿþw,‹«žÌÏ?ÿoß¾}úïÀó'@^°Ç×®¾þúëøóŸÿ\ûφÃaŒF£xxxˆårù»Ïúꫯâ¯ýëÑÿì/ùK¼ÿþúñöíÛN¾>ðRB²qÀQËå2îïïÛ>FMY–QÅÓ?§x­ €´X­V±^¯c0Ô~_¯×±Z­®~~Y–qwwW{þl6‹étzõ³ÈG€Ðˆªªb2™Ôn(">†IUUW?¿(Šƒ¸ {©ªª‘ØàvB²q@Rûƒãͽ–@÷’968ñq0}2™œ!§†Üè@2§ÇƒAEqV€ì¹?Þ¨üøã€<Ì€ÐUUÅû÷ïãÕ«WñêÕ+{@:È óððÐö¸ ™~¿Ûí6z½^í÷ívý~ÿâç.—Ëk@K¼‚@2›Íæ >""z½^l6›N@Û² @6€d‡Ð÷];„@wù ɤB/ËòšcÐ"@§”eãñ8""æóy˧à\^Á²q@§TUÓé4""~ýõ×–OÀ¹ɤڄ^UÕµG %^Á ›ÐØ'@€l² $c:ûÉB`Ÿ² @6p ,Ë(Š¢öÛjµ²€« jʲŒ»»» µß×ëuL&“³"dµZÅz½>ú¬ÕjÕÈyè@MQÁ1 ¢(ŠO ápUUÅd2q›À@ãF£QDDÜßßGUUb€'€Æ<ÎŽüòË/ñÕW_¹í F€PséÜF“³#Ü.@Í¥sMÌŽpûRÌm<¦/—ËFŸ @·XD@£Ñèi8€—Ë Y<<<´}ž@#úý~l·Ûèõzµß·Ûmôû}¯^^Á !›Íæ >""z½^l6›(Ë2ʲlád<'n@Hn8Æ÷ßÓéÔ±^07 @6n@Hn¹\Æt:ˆpûð ñ{C耯`ð?œ38n€OဣʲŒñx׎Bà‘ 7 UUUcƒãM ¡—eEQ<ýób±¸ê\ä'@8éœXX­V±^¯c0Ô~_¯×±Z­®Ž˜²,ãîî®öüÙlö6tƒ UUÅd2©ÝPDD#ñQÅAÜÐ=€ÆTUeÀ€ÿÉ:’Úhîµ,ºG€Ì±Áñˆƒé“ɤ‘!wºE€Ì©ÁñÁ`EQœ »Cî7*?þøccg 3 tFUUñþýûxõêU¼zõÊ€r@g<ΓüòË/ñå—_¶|"Î%@Hæ÷–žãØ<ÉŸþô'‹:F€L“Ë -"¸ €¤,'`—!t dã,µ»ùüqÐ|wäÒ-èÜ@cö¿Tµ^¯£×ëÅçŸÿÿÿÜ\²=âø@û|>oæàd#@hÌþ—ªšÚ‚þh Ý"B€î1dã€ÆìÏi4=³±;_á ‹ÙŸÓè÷ûñý÷ß×"ä’-èÇ7¡Ïf3›Ð:F€Ð¨ý9ýÁñK¿‚e:Àm $e:» ¡Ù¸àÀþ°wÄå¯NÀ.@ͱaïˆË6Å&t€Û @¨95ì}ÎÁápËå²±sÙ„pFqßèsmBè>@ãÚ>Ï” qM¾zÀmñ^júý~l·Ûƒß·ÛmôûýNÀ- Ôl6›èõz¿÷z½Øl6ŸôŒápø4ˆ»FOƒè°Ë ÛB·Ô€]€šS ÿÖëu¬V«OzÆãús]j@{5ÇþE\vkÑÄRCn‹àÀþÂ?hŠ!t d#@HÆRCö ’ib©!·E€Ù Ÿá 3ö·ª/‹OÀ%É4±Uýѱ­ê³Ù,¦Ói#g @29¶ªÐ-€¤lU`—!t d#@è„S[Õè@'œÚª@· dã3¼$µ¿½<â²= Ü@2Ƕ—G|Ü„>™LΊS[Õè@2§¶—ƒ(Šâ¬9¶U}>Ÿ7rNò tÆþVõÅbÑâi¸„ 3öçI@÷’95·±^¯cµZõ¬có$³Ù,¦Ói#g @2Çæ6".û Ö©yºE€ÔþÜ/›E„@6ÈF€Ð ý~?¶ÛmÛÇàJ€NØl6ÑëõÚ>W @6ÈF€Ù:Á:Àm t‚!t€Û @€lÍçm€ÛV–eEQûmµZEUU-€6 ’)Ë2îîîb0Ô~_¯×1™LΊÕjëõúàYt‹ ™¢(ŽÃ`0ˆ¢(Î ªªb2™ÔnSæóy#ç @gTUU‹–ÅbÑâi¸„!t d#@HæÔöòívý~¿…Ð6@2§¶—÷z½Øl6-œ€¶  d#@HÆ ûɘ`Ÿ² @6Ÿ·}øTeYFQOÿ¼X,Z< — $³Z­b½^Ç`0¨ý¾^¯cµZõ¬²,ãîî®ö¬ÙlÓé´‘³‡ ™ªªb2™Ôn-">†IUUg=«(Šƒ {IUUuvlp» ¡Ù¸ ©ýÁñˆË^Áà6’968ñq}2™œ!§Úè@2§ÇƒAEqV€hŸÏ眀|±?Ðn@÷B²qÀƒã¤"@¨É18~É&tnƒ ¦‰ÁñápÍnBà67""âþþÞ&tj{xxhûðr¸ ¹ªªžöu\3Cb!@÷ jR-"¼vxÝ"B€Û @¨±ˆ€”," Cè@6n@HÊVuv ’ir«ºMè·A€L“[ÕmB¸ €Î°  û ¡Ù¸ 3lBè>@2MnU· à6’ir«ºMè·A€”­êì2„dã€ìlGx¹Y]ºÝ&t€Û @Èb8FÄåÛÑwÚoO~üñÇt 3 d1b4]õŒªªâýû÷ñêÕ«xõê•= 䀤ç=~ùå—ˆˆøâ‹/®~æÃÃÃÕÏ €dNÍ{l·ÛèõzµßÎYN¸\.;#y ’95ïÑëõâÍ›7ñáǧß| àe ´âÇñþýû¶@f†Ð€lÜ€Ðû }  {ÉœZxÎÀù£cí³Ù,¦Ói#g @2»Ëw]2p~j €n $UU•¯[ðÄ:² tB¿ßívÛö1¸’ 6›Môz½¶À•² @6€N0„p`à6 dóyÛà¶•eEQÔ~[­VQUUK' M€“ʲŒˆ¸8ʲŒ»»» µß×ëuL&“³ž»Z­b½^< €n õqv, ‡Ãˆˆ(Šâh0 ƒ(Šâ¬gVU“ɤv›2ŸÏ?ùïð<ŽÚ‡sca4EDÄO?ýÔ虪ªªc±X4ú|Ò õøÊÓãŸÏñððÿÒ”ýyÐ=€£_yzüó9–ËeDD|öÙgGç6ÖëõÙQslžd6›Åt:=ë9´K€pÒµ_ª:6·qÙW°NÍ“Ð-€¤öç6xÙ,"²q@R°K€ŒE„ì $c!ûa!@÷B²qÀƒã¤"@¨É18~É&tnƒ ¦‰Áñ²,#¢ÙMèÜ@£Ê²ŒñxÓéÔ&tj ¡Ù¸ QUUÅt:}ús“ö‡ã}† {5M ާxåêØpül6{ŠºA€Pó\ÇO ÇÐ-€ÇHÅ:’²U€]€drlU [É4±Uýѱáøù|ÞÈ9ÈG€ÐûÃñö€t!t @2ý~?¶ÛíÁïÛí6úý~ ' m€d6›Môz½ƒß{½^l6›N@Û² @6€d ¡°O€Œ!tö  d#@€lÉB`ßçm€Û•j½,ËkŽ@‹R–eŒÇ㈈˜Ïç-Ÿ€sy ÈÆ Q–eEoÞ¼‰ˆˆ¢(Z>ç $³Z­b½^Ç`0¨ý¾^¯cµZõ¬²,ãîî®ö¬?þñ1N9+y’©ª*&“ÉÁMÅjµŠªªÎzVQ!@÷’ªªêìØàvB²q@Rƒã».y €Û @HæØàxÄÇ!ôÉdrV„œh [Éœ QÅYrl Ý"B€î tÆþ@ûb±hñ4\Â: ™~¿Ûíöà÷ívý~¿…Ð6@2›Í&z½ÞÁï½^/6›M ' mÈF€Ùø , @MŽåëõ:V«U#ç [5©—F¸MxÉIí/àe3„d〤 ´°K€PÓäàx“íÜ@M“ƒãM´p ŽŠ!t 7 08@*€›ÐHI€PÓÄàøp8Œ›Ð8$@hÜh4Šˆˆûû{íÔ÷ððÐöx¦[.—O6ÐÀ.@Mè¤$@¨± €” ŽŠà€¹ R ÔXD@J€š&ç6,"`Ÿ q›Ð—Ë¥yj>kûÜžÑhô´ v¹ q©6¡ïÇ/‹$ÿÒ Ô418¾» ½)džãg³YL§ÓÆÿ]¤#@¨y®ƒã§†ãèÀƒã¤bÈÆ IÙªÀ.@29¶ªÐ-€dRoUŸÏ眀|±?o@÷B²qÀƒã¤"@¨É18~ÎVun‹ ¦‰ÁñápÏw«:í 4n4EDÄýý½­êÔ÷ððÐöx¦[.—Iž»?ï3¼Ý#@¨y®ƒãdžãg³YL§ÓÖÎÀù5ÏupüÔp<Ý"@8`p€T," @M“‹›tj6€n Ô4±ˆð˜²,#".þûÇfSæóùEÏ =€äʲŒñxÓéôªÙý»>à Ð=Ÿµ}àåp@RíoÞ¼‰ˆxz…Ê@;ÀË$@¨irásh =€š&¦h »," Cè@6n@8`:©jšor €Û @¨ibp|wëySíÜ@£Žm=<2„dã€FUUÓéôéϰK€PÓï÷c»ÝF¯×«ý¾Ýn£ßïÒ3„§x €šÍfs½^/6›Í'=c8Æp8lúhÜ@ãF£QŒF£¶À3ä,÷ððÐöx¦[.—m€gÊ+XÔ<¡ï;gN ÔB %@ã ¡pŠ—j½,Ë(Šâ韋E’éjV«U¬×ë µß×ëu¬V«OzFŠ!ô²,ãîî®v®Ùlö´u€n ÔTU“ɤvÓñ1LÚÜp^ÅAÐ=€UUµÜ.Cè@6n@8°?ìÑþ+XÜ@ͱaïˆCè“ɤµ95@·jN {ƒ(Šâ“¤,ˈˆFcåØpü|>oìùä!@hTY–1#"b:6!»Ï³ { ¡Ù¸ QUU=-lz^Ä&t€î Ô4± =Å ºMè·A€Pc:) Ø„@*†Ð€lÜ€”­êì $ÓäVu›Ðnƒ ™&¶ª?² à6:Ã&t€î3„d#@€lÉôûýØn·¿o·Ûè÷û-œ€¶ ’Ùl6Ñëõ~ïõz±ÙlZ8m @6¾‚@gì/5ô,€î $sjyàz½ŽÕjuÖ³Ž-5œÍf1N9+y’9¶<0âc˜œ³„0âôRCºE€Ôþò@^6Cè@6n@8°?ìqÙkS°O€PslØ;âãàød2i-BN ´Ð-€šSÃÞƒÁ Š¢ø¤)Ë2"¢ÑX96Ð>ŸÏ{>yU–eŒÇ㈈˜N§GÈîóìèCè@6n@hTUUOË«ª2Ð@ ¦‰íåqñ\Úh &ÇöòsÚ¸-€¶—Š!t 7 tÆþ@»Ïðt ™&ÚhŸÍfO_Ü Éäh [Ih`—!t 7 °½€T5Ïu{ù©vºE€Pó\·—hŸÏç­œ€Ë :c Ý€î t†E„Ý'@¨iry`“,"¸ €š&—6É"B€Û @8`y ©XDd#@€lÐï÷c»Ý¶} ®$@è„Íf½^¯ícp%d#@€l  ¡Ü@eYFY–ÿ}Cè·A€pÔp8Œüã1¯~VY–1c<_!tŸ²ù¼íðü”eEQÄO?ýÃá0†Ãa¬V«¨ªê¢çUUÓéôéϼ\€š²,ãîî.ƒAí÷õz“É䪹ÆjµŠõz}p.ºE€PSÅÑÿ'0DQŸs#Ëå²±sUU“É$Š¢xúm>Ÿ7ö|ò 4n4EDÄýý}£Ï­ªª@‹Å¢Ñçž qm€gJ€Ð¸&_½Úõ8ÿÈ @÷jN {¯×ëX­V-êøpül6{úºÝ @¨96ìW}†· §†ãèÀýaohŠàÀþ¬EDû7 Ü@MªE„ײˆà6jšXD˜‚E„·A€Ðtßgmx9Ü€”vv jš\Dø\Úh ¦ÉE„Ïu €öXD@*†Ð€lÜ€Ðûí>à Ð=€dR´Ïf³˜N§œ€<Éäh [Ih`—!t 7 °½€T5Ïu{ù©vºE€PÓÄöòápËåòé·²,#".˜cíóùü¢gÐ@ãF£QDDÜßßGÄÇøÇ1N¯ŠÝ¿k@÷÷ððÐöx¦Û}õ*âãÍÅãÂÀkfHlBè>@M“ÛËw];¼n:Àm Ô4¹½¼I6¡ÜÀÛËHE€Ð ø=€F¤^`h!Àm 4¢‰†ÿ‹E„·A€Å±íè粈 û>kû¼ £ÑèiC:/—¤&·€ÀžK‡Éoá5¯^p;5—“ï‰ïÎ{ø /»Ì€Ð˜ªªbµZÅëׯãõë×â€ÈÆ+X4ªªª˜N§O¶€]€šß&ÿq‘z;:Ý#@¨9¶q<â¼[‹²,#"ývtºG€p ªªøïÿ£Ñ(Îú„nY–1#"âÍ›7©Ž@GB²qÀQËå2îïïÏþ{»Cèû¯q]k }±X4ú|Ò 4nw¶ãÚöGÇÚg³ÙSìÐ €dšhtj €n $UU•¯]ðÄ:cë9¿G€pÔcL|j@¤Þz~jC;Ý"@8°Ÿ©·žhŸÏçW=€üvc¢©€hÂþ@»= Ýc€ý~?ÖëuD||…j8FY–-Ÿ €[ @¨)Ë2¾ÿþûÚ Èwß}wwwÿ3Bg4ö]²t€Ûå,j.åhré ·K€ÐKø=^Á² @6€š~¿Ûíöà÷ívý~¿…pK5›Í&z½ÞÁï½^/6›ÍÅÏ-ËÒ§| ¤W–eŒÇãÇ"à… @6>à @ReYFQñæÍ›ˆˆ§=!>× ð2 j7šï/#¼d£yY–qwwwôY“ÉD„¼@€š&7š_ºU€Û%@8`£9©B²q@g<´?Z,-ž€K’I=Ð>›Íb:6rVò $“c €n $e €]†Ð€lÜ€p`Ø;â²×¦`Ÿ æ¹n/?5Ð@·jžëöòcíóù¼•³p9@gì´ÛÐ=€¤Ì“°K€P“zyàã³Úœ' =€šËÛž' =€–ŠE„@6n@8`p€T5MŽ79ÐÀm Ô418^–eDÔÚ‡ÃaDD,—K·)/˜U–eŒÇãÇQ–eTU«Õ*^¿~¯_¿/œ²ñ À öÛo¿EDÄÏ?ÿüôÛb±ˆÙlvôÿ~>ŸÇb±øÝçÎçóˆˆøõ×_ÿço×z<÷ãž?ð‚½{÷.""Þ¾}oß¾}ú}:¶t¢Ë¼{÷.þþ÷¿·} >Ao»ÝnÛ>íøðáCüðÃñÍ7ßÄþð‡¶s¶ß~û-Þ½{ß~ûm|ñÅm€O @€l ¡Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l² @6ÈF€Ù d#@€l _óIDAT² @6ÈF€Ùü…þÝûN«„IEND®B`‚././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/tests/baseline_images/test_plot/pdf.png0000644000175100001770000003517300000000000025233 0ustar00runnerdocker00000000000000‰PNG  IHDR Xšv‚psBIT|dˆ pHYsaa¨?§i IDATxœíÝ{”Öõ}àñÏ33 wPP!4€df``5à…Œ5º†š¬Ív 6'§Ý˜œm·mÝMOmNb­[w³9 »9[϶¹´îÚ’ž]›[“Ö\•˜DPåâ «ru.€Èý6ÏþáÈe.Ïü¾Ïåõ:‡0Ï'yHœ7ßß÷ûÍåóù|d *õ@å @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fjRÀ/ôôôÄã?3f̈ѣG§€_räȑغukÜ~ûíQWW—zœ’$@ŠÈã?þð‡SÀE<úè£ñ›¿ù›©Ç(I¤ˆÌ˜1#"Þú=wîܴÉ+VÄÊ•+³áööˆ8âÑG#üYËL²÷›$¼ß•Åû]9ÚÛÛãÃþðé¯Û8RDú»š;wn477'ž†,Lœ81í{=wn„?k™Iþ~“)ïweñ~WËžMè@fdF€@BË—/O=ò~Wïweñ~Cÿ HÈ¿°*‹÷»²x¿+‹÷úO€™ ­|>·ÜrK¼ûÝïŽïÿû‘ÏçSÀ ŠÖ7¿ùÍøñ‡ŠÛn»-Z[[ã§?ýiê±BQêííOúÓñ«¿ú«±~ýúøÖ·¾ûöí‹o¼1Þ÷¾÷E[[[êBQúßÿûÇÆãÁŒ\.wÞyg¬[·.¾öµ¯Å«¯¾---ñ‰O|"õ˜ ¡èœýñªªª¸ûî»cÓ¦Mqÿý÷Ç¿øÅزeKÂI(BÑùÛ¿ýÛx饗âÏþìÏÎùó555ñÉO~2&L˜ý×ñt …¡¨œ8q"xà¸ë®»¢¥¥å¼¿nìØ±qÏ=÷ÄW¿úÕ8yòd†0„¢ò×ý×±uëÖxà.úkï½÷ÞØ¹sg<þøãL@!ŠÆÑ£GãÁŒ}èC1oÞ¼‹þúæææX°`A|ùË_Î`: A€P4¾ô¥/Å®]»âþûïïׯÏårñ±},¾ýíoÇîÝ»‡y: A€P>=ôP|ä#‰9sæôû÷ÝsÏ=1bĈxä‘G†q: E€P~øáèéé‰Ï|æ3ú}—\rIüÆoüF|ùË_Ž|>?LÓP(„äŽ;Ÿýìgã£ýhÌœ9sÀ¿ÿÞ{ïW_}5žxâ‰Â@A ’[·n]ôôôĽ÷Þ;¨ßÓM7ÅìÙ³mF(„äÖ®]#GŽŒùóçê÷çr¹¸÷Þ{ã±Ç‹½{÷x: I€Üš5k¢¹¹9jkký9>ò‘Ä©S§âÑG-àdš!¹µkׯu×]7¤Ïqùå—Ç>ðøÒ—¾d3:@ $µgÏžxå•W† omF߸qc¬]»¶“0I=óÌ3qýõ×ùs½÷½ïéÓ§ÛŒPÄI­]»6.»ì²˜5kÖ?Wuuuü›óobÕªUqèСL@¡ ’Z³fM\wÝu‘Ëå òùþõ¿þ×qèСøÙÏ~VÏ@a ’Éçó±víÚ‚<~ÕgîܹQWWO>ùdÁ>'…#@HfË–-ÑÓÓS è}r¹\Ü|óͱzõê‚}N G€LßiU… ˆˆ¥K—Æš5kâÈ‘#ý¼ !™5kÖĬY³¢®®® Ÿ÷æ›oŽãÇ;Ž  ’)Ä„ç2oÞ¼¸ä’Kì(B„$Nœ8mmmÝ€Þ§ºº:n¼ñFû@Š!‰ 6ÄÑ£G‡e$â­} ?ûÙÏâøñãÃòùBk×®šššX¸pá°|þ›o¾9Ž9Ï=÷ܰ|~G€Äš5kbþüù1zôèaùüÍÍÍ1vìXû@ŠŒ!‰áÚ€Þ§¦¦&n¸áû@ŠŒ!so¾ùf´··Ëô3-]º4žzê©8yòä°¾ý'@ÈܳÏ>ù|~XW@"Þ Ä /¼0¬¯@ÿ 2·víÚ?~|\sÍ5Ãú:‹-ŠQ£Fy  ˆ2·fÍš¸öÚk£ªjxÿø92/^l#:@ dn¸7 Ÿéæ›oŽŸüä'ÑÛÛ›Éëpa„LíØ±#víÚ5ìÐû,]º4öîÝ›6mÊäõ¸0B¦Ö®]‘Ù Èõ×_#FŒ° H2µfÍšxÇ;ÞS§NÍäõÆŒ×]w} EB€©,÷ô¹ùæ›cõêÕ‘Ïç3}]ÞN€™S§NųÏ>›y€,]º4:;;ãå—_Îôux;BfÚÛÛãàÁƒ™m@ï³dÉ’¨®®¶ 2óüóÏGDDsss¦¯;~üøhnn¶ 2ÓÑÑS¦L‰ &dþÚ7ß|s<ùä“ö$&@ÈLGGGÌš5+Ék/]º4vìØ[·nMòú¼E€™W_}5Y€Üxãñ“Ÿü$ÉëðBf:::⪫®JòÚ—^zi¼ë]¶¶$¯À[™8räHìÚµ+Ù HDÄÂ… cݺuÉ^BFúö^¤ æææX·n]ôöö&› Ò 2ÑÑÑidáÂ…qàÀxõÕW“ÍPé™èèèˆÚÚÚ˜:uj².\á1,€„™èè舙3gFUUº?r“'OŽéÓ§Ûˆ!)ï9“èi 2Q,ÒÜÜmmmnDHD€0ìòù|ÑÈÂ… £§§'vîÜ™z€Š$@vqøðá¢æææˆû@ »¾#xSÝ‚~¦iÓ¦ÅäÉ“@"„a× 3gÎL_Ô²páÂhoo#Gޤ b†MWWW>|¸h¤¹¹9N:6lH= @Å ›b=‚·OcccTWWÛˆ!°yõÕW#"bæÌ™‰'9·Q£FECCƒ}  ›ŽŽŽ˜õ(çÕÜÜl C„aSÌÐû,\¸0Ö¯_'NœH= @E ›Ræææ8vìX´··§ "†MGGG\uÕU©Ç¸ ¦¦¦ÈårÃȈaX=z4vîÜYô+ ãÇw½ë]6¢dD€0,¶nÝÅ{ššÜ°(ö;@ÎÔØØ6lˆ|>Ÿz€²'@Q[[S§NM=ÊE566FOOOtuu¥ ì †Å«¯¾3f̈êêêÔ£\Ô¼yó""bãÆ‰'(„aQ Gðö™5kVŒ5J€d@€0,J)@ª««£¾¾^€d@€Ppù|¾¤$âÑ^„‚ëêêŠÃ‡—\€lÚ´)z{{SPÖ×wo±ß‚~¦yóæÅÁƒcÛ¶m©G(k„‚ë ™3g&ž¤ÿ#ÂIXÃM€Pp1yòä?~|êQúmÚ´i1qâD0Ì·uëÖ˜1cFê1$—ËEcc£f„‚ÛµkWL›6-õæ$,€á'@(¸]»vÅ•W^™zŒ›7o^lÞ¼9Nœ8‘z€²%@(¸Ý»wÇÔ©SS1`qüøñxå•WRP¶uüøñèîî.Éihhˆ'a 'BA½þúë%ùV]]]\qÅ`  j÷îÝ%¹a#:Àp Ô®]»"¢4W@"ÂQ¼ÃL€PP»wïŽššš¨««K=Ê Ì›7/^yå•8räHêQÊ’¡ úŽà­ª*Í?Z‘Ï磽½=õ(e©4¿J¤h•ê }êëë#ÂIXÃE€PP¥zHŸqãÆÅÌ™3mD&„‚*õц“¡ J}$â­è`x ¦”oA?ScccìØ±#ÞxãÔ£”BÁ”ò-ègjllŒˆˆM›6%ž ü ¦ÔoAï3gΜ¨©©±` ¦ÔoAïS[[³g϶` ¦ÔoA?“èÃC€P0¥~ ú™úŽâÍçó©G(+¥ÿ•"E£îéÓØØ{öì‰ÎÎÎÔ£”BÁ”à }úNÂò@a  f×®]e 3gΌѣG;  À³{÷î²y«ºº:êëë­€˜¡ Êåô39  ðQ.· Ÿ©¡¡!6mÚä$,€ D¹Ü‚~¦†††8tèPlÛ¶-õ(eC€P}· —S€Ô××GDÄ‹/¾˜x€ò!@(ˆ¾[Ð'Mš”z”‚™>}zŒ;V€¡ Êéô>UUU1wî\P@åóÕ"I•Ó-ègjhh $@(ˆrºýLõõõñâ‹/:  @QN· Ÿ©¾¾>Þ|óÍØ¹sgêQÊ‚¡ Êéô39  °CVŽ· ÷™1cFŒ=Z€ˆaÈÊñô>}'amÚ´)õ(eA€0dåx ú™ú6¢0t„!+Ç[ÐÏä$,€Â Y9Þ‚~¦úúúxã7N?jÀàÕ¤€·[±bELœ8ñ¬-_¾<–/_žh¢ +Ç[ÐÏÔÐÐo„UŽû\€s[µjU¬Zµê¬íß¿?Ñ4åC€¡•+WFsssê1ú­\ïé3sæÌ9rdlÚ´)~õW5õ8@FÎõÀmmmÑÒÒ’h¢òPžeM¦Êõ>ÕÕÕqÍ5רˆP„!+÷'aŠaÈÊ}$â­Ù´i““°†H€0$å| ú™bïÞ½ÑÝÝz€’&@’¾£iË=@êëë#"܈0D„!黽ÜÁºêª«bĈö ‘aHÊýô>5551gÎ0D„!)÷[ÐÏÔÐÐ @†H€0$å~ ú™úNÂ`ðÊÿ«F†U%ÜÒ§¾¾>º»»„0„!©„;@úô„ÕÞÞžx€Ò%@’JZ¹úê«£¦¦Æ>€! I%­€ÔÖÖÆìÙ³À­RnA?“èC#@´J¹ýLõõõV@†@€0h•r ú™êëëãõ×_½{÷¦ $ ­RnA?““°†F€0h»wïŽ#FTÄ-è}fÏžÕÕÕÃ$ íÚµ+®¸âŠŠ¸½ÏÈ‘#ãꫯ¶`*ç+G ®’î9“èƒ'@´JºäL`ðƒV©+ ±sçÎØ¿êQJŽaÐ:;;ãòË/O=FæúN² 0p„AéííîîîŠ Ù³gGUU• ìÝ»7z{{cÊ”)©GÉÜèÑ£cÖ¬Y`ƒÒÝÝ“'ON¶oßo¾ùfêQJŠaPºººbĈ1qâÄÔ£$ÑÐÐííí‰'(-„AéêêŠ)S¦D.—K=JsæÌ‰\.ç1,€ JWWWÅn@ˆ3fLÌš5ËFt€ JwwwÅîÿèc#:ÀÀ ¥ï¬JV__o`€ƒ"@ÞÚˆ¾mÛ¶8pà@êQJ†aPÈ[+ ›7oN< @é ØñãÇcß¾} ×\sMD„ǰ@€0`===} VDÄØ±ccæÌ™6¢ €aÀº»»#¢roA?““°F€0`]]]!@"œ„0P„ë J+â­“°¶n݇J= @I XWWWŒ;6ÆŽ›z”äúNÂjooO< @i ˜#xaîܹöô“aÀººº<~õÿ7.ÞùÎw €~ Xww·3؈ЄóÖÙŠЄ gkhhˆ-[¶ÄáÇSPô&@ÎV__ù|>6oÞœz€¢'@C‡Å¡C‡lB?ƒ“°úO€0 ÝÝÝáô3M˜0!¦OŸn#:@?D€œ›èý#@®®®ˆ ¿L€ôa@ú¤®®.ñ$Å¥¡¡!^}õÕ8räHêQŠša@ºººâÒK/ÚÚÚÔ£•¾“°^zé¥Ô£5€tuu9ëúN²àÂâs»ä’KbÚ´iö\„a@º»»Èy؈pq„±r~õõõÁ¸€óë; ëèÑ£©G(Z„~Ëçóäêë룷·7^~ùåÔ£-B¿íß¿?Nœ8á¬ó¨¯¯'a\ˆ¡ßÜ‚~a—^ziL:U€\€¡ßº»»#B€\Hccclܸ1õEK€ÐoV@.N€\˜¡ßººº¢ªª*.»ì²Ô£­ÆÆÆèèèˆC‡¥ ( ú­««+êêꢪÊ›óillŒ|>ííí©G(J¾’¤ßÁ{q}'ay àÜý&@.nìØ±1kÖ,p5©àíV¬X'N<ëcË—/åË—'šè-ÝÝÝqùå—'¡؈åaÕªU±jÕª³>¶ÿþDÓ”R„V®\ÍÍÍ©Çx›®®®˜7o^ê1Š^ccc<òÈ#©Ç†è\ÜÖÖ---‰&*Á¢ß<‚Õ?±sçÎØ·o_êQŠŽ¡_N:===1yòäÔ£½ÆÆÆˆ7¢œƒ¡_öìÙù|Þ H?Ì™3'jjjì8B¿¸½ÿjkkcöìÙàýÒÝݤ¿cÆ ©Ç(:„~±20}GñæóùÔ£B¿tuuÅÈ‘#cüøñ©G) óæÍ‹½{÷Æë¯¿žz€¢"@è—®®®˜õ(IΫ/@êêêOR>ššš¢§§'víÚ•z€$çe¤ðššš"ÂFt r Ϋ§§'FcÇŽM=JÙ˜1cFL˜0A€K€p^Žà-¼\.g#:PÑç%@†‡*™á¼ÈðhjjŠ—_~9Ž9’z€Ì ÎK€ ¦¦¦èíí7¦ s„ó ã±±1ªªª<†T$Ây á1zôè˜={¶*’áœNœ8ûöí ÃÄFt R ÎiÏž=áÂáÒÔÔëׯ|>Ÿz€L ΩïôºººÄ“”§¦¦¦Ø¿¼öÚk©GÈ”áœzzz" ÈpijjŠˆˆçŸ>ñ$Ù œSß ˆS§NI“&ÙTÂ9uwwLj#bâĉ©G)K¹\ÎFt " Ω»»;êêê"—Ë¥¥l-X°@€G€pNî~MMMÑÑÑo¾ùfêQ2#@8'2üú6¢oذ!ñ$Ù œ“~sçÎ#Fx ¨(„s ï¶¶6æÎ+@€Š"@8'’ 'a•F€ð6½½½±gÏ’ ÄúõëãÔ©S©GÈ„ámÞxã8uê”É@sss9r$6oÞœz€LÞ¦ïôºººÄ“”¿æææˆˆxöÙgO ÂÛôˆá7a„˜={v<÷Üs©GÈ„ámH¶-Zd¨5©àíV¬X'N<ëcË—/åË—gòúÝÝÝ‘Ëåâ²Ë.Ëäõ*Ý¢E‹âë_ÿzœæÌ™c#:Po#@²g#:P)o#@²×ÒÒ/¼ðBœ8q"õ(ÃJ€ð6===$c‹-Š£GÆ‹/¾˜z€a%@8K>Ÿ·’À‚ lD*‚á,ŒcÇŽ Œ7.æÎk#:Pög黽®®.ñ$•§¥¥Å Pögé + Ù[´hQ¬_¿>Ž?žz€a#@8‹I§¥¥%Ž;›6mJ= À° œÅ#Xé,X° ªªªìÊšá,ÝÝÝ1qâĨ­­M=JÅ;vlÔ××Û”5ÂYÁ›VKK‹ ¬ Î"@ÒZ´hQ¼ð qìØ±Ô£ ÂYÜ‚žVKKKœ8q"6nܘz€a!@8‹´ššš¢ººÚcX@Ù œE€¤5fÌÑ€²&@8‹IoÑ¢EV@€²%@8íèÑ£qðàA’XKKKlذÁFt , Ns aqX´hQœ8q"6lØz€‚ œÖ V@Òš?~TWWÛ”%Âi¤8Œ=:íÊ’á4R<-Zd(K„Óº»»c̘11f̘ԣT¼–––ظqc=z4õ(%@8ͼÅcÑ¢EqòäÉxþùçSPP„ÓzzzH‘hjjŠ‘#GÆÓO?z€‚ œf¤xÔÖÖÆ¢E‹âg?ûYêQ J€pš).K–,‰Ÿÿüç©Ç((Âi¤¸,^¼8vìØÛ·oO= @ÁN ÅeñâŸåXIDATÅVA€²"@ˆˆˆ'Nľ}û¢®®.õ(üW\qEÌœ9S€eE€{ö쉗›%K–؈”BD¸½X-^¼8Ö­[GŽI= @A"B€«%K–ĉ'â¹çžK= @A"B€«yóæÅرcíʆ!"Þ ÚÚÚ˜0aBêQ8CMMM\wÝuöeC€qùå—G.—K= ¿dñâÅñóŸÿ<òù|êQ†L€oÈ”)SRÁ9,Y²$:;;cË–-©G2BDüb„âóîw¿;"\H”BDb6iÒ¤˜3gŽ} @Y DDDWW—)b}û@J!"¬€»Å‹Ç /¼L= ÀâàÁƒqøða›Ð‹Ø’%K¢··7žyæ™Ô£ ‰!:;;#"¬€±úúú˜0a‚} @É PUUï~÷»íJž!ººº"B€;å@€Q]]“&MJ= °dÉ’Ø»wo¼üòË©G4BtvvÆäÉ“£ªÊ‡bvýõ×G.—óPÒ|ÅItvv:«Lœ81lDJšÁ %Ä…„@© ²dÉ’Ø´iSìß¿?õ(ƒ"@ˆ®®.R"/^ù|>Ö¬Y“z€A X)!³gÏŽI“&ÅSO=•z€A îèÑ£±ÿ~›ÐKD.—‹¥K—ÆüãÔ£ Š©p.!,=­­­±fÍš8|øpêQL€T¸ÎÎΈ ¥¤µµ5Nœ8?ýéOS0`¤Â ÒS__S¦LñP’H…ë{kòäɉ'¡¿r¹\¼ç=ï @I ®³³3&Mš#FŒH= pË-·Ä3Ï<H= À€ ×ÙÙé¬ÔÚÚ§Nr/PrH…sHiz×»ÞS§NýèG©GRáHiÊårÑÚÚjPrH…ëêê %ªµµ5Ö­[o¼ñFêQúM€T8+ ¥«µµ5z{{cõêÕ©Gè7RÁNž<{öì± ½DÍœ93~åW~ÅcX@I ¬»»;òù¼•Ëåâ–[n @I Ì-襯µµ5^xá…Ø³gOêQú¥&õ¼ÝŠ+bâĉg}lùòå±|ùò‚¾Ž)}­­­ñÄOÄ?øÁÄÓ@yYµjU¬Zµê¬íß¿?Ñ4åC€¡•+WFssó°¿NWWWD„= %lúôéqÕUWÅücv®¿nkk‹–––D•`U°ÎÎÎ?~|Œ=:õ( û@€R"@*˜#xËCkkk¼øâ‹§©(f¤‚ òpæ>€b'@*˜)W^ye\sÍ5ÃJ‚©`¤|´¶¶Æ~ô£Ôc\”©`]]]¤L´¶¶Æÿý¿ÿ7vîÜ™z€  ª··7ºººÁ[&Þóž÷DDx (z¤BíÝ»7N:e¤LLž<9PôH…r zù¹õÖ[ã{ßû^äóùÔ£œ—©P¤ü,[¶,vìØëׯO= Ày ÕÕÕ¤œÜtÓM1vìØøÎw¾“z€ó ª³³3FãÆK= 2räÈxï{ß+@€¢&@*Tß ¹\.õ(вeËâé§ŸŽ={ö¤àœH…r ayzßûÞ½½½ñøã§àœH… åiêÔ©±páBaEK€T(R¾–-[ÿüÏÿ§NJ= ÀÛ ÕÕÕ%@ÊÔ²eËbïÞ½ñôÓO§àmHÊçóÑÙÙS¦LI= ÃàÚk¯ºº:aEI€T 7ß|3Ž;f¤LUWWÇ¿øÿB€EI€T · —¿eË–ÅúõëcÇŽ©G8‹©@¤üÝ~ûíQ]]ßýîwSpRHù»ôÒKcÉ’%ÃŠŽ©@]]]QSS—\rIêQFË–-‹üàqôèÑÔ£œ&@*Pß XUUÞþr¶lÙ²8|øp<ùä“©G8ÍW È%„•¡¡¡!¦OŸî1, ¨ $@*C.—‹eË–Åw¾óÈçó©ÇˆR‘HåX¶lYtttÄË/¿œz€ˆ ©««K€Tˆ[n¹%Få1, h Ô· ò7f̘hmm @Ñ æðáÃqðàA+ ä×~í×bõêÕ±oß¾Ô£JãÂÊó/ÿ忌S§NÅ׿þõÔ£J#@*ÏÔ©ScéÒ¥ñ÷ÿ÷©G •F€T¦»ï¾;~øÃFwwwêQ€ '@*LWWWär¹˜4iRêQÈÐ?øÁˆˆø‡ø‡Ä“•N€T˜ÎÎΨ««‹šššÔ£¡É“'Ç-·Üâ1, 9Ra\BX¹î¾ûîxòÉ'ãõ×_O= PÁH…Ù½{w\qÅ©Ç »îº+ªªªâÿüŸÿ“z ‚ ³}ûö˜>}zê1Hà²Ë.‹Ûn»-¾öµ¯¥¨`¤Âlß¾=~åW~%õ$r÷ÝwÇOúSaÉ rìØ±Ø½{· ö| jkkã?øAêQ€ %@*ÈÎ;#"¬€T°‰'ÆwÜßûÞ÷RT(RA¶oߤÒÝ}÷ݱqÓ¦ÔcJ€TmÛ¶EDx«ÂÝyç1jäÈÔcJ€Tí۷ǤI“b̘1©G!¡qãÆÅM7Ý”z  B ²mÛ6«DDÄm·Ý¿XÈŠ© Û¶m³ÿƒˆˆ¸á†""lF2'@*ˆ;@è3jÔ¨ˆˆxüñÇOTRA<‚Å/{µ£#6nܘz  ‚ ±ÿþxóÍ7­€p–K/¹$¾úÕ¯¦¨ ¤BôÝb„3ÝyçñÈ#ÄÑ£GSTR!úN;²™îºë®Ø»wo<öØc©G*„©Û·oêêê¸òÊ+SByç;ß­­­ñ—ù—©G*„©Û¶m‹©S§FMMMêQ(2ÿøÇã'?ùI´··§¨¤B¸„ó¹ë®»¢®®.¾ô¥/¥¨¤B¸„ó9rdüÖoý–Íè@&H…pò±}Ìft ¤ôööÆŽ;¬€p^³gÏŽÖÖÖø«¿ú«Ô£eN€T€ÎÎÎ8qâ„.èãÿx¬^½Úft`X àúÃft ¤ôÝ‚.@¸›Ñ€, °mÛ¶3fL\z饩G¡ÈÙŒ 7RúŽàÍår©G¡ÈÙŒ 7RÁË@ØŒ 'RÜ‚Î@ômFÿ‹¿ø‹Ô£eH€T· 3#GŽŒßÿý߯|å+ÑÕÕ•z Ì2wìØ±èììôò{¿÷{QSSÿí¿ý·Ô£e¦&õ¼ÝŠ+bâĉg}lùòå±|ùò®;vD„#x˜Ë.»,þí¿ý·ñßÿû?ú£?zÛŸG¨«V­ŠU«Võ±ýû÷'š¦|"´råÊhnn.Èç껄Рõ‡ø‡ñÅ/~1~øá¸ï¾ûR™;×_·µµEKKK¢‰ÊƒG°Êœa°®¼òÊøíßþíX¹re>|8õ8@™ enûöí1yòä=ztêQ(AôG{ö쉯~õ«©GÊ„)sîa(fÍšúЇâsŸû\œ8q"õ8@ eμ ÕÿñǶmÛâïþîïR”R欀0TóæÍ‹÷¿ÿýñŸÿóŽÞÞÞÔã%N€”±|>ït â¾ûî‹Í›7Ç7¾ñÔ£%N€”±ýû÷ÇÁƒCöîw¿;Z[[㡇Š|>Ÿz „ 2æ^ é¾ûî‹çž{.~ðƒ¤(a¤Œõˆ áÖ[oE‹ÅŸýÙŸYM€”±íÛ·GMMM\qÅ©G¡ är¹øÿñ?ÆSO=_ÿú×S”(RƶmÛÓ¦M‹êêêÔ£P&n¿ýö¸ãŽ;ⓟüd;v,õ8@  eÌ ‡ÏþóñÚk¯Å¿øÅÔ£%H€”1w€0æÎ¿û»¿>ø`tuu¥(1¤Œ¹„árÿý÷GMMM|æ3ŸI= PbH™:uêTìÜ¹Ó ÃbÒ¤Iqÿý÷Ç—¾ô¥Ø°aCêq€"@ÊÔ믿'Ož´°ùÝßýݸúê«cÅŠŽåúM€”©íÛ·G„;@>#FŒˆÏþóñÃþ0þñÿ1õ8@‰ eÊ-èdaÙ²eqë­·Æøÿ!Ž?žz 2µmÛ¶7n\\rÉ%©G¡Œår¹øÂ¾¯¼òJ<üðéÇJ€)SÛ·oéÓ§G.—K= enÞ¼yñ±},xàØ½{wêq€"'@Ê”#xÉÒŸÿùŸÇÈ‘#ãcû˜ éÀ 2åt²4iÒ¤øò—¿ßùÎwâ+_ùJêq€"&@ÊP>ŸŽŽB¦~í×~-î½÷ÞX±bEttt¤(R¤ mß¾=öíÛMMM©G¡Â|á _ˆºººø­ßú­8uêTêq€"$@ÊÐsÏ=---‰'¡ÒŒ?>þæoþ&žzê©ø¯ÿõ¿¦(B¤ µµµÅå—_W^yeêQ¨@K—.+VÄŸüÉŸÄÆSR†ÚÚÚ¢¹¹Ù¼$óçþçqõÕWÇG>òg e¨/@ •Q£FÅÿüŸÿ36lØ>ø`êq€""@ÊÌîÝ»ãõ×_ $×ÒÒŸþô§ã?ý§ÿ?ùÉOR RfÚÚÚ""Eá¾ûî‹n¸!~ý×=¶nÝšz 2óÜsÏÅe—]ï|ç;S1bĈxì±ÇbüøñqçwÆR$&@ÊŒ 蛺ººøö·¿¯½öZÜsÏ=î€ '@ÊŒ 裆††øû¿ÿûøîw¿÷Ýw_êq€„HéîîŽíÛ· ŠÒwÜŸÿüçãsŸû\üÍßüMêq€DjR@á¬[·."l@§x}⟈M›6ÅÇ?þñ¸úê«ãÆoL=1+ e¤­­-ÆW]uUêQàœr¹\üÅ_üE,Y²$îºë®Ø²eKê‘€Œ 2Ò·ÿ£ªÊÛJñª­­Ç{,&Nœ·Þzk¼öÚk©G2ä+Õ2òÜsÏyüŠ’0iÒ¤øáK—.µD€”‰}ûöEGG‡¡d¼óïŒ'Ÿ|2FŒK—.W^y%õH@H™xþùç#ÂtJË;ÞñŽxòÉ'c̘1±téÒxùå—S 3R&ÚÚÚbôèÑ1gΜԣÀ€L:5žx≸ä’KbéÒ¥ÑÞÞžz$` 2ÑÖÖ ,ˆêêêÔ£À€]qÅñãÿ8&OžïyÏ{bãÆ©G†‰)mmmÑÒÒ’z ´)S¦Ä~ô£˜:ujÜtÓMñOÿôO©G†)ˆ—^zÉþJ^]]]<ñÄqã7ƲeË⡇Š|>Ÿz, €Hxá…"ŸÏ ÊÂĉã›ßüf|æ3Ÿ‰O}êSñ¯þÕ¿Š¤ (RÚÚÚ¢¶¶6êëëSQUUú§ßüæ7ãûßÿ~\ýõNÈ€2!@Ê@[[[ÌŸ??FŒ‘z(¨÷¿ÿýñÌ3ÏD>Ÿk¯½6þñÿ1õHÀ 2ÐÖÖæñ+ÊÖœ9sbÍš5qË-·ÄwÞÿøÇã7ÞH=0H¤Ä9r$^|ñE'`QÖ&L˜=öX<üðÃñµ¯}-êëëãßøFê±€A %nýúõqêÔ)+ ”½ªªªøßùØ´iS´´´Ä]wÝ¿ñ¿¯¿þzêÑ€ %®­­-jjj¢±±1õ(‰éÓ§Ç·¾õ­øÚ×¾O>ùdÌ;7¾úÕ¯:®J„)qmmmÑÐУFJ= d&—ËÅÝwßíííñþ÷¿?>úÑÆµ×^ÿüÏÿ,D È g:•lÒ¤IñÈ#ÄO<#GŽŒ;î¸#n¾ùæX½zuêÑ€ó %ìøñã±aÃBÅ[ºti<õÔSñÝï~7>K—.Ûn»-Ö®]›z4à—ö½ï}/Nœ8×]w]êQ ¹\.wÜqG<ûì³ñØcÅÎ;ãúë¯Ûo¿=¾õ­oÅ©S§R„)Y'OžŒ?þã?Ž¥K—Ƶ×^›z(¹\.~ý×=Ö¯_÷wû÷ï|à1kÖ¬x衇¢³³3õˆPÑH‰zä‘GbÓ¦Mñ¹Ï}.r¹\êq èTWWÇòåËãé§ŸŽgŸ}6n½õÖxðÁcúôéqÏ=÷ÄOô¡Yý€~hii‰¯|å+±k×®øìg?Ï<óL´¶¶Æ;ÞñŽøwÿî߉È)A+W®Œîîîx衇R%åÒK/+VÄK/½?ûÙÏbùòåñío;Z[[cêÔ©ñ;¿ó;ñƒü Ž;–zT([¤ÄtuuÅg?ûÙø½ßû½˜9sfêq $UUUÅâÅ‹ã _øB¼öÚkñôÓOÇG>ò‘ø§ú§xï{ß—^ziÜvÛmñ_þˉçž{Îê)1<ð@TWWǧ>õ©Ô£@YÈårqýõ×Çç>÷¹Ø²eK¬[·.|ðÁ¨©©‰x -Z“'OŽ~ðƒñùÏ>V¯^L=6”,RB^zé¥øË¿üËøÔ§>“&MJ=°jÕªÔ#p†\. ,ˆÿïÿ}|÷»ß}ûöÅêÕ«ãþࢻ»;>ýéOÇÒ¥KcâÄ‰ÑØØ¿ýÛ¿?üpüüç?7Þx㢟ßû]Y¼ß•Åû ý'@JÈŸüÉŸÄ´iÓâ÷ÿ÷SBøVq«­­›nº)þôOÿ4V¯^o¾ùf¼ð ñWõWqã7ÆúõëãŸøD,Y²$.½ôÒ˜6mZÜzë­ñðñ?þÇÿˆ'žx"^{íµÓpy¿+‹÷»²x¿¡ÿjR@ÿüô§?ø‡ˆÿõ¿þWŒ5*õ8P‘jjjbþüù1þüøèG?GÍ›7G{{{¼øâ‹ñâ‹/Æ÷¿ÿýxøá‡O‡GMMMLŸ>=8÷Þ{o̘1#¦M›S§N©S§Æ•W^“&Mr¤6A€”€|>Ÿüä'cÁ‚qÏ=÷¤8èQ£bÁ‚±`Á‚³>~üøñèè舭[·Æ–-[bË–-ñè£ÆóÏ?ßøÆ7bÏž=gýúÚÚÚ¸òÊ+cÊ”)QWW÷¶o—]vYLœ81.¹ä’³¾9rd–ÿq`ÈH:yòdlÞ¼96lØëׯgŸ}6~þóŸÇ÷¿ÿý¨ªòÔ”‚ÚÚÚ¸æškâšk®9ý±Í›7Ç·¾õ­ˆˆ8vìX¼þúë±k׮ؽ{wìÚµ+víÚ===ÑÓÓ[·ngŸ}6zzzbÏž=ÑÛÛ{Î×9rdŒ?>Æ÷¶ocÆŒ‰Ñ£GŸþÖ÷Ï£FŠ‘#Gžþ¾ïǵµµQ[[#FŒxÛ÷555§¿?óÇÕÕÕQSSãÿ›è7RDŽ97ÜpCœ|øô»ººâرcqìØ±8zôèéoÇŽ‹ãÇÇñãÇ#ŸÏô?_UUUTWWGuuuTUUõ-—ËEuuuär¹Óÿ|æÏùñ¾÷}ξõ=¢væ¯ís®_óË?wæÇùq·_þõçúñ¹þù\ú~M{{{Üpà ýú½çû9啎M›6ÅÍ7ßœz 2Ðwbß×m \._è1hû·þð‡SÀE<úè£ñ›¿ù›©Ç(I¤ˆôôôÄã?3f̈ѣG§€_räȑغukÜ~ûíQWW—zœ’$@€ÌØ5dF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdF€™ @fdæÿs{ño½g/IEND®B`‚././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.417372 bayespy-0.6.2/bayespy/utils/0000755000175100001770000000000000000000000016615 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/__init__.py0000644000175100001770000000052300000000000020726 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ from . import misc from . import linalg from . import random from . import optimize ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.417372 bayespy-0.6.2/bayespy/utils/covfunc/0000755000175100001770000000000000000000000020260 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/covfunc/__init__.py0000644000175100001770000000000000000000000022357 0ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/covfunc/covariance.py0000644000175100001770000002622400000000000022752 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2012 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import itertools import numpy as np #import scipy as sp import scipy.sparse as sp # prefer CSC format #import scipy.spatial.distance as dist from bayespy.utils import misc #from bayespy.utils.covfunc import distance from scipy.spatial import distance # Covariance matrices can be either arrays or matrices so be careful # with products and powers! Use explicit multiply or dot instead of # *-operator. def gp_cov_se(D2, overwrite=False): if overwrite: K = D2 K *= -0.5 np.exp(K, out=K) else: K = np.exp(-0.5*D2) return K def gp_cov_pp2_new(r, d, derivative=False): # Dimension dependent parameter q = 2 j = np.floor(d/2) + q + 1 # Polynomial coefficients a2 = j**2 + 4*j + 3 a1 = 3*j + 6 a0 = 3 # Two parts of the covariance function k1 = (1-r) ** (j+2) k2 = (a2*r**2 + a1*r + 3) # TODO: Check that derivative is 0, 1 or 2! if derivative == 0: # Return covariance return k1 * k2 / 3 dk1 = - (j+2) * (1-r)**(j+1) dk2 = 2*a2*r + a1 if derivative == 1: # Return first derivative of the covariance return (k1 * dk2 + dk1 * k2) / 3 ddk1 = (j+2) * (j+1) * (1-r)**j ddk2 = 2*a2 if derivative == 2: # Return second derivative of the covariance return (ddk1*k2 + 2*dk1*dk2 + k1*ddk2) / 3 def gp_cov_pp2(r, d, gradient=False): # Dimension dependent parameter j = np.floor(d/2) + 2 + 1 # Polynomial coefficients a2 = j**2 + 4*j + 3 a1 = 3*j + 6 a0 = 3 # Two parts of the covariance function k1 = (1-r) ** (j+2) k2 = (a2*r**2 + a1*r + 3) # The covariance function k = k1 * k2 / 3 if gradient: # The gradient w.r.t. r dk = k * (j+2) / (r-1) + k1 * (2*a2*r + a1) / 3 return (k, dk) else: return k def gp_cov_delta(N): # TODO: Use sparse matrices here! if N > 0: #print('in gpcovdelta', N, sp.identity(N).shape) return sp.identity(N) else: # Sparse matrices do not allow zero-length dimensions return np.identity(N) #return np.identity(N) #return np.asmatrix(np.identity(N)) def squared_distance(x1, x2): ## # Reshape arrays to 2-D arrays ## sh1 = np.shape(x1)[:-1] ## sh2 = np.shape(x2)[:-1] ## d = np.shape(x1)[-1] ## x1 = np.reshape(x1, (-1,d)) ## x2 = np.reshape(x2, (-1,d)) (m1,n1) = x1.shape (m2,n2) = x2.shape if m1 == 0 or m2 == 0: D2 = np.empty((m1,m2)) else: D2 = distance.cdist(x1, x2, metric='sqeuclidean') #D2 = distance.cdist(x1, x2, metric='sqeuclidean') #D2 = np.asmatrix(D2) # Reshape the result #D2 = np.reshape(D2, sh1 + sh2) return D2 # General rule for the parameters for covariance functions: # # (value, [ [dvalue1, ...], [dvalue2, ...], [dvalue3, ...], ...]) # # For instance, # # k = covfunc_se((1.0, []), (15, [ [1,update_grad] ])) # K = k((x1, [ [dx1,update_grad] ]), (x2, [])) # # Plain values are converted as: # value -> (value, []) def gp_standardize_input(x): if np.size(x) == 0: x = np.reshape(x, (0,0)) elif np.ndim(x) == 0: x = np.reshape(x, (1,1)) elif np.ndim(x) == 1: x = np.reshape(x, (-1,1)) elif np.ndim(x) == 2: x = np.atleast_2d(x) else: raise Exception("Standard GP inputs must be 2-dimensional") return x def gp_preprocess_inputs(x1,x2=None): #args = list(args) #if len(args) < 1 or len(args) > 2: #raise Exception("Number of inputs must be one or two") if x2 is None: x1 = gp_standardize_input(x1) return x1 else: if x1 is x2: x1 = gp_standardize_input(x1) x2 = x1 else: x1 = gp_standardize_input(x1) x2 = gp_standardize_input(x2) return (x1, x2) #return args ## def gp_preprocess_inputs(x1,x2=None): ## #args = list(args) ## #if len(args) < 1 or len(args) > 2: ## #raise Exception("Number of inputs must be one or two") ## if x2 is not None: len(args) == 2: ## if args[0] is args[1]: ## args[0] = gp_standardize_input(args[0]) ## args[1] = args[0] ## else: ## args[1] = gp_standardize_input(args[1]) ## args[0] = gp_standardize_input(args[0]) ## else: ## args[0] = gp_standardize_input(args[0]) ## return args # TODO: # General syntax for these covariance functions: # covfunc(hyper1, # hyper2, # ... # hyperN, # x1, # x2=None, # gradient=list_of_booleans_for_each_hyperparameter) def covfunc_zeros(x1, x2=None, gradient=False): inputs = gp_preprocess_inputs(*inputs) # Compute distance and covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) # Only variance vector asked N = np.shape(x1)[0] # TODO: Use sparse matrices! K = np.zeros(N) #K = np.asmatrix(np.zeros((N,1))) else: (x1,x2) = gp_preprocess_inputs(x1,x2) # Full covariance matrix asked #x1 = inputs[0] #x2 = inputs[1] # Number of inputs x1 N1 = np.shape(x1)[0] N2 = np.shape(x2)[0] # TODO: Use sparse matrices! K = np.zeros((N1,N2)) #K = np.asmatrix(np.zeros((N1,N2))) if gradient is not False: return (K, []) else: return K def covfunc_delta(amplitude, x1, x2=None, gradient=False): # Make sure that amplitude is a scalar, not an array object amplitude = misc.array_to_scalar(amplitude) ## if gradient: ## gradient_amplitude = gradient[0] ## else: ## gradient_amplitude = [] ## inputs = gp_preprocess_inputs(*inputs) # Compute distance and covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) # Only variance vector asked #x = inputs[0] N = np.shape(x1)[0] K = np.ones(N) * amplitude**2 else: (x1,x2) = gp_preprocess_inputs(x1,x2) # Full covariance matrix asked #x1 = inputs[0] #x2 = inputs[1] # Number of inputs x1 N1 = np.shape(x1)[0] # x1 == x2? if x1 is x2: delta = True # Delta covariance # # FIXME: Broadcasting doesn't work with sparse matrices, # so must use scalar multiplication K = gp_cov_delta(N1) * amplitude**2 #K = gp_cov_delta(N1).multiply(amplitude**2) else: delta = False # Number of inputs x2 N2 = np.shape(x2)[0] # Zero covariance if N1 > 0 and N2 > 0: K = sp.csc_matrix((N1,N2)) else: K = np.zeros((N1,N2)) # Gradient w.r.t. amplitude if gradient: # FIXME: Broadcasting doesn't work with sparse matrices, # so must use scalar multiplication gradient_amplitude = K*(2/amplitude) print("noise grad", gradient_amplitude) return (K, (gradient_amplitude,)) else: return K def covfunc_pp2(amplitude, lengthscale, x1, x2=None, gradient=False): # Make sure that hyperparameters are scalars, not an array objects amplitude = misc.array_to_scalar(amplitude) lengthscale = misc.array_to_scalar(lengthscale) #amplitude = theta[0] #lengthscale = theta[1] ## if gradient: ## gradient_amplitude = gradient[0] ## gradient_lengthscale = gradient[1] ## else: ## gradient_amplitude = [] ## gradient_lengthscale = [] ## inputs = gp_preprocess_inputs(*inputs) # Compute covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) # Compute variance vector K = np.ones(np.shape(x)[:-1]) K *= amplitude**2 # Compute gradient w.r.t. lengthscale if gradient: gradient_lengthscale = np.zeros(np.shape(x1)[:-1]) else: (x1,x2) = gp_preprocess_inputs(x1,x2) # Compute (sparse) distance matrix if x1 is x2: x1 = x1 / (lengthscale) x2 = x1 D2 = distance.sparse_pdist(x1, 1.0, form="full", format="csc") else: x1 = x1 / (lengthscale) x2 = x2 / (lengthscale) D2 = distance.sparse_cdist(x1, x2, 1.0, format="csc") r = np.sqrt(D2.data) N1 = np.shape(x1)[0] N2 = np.shape(x2)[0] # Compute the covariances if gradient: (k, dk) = gp_cov_pp2(r, np.shape(x1)[-1], gradient=True) else: k = gp_cov_pp2(r, np.shape(x1)[-1]) k *= amplitude**2 # Compute gradient w.r.t. lengthscale if gradient: if N1 >= 1 and N2 >= 1: dk *= r * (-amplitude**2 / lengthscale) gradient_lengthscale = sp.csc_matrix((dk, D2.indices, D2.indptr), shape=(N1,N2)) else: gradient_lengthscale = np.empty((N1,N2)) # Form sparse covariance matrix if N1 >= 1 and N2 >= 1: ## K = sp.csc_matrix((k, ij), shape=(N1,N2)) K = sp.csc_matrix((k, D2.indices, D2.indptr), shape=(N1,N2)) else: K = np.empty((N1, N2)) #print(K.__class__) # Gradient w.r.t. amplitude if gradient: gradient_amplitude = K * (2 / amplitude) # Return values if gradient: print("pp2 grad", gradient_lengthscale) return (K, (gradient_amplitude, gradient_lengthscale)) else: return K def covfunc_se(amplitude, lengthscale, x1, x2=None, gradient=False): # Make sure that hyperparameters are scalars, not an array objects amplitude = misc.array_to_scalar(amplitude) lengthscale = misc.array_to_scalar(lengthscale) # Compute covariance matrix if x2 is None: x1 = gp_preprocess_inputs(x1) #x = inputs[0] # Compute variance vector N = np.shape(x1)[0] K = np.ones(N) np.multiply(K, amplitude**2, out=K) # Compute gradient w.r.t. lengthscale if gradient: # TODO: Use sparse matrices? gradient_lengthscale = np.zeros(N) else: (x1,x2) = gp_preprocess_inputs(x1,x2) x1 = x1 / (lengthscale) x2 = x2 / (lengthscale) # Compute distance matrix K = squared_distance(x1, x2) # Compute gradient partly if gradient: gradient_lengthscale = np.divide(K, lengthscale) # Compute covariance matrix gp_cov_se(K, overwrite=True) np.multiply(K, amplitude**2, out=K) # Compute gradient w.r.t. lengthscale if gradient: gradient_lengthscale *= K # Gradient w.r.t. amplitude if gradient: gradient_amplitude = K * (2 / amplitude) # Return values if gradient: print("se grad", gradient_amplitude, gradient_lengthscale) return (K, (gradient_amplitude, gradient_lengthscale)) else: return K ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/linalg.py0000644000175100001770000004407400000000000020446 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ General numerical functions and methods. """ import itertools import numpy as np import scipy as sp #import scipy.linalg.decomp_cholesky as decomp import scipy.linalg as linalg import scipy.special as special import scipy.optimize as optimize import scipy.sparse as sparse #import scikits.sparse.cholmod as cholmod # THIS IS SOME NEW GENERALIZED UFUNC FOR LINALG FEATURE, NOT IN OFFICIAL NUMPY # REPO YET #import numpy.linalg._gufuncs_linalg as gula #import numpy.core.gufuncs_linalg as gula from . import misc def chol(C, ndim=1): if sparse.issparse(C): if ndim != 1: raise NotImplementedError() # Sparse Cholesky decomposition (returns a Factor object) return cholmod.cholesky(C) else: # Computes Cholesky decomposition for a collection of matrices. # The last 2*ndim axes of C are considered as the matrix. if ndim == 0: return np.sqrt(C) shape_original = np.shape(C)[-ndim:] C = ( C if ndim == 1 else misc.flatten_axes(C, ndim, ndim) ) if np.shape(C)[-1] != np.shape(C)[-2]: raise ValueError("Not square matrix w.r.t. ndim sense") U = np.empty(np.shape(C)) for i in misc.nested_iterator(np.shape(U)[:-2]): try: # Handle (0, 0) matrices. See: # https://github.com/scipy/scipy/issues/8056 U[i] = ( np.empty((0,0)) if np.size(C[i]) == 0 else linalg.cho_factor(C[i])[0] ) except np.linalg.linalg.LinAlgError: raise Exception("Matrix not positive definite") return ( U if ndim == 1 else misc.reshape_axes(U, shape_original, shape_original) ) def chol_solve(U, b, out=None, matrix=False, ndim=1): if isinstance(U, np.ndarray) or np.isscalar(U): if sparse.issparse(b): b = b.toarray() if ndim == 0: return (b / U) / U shape = np.shape(U)[-ndim:] U = ( U if ndim == 1 else misc.flatten_axes(U, ndim, ndim) ) if matrix: shape_b = np.shape(b)[-ndim:] B = ( b if ndim == 1 else misc.flatten_axes(b, ndim, ndim) ) B = transpose(B, ndim=1) U = U[...,None,:,:] else: B = ( b if ndim == 1 else misc.flatten_axes(b, ndim) ) # Allocate memory sh_u = U.shape[:-2] sh_b = B.shape[:-1] l_u = len(sh_u) l_b = len(sh_b) # Check which axis are iterated over with B along with U ind_b = [slice(None)] * l_b l_min = min(l_u, l_b) jnd_b = tuple(i for i in range(-l_min,0) if sh_b[i]==sh_u[i]) if out == None: # Shape of the result (broadcasting rules) sh = misc.broadcasted_shape(sh_u, sh_b) #out = np.zeros(np.shape(B)) out = np.zeros(sh + B.shape[-1:]) for i in misc.nested_iterator(np.shape(U)[:-2]): # The goal is to run Cholesky solver once for all vectors of B # for which the matrices of U are the same (according to the # broadcasting rules). Thus, we collect all the axes of B for # which U is singleton and form them as a 2-D matrix and then # run the solver once. # Select those axes of B for which U and B are not singleton for j in jnd_b: ind_b[j] = i[j] # Collect all the axes for which U is singleton b = B[tuple(ind_b) + (slice(None),)] # Reshape it to a 2-D (or 1-D) array orig_shape = b.shape if b.ndim > 1: b = b.reshape((-1, b.shape[-1])) # slice(None) to all preceeding axes and ellipsis for the last # axis: if len(ind_b) < len(sh): ind_out = (slice(None),) + tuple(ind_b) + (slice(None),) else: ind_out = tuple(ind_b) + (slice(None),) out[ind_out] = ( # Handle (0, 0) matrices. See: # https://github.com/scipy/scipy/issues/8056 np.empty(orig_shape) if np.size(U[i]) == 0 else linalg.cho_solve( (U[i], False), b.T ).T.reshape(orig_shape) ) if matrix: out = transpose(out, ndim=1) out = ( out if ndim == 1 else misc.reshape_axes(out, shape, shape_b) ) else: out = ( out if ndim == 1 else misc.reshape_axes(out, shape) ) return out elif isinstance(U, cholmod.Factor): if ndim != 1: raise NotImplementedError() if matrix: raise NotImplementedError() if sparse.issparse(b): b = b.toarray() return U.solve_A(b) else: raise ValueError("Unknown type of Cholesky factor") def chol_inv(U, ndim=1): if isinstance(U, np.ndarray) or np.isscalar(U): if ndim == 0: return (1 / U) / U shape = np.shape(U)[-ndim:] U = ( U if ndim == 1 else misc.flatten_axes(U, ndim, ndim) ) # Allocate memory V = np.tile(np.identity(np.shape(U)[-1]), np.shape(U)[:-2]+(1,1)) for i in misc.nested_iterator(np.shape(U)[:-2]): V[i] = ( # Handle (0, 0) matrices. See: # https://github.com/scipy/scipy/issues/8056 np.empty((0, 0)) if np.size(V[i]) == 0 else linalg.cho_solve( (U[i], False), V[i], overwrite_b=True # This would need Fortran order ) ) V = ( V if ndim == 1 else misc.reshape_axes(V, shape, shape) ) return V elif isinstance(U, cholmod.Factor): raise NotImplementedError if ndim != 1: raise NotImplementedError() else: raise ValueError("Unknown type of Cholesky factor") def chol_logdet(U, ndim=1): if isinstance(U, np.ndarray) or np.isscalar(U): if ndim == 0: return 2 * np.log(U) U = ( U if ndim == 1 else misc.flatten_axes(U, ndim, ndim) ) return 2*np.sum(np.log(np.einsum('...ii->...i',U)), axis=-1) elif isinstance(U, cholmod.Factor): if ndim != 1: raise NotImplementedError() return np.sum(np.log(U.D())) else: raise ValueError("Unknown type of Cholesky factor") def logdet_chol(U): if isinstance(U, np.ndarray): # Computes Cholesky decomposition for a collection of matrices. return 2*np.sum(np.log(np.einsum('...ii->...i', U)), axis=(-1,)) elif isinstance(U, cholmod.Factor): return np.sum(np.log(U.D())) def logdet_tri(R): """ Logarithm of the absolute value of the determinant of a triangular matrix. """ return np.sum(np.log(np.abs(np.einsum('...ii->...i', R)))) def logdet_cov(C, ndim=1): return chol_logdet(chol(C, ndim=ndim), ndim=ndim) def solve_triangular(U, B, ndim=1, **kwargs): if ndim != 1: raise NotImplementedError("Not yet implemented for ndim!=1") # Allocate memory U = np.atleast_2d(U) B = np.atleast_1d(B) sh_u = U.shape[:-2] sh_b = B.shape[:-1] l_u = len(sh_u) l_b = len(sh_b) # Check which axis are iterated over with B along with U ind_b = [slice(None)] * l_b l_min = min(l_u, l_b) jnd_b = tuple(i for i in range(-l_min,0) if sh_b[i]==sh_u[i]) # Shape of the result (broadcasting rules) sh = misc.broadcasted_shape(sh_u, sh_b) out = np.zeros(sh + B.shape[-1:]) for i in misc.nested_iterator(np.shape(U)[:-2]): # The goal is to run triangular solver once for all vectors of # B for which the matrices of U are the same (according to the # broadcasting rules). Thus, we collect all the axes of B for # which U is singleton and form them as a 2-D matrix and then # run the solver once. # Select those axes of B for which U and B are not singleton for j in jnd_b: ind_b[j] = i[j] # Collect all the axes for which U is singleton b = B[tuple(ind_b) + (slice(None),)] # Reshape it to a 2-D (or 1-D) array orig_shape = b.shape if b.ndim > 1: b = b.reshape((-1, b.shape[-1])) # slice(None) to all preceeding axes and ellipsis for the last # axis: if len(ind_b) < len(sh): ind_out = (slice(None),) + tuple(ind_b) + (slice(None),) else: ind_out = tuple(ind_b) + (slice(None),) out[ind_out] = linalg.solve_triangular(U[i], b.T, **kwargs).T.reshape(orig_shape) return out def inner(*args, ndim=1): """ Compute inner product. The number of arrays is arbitrary. The number of dimensions is arbitrary. """ axes = tuple(range(-ndim,0)) return misc.sum_product(*args, axes_to_sum=axes) def outer(A, B, ndim=1): """ Computes outer product over the last axes of A and B. The other axes are broadcasted. Thus, if A has shape (..., N) and B has shape (..., M), then the result has shape (..., N, M). Using the argument `ndim` it is possible to change that how many axes trailing axes are used for the outer product. For instance, if ndim=3, A and B have shapes (...,N1,N2,N3) and (...,M1,M2,M3), the result has shape (...,N1,M1,N2,M2,N3,M3). """ if not isinstance(ndim, int) or ndim < 0: raise ValueError('ndim must be non-negative integer') if ndim > 0: if ndim > np.ndim(A): raise ValueError('Argument ndim larger than ndim of the first ' 'parameter') if ndim > np.ndim(B): raise ValueError('Argument ndim larger than ndim of the second ' 'parameter') shape_A = np.shape(A) + (1,)*ndim shape_B = np.shape(B)[:-ndim] + (1,)*ndim + np.shape(B)[-ndim:] A = np.reshape(A, shape_A) B = np.reshape(B, shape_B) return np.asanyarray(A) * np.asanyarray(B) def _dot(A, B): """ Dot product which handles broadcasting properly. Future NumPy will have a better built-in implementation for this. """ A_plates = np.shape(A)[:-2] B_plates = np.shape(B)[:-2] M = np.shape(A)[-2] N = np.shape(B)[-1] Y_plates = misc.broadcasted_shape(A_plates, B_plates) if Y_plates == (): return np.dot(A, B) indices = misc.nested_iterator(Y_plates) Y_shape = Y_plates + (M, N) Y = np.zeros(Y_shape) for i in indices: Y[i] = np.dot(A[misc.safe_indices(i, A_plates)], B[misc.safe_indices(i, B_plates)]) return Y def dot(*arrays): """ Compute matrix-matrix product. You can give multiple arrays, the dot product is computed from left to right: A1*A2*A3*...*AN. The dot product is computed over the last two axes of each arrays. All other axes must be broadcastable. """ if len(arrays) == 0: return 0 else: Y = np.asanyarray(arrays[0]) for X in arrays[1:]: X = np.asanyarray(X) if np.ndim(Y) < 2 or np.ndim(X) < 2: raise ValueError("Must be at least 2-D arrays") if np.shape(Y)[-1] != np.shape(X)[-2]: raise ValueError("Dimensions do not match") # Replace this with numpy.dot when NumPy implements broadcasting in dot Y = _dot(Y, X) #Y = np.einsum('...ik,...kj->...ij', Y, X) #Y = gula.matrix_multiply(Y, X) return Y def tracedot(A, B): """ Computes trace(A*B). """ return np.einsum('...ij,...ji->...', A, B) def inv(A, ndim=1): """ General array inversion. Supports broadcasting and inversion of multidimensional arrays. For instance, an array with shape (4,3,2,3,2) could mean that there are four (3*2) x (3*2) matrices to be inverted. This can be done by inv(A, ndim=2). For inverting scalars, ndim=0. For inverting matrices, ndim=1. """ A = np.asanyarray(A) if ndim == 0: return 1 / A elif ndim == 1: return np.linalg.inv(A) else: raise NotImplementedError() def mvdot(A, b, ndim=1): """ Compute matrix-vector product. Applies broadcasting. """ # TODO/FIXME: A bug in inner1d: # https://github.com/numpy/numpy/issues/3338 # # b = np.asanyarray(b) # return gula.inner1d(A, b[...,np.newaxis,:]) # # Use einsum instead: if ndim > 0: b = misc.add_axes(b, num=ndim, axis=-1-ndim) return inner(A, b, ndim=ndim) ## if ndim != 1: ## raise NotImplementedError("mvdot not yet implemented for ndim!=1") ## return _dot(A, b[...,None])[...,0] ## #return np.einsum('...ik,...k->...i', A, b) def mmdot(A, B, ndim=1): """ Compute matrix-matrix product. Applies broadcasting. """ if ndim == 0: return A * B elif ndim == 1: return _dot(A, B) else: raise Exception("mmdot not yet implemented for ndim>1") #return np.einsum('...ik,...kj->...ij', A, B) def transpose(X, ndim=1): """ Transpose the matrix. """ for n in range(ndim): X = np.swapaxes(X, -1-n, -1-ndim-n) return X ## if ndim != 1: ## raise Exception("transpose not yet implemented for ndim!=1") ## return np.swapaxes(X, -1, -2) def m_dot(A,b): raise DeprecationWarning() # Compute matrix-vector product over the last two axes of A and # the last axes of b. Other axes are broadcasted. If A has shape # (..., M, N) and b has shape (..., N), then the result has shape # (..., M) #b = reshape(b, shape(b)[:-1] + (1,) + shape(b)[-1:]) #return np.dot(A, b) return np.einsum('...ik,...k->...i', A, b) # TODO: Use einsum!! #return np.sum(A*b[...,np.newaxis,:], axis=(-1,)) def block_banded_solve(A, B, y): """ Invert symmetric, banded, positive-definite matrix. A contains the diagonal blocks. B contains the superdiagonal blocks (their transposes are the subdiagonal blocks). Shapes: A: (..., N, D, D) B: (..., N-1, D, D) y: (..., N, D) The algorithm is basically LU decomposition. Computes only the diagonal and super-diagonal blocks of the inverse. The true inverse is dense, in general. Assume each block has the same size. Return: * inverse blocks * solution to the system * log-determinant """ # Number of time instance and dimensionality N = np.shape(y)[-2] D = np.shape(y)[-1] # Check the shape of the diagonal blocks if np.shape(A)[-3] != N: raise ValueError("The number of diagonal blocks is incorrect") if np.shape(A)[-2:] != (D,D): raise ValueError("The diagonal blocks have wrong shape") # Check the shape of the super-diagonal blocks if np.shape(B)[-3] != N-1: raise ValueError("The number of super-diagonal blocks is incorrect") if np.shape(B)[-2:] != (D,D): raise ValueError("The diagonal blocks have wrong shape") plates_VC = misc.broadcasted_shape(np.shape(A)[:-3], np.shape(B)[:-3]) plates_y = misc.broadcasted_shape(plates_VC, np.shape(y)[:-2]) V = np.empty(plates_VC+(N,D,D)) C = np.empty(plates_VC+(N-1,D,D)) x = np.empty(plates_y+(N,D)) # # Forward recursion # # In the forward recursion, store the Cholesky factor in V. So you # don't need to recompute them in the backward recursion. # TODO: This whole algorithm could be implemented as in-place operation. # Might be a nice feature (optional?) # TODO/FIXME: chol_solve has quite a high overhead because it uses shape # manipulations. Use some more raw method instead. x[...,0,:] = y[...,0,:] V[...,0,:,:] = chol(A[...,0,:,:]) ldet = chol_logdet(V[...,0,:,:]) for n in range(N-1): # Compute the solution of the system x[...,n+1,:] = (y[...,n+1,:] - mvdot(misc.T(B[...,n,:,:]), chol_solve(V[...,n,:,:], x[...,n,:]))) # Compute the superdiagonal block of the inverse C[...,n,:,:] = chol_solve(V[...,n,:,:], B[...,n,:,:], matrix=True) # Compute the diagonal block V[...,n+1,:,:] = (A[...,n+1,:,:] - mmdot(misc.T(B[...,n,:,:]), C[...,n,:,:])) # Ensure symmetry by 0.5*(V+V.T) V[...,n+1,:,:] = 0.5 * (V[...,n+1,:,:] + misc.T(V[...,n+1,:,:])) # Compute and store the Cholesky factor of the diagonal block V[...,n+1,:,:] = chol(V[...,n+1,:,:]) # Compute the log-det term here, too ldet += chol_logdet(V[...,n+1,:,:]) # # Backward recursion # x[...,-1,:] = chol_solve(V[...,-1,:,:], x[...,-1,:]) V[...,-1,:,:] = chol_inv(V[...,-1,:,:]) for n in reversed(range(N-1)): # Compute the solution of the system x[...,n,:] = chol_solve(V[...,n,:,:], x[...,n,:] - mvdot(B[...,n,:,:], x[...,n+1,:])) # Compute the diagonal block of the inverse V[...,n,:,:] = (chol_inv(V[...,n,:,:]) + mmdot(C[...,n,:,:], mmdot(V[...,n+1,:,:], misc.T(C[...,n,:,:])))) C[...,n,:,:] = - mmdot(C[...,n,:,:], V[...,n+1,:,:]) # Ensure symmetry by 0.5*(V+V.T) V[...,n,:,:] = 0.5 * (V[...,n,:,:] + misc.T(V[...,n,:,:])) return (V, C, x, ldet) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/misc.py0000644000175100001770000012760100000000000020131 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ General numerical functions and methods. """ from scipy.optimize import approx_fprime import functools import itertools import operator import sys import getopt import numpy as np import scipy as sp import scipy.linalg as linalg import scipy.special as special import scipy.optimize as optimize import scipy.sparse as sparse import tempfile as tmp import unittest from numpy import testing def flatten_axes(X, *ndims): ndim = sum(ndims) if np.ndim(X) < ndim: raise ValueError("Not enough ndims in the array") if len(ndims) == 0: return X shape = np.shape(X) i = np.ndim(X) - ndim plates = shape[:i] nd_sums = i + np.cumsum((0,) + ndims) sizes = tuple( np.prod(shape[i:j]) for (i, j) in zip(nd_sums[:-1], nd_sums[1:]) ) return np.reshape(X, plates + sizes) def reshape_axes(X, *shapes): ndim = len(shapes) if np.ndim(X) < ndim: raise ValueError("Not enough ndims in the array") i = np.ndim(X) - ndim sizes = tuple(np.prod(sh) for sh in shapes) if np.shape(X)[i:] != sizes: raise ValueError("Shapes inconsistent with sizes") shape = tuple(i for sh in shapes for i in sh) return np.reshape(X, np.shape(X)[:i] + shape) def find_set_index(index, set_lengths): """ Given set sizes and an index, returns the index of the set The given index is for the concatenated list of the sets. """ # Negative indices to positive if index < 0: index += np.sum(set_lengths) # Indices must be on range (0, N-1) if index >= np.sum(set_lengths) or index < 0: raise Exception("Index out bounds") return np.searchsorted(np.cumsum(set_lengths), index, side='right') def parse_command_line_arguments(mandatory_args, *optional_args_list, argv=None): """ Parse command line arguments of style "--parameter=value". Parameter specification is tuple: (name, converter, description). Some special handling: * If converter is None, the command line does not accept any value for it, but instead use either "--option" to enable or "--no-option" to disable. * If argument name contains hyphens, those are converted to underscores in the keys of the returned dictionaries. Parameters ---------- mandatory_args : list of tuples Specs for mandatory arguments optional_args_list : list of lists of tuples Specs for each optional arguments set argv : list of strings (optional) The command line arguments. By default, read sys.argv. Returns ------- args : dictionary The parsed mandatory arguments kwargs : dictionary The parsed optional arguments Examples -------- >>> from pprint import pprint as print >>> from bayespy.utils import misc >>> (args, kwargs) = misc.parse_command_line_arguments( ... # Mandatory arguments ... [ ... ('name', str, "Full name"), ... ('age', int, "Age (years)"), ... ('employed', None, "Working"), ... ], ... # Optional arguments ... [ ... ('phone', str, "Phone number"), ... ('favorite-color', str, "Favorite color") ... ], ... argv=['--name=John Doe', ... '--age=42', ... '--no-employed', ... '--favorite-color=pink'] ... ) >>> print(args) {'age': 42, 'employed': False, 'name': 'John Doe'} >>> print(kwargs) {'favorite_color': 'pink'} It is possible to have several optional argument sets: >>> (args, kw_info, kw_fav) = misc.parse_command_line_arguments( ... # Mandatory arguments ... [ ... ('name', str, "Full name"), ... ], ... # Optional arguments (contact information) ... [ ... ('phone', str, "Phone number"), ... ('email', str, "E-mail address") ... ], ... # Optional arguments (preferences) ... [ ... ('favorite-color', str, "Favorite color"), ... ('favorite-food', str, "Favorite food") ... ], ... argv=['--name=John Doe', ... '--favorite-color=pink', ... '--email=john.doe@email.com', ... '--favorite-food=spaghetti'] ... ) >>> print(args) {'name': 'John Doe'} >>> print(kw_info) {'email': 'john.doe@email.com'} >>> print(kw_fav) {'favorite_color': 'pink', 'favorite_food': 'spaghetti'} """ if argv is None: argv = sys.argv[1:] mandatory_arg_names = [arg[0] for arg in mandatory_args] # Sizes of each optional argument list optional_args_lengths = [len(opt_args) for opt_args in optional_args_list] all_args = mandatory_args + functools.reduce(operator.add, optional_args_list, []) # Create a list of arg names for the getopt parser arg_list = [] for arg in all_args: arg_name = arg[0].lower() if arg[1] is None: arg_list.append(arg_name) arg_list.append("no-" + arg_name) else: arg_list.append(arg_name + "=") if len(set(arg_list)) < len(arg_list): raise Exception("Argument names are not unique") # Use getopt parser try: (cl_opts, cl_args) = getopt.getopt(argv, "", arg_list) except getopt.GetoptError as err: print(err) print("Usage:") for arg in all_args: if arg[1] is None: print("--{0}\t{1}".format(arg[0].lower(), arg[2])) else: print("--{0}=<{1}>\t{2}".format(arg[0].lower(), str(arg[1].__name__).upper(), arg[2])) sys.exit(2) # A list of all valid flag names: ["--first-argument", "--another-argument"] valid_flags = [] valid_flag_arg_indices = [] for (ind, arg) in enumerate(all_args): valid_flags.append("--" + arg[0].lower()) valid_flag_arg_indices.append(ind) if arg[1] is None: valid_flags.append("--no-" + arg[0].lower()) valid_flag_arg_indices.append(ind) # Go through all the given command line arguments and store them in the # correct dictionaries args = dict() kwargs_list = [dict() for i in range(len(optional_args_list))] handled_arg_names = [] for (cl_opt, cl_arg) in cl_opts: # Get the index of the argument try: ind = valid_flag_arg_indices[valid_flags.index(cl_opt.lower())] except ValueError: print("Invalid command line argument: {0}".format(cl_opt)) raise Exception("Invalid argument given") # Check that the argument wasn't already given and then mark the # argument as handled if all_args[ind][0] in handled_arg_names: raise Exception("Same argument given multiple times") else: handled_arg_names.append(all_args[ind][0]) # Check whether to add the argument to the mandatory or optional # argument dictionary if ind < len(mandatory_args): dict_to = args else: dict_index = find_set_index(ind - len(mandatory_args), optional_args_lengths) dict_to = kwargs_list[dict_index] # Convert and store the argument convert_function = all_args[ind][1] arg_name = all_args[ind][0].replace('-', '_') if convert_function is None: if cl_opt[:5] == "--no-": dict_to[arg_name] = False else: dict_to[arg_name] = True else: dict_to[arg_name] = convert_function(cl_arg) # Check if some mandatory argument was not given for arg_name in mandatory_arg_names: if arg_name not in handled_arg_names: raise Exception("Mandatory argument --{0} not given".format(arg_name)) return tuple([args] + kwargs_list) def composite_function(function_list): """ Construct a function composition from a list of functions. Given a list of functions [f,g,h], constructs a function :math:`h \circ g \circ f`. That is, returns a function :math:`z`, for which :math:`z(x) = h(g(f(x)))`. """ def composite(X): for function in function_list: X = function(X) return X return composite def ceildiv(a, b): """ Compute a divided by b and rounded up. """ return -(-a // b) def rmse(y1, y2, axis=None): return np.sqrt(np.mean((y1-y2)**2, axis=axis)) def is_callable(f): return hasattr(f, '__call__') def atleast_nd(X, d): if np.ndim(X) < d: sh = (d-np.ndim(X))*(1,) + np.shape(X) X = np.reshape(X, sh) return X def T(X): """ Transpose the matrix. """ return np.swapaxes(X, -1, -2) class TestCase(unittest.TestCase): """ Simple base class for unit testing. Adds NumPy's features to Python's unittest. """ def assertAllClose(self, A, B, msg="Arrays not almost equal", rtol=1e-4, atol=0): self.assertEqual(np.shape(A), np.shape(B), msg=msg) testing.assert_allclose(A, B, err_msg=msg, rtol=rtol, atol=atol) pass def assertArrayEqual(self, A, B, msg="Arrays not equal"): self.assertEqual(np.shape(A), np.shape(B), msg=msg) testing.assert_array_equal(A, B, err_msg=msg) pass def assertMessage(self, M1, M2): if len(M1) != len(M2): self.fail("Message lists have different lengths") for (m1, m2) in zip(M1, M2): self.assertAllClose(m1, m2) pass def assertMessageToChild(self, X, u): self.assertMessage(X._message_to_child(), u) pass def _get_pack_functions(self, plates, dims): inds = np.concatenate( [ [0], np.cumsum( [ np.prod(dimi) * np.prod(plates) for dimi in dims ] ) ] ).astype(int) def pack(x): return [ np.reshape(x[start:end], plates + dimi) for (start, end, dimi) in zip(inds[:-1], inds[1:], dims) ] def unpack(u): return np.concatenate( [ np.broadcast_to(ui, plates + dimi).ravel() for (ui, dimi) in zip(u, dims) ] ) return (pack, unpack) def assert_message_to_parent(self, child, parent, postprocess=lambda u: u, eps=1e-6, rtol=1e-4, atol=0): (pack, unpack) = self._get_pack_functions(parent.plates, parent.dims) def cost(x): parent.u = pack(x) return child.lower_bound_contribution() d = postprocess(pack(unpack(parent._message_from_children()))) d_num = postprocess( pack( approx_fprime( unpack(parent.u), cost, eps ) ) ) # for (i, j) in zip(postprocess(pack(d)), postprocess(pack(d_num))): # print(i) # print(j) assert len(d_num) == len(d) for i in range(len(d)): self.assertAllClose(d[i], d_num[i], rtol=rtol, atol=atol) def assert_moments(self, node, postprocess=lambda u: u, eps=1e-6, rtol=1e-4, atol=0): (u, g) = node._distribution.compute_moments_and_cgf(node.phi) (pack, unpack) = self._get_pack_functions(node.plates, node.dims) def cost(x): (_, g) = node._distribution.compute_moments_and_cgf(pack(x)) return -np.sum(g) u_num = pack( approx_fprime( unpack(node.phi), cost, eps ) ) assert len(u_num) == len(u) up = postprocess(u) up_num = postprocess(u_num) for i in range(len(up)): self.assertAllClose(up[i], up_num[i], rtol=rtol, atol=atol) pass def symm(X): """ Make X symmetric. """ return 0.5 * (X + np.swapaxes(X, -1, -2)) def unique(l): """ Remove duplicate items from a list while preserving order. """ seen = set() seen_add = seen.add return [ x for x in l if x not in seen and not seen_add(x)] def tempfile(prefix='', suffix=''): return tmp.NamedTemporaryFile(prefix=prefix, suffix=suffix).name def write_to_hdf5(group, data, name): """ Writes the given array into the HDF5 file. """ try: # Try using compression. It doesn't work for scalars. group.create_dataset(name, data=data, compression='gzip') except TypeError: group.create_dataset(name, data=data) except ValueError: raise ValueError('Could not write %s' % data) def nans(size=()): return np.tile(np.nan, size) def trues(shape): return np.ones(shape, dtype=np.bool) def identity(*shape): return np.reshape(np.identity(np.prod(shape)), shape+shape) def array_to_scalar(x): # This transforms an N-dimensional array to a scalar. It's most # useful when you know that the array has only one element and you # want it out as a scalar. return np.ravel(x)[0] #def diag(x): def put(x, indices, y, axis=-1, ufunc=np.add): """A kind of inverse mapping of `np.take` In a simple, the operation can be thought as: .. code-block:: python x[indices] += y with the exception that all entries of `y` are used instead of just the first occurence corresponding to a particular element. That is, the results are accumulated, and the accumulation function can be changed by providing `ufunc`. For instance, `np.multiply` corresponds to: .. code-block:: python x[indices] *= y Whereas `np.take` picks indices along an axis and returns the resulting array, `put` similarly picks indices along an axis but accumulates the given values to those entries. Example ------- .. code-block:: python >>> x = np.zeros(3) >>> put(x, [2, 2, 0, 2, 2], 1) array([1., 0., 4.]) `y` must broadcast to the shape of `np.take(x, indices)`: .. code-block:: python >>> x = np.zeros((3,4)) >>> put(x, [[2, 2, 0, 2, 2], [1, 2, 1, 2, 1]], np.ones((2,1,4)), axis=0) array([[1., 1., 1., 1.], [3., 3., 3., 3.], [6., 6., 6., 6.]]) """ #x = np.copy(x) ndim = np.ndim(x) if not isinstance(axis, int): raise ValueError("Axis must be an integer") # Make axis index positive: [0, ..., ndim-1] if axis < 0: axis = axis + ndim if axis < 0 or axis >= ndim: raise ValueError("Axis out of bounds") indices = axis*(slice(None),) + (indices,) + (ndim-axis-1)*(slice(None),) #y = add_trailing_axes(y, ndim-axis-1) ufunc.at(x, indices, y) return x def put_simple(y, indices, axis=-1, length=None): """An inverse operation of `np.take` with accumulation and broadcasting. Compared to `put`, the difference is that the result array is initialized with an array of zeros whose shape is determined automatically and `np.add` is used as the accumulator. """ if length is None: # Try to determine the original length of the axis by finding the # largest index. It is more robust to give the length explicitly. indices = np.copy(indices) indices[indices<0] = np.abs(indices[indices<0]) - 1 length = np.amax(indices) + 1 if not isinstance(axis, int): raise ValueError("Axis must be an integer") # Make axis index negative: [-ndim, ..., -1] if axis >= 0: raise ValueError("Axis index must be negative") y = atleast_nd(y, abs(axis)-1) shape_y = np.shape(y) end_before = axis - np.ndim(indices) + 1 start_after = axis + 1 if end_before == 0: shape_x = shape_y + (length,) elif start_after == 0: shape_x = shape_y[:end_before] + (length,) else: shape_x = shape_y[:end_before] + (length,) + shape_y[start_after:] x = np.zeros(shape_x) return put(x, indices, y, axis=axis) def grid(x1, x2): """ Returns meshgrid as a (M*N,2)-shape array. """ (X1, X2) = np.meshgrid(x1, x2) return np.hstack((X1.reshape((-1,1)),X2.reshape((-1,1)))) # class CholeskyDense(): # def __init__(self, K): # self.U = linalg.cho_factor(K) # def solve(self, b): # if sparse.issparse(b): # b = b.toarray() # return linalg.cho_solve(self.U, b) # def logdet(self): # return 2*np.sum(np.log(np.diag(self.U[0]))) # def trace_solve_gradient(self, dK): # return np.trace(self.solve(dK)) # class CholeskySparse(): # def __init__(self, K): # self.LD = cholmod.cholesky(K) # def solve(self, b): # if sparse.issparse(b): # b = b.toarray() # return self.LD.solve_A(b) # def logdet(self): # return self.LD.logdet() # #np.sum(np.log(LD.D())) # def trace_solve_gradient(self, dK): # # WTF?! numpy.multiply doesn't work for two sparse # # matrices.. It returns a result but it is incorrect! # # Use the identity trace(K\dK)=sum(inv(K).*dK) by computing # # the sparse inverse (lower triangular part) # iK = self.LD.spinv(form='lower') # return (2*iK.multiply(dK).sum() # - iK.diagonal().dot(dK.diagonal())) # # Multiply by two because of symmetry (remove diagonal once # # because it was taken into account twice) # #return np.multiply(self.LD.inv().todense(),dK.todense()).sum() # #return self.LD.inv().multiply(dK).sum() # THIS WORKS # #return np.multiply(self.LD.inv(),dK).sum() # THIS NOT WORK!! WTF?? # iK = self.LD.spinv() # return iK.multiply(dK).sum() # #return (2*iK.multiply(dK).sum() # # - iK.diagonal().dot(dK.diagonal())) # #return (2*np.multiply(iK, dK).sum() # # - iK.diagonal().dot(dK.diagonal())) # THIS NOT WORK!! # #return np.trace(self.solve(dK)) # def cholesky(K): # if isinstance(K, np.ndarray): # return CholeskyDense(K) # elif sparse.issparse(K): # return CholeskySparse(K) # else: # raise Exception("Unsupported covariance matrix type") # Computes log probability density function of the Gaussian # distribution def gaussian_logpdf(y_invcov_y, y_invcov_mu, mu_invcov_mu, logdetcov, D): return (-0.5*D*np.log(2*np.pi) -0.5*logdetcov -0.5*y_invcov_y +y_invcov_mu -0.5*mu_invcov_mu) def zipper_merge(*lists): """ Combines lists by alternating elements from them. Combining lists [1,2,3], ['a','b','c'] and [42,666,99] results in [1,'a',42,2,'b',666,3,'c',99] The lists should have equal length or they are assumed to have the length of the shortest list. This is known as alternating merge or zipper merge. """ return list(sum(zip(*lists), ())) def remove_whitespace(s): return ''.join(s.split()) def is_numeric(a): return (np.isscalar(a) or isinstance(a, list) or isinstance(a, np.ndarray)) def is_scalar_integer(x): t = np.asanyarray(x).dtype.type return np.ndim(x) == 0 and issubclass(t, np.integer) def isinteger(x): t = np.asanyarray(x).dtype.type return ( issubclass(t, np.integer) or issubclass(t, np.bool_) ) def is_string(s): return isinstance(s, str) def multiply_shapes(*shapes): """ Compute element-wise product of lists/tuples. Shorter lists are concatenated with leading 1s in order to get lists with the same length. """ # Make the shapes equal length shapes = make_equal_length(*shapes) # Compute element-wise product f = lambda X,Y: (x*y for (x,y) in zip(X,Y)) shape = functools.reduce(f, shapes) return tuple(shape) def make_equal_length(*shapes): """ Make tuples equal length. Add leading 1s to shorter tuples. """ # Get maximum length max_len = max((len(shape) for shape in shapes)) # Make the shapes equal length shapes = ((1,)*(max_len-len(shape)) + tuple(shape) for shape in shapes) return shapes def make_equal_ndim(*arrays): """ Add trailing unit axes so that arrays have equal ndim """ shapes = [np.shape(array) for array in arrays] shapes = make_equal_length(*shapes) arrays = [np.reshape(array, shape) for (array, shape) in zip(arrays, shapes)] return arrays def sum_to_dim(A, dim): """ Sum leading axes of A such that A has dim dimensions. """ dimdiff = np.ndim(A) - dim if dimdiff > 0: axes = np.arange(dimdiff) A = np.sum(A, axis=axes) return A def broadcasting_multiplier(plates, *args): """ Compute the plate multiplier for given shapes. The first shape is compared to all other shapes (using NumPy broadcasting rules). All the elements which are non-unit in the first shape but 1 in all other shapes are multiplied together. This method is used, for instance, for computing a correction factor for messages to parents: If this node has non-unit plates that are unit plates in the parent, those plates are summed. However, if the message has unit axis for that plate, it should be first broadcasted to the plates of this node and then summed to the plates of the parent. In order to avoid this broadcasting and summing, it is more efficient to just multiply by the correct factor. This method computes that factor. The first argument is the full plate shape of this node (with respect to the parent). The other arguments are the shape of the message array and the plates of the parent (with respect to this node). """ # Check broadcasting of the shapes for arg in args: broadcasted_shape(plates, arg) # Check that each arg-plates are a subset of plates? for arg in args: if not is_shape_subset(arg, plates): print("Plates:", plates) print("Args:", args) raise ValueError("The shapes in args are not a sub-shape of " "plates") r = 1 for j in range(-len(plates),0): mult = True for arg in args: # if -j <= len(arg) and arg[j] != 1: if not (-j > len(arg) or arg[j] == 1): mult = False if mult: r *= plates[j] return r def sum_multiply_to_plates(*arrays, to_plates=(), from_plates=None, ndim=0): """ Compute the product of the arguments and sum to the target shape. """ arrays = list(arrays) def get_plates(x): if ndim == 0: return x else: return x[:-ndim] plates_arrays = [get_plates(np.shape(array)) for array in arrays] product_plates = broadcasted_shape(*plates_arrays) if from_plates is None: from_plates = product_plates r = 1 else: r = broadcasting_multiplier(from_plates, product_plates, to_plates) for ind in range(len(arrays)): plates_others = plates_arrays[:ind] + plates_arrays[(ind+1):] plates_without = broadcasted_shape(to_plates, *plates_others) ax = axes_to_collapse(plates_arrays[ind], #get_plates(np.shape(arrays[ind])), plates_without) if ax: ax = tuple([a-ndim for a in ax]) arrays[ind] = np.sum(arrays[ind], axis=ax, keepdims=True) plates_arrays = [get_plates(np.shape(array)) for array in arrays] product_plates = broadcasted_shape(*plates_arrays) ax = axes_to_collapse(product_plates, to_plates) if ax: ax = tuple([a-ndim for a in ax]) y = sum_multiply(*arrays, axis=ax, keepdims=True) else: y = functools.reduce(np.multiply, arrays) y = squeeze_to_dim(y, len(to_plates) + ndim) return r * y def multiply(*arrays): return functools.reduce(np.multiply, arrays, 1) def sum_multiply(*args, axis=None, sumaxis=True, keepdims=False): # Computes sum(arg[0]*arg[1]*arg[2]*..., axis=axes_to_sum) without # explicitly computing the intermediate product if len(args) == 0: raise ValueError("You must give at least one input array") # Dimensionality of the result max_dim = 0 for k in range(len(args)): max_dim = max(max_dim, np.ndim(args[k])) if sumaxis: if axis is None: # Sum all axes axes = [] else: if np.isscalar(axis): axis = [axis] axes = [i for i in range(max_dim) if i not in axis and (-max_dim+i) not in axis] else: if axis is None: # Keep all axes axes = list(range(max_dim)) else: # Find axes that are kept if np.isscalar(axis): axes = [axis] axes = [i if i >= 0 else i+max_dim for i in axis] axes = sorted(axes) if len(axes) > 0 and (min(axes) < 0 or max(axes) >= max_dim): raise ValueError("Axis index out of bounds") # Form a list of pairs: the array in the product and its axes pairs = list() for i in range(len(args)): a = args[i] a_dim = np.ndim(a) pairs.append(a) pairs.append(range(max_dim-a_dim, max_dim)) # Output axes are those which are not summed pairs.append(axes) # Compute the sum-product try: # Set optimize=False to work around a einsum broadcasting bug in NumPy 1.14.0: # https://github.com/numpy/numpy/issues/10343 # Perhaps it'll be fixed in 1.14.1? y = np.einsum(*pairs, optimize=False) except ValueError as err: if str(err) == ("If 'op_axes' or 'itershape' is not NULL in " "theiterator constructor, 'oa_ndim' must be greater " "than zero"): # TODO/FIXME: Handle a bug in NumPy. If all arguments to einsum are # scalars, it raises an error. For scalars we can just use multiply # and forget about summing. Hopefully, in the future, einsum handles # scalars properly and this try-except becomes unnecessary. y = functools.reduce(np.multiply, args) else: raise err # Restore summed axes as singleton axes if keepdims: d = 0 s = () for k in range(max_dim): if k in axes: # Axis not summed s = s + (np.shape(y)[d],) d += 1 else: # Axis was summed s = s + (1,) y = np.reshape(y, s) return y def sum_product(*args, axes_to_keep=None, axes_to_sum=None, keepdims=False): if axes_to_keep is not None: return sum_multiply(*args, axis=axes_to_keep, sumaxis=False, keepdims=keepdims) else: return sum_multiply(*args, axis=axes_to_sum, sumaxis=True, keepdims=keepdims) def moveaxis(A, axis_from, axis_to): """ Move the axis `axis_from` to position `axis_to`. """ if ((axis_from < 0 and abs(axis_from) > np.ndim(A)) or (axis_from >= 0 and axis_from >= np.ndim(A)) or (axis_to < 0 and abs(axis_to) > np.ndim(A)) or (axis_to >= 0 and axis_to >= np.ndim(A))): raise ValueError("Can't move axis %d to position %d. Axis index out of " "bounds for array with shape %s" % (axis_from, axis_to, np.shape(A))) axes = np.arange(np.ndim(A)) axes[axis_from:axis_to] += 1 axes[axis_from:axis_to:-1] -= 1 axes[axis_to] = axis_from return np.transpose(A, axes=axes) def safe_indices(inds, shape): """ Makes sure that indices are valid for given shape. The shorter shape determines the length. For instance, .. testsetup:: from bayespy.utils.misc import safe_indices >>> safe_indices( (3, 4, 5), (1, 6) ) (0, 5) """ m = min(len(inds), len(shape)) if m == 0: return () inds = inds[-m:] maxinds = np.array(shape[-m:]) - 1 return tuple(np.fmin(inds, maxinds)) def broadcasted_shape(*shapes): """ Computes the resulting broadcasted shape for a given set of shapes. Uses the broadcasting rules of NumPy. Raises an exception if the shapes do not broadcast. """ dim = 0 for a in shapes: dim = max(dim, len(a)) S = () for i in range(-dim,0): s = 1 for a in shapes: if -i <= len(a): if s == 1: s = a[i] elif a[i] != 1 and a[i] != s: raise ValueError("Shapes %s do not broadcast" % (shapes,)) S = S + (s,) return S def broadcasted_shape_from_arrays(*arrays): """ Computes the resulting broadcasted shape for a given set of arrays. Raises an exception if the shapes do not broadcast. """ shapes = [np.shape(array) for array in arrays] return broadcasted_shape(*shapes) def is_shape_subset(sub_shape, full_shape): """ """ if len(sub_shape) > len(full_shape): return False for i in range(len(sub_shape)): ind = -1 - i if sub_shape[ind] != 1 and sub_shape[ind] != full_shape[ind]: return False return True def add_axes(X, num=1, axis=0): for i in range(num): X = np.expand_dims(X, axis=axis) return X shape = np.shape(X)[:axis] + num*(1,) + np.shape(X)[axis:] return np.reshape(X, shape) def add_leading_axes(x, n): return add_axes(x, axis=0, num=n) def add_trailing_axes(x, n): return add_axes(x, axis=-1, num=n) def nested_iterator(max_inds): s = [range(i) for i in max_inds] return itertools.product(*s) def first(L): """ """ for (n,l) in enumerate(L): if l: return n return None def squeeze(X): """ Remove leading axes that have unit length. For instance, a shape (1,1,4,1,3) will be reshaped to (4,1,3). """ shape = np.array(np.shape(X)) inds = np.nonzero(shape != 1)[0] if len(inds) == 0: shape = () else: shape = shape[inds[0]:] return np.reshape(X, shape) def squeeze_to_dim(X, dim): s = tuple(range(np.ndim(X)-dim)) return np.squeeze(X, axis=s) def axes_to_collapse(shape_x, shape_to): # Solves which axes of shape shape_x need to be collapsed in order # to get the shape shape_to s = () for j in range(-len(shape_x), 0): if shape_x[j] != 1: if -j > len(shape_to) or shape_to[j] == 1: s += (j,) elif shape_to[j] != shape_x[j]: print('Shape from: ' + str(shape_x)) print('Shape to: ' + str(shape_to)) raise Exception('Incompatible shape to squeeze') return tuple(s) def sum_to_shape(X, s): """ Sum axes of the array such that the resulting shape is as given. Thus, the shape of the result will be s or an error is raised. """ # First, sum and remove axes that are not in s if np.ndim(X) > len(s): axes = tuple(range(-np.ndim(X), -len(s))) else: axes = () Y = np.sum(X, axis=axes) # Second, sum axes that are 1 in s but keep the axes axes = () for i in range(-np.ndim(Y), 0): if s[i] == 1: if np.shape(Y)[i] > 1: axes = axes + (i,) else: if np.shape(Y)[i] != s[i]: raise ValueError("Shape %s can't be summed to shape %s" % (np.shape(X), s)) Y = np.sum(Y, axis=axes, keepdims=True) return Y def repeat_to_shape(A, s): # Current shape t = np.shape(A) if len(t) > len(s): raise Exception("Can't repeat to a smaller shape") # Add extra axis t = tuple([1]*(len(s)-len(t))) + t A = np.reshape(A,t) # Repeat for i in reversed(range(len(s))): if s[i] != t[i]: if t[i] != 1: raise Exception("Can't repeat non-singular dimensions") else: A = np.repeat(A, s[i], axis=i) return A def multidigamma(a, d): """ Returns the derivative of the log of multivariate gamma. """ return np.sum(special.digamma(a[...,None] - 0.5*np.arange(d)), axis=-1) m_digamma = multidigamma def diagonal(A): return np.diagonal(A, axis1=-2, axis2=-1) def make_diag(X, ndim=1, ndim_from=0): """ Create a diagonal array given the diagonal elements. The diagonal array can be multi-dimensional. By default, the last axis is transformed to two axes (diagonal matrix) but this can be changed using ndim keyword. For instance, an array with shape (K,L,M,N) can be transformed to a set of diagonal 4-D tensors with shape (K,L,M,N,M,N) by giving ndim=2. If ndim=3, the result has shape (K,L,M,N,L,M,N), and so on. Diagonality means that for the resulting array Y holds: Y[...,i_1,i_2,..,i_ndim,j_1,j_2,..,j_ndim] is zero if i_n!=j_n for any n. """ if ndim < 0: raise ValueError("Parameter ndim must be non-negative integer") if ndim_from < 0: raise ValueError("Parameter ndim_to must be non-negative integer") if ndim_from > ndim: raise ValueError("Parameter ndim_to must not be greater than ndim") if ndim == 0: return X if np.ndim(X) < 2 * ndim_from: raise ValueError("The array does not have enough axes") if ndim_from > 0: if np.shape(X)[-ndim_from:] != np.shape(X)[-2*ndim_from:-ndim_from]: raise ValueError("The array X is not square") if ndim == ndim_from: return X X = atleast_nd(X, ndim+ndim_from) if ndim > 0: if ndim_from > 0: I = identity(*(np.shape(X)[-(ndim_from+ndim):-ndim_from])) else: I = identity(*(np.shape(X)[-ndim:])) X = add_axes(X, axis=np.ndim(X)-ndim_from, num=ndim-ndim_from) X = I * X return X def get_diag(X, ndim=1, ndim_to=0): """ Get the diagonal of an array. If ndim>1, take the diagonal of the last 2*ndim axes. """ if ndim < 0: raise ValueError("Parameter ndim must be non-negative integer") if ndim_to < 0: raise ValueError("Parameter ndim_to must be non-negative integer") if ndim_to > ndim: raise ValueError("Parameter ndim_to must not be greater than ndim") if ndim == 0: return X if np.ndim(X) < 2*ndim: raise ValueError("The array does not have enough axes") if np.shape(X)[-ndim:] != np.shape(X)[-2*ndim:-ndim]: raise ValueError("The array X is not square") if ndim == ndim_to: return X n_plate_axes = np.ndim(X) - 2 * ndim n_diag_axes = ndim - ndim_to axes = tuple(range(0, np.ndim(X) - ndim + ndim_to)) lengths = [0, n_plate_axes, n_diag_axes, ndim_to, ndim_to] cutpoints = list(np.cumsum(lengths)) axes_plates = axes[cutpoints[0]:cutpoints[1]] axes_diag= axes[cutpoints[1]:cutpoints[2]] axes_dims1 = axes[cutpoints[2]:cutpoints[3]] axes_dims2 = axes[cutpoints[3]:cutpoints[4]] axes_input = axes_plates + axes_diag + axes_dims1 + axes_diag + axes_dims2 axes_output = axes_plates + axes_diag + axes_dims1 + axes_dims2 return np.einsum(X, axes_input, axes_output) def diag(X, ndim=1): """ Create a diagonal array given the diagonal elements. The diagonal array can be multi-dimensional. By default, the last axis is transformed to two axes (diagonal matrix) but this can be changed using ndim keyword. For instance, an array with shape (K,L,M,N) can be transformed to a set of diagonal 4-D tensors with shape (K,L,M,N,M,N) by giving ndim=2. If ndim=3, the result has shape (K,L,M,N,L,M,N), and so on. Diagonality means that for the resulting array Y holds: Y[...,i_1,i_2,..,i_ndim,j_1,j_2,..,j_ndim] is zero if i_n!=j_n for any n. """ X = atleast_nd(X, ndim) if ndim > 0: I = identity(*(np.shape(X)[-ndim:])) X = add_axes(X, axis=np.ndim(X), num=ndim) X = I * X return X def m_dot(A,b): # Compute matrix-vector product over the last two axes of A and # the last axes of b. Other axes are broadcasted. If A has shape # (..., M, N) and b has shape (..., N), then the result has shape # (..., M) #b = reshape(b, shape(b)[:-1] + (1,) + shape(b)[-1:]) #return np.dot(A, b) return np.einsum('...ik,...k->...i', A, b) # TODO: Use einsum!! #return np.sum(A*b[...,np.newaxis,:], axis=(-1,)) def block_banded(D, B): """ Construct a symmetric block-banded matrix. `D` contains square diagonal blocks. `B` contains super-diagonal blocks. The resulting matrix is: D[0], B[0], 0, 0, ..., 0, 0, 0 B[0].T, D[1], B[1], 0, ..., 0, 0, 0 0, B[1].T, D[2], B[2], ..., ..., ..., ... ... ... ... ... ..., B[N-2].T, D[N-1], B[N-1] 0, 0, 0, 0, ..., 0, B[N-1].T, D[N] """ D = [np.atleast_2d(d) for d in D] B = [np.atleast_2d(b) for b in B] # Number of diagonal blocks N = len(D) if len(B) != N-1: raise ValueError("The number of super-diagonal blocks must contain " "exactly one block less than the number of diagonal " "blocks") # Compute the size of the full matrix M = 0 for i in range(N): if np.ndim(D[i]) != 2: raise ValueError("Blocks must be 2 dimensional arrays") d = np.shape(D[i]) if d[0] != d[1]: raise ValueError("Diagonal blocks must be square") M += d[0] for i in range(N-1): if np.ndim(B[i]) != 2: raise ValueError("Blocks must be 2 dimensional arrays") b = np.shape(B[i]) if b[0] != np.shape(D[i])[1] or b[1] != np.shape(D[i+1])[0]: raise ValueError("Shapes of the super-diagonal blocks do not match " "the shapes of the diagonal blocks") A = np.zeros((M,M)) k = 0 for i in range(N-1): (d0, d1) = np.shape(B[i]) # Diagonal block A[k:k+d0, k:k+d0] = D[i] # Super-diagonal block A[k:k+d0, k+d0:k+d0+d1] = B[i] # Sub-diagonal block A[k+d0:k+d0+d1, k:k+d0] = B[i].T k += d0 A[k:,k:] = D[-1] return A def dist_haversine(c1, c2, radius=6372795): # Convert coordinates to radians lat1 = np.atleast_1d(c1[0])[...,:,None] * np.pi / 180 lon1 = np.atleast_1d(c1[1])[...,:,None] * np.pi / 180 lat2 = np.atleast_1d(c2[0])[...,None,:] * np.pi / 180 lon2 = np.atleast_1d(c2[1])[...,None,:] * np.pi / 180 dlat = lat2 - lat1 dlon = lon2 - lon1 A = np.sin(dlat/2)**2 + np.cos(lat1)*np.cos(lat2)*(np.sin(dlon/2)**2) C = 2 * np.arctan2(np.sqrt(A), np.sqrt(1-A)) return radius * C def logsumexp(X, axis=None, keepdims=False): """ Compute log(sum(exp(X)) in a numerically stable way """ X = np.asanyarray(X) maxX = np.amax(X, axis=axis, keepdims=True) if np.ndim(maxX) > 0: maxX[~np.isfinite(maxX)] = 0 elif not np.isfinite(maxX): maxX = 0 X = X - maxX if not keepdims: maxX = np.squeeze(maxX, axis=axis) return np.log(np.sum(np.exp(X), axis=axis, keepdims=keepdims)) + maxX def normalized_exp(phi): """Compute exp(phi) so that exp(phi) sums to one. This is useful for computing probabilities from log evidence. """ logsum_p = logsumexp(phi, axis=-1, keepdims=True) logp = phi - logsum_p p = np.exp(logp) # Because of small numerical inaccuracy, normalize the probabilities # again for more accurate results return ( p / np.sum(p, axis=-1, keepdims=True), logsum_p ) def invpsi(x): r""" Inverse digamma (psi) function. The digamma function is the derivative of the log gamma function. This calculates the value Y > 0 for a value X such that digamma(Y) = X. For the new version, see Appendix C: http://research.microsoft.com/en-us/um/people/minka/papers/dirichlet/minka-dirichlet.pdf For the previous implementation, see: http://www4.ncsu.edu/~pfackler/ Are there speed/accuracy differences between the methods? """ x = np.asanyarray(x) y = np.where( x >= -2.22, np.exp(x) + 0.5, -1/(x - special.psi(1)) ) for i in range(5): y = y - (special.psi(y) - x) / special.polygamma(1, y) return y # # Previous implementation. Is it worse? Is there difference? # L = 1.0 # y = np.exp(x) # while (L > 1e-10): # y += L*np.sign(x-special.psi(y)) # L /= 2 # # Ad hoc by Jaakko # y = np.where(x < -100, -1 / x, y) # return y def invgamma(x): r""" Inverse gamma function. See: http://mathoverflow.net/a/28977 """ k = 1.461632 c = 0.036534 L = np.log((x+c)/np.sqrt(2*np.pi)) W = special.lambertw(L/np.exp(1)) return L/W + 0.5 def mean(X, axis=None, keepdims=False): """ Compute the mean, ignoring NaNs. """ if np.ndim(X) == 0: if axis is not None: raise ValueError("Axis out of bounds") return X X = np.asanyarray(X) nans = np.isnan(X) X = X.copy() X[nans] = 0 m = (np.sum(X, axis=axis, keepdims=keepdims) / np.sum(~nans, axis=axis, keepdims=keepdims)) return m def gradient(f, x, epsilon=1e-6): return optimize.approx_fprime(x, f, epsilon) def broadcast(*arrays, ignore_axis=None): """ Explicitly broadcast arrays to same shapes. It is possible ignore some axes so that the arrays are not broadcasted along those axes. """ shapes = [np.shape(array) for array in arrays] if ignore_axis is None: full_shape = broadcasted_shape(*shapes) else: try: ignore_axis = tuple(ignore_axis) except TypeError: ignore_axis = (ignore_axis,) if len(ignore_axis) != len(set(ignore_axis)): raise ValueError("Indices must be unique") if any(i >= 0 for i in ignore_axis): raise ValueError("Indices must be negative") # Put lengths of ignored axes to 1 cut_shapes = [ tuple( 1 if i in ignore_axis else shape[i] for i in range(-len(shape), 0) ) for shape in shapes ] full_shape = broadcasted_shape(*cut_shapes) return [np.ones(full_shape) * array for array in arrays] def block_diag(*arrays): """ Form a block diagonal array from the given arrays. Compared to SciPy's block_diag, this utilizes broadcasting and accepts more than dimensions in the input arrays. """ arrays = broadcast(*arrays, ignore_axis=(-1, -2)) plates = np.shape(arrays[0])[:-2] M = sum(np.shape(array)[-2] for array in arrays) N = sum(np.shape(array)[-1] for array in arrays) Y = np.zeros(plates + (M, N)) i_start = 0 j_start = 0 for array in arrays: i_end = i_start + np.shape(array)[-2] j_end = j_start + np.shape(array)[-1] Y[...,i_start:i_end,j_start:j_end] = array i_start = i_end j_start = j_end return Y def concatenate(*arrays, axis=-1): """ Concatenate arrays along a given axis. Compared to NumPy's concatenate, this utilizes broadcasting. """ # numpy.concatenate doesn't do broadcasting, so we need to do it explicitly return np.concatenate( broadcast(*arrays, ignore_axis=axis), axis=axis ) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/optimize.py0000644000175100001770000000306700000000000021035 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2011-2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import numpy as np from scipy import optimize _epsilon = np.sqrt(np.finfo(float).eps) def minimize(f, x0, maxiter=None, verbose=False): """ Simple wrapper for SciPy's optimize. The given function must return a tuple: (value, gradient). """ options = {'disp': verbose} if maxiter is not None: options['maxiter'] = maxiter opt = optimize.minimize(f, x0, jac=True, method='CG', options=options) return opt.x def check_gradient(f, x0, verbose=True, epsilon=_epsilon, return_abserr=False): """ Simple wrapper for SciPy's gradient checker. The given function must return a tuple: (value, gradient). Returns absolute and relative errors """ df = f(x0)[1] df_num = optimize.approx_fprime(x0, lambda x: f(x)[0], epsilon) abserr = np.linalg.norm(df-df_num) norm_num = np.linalg.norm(df_num) if abserr == 0 and norm_num == 0: err = 0 else: err = abserr / norm_num if verbose: print("Norm of numerical gradient: %g" % np.linalg.norm(df_num)) print("Norm of function gradient: %g" % np.linalg.norm(df)) print("Gradient relative error = %g and absolute error = %g" % (err, abserr)) return (abserr, err) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/random.py0000644000175100001770000002676300000000000020465 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ r""" General functions random sampling and distributions. """ import numpy as np from scipy import special from . import linalg from . import misc def intervals(N, length, amount=1, gap=0): r""" Return random non-overlapping parts of a sequence. For instance, N=16, length=2 and amount=4: [0, |1, 2|, 3, 4, 5, |6, 7|, 8, 9, |10, 11|, |12, 13|, 14, 15] that is, [1,2,6,7,10,11,12,13] However, the function returns only the indices of the beginning of the sequences, that is, in the example: [1,6,10,12] """ if length * amount + gap * (amount-1) > N: raise ValueError("Too short sequence") # In practice, we draw the sizes of the gaps between the sequences total_gap = N - length*amount - gap*(amount-1) gaps = np.random.multinomial(total_gap, np.ones(amount+1)/(amount+1)) # And then we get the beginning index of each sequence intervals = np.cumsum(gaps[:-1]) + np.arange(amount)*(length+gap) return intervals def mask(*shape, p=0.5): r""" Return a boolean array of the given shape. Parameters ---------- d0, d1, ..., dn : int Shape of the output. p : value in range [0,1] A probability that the elements are `True`. """ return np.random.rand(*shape) < p def wishart(nu, V): r""" Draw a random sample from the Wishart distribution. Parameters ---------- nu : int """ # TODO/FIXME: Are these correct.. D = np.shape(V)[0] if nu < D: raise ValueError("Degrees of freedom must be equal or greater than the " "dimensionality of the matrix.") X = np.random.multivariate_normal(np.zeros(D), V, size=nu) return np.dot(X, X.T) wishart_rand = wishart def invwishart_rand(nu, V): # TODO/FIXME: Are these correct.. return np.linalg.inv(wishart_rand(nu, V)) def covariance(D, size=(), nu=None): r""" Draw a random covariance matrix. Draws from inverse-Wishart distribution. The distribution of each element is independent of the dimensionality of the matrix. C ~ Inv-W(I, D) Parameters ---------- D : int Dimensionality of the covariance matrix. Returns: -------- C : (D,D) ndarray Positive-definite symmetric :math:`D\times D` matrix. """ if nu is None: nu = D if nu < D: raise ValueError("nu must be greater than or equal to D") try: size = tuple(size) except TypeError: size = (size,) shape = size + (D,nu) C = np.random.randn(*shape) C = linalg.dot(C, np.swapaxes(C, -1, -2)) / nu return linalg.inv(C) #return np.linalg.inv(np.dot(C, C.T)) def correlation(D): r""" Draw a random correlation matrix. """ X = np.random.randn(D,D); s = np.sqrt(np.sum(X**2, axis=-1, keepdims=True)) X = X / s return np.dot(X, X.T) def gaussian_logpdf(yVy, yVmu, muVmu, logdet_V, D): r""" Log-density of a Gaussian distribution. :math:`\mathcal{G}(\mathbf{y}|\boldsymbol{\mu},\mathbf{V}^{-1})` Parameters ----------- yVy : ndarray or double :math:`\mathbf{y}^T\mathbf{Vy}` yVmu : ndarray or double :math:`\mathbf{y}^T\mathbf{V}\boldsymbol{\mu}` muVmu : ndarray or double :math:`\boldsymbol{\mu}^T\mathbf{V}\boldsymbol{\mu}` logdet_V : ndarray or double Log-determinant of the precision matrix, :math:`\log|\mathbf{V}|`. D : int Dimensionality of the distribution. """ return -0.5*yVy + yVmu - 0.5*muVmu + 0.5*logdet_V - 0.5*D*np.log(2*np.pi) def gaussian_entropy(logdet_V, D): r""" Compute the entropy of a Gaussian distribution. If you want to get the gradient, just let each parameter be a gradient of that term. Parameters ---------- logdet_V : ndarray or double The log-determinant of the precision matrix. D : int The dimensionality of the distribution. """ return -0.5*logdet_V + 0.5*D + 0.5*D*np.log(2*np.pi) def gamma_logpdf(bx, logx, a_logx, a_logb, gammaln_a): r""" Log-density of :math:`\mathcal{G}(x|a,b)`. If you want to get the gradient, just let each parameter be a gradient of that term. Parameters ---------- bx : ndarray :math:`bx` logx : ndarray :math:`\log(x)` a_logx : ndarray :math:`a \log(x)` a_logb : ndarray :math:`a \log(b)` gammaln_a : ndarray :math:`\log\Gamma(a)` """ return a_logb - gammaln_a + a_logx - logx - bx #def gamma_logpdf(a, log_b, gammaln_a, def gamma_entropy(a, log_b, gammaln_a, psi_a, a_psi_a): r""" Entropy of :math:`\mathcal{G}(a,b)`. If you want to get the gradient, just let each parameter be a gradient of that term. Parameters ---------- a : ndarray :math:`a` log_b : ndarray :math:`\log(b)` gammaln_a : ndarray :math:`\log\Gamma(a)` psi_a : ndarray :math:`\psi(a)` a_psi_a : ndarray :math:`a\psi(a)` """ return a - log_b + gammaln_a + psi_a - a_psi_a def orth(D): r""" Draw random orthogonal matrix. """ Q = np.random.randn(D,D) (Q, _) = np.linalg.qr(Q) return Q def svd(s): r""" Draw a random matrix given its singular values. """ D = len(s) U = orth(D) * s V = orth(D) return np.dot(U, V.T) def sphere(N=1): r""" Draw random points uniformly on a unit sphere. Returns (latitude,longitude) in degrees. """ lon = np.random.uniform(-180, 180, N) lat = (np.arccos(np.random.uniform(-1, 1, N)) * 180 / np.pi) - 90 return (lat, lon) def bernoulli(p, size=None): r""" Draw random samples from the Bernoulli distribution. """ if isinstance(size, int): size = (size,) if size is None: size = np.shape(p) return (np.random.rand(*size) < p) def categorical(p, size=None): r""" Draw random samples from a categorical distribution. """ if size is None: size = np.shape(p)[:-1] if isinstance(size, int): size = (size,) if np.any(np.asanyarray(p)<0): raise ValueError("Array contains negative probabilities") if not misc.is_shape_subset(np.shape(p)[:-1], size): raise ValueError("Probability array shape and requested size are " "inconsistent") size = tuple(size) # Normalize probabilities p = p / np.sum(p, axis=-1, keepdims=True) # Compute cumulative probabilities (p_1, p_1+p_2, ..., p_1+...+p_N): P = np.cumsum(p, axis=-1) # Draw samples from interval [0,1] x = np.random.rand(*size) # For simplicity, repeat p to the size of the output (plus probability axis) K = np.shape(p)[-1] P = P * np.ones(tuple(size)+(K,)) if size == (): z = np.searchsorted(P, x) else: # Seach the indices z = np.zeros(size) inds = misc.nested_iterator(size) for ind in inds: z[ind] = np.searchsorted(P[ind], x[ind]) return z.astype(int) def multinomial(n, p, size=None): plates_n = np.shape(n) plates_p = np.shape(p)[:-1] k = np.shape(p)[-1] if size is None: size = misc.broadcasted_shape(plates_n, plates_p) if not misc.is_shape_subset(plates_n, size): raise ValueError("Shape of n does not broadcast to the given size") if not misc.is_shape_subset(plates_p, size): raise ValueError("Shape of p does not broadcast to the given size") # This isn't a very efficient implementation. One could use NumPy's # multinomial once for all those plates for which n and p is the same. n = np.broadcast_to(n, size) p = np.broadcast_to(p, size + (k,)) x = np.empty(size + (k,)) for i in misc.nested_iterator(size): x[i] = np.random.multinomial(n[i], p[i]) return x.astype(int) def gamma(a, b, size=None): x = np.random.gamma(a, b, size=size) if np.any(x == 0): raise RuntimeError( "Numerically zero samples. Try using a larger shape parameter in " "the gamma distribution." ) return x def dirichlet(alpha, size=None): r""" Draw random samples from the Dirichlet distribution. """ if isinstance(size, int): size = (size,) if size is None: size = np.shape(alpha) else: size = size + np.shape(alpha)[-1:] p = np.random.gamma(alpha, size=size) sump = np.sum(p, axis=-1, keepdims=True) if np.any(sump == 0): raise RuntimeError( "Numerically zero samples. Try using a larger Dirichlet " "concentration parameter value." ) p /= sump return p def logodds_to_probability(x): r""" Solves p from log(p/(1-p)) """ return 1 / (1 + np.exp(-x)) def alpha_beta_recursion(logp0, logP): r""" Compute alpha-beta recursion for Markov chain Initial state log-probabilities are in `p0` and state transition log-probabilities are in P. The probabilities do not need to be scaled to sum to one, but they are interpreted as below: logp0 = log P(z_0) + log P(y_0|z_0) logP[...,n,:,:] = log P(z_{n+1}|z_n) + log P(y_{n+1}|z_{n+1}) """ logp0 = misc.atleast_nd(logp0, 1) logP = misc.atleast_nd(logP, 3) D = np.shape(logp0)[-1] N = np.shape(logP)[-3] plates = misc.broadcasted_shape(np.shape(logp0)[:-1], np.shape(logP)[:-3]) if np.shape(logP)[-2:] != (D,D): raise ValueError("Dimension mismatch %s != %s" % (np.shape(logP)[-2:], (D,D))) # # Run the recursion algorithm # # Allocate memory logalpha = np.zeros(plates+(N,D)) logbeta = np.zeros(plates+(N,D)) g = np.zeros(plates) # Forward recursion logalpha[...,0,:] = logp0 for n in range(1,N): # Compute: P(z_{n-1},z_n|x_1,...,x_n) v = logalpha[...,n-1,:,None] + logP[...,n-1,:,:] c = misc.logsumexp(v, axis=(-1,-2)) # Sum over z_{n-1} to get: log P(z_n|x_1,...,x_n) logalpha[...,n,:] = misc.logsumexp(v - c[...,None,None], axis=-2) g -= c # Compute the normalization of the last term v = logalpha[...,N-1,:,None] + logP[...,N-1,:,:] g -= misc.logsumexp(v, axis=(-1,-2)) # Backward recursion logbeta[...,N-1,:] = 0 for n in reversed(range(N-1)): v = logbeta[...,n+1,None,:] + logP[...,n+1,:,:] c = misc.logsumexp(v, axis=(-1,-2)) logbeta[...,n,:] = misc.logsumexp(v - c[...,None,None], axis=-1) v = logalpha[...,:,:,None] + logbeta[...,:,None,:] + logP[...,:,:,:] c = misc.logsumexp(v, axis=(-1,-2)) zz = np.exp(v - c[...,None,None]) # The logsumexp normalization is not numerically accurate, so do # normalization again: zz /= np.sum(zz, axis=(-1,-2), keepdims=True) z0 = np.sum(zz[...,0,:,:], axis=-1) z0 /= np.sum(z0, axis=-1, keepdims=True) return (z0, zz, g) def gaussian_gamma_to_t(mu, Cov, a, b, ndim=1): r""" Integrates gamma distribution to obtain parameters of t distribution """ alpha = a/b nu = 2*a S = Cov / misc.add_trailing_axes(alpha, 2*ndim) return (mu, S, nu) def t_logpdf(z2, logdet_cov, nu, D): r""" """ return (special.gammaln((nu+D)/2) - special.gammaln(nu/2) - 0.5 * D * np.log(nu*np.pi) - 0.5 * logdet_cov - 0.5 * (nu+D) * np.log(1 + z2/nu)) ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.417372 bayespy-0.6.2/bayespy/utils/tests/0000755000175100001770000000000000000000000017757 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/tests/__init__.py0000644000175100001770000000000000000000000022056 0ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/tests/test_linalg.py0000644000175100001770000001407200000000000022642 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for bayespy.utils.linalg module. """ import numpy as np from .. import misc from .. import linalg class TestDot(misc.TestCase): def test_dot(self): """ Test dot product multiple multi-dimensional arrays. """ # If no arrays, return 0 self.assertAllClose(linalg.dot(), 0) # If only one array, return itself self.assertAllClose(linalg.dot([[1,2,3], [4,5,6]]), [[1,2,3], [4,5,6]]) # Basic test of two arrays: (2,3) * (3,2) self.assertAllClose(linalg.dot([[1,2,3], [4,5,6]], [[7,8], [9,1], [2,3]]), [[31,19], [85,55]]) # Basic test of four arrays: (2,3) * (3,2) * (2,1) * (1,2) self.assertAllClose(linalg.dot([[1,2,3], [4,5,6]], [[7,8], [9,1], [2,3]], [[4], [5]], [[6,7]]), [[1314,1533], [3690,4305]]) # Test broadcasting: (2,2,2) * (2,2,2,2) self.assertAllClose(linalg.dot([[[1,2], [3,4]], [[5,6], [7,8]]], [[[[1,2], [3,4]], [[5,6], [7,8]]], [[[9,1], [2,3]], [[4,5], [6,7]]]]), [[[[ 7, 10], [ 15, 22]], [[ 67, 78], [ 91, 106]]], [[[ 13, 7], [ 35, 15]], [[ 56, 67], [ 76, 91]]]]) # Inconsistent shapes: (2,3) * (2,3) self.assertRaises(ValueError, linalg.dot, [[1,2,3], [4,5,6]], [[1,2,3], [4,5,6]]) # Other axes do not broadcast: (2,2,2) * (3,2,2) self.assertRaises(ValueError, linalg.dot, [[[1,2], [3,4]], [[5,6], [7,8]]], [[[1,2], [3,4]], [[5,6], [7,8]], [[9,1], [2,3]]]) # Do not broadcast matrix axes: (2,1) * (3,2) self.assertRaises(ValueError, linalg.dot, [[1], [2]], [[1,2,3], [4,5,6]]) # Do not accept less than 2-D arrays: (2) * (2,2) self.assertRaises(ValueError, linalg.dot, [1,2], [[1,2,3], [4,5,6]]) class TestBandedSolve(misc.TestCase): def test_block_banded_solve(self): """ Test the Gaussian elimination algorithm for block-banded matrices. """ # # Create a block-banded matrix # # Number of blocks N = 40 # Random sizes of the blocks #D = np.random.randint(5, 10, size=N) # Fixed sizes of the blocks D = 5*np.ones(N, dtype=np.int64) # Some helpful variables to create the covariances W = [np.random.randn(D[i], 2*D[i]) for i in range(N)] # The diagonal blocks (covariances) A = [np.dot(W[i], W[i].T) for i in range(N)] # The superdiagonal blocks (cross-covariances) B = [np.dot(W[i][:,-1:], W[i+1][:,:1].T) for i in range(N-1)] C = misc.block_banded(A, B) # Create the system to be solved: y=C*x x_true = np.random.randn(np.sum(D)) y = np.dot(C, x_true) x_true = np.reshape(x_true, (N, -1)) y = np.reshape(y, (N, -1)) # # Run tests # # The correct inverse invC = np.linalg.inv(C) # Inverse from the function that is tested (invA, invB, x, ldet) = linalg.block_banded_solve(np.asarray(A), np.asarray(B), np.asarray(y)) # Check that you get the correct number of blocks self.assertEqual(len(invA), N) self.assertEqual(len(invB), N-1) # Check each block i0 = 0 for i in range(N-1): i1 = i0 + D[i] i2 = i1 + D[i+1] # Check diagonal block self.assertTrue(np.allclose(invA[i], invC[i0:i1, i0:i1])) # Check super-diagonal block self.assertTrue(np.allclose(invB[i], invC[i0:i1, i1:i2])) i0 = i1 # Check last block self.assertTrue(np.allclose(invA[-1], invC[i0:, i0:])) # Check the solution of the system self.assertTrue(np.allclose(x_true, x)) # Check the log determinant self.assertAlmostEqual(ldet/np.linalg.slogdet(C)[1], 1) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/tests/test_misc.py0000644000175100001770000004133700000000000022333 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2013 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for bayespy.utils.misc module. """ import unittest import warnings import numpy as np from scipy.special import psi from numpy import testing from .. import misc class TestCeilDiv(misc.TestCase): def test_ceildiv(self): """ Test the ceil division """ self.assertEqual(misc.ceildiv(3, 1), 3) self.assertEqual(misc.ceildiv(6, 3), 2) self.assertEqual(misc.ceildiv(7, 3), 3) self.assertEqual(misc.ceildiv(8, 3), 3) self.assertEqual(misc.ceildiv(-6, 3), -2) self.assertEqual(misc.ceildiv(-7, 3), -2) self.assertEqual(misc.ceildiv(-8, 3), -2) self.assertEqual(misc.ceildiv(-9, 3), -3) self.assertEqual(misc.ceildiv(6, -3), -2) self.assertEqual(misc.ceildiv(7, -3), -2) self.assertEqual(misc.ceildiv(8, -3), -2) self.assertEqual(misc.ceildiv(9, -3), -3) class TestAddAxes(misc.TestCase): def test_add_axes(self): """ Test the add_axes method. """ f = lambda X, **kwargs: np.shape(misc.add_axes(X, **kwargs)) # By default, add one leading axis self.assertEqual(f(np.ones((3,))), (1,3)) # By default, add leading axes self.assertEqual(f(np.ones((3,)), num=3), (1,1,1,3)) # By default, add one axis self.assertEqual(f(np.ones((3,)), axis=1), (3,1)) # Add axes to the beginning self.assertEqual(f(np.ones((2,3,4,)), axis=0, num=3), (1,1,1,2,3,4)) # Add axes to the middle self.assertEqual(f(np.ones((2,3,4,)), axis=1, num=3), (2,1,1,1,3,4)) # Test negative axis index self.assertEqual(f(np.ones((2,3,4,)), axis=-4, num=3), (1,1,1,2,3,4)) self.assertEqual(f(np.ones((2,3,4,)), axis=-1, num=1), (2,3,4,1)) self.assertEqual(f(np.ones((2,3,4,)), axis=-2, num=3), (2,3,1,1,1,4)) # Add axes to the end self.assertEqual(f(np.ones((2,3,4,)), axis=3, num=3), (2,3,4,1,1,1)) class TestBroadcasting(unittest.TestCase): def test_is_shape_subset(self): f = misc.is_shape_subset self.assertTrue(f( (), () )) self.assertTrue(f( (), (3,) )) self.assertTrue(f( (1,), (1,) )) self.assertTrue(f( (1,), (3,) )) self.assertTrue(f( (1,), (4,1) )) self.assertTrue(f( (1,), (4,3) )) self.assertTrue(f( (1,), (1,3) )) self.assertTrue(f( (3,), (1,3) )) self.assertTrue(f( (3,), (4,3) )) self.assertTrue(f( (5,1,3), (6,5,4,3) )) self.assertTrue(f( (5,4,3), (6,5,4,3) )) self.assertFalse(f( (1,), () )) self.assertFalse(f( (3,), (1,) )) self.assertFalse(f( (4,3,), (3,) )) self.assertFalse(f( (4,3,), (1,3,) )) self.assertFalse(f( (6,1,4,3,), (6,1,1,3,) )) class TestMultiplyShapes(unittest.TestCase): def test_multiply_shapes(self): f = lambda *shapes: tuple(misc.multiply_shapes(*shapes)) # Basic test self.assertEqual(f((2,), (3,)), (6,)) # Test multiple arguments self.assertEqual(f((2,), (3,), (4,)), (24,)) # Test different lengths and multiple arguments self.assertEqual(f(( 2,3,), (4,5,6,), ( 7,)), (4,10,126,)) # Test empty shapes self.assertEqual(f((), ()), ()) self.assertEqual(f((), (5,)), (5,)) class TestSumMultiply(unittest.TestCase): def check_sum_multiply(self, *shapes, **kwargs): # The set of arrays x = list() for (ind, shape) in enumerate(shapes): x += [np.random.randn(*shape)] # Result from the function yh = misc.sum_multiply(*x, **kwargs) axis = kwargs.get('axis', None) sumaxis = kwargs.get('sumaxis', True) keepdims = kwargs.get('keepdims', False) # Compute the product y = 1 for xi in x: y = y * xi # Compute the sum if sumaxis: y = np.sum(y, axis=axis, keepdims=keepdims) else: axes = np.arange(np.ndim(y)) # TODO/FIXME: np.delete has a bug that it doesn't accept negative # indices. Thus, transform negative axes to positive axes. if len(axis) > 0: axis = [i if i >= 0 else i+np.ndim(y) for i in axis] elif axis < 0: axis += np.ndim(y) axes = np.delete(axes, axis) axes = tuple(axes) if len(axes) > 0: y = np.sum(y, axis=axes, keepdims=keepdims) # Check the result testing.assert_allclose(yh, y, err_msg="Incorrect value.") def test_sum_multiply(self): """ Test misc.sum_multiply. """ # Check empty list returns error self.assertRaises(ValueError, self.check_sum_multiply) # Check scalars self.check_sum_multiply(()) self.check_sum_multiply((), (), ()) # Check doing no summation self.check_sum_multiply((3,), axis=()) self.check_sum_multiply((3,1,5), ( 4,1), ( 5,), ( ), axis=(), keepdims=True) # Check AXES_TO_SUM self.check_sum_multiply((3,1), (1,4), (3,4), axis=(1,)) self.check_sum_multiply((3,1), (1,4), (3,4), axis=(-2,)) self.check_sum_multiply((3,1), (1,4), (3,4), axis=(1,-2)) # Check AXES_TO_SUM and KEEPDIMS self.check_sum_multiply((3,1), (1,4), (3,4), axis=(1,), keepdims=True) self.check_sum_multiply((3,1), (1,4), (3,4), axis=(-2,), keepdims=True) self.check_sum_multiply((3,1), (1,4), (3,4), axis=(1,-2,), keepdims=True) self.check_sum_multiply((3,1,5,6), ( 4,1,6), ( 4,1,1), ( ), axis=(1,-2), keepdims=True) # Check AXES_TO_KEEP self.check_sum_multiply((3,1), (1,4), (3,4), sumaxis=False, axis=(1,)) self.check_sum_multiply((3,1), (1,4), (3,4), sumaxis=False, axis=(-2,)) self.check_sum_multiply((3,1), (1,4), (3,4), sumaxis=False, axis=(1,-2)) # Check AXES_TO_KEEP and KEEPDIMS self.check_sum_multiply((3,1), (1,4), (3,4), sumaxis=False, axis=(1,), keepdims=True) self.check_sum_multiply((3,1), (1,4), (3,4), sumaxis=False, axis=(-2,), keepdims=True) self.check_sum_multiply((3,1), (1,4), (3,4), sumaxis=False, axis=(1,-2,), keepdims=True) self.check_sum_multiply((3,1,5,6), ( 4,1,6), ( 4,1,1), ( ), sumaxis=False, axis=(1,-2,), keepdims=True) # Check errors # Inconsistent shapes self.assertRaises(ValueError, self.check_sum_multiply, (3,4), (3,5)) # Axis index out of bounds self.assertRaises(ValueError, self.check_sum_multiply, (3,4), (3,4), axis=(-3,)) self.assertRaises(ValueError, self.check_sum_multiply, (3,4), (3,4), axis=(2,)) self.assertRaises(ValueError, self.check_sum_multiply, (3,4), (3,4), sumaxis=False, axis=(-3,)) self.assertRaises(ValueError, self.check_sum_multiply, (3,4), (3,4), sumaxis=False, axis=(2,)) # Same axis several times self.assertRaises(ValueError, self.check_sum_multiply, (3,4), (3,4), axis=(1,-1)) self.assertRaises(ValueError, self.check_sum_multiply, (3,4), (3,4), sumaxis=False, axis=(1,-1)) class TestLogSumExp(misc.TestCase): def test_logsumexp(self): """ Test the ceil division """ self.assertAllClose(misc.logsumexp(3), np.log(np.sum(np.exp(3)))) with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) self.assertAllClose(misc.logsumexp(-np.inf), -np.inf) self.assertAllClose(misc.logsumexp(np.inf), np.inf) with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) self.assertAllClose(misc.logsumexp(np.nan), np.nan) with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) self.assertAllClose(misc.logsumexp([-np.inf, -np.inf]), -np.inf) self.assertAllClose(misc.logsumexp([[1e10, 1e-10], [-1e10, -np.inf]], axis=-1), [1e10, -1e10]) # Test keeping dimensions self.assertAllClose(misc.logsumexp([[1e10, 1e-10], [-1e10, -np.inf]], axis=-1, keepdims=True), [[1e10], [-1e10]]) # Test multiple axes self.assertAllClose(misc.logsumexp([[1e10, 1e-10], [-1e10, -np.inf]], axis=(-1,-2)), 1e10) pass class TestMean(misc.TestCase): def test_mean(self): """ Test the ceil division """ self.assertAllClose(misc.mean(3), 3) with warnings.catch_warnings(): warnings.simplefilter("ignore", RuntimeWarning) self.assertAllClose(misc.mean(np.nan), np.nan) self.assertAllClose(misc.mean([[2,3], [np.nan,np.nan]], axis=-1), [2.5,np.nan]) self.assertAllClose(misc.mean([[2,3], [np.nan,np.nan]], axis=-1, keepdims=True), [[2.5],[np.nan]]) self.assertAllClose(misc.mean([[2,3], [np.nan,np.nan]], axis=-2), [2,3]) self.assertAllClose(misc.mean([[2,3], [np.nan,np.nan]], axis=-2, keepdims=True), [[2,3]]) self.assertAllClose(misc.mean([[2,3], [np.nan,np.nan]]), 2.5) self.assertAllClose(misc.mean([[2,3], [np.nan,np.nan]], axis=(-1,-2)), 2.5) self.assertAllClose(misc.mean([[2,3], [np.nan,np.nan]], keepdims=True), [[2.5]]) pass class TestInvPsi(misc.TestCase): def test_invpsi(self): x = 1000 y = psi(x) self.assertAllClose(misc.invpsi(y), x) x = 1/1000 y = psi(x) self.assertAllClose(misc.invpsi(y), x, rtol=1e-3) x = 50*np.random.rand(5) y = psi(x) self.assertAllClose(misc.invpsi(y), x) pass class TestPutSimple(misc.TestCase): def test_put_simple(self): # Scalar indices self.assertAllClose( misc.put_simple( 42, 2, ), [0, 0, 42], ) # Simple vectors, automatic length self.assertAllClose( misc.put_simple( [1, 0.1, 0.01, 0.001, 0.0001], [3, 3, 1, 3, 0], ), [0.0001, 0.01, 0, 1.101], ) # Matrix indices self.assertAllClose( misc.put_simple( [[1, 0.1], [0.01, 0.001]], [[4, 1], [1, 3]], ), [0, 0.11, 0, 0.001, 1], ) # Test axis self.assertAllClose( misc.put_simple( [[1, 0.1], [0.01, 0.001], [0.0001, 0.00001]], [3, 3, 0], axis=-2, ), [[0.0001, 0.00001], [0, 0], [0, 0], [1.01, 0.101]], ) # Test explicit length self.assertAllClose( misc.put_simple( [1, 0.1, 0.01, 0.001, 0.0001], [3, 3, 1, 3, 0], length=6, ), [0.0001, 0.01, 0, 1.101, 0, 0], ) # Test broadcasting self.assertAllClose( misc.put_simple( 2, [3, 3, 1, 3, 0], ), [2, 2, 0, 6], ) # Test leading axes in y self.assertAllClose( misc.put_simple( [[1, 0.1], [0.01, 0.001], [0.0001, 0.00001]], [2, 0], ), [[0.1, 0, 1], [0.001, 0, 0.01], [0.00001, 0, 0.0001]], ) pass ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/bayespy/utils/tests/test_random.py0000644000175100001770000002274100000000000022656 0ustar00runnerdocker00000000000000################################################################################ # Copyright (C) 2014 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ """ Unit tests for bayespy.utils.random module. """ import numpy as np from .. import misc from .. import random class TestCeilDiv(misc.TestCase): def test_categorical(self): # Test dummy one category y = random.categorical([1]) self.assertEqual(y, 0) # Test multiple categories y = random.categorical([1,0,0]) self.assertEqual(y, 0) y = random.categorical([0,1,0]) self.assertEqual(y, 1) y = random.categorical([0,0,1]) self.assertEqual(y, 2) # Test un-normalized probabilities y = random.categorical([0,0.1234]) self.assertEqual(y, 1) # Test multiple distributions y = random.categorical([ [1,0,0], [0,0,1], [0,1,0] ]) self.assertArrayEqual(y, [0,2,1]) # Test multiple samples y = random.categorical([0,1,0], size=(4,)) self.assertArrayEqual(y, [1,1,1,1]) # # ERRORS # # Negative probablities self.assertRaises(ValueError, random.categorical, [0, -1]) # Requested size and probability array size mismatch self.assertRaises(ValueError, random.categorical, [[1,0],[0,1]], size=(3,)) pass class TestDirichlet(misc.TestCase): """ Unit tests for the Dirichlet random sampling """ def test(self): """ Test random sampling from the Dirichlet distribution. """ # Test computations p = random.dirichlet([1e-10, 1e-10, 1e10, 1e-10]) self.assertAllClose(p, [0, 0, 1, 0], atol=1e-5) p = random.dirichlet([1e20, 1e20, 1e20, 5*1e20]) self.assertAllClose(p, [0.125, 0.125, 0.125, 0.625]) # Test array p = random.dirichlet([ [1e20, 1e-20], [3*1e20, 1e20] ]) self.assertAllClose(p, [[1.0, 0.0], [0.75, 0.25]]) # Test size argument p = random.dirichlet([ [1e20, 1e-20] ], size=3) self.assertAllClose(p, [[1, 0], [1, 0], [1, 0]]) p = random.dirichlet([ [3*1e20, 1e20] ], size=(2,3)) self.assertAllClose(p, [ [[0.75, 0.25], [0.75, 0.25], [0.75, 0.25]], [[0.75, 0.25], [0.75, 0.25], [0.75, 0.25]] ]) pass class TestAlphaBetaRecursion(misc.TestCase): def test(self): """ Test the results of alpha-beta recursion for Markov chains """ np.seterr(divide='ignore') # Deterministic oscillator p0 = np.array([1.0, 0.0]) P = np.array(3*[[[0.0, 1.0], [1.0, 0.0]]]) (z0, zz, g) = random.alpha_beta_recursion(np.log(p0), np.log(P)) self.assertAllClose(z0, [1.0, 0]) self.assertAllClose(zz, [ [[0.0, 1.0], [0.0, 0.0]], [[0.0, 0.0], [1.0, 0.0]], [[0.0, 1.0], [0.0, 0.0]] ]) self.assertAllClose(g, -np.log(np.einsum('a,ab,bc,cd->', p0, P[0], P[1], P[2])), msg="Cumulant generating function incorrect") # Maximum randomness p0 = np.array([0.5, 0.5]) P = np.array(3*[[[0.5, 0.5], [0.5, 0.5]]]) (z0, zz, g) = random.alpha_beta_recursion(np.log(p0), np.log(P)) self.assertAllClose(z0, [0.5, 0.5]) self.assertAllClose(zz, [ [[0.25, 0.25], [0.25, 0.25]], [[0.25, 0.25], [0.25, 0.25]], [[0.25, 0.25], [0.25, 0.25]] ]) self.assertAllClose(g, -np.log(np.einsum('a,ab,bc,cd->', p0, P[0], P[1], P[2])), msg="Cumulant generating function incorrect") # Unnormalized probabilities p0 = np.array([2, 2]) P = np.array([ [[4, 4], [4, 4]], [[8, 8], [8, 8]], [[20, 20], [20, 20]] ]) (z0, zz, g) = random.alpha_beta_recursion(np.log(p0), np.log(P)) self.assertAllClose(z0, [0.5, 0.5]) self.assertAllClose(zz, [ [[0.25, 0.25], [0.25, 0.25]], [[0.25, 0.25], [0.25, 0.25]], [[0.25, 0.25], [0.25, 0.25]] ]) self.assertAllClose(g, -np.log(np.einsum('a,ab,bc,cd->', p0, P[0], P[1], P[2])), msg="Cumulant generating function incorrect") p0 = np.array([2, 6]) P = np.array([ [[0, 3], [4, 1]], [[3, 5], [6, 4]], [[9, 2], [8, 1]] ]) (z0, zz, g) = random.alpha_beta_recursion(np.log(p0), np.log(P)) y0 = np.einsum('a,ab,bc,cd->a', p0, P[0], P[1], P[2]) y1 = np.einsum('a,ab,bc,cd->ab', p0, P[0], P[1], P[2]) y2 = np.einsum('a,ab,bc,cd->bc', p0, P[0], P[1], P[2]) y3 = np.einsum('a,ab,bc,cd->cd', p0, P[0], P[1], P[2]) self.assertAllClose(z0, y0 / np.sum(y0)) self.assertAllClose(zz, [ y1 / np.sum(y1), y2 / np.sum(y2), y3 / np.sum(y3) ]) self.assertAllClose(g, -np.log(np.einsum('a,ab,bc,cd->', p0, P[0], P[1], P[2])), msg="Cumulant generating function incorrect") # Test plates p0 = np.array([ [1.0, 0.0], [0.5, 0.5] ]) P = np.array([ [ [[0.0, 1.0], [1.0, 0.0]] ], [ [[0.5, 0.5], [0.5, 0.5]] ] ]) (z0, zz, g) = random.alpha_beta_recursion(np.log(p0), np.log(P)) self.assertAllClose(z0, [[1.0, 0.0], [0.5, 0.5]]) self.assertAllClose(zz, [ [ [[0.0, 1.0], [0.0, 0.0]] ], [ [[0.25, 0.25], [0.25, 0.25]] ] ]) self.assertAllClose(g, -np.log(np.einsum('...a,...ab->...', p0, P[...,0,:,:])), msg="Cumulant generating function incorrect") # Test overflow logp0 = np.array([1e5, -np.inf]) logP = np.array([[[-np.inf, 1e5], [-np.inf, 1e5]]]) (z0, zz, g) = random.alpha_beta_recursion(logp0, logP) self.assertAllClose(z0, [1.0, 0]) self.assertAllClose(zz, [ [[0.0, 1.0], [0.0, 0.0]] ]) ## self.assertAllClose(g, ## -np.log(np.einsum('a,ab,bc,cd->', ## p0, P[0], P[1], P[2]))) # Test underflow logp0 = np.array([-1e5, -np.inf]) logP = np.array([[[-np.inf, -1e5], [-np.inf, -1e5]]]) (z0, zz, g) = random.alpha_beta_recursion(logp0, logP) self.assertAllClose(z0, [1.0, 0]) self.assertAllClose(zz, [ [[0.0, 1.0], [0.0, 0.0]] ]) ## self.assertAllClose(g, ## -np.log(np.einsum('a,ab,bc,cd->', ## p0, P[0], P[1], P[2]))) # Test stability of the algorithm logp0 = np.array([-1e5, -np.inf]) logP = np.array(10*[[[-np.inf, 1e5], [1e0, -np.inf]]]) (z0, zz, g) = random.alpha_beta_recursion(logp0, logP) self.assertTrue(np.all(~np.isnan(z0)), msg="Nans in results, algorithm not stable") self.assertTrue(np.all(~np.isnan(zz)), msg="Nans in results, algorithm not stable") self.assertTrue(np.all(~np.isnan(g)), msg="Nans in results, algorithm not stable") pass ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.405372 bayespy-0.6.2/bayespy.egg-info/0000755000175100001770000000000000000000000017147 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273981.0 bayespy-0.6.2/bayespy.egg-info/PKG-INFO0000644000175100001770000001634700000000000020257 0ustar00runnerdocker00000000000000Metadata-Version: 2.1 Name: bayespy Version: 0.6.2 Summary: Variational Bayesian inference tools for Python Home-page: http://bayespy.org Author: Jaakko Luttinen Author-email: jaakko.luttinen@iki.fi License: UNKNOWN Description: BayesPy - Bayesian Python ========================= BayesPy provides tools for Bayesian inference with Python. The user constructs a model as a Bayesian network, observes data and runs posterior inference. The goal is to provide a tool which is efficient, flexible and extendable enough for expert use but also accessible for more casual users. Currently, only variational Bayesian inference for conjugate-exponential family (variational message passing) has been implemented. Future work includes variational approximations for other types of distributions and possibly other approximate inference methods such as expectation propagation, Laplace approximations, Markov chain Monte Carlo (MCMC) and other methods. Contributions are welcome. Project information ------------------- Copyright (C) 2011-2017 Jaakko Luttinen and other contributors (see below) BayesPy including the documentation is licensed under the MIT License. See LICENSE file for a text of the license or visit http://opensource.org/licenses/MIT. .. |chat| image:: https://badges.gitter.im/Join%20Chat.svg :target: https://gitter.im/bayespy/bayespy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge .. |release| image:: https://badge.fury.io/py/bayespy.svg :target: https://pypi.python.org/pypi/bayespy .. |conda-release| image:: https://anaconda.org/conda-forge/bayespy/badges/installer/conda.svg :target: https://anaconda.org/conda-forge/bayespy ============== ============================================= Latest release |release| |conda-release| Documentation http://bayespy.org Repository https://github.com/bayespy/bayespy.git Bug reports https://github.com/bayespy/bayespy/issues Author Jaakko Luttinen jaakko.luttinen@iki.fi Chat |chat| Mailing list bayespy@googlegroups.com ============== ============================================= Continuous integration ++++++++++++++++++++++ .. |travismaster| image:: https://travis-ci.org/bayespy/bayespy.svg?branch=master :target: https://travis-ci.org/bayespy/bayespy/ :align: middle .. |travisdevelop| image:: https://travis-ci.org/bayespy/bayespy.svg?branch=develop :target: https://travis-ci.org/bayespy/bayespy/ :align: middle .. |covermaster| image:: https://coveralls.io/repos/bayespy/bayespy/badge.svg?branch=master :target: https://coveralls.io/r/bayespy/bayespy?branch=master :align: middle .. |coverdevelop| image:: https://coveralls.io/repos/bayespy/bayespy/badge.svg?branch=develop :target: https://coveralls.io/r/bayespy/bayespy?branch=develop :align: middle .. |docsmaster| image:: https://img.shields.io/badge/docs-master-blue.svg?style=flat :target: http://www.bayespy.org/en/stable/ :align: middle .. |docsdevelop| image:: https://img.shields.io/badge/docs-develop-blue.svg?style=flat :target: http://www.bayespy.org/en/latest/ :align: middle ==================== =============== ============== ============= Branch Test status Test coverage Documentation ==================== =============== ============== ============= **master (stable)** |travismaster| |covermaster| |docsmaster| **develop (latest)** |travisdevelop| |coverdevelop| |docsdevelop| ==================== =============== ============== ============= Similar projects ---------------- `VIBES `_ (http://vibes.sourceforge.net/) allows variational inference to be performed automatically on a Bayesian network. It is implemented in Java and released under revised BSD license. `Bayes Blocks `_ (http://research.ics.aalto.fi/bayes/software/) is a C++/Python implementation of the variational building block framework. The framework allows easy learning of a wide variety of models using variational Bayesian learning. It is available as free software under the GNU General Public License. `Infer.NET `_ (http://research.microsoft.com/infernet/) is a .NET framework for machine learning. It provides message-passing algorithms and statistical routines for performing Bayesian inference. It is partly closed source and licensed for non-commercial use only. `PyMC `_ (https://github.com/pymc-devs/pymc) provides MCMC methods in Python. It is released under the Academic Free License. `OpenBUGS `_ (http://www.openbugs.info) is a software package for performing Bayesian inference using Gibbs sampling. It is released under the GNU General Public License. `Dimple `_ (http://dimple.probprog.org/) provides Gibbs sampling, belief propagation and a few other inference algorithms for Matlab and Java. It is released under the Apache License. `Stan `_ (http://mc-stan.org/) provides inference using MCMC with an interface for R and Python. It is released under the New BSD License. `PBNT - Python Bayesian Network Toolbox `_ (http://pbnt.berlios.de/) is Bayesian network library in Python supporting static networks with discrete variables. There was no information about the license. Contributors ------------ The list of contributors: * Jaakko Luttinen * Hannu Hartikainen * Deebul Nair * Christopher Cramer * Till Hoffmann Each file or the git log can be used for more detailed information. Keywords: variational Bayes,probabilistic programming,Bayesian networks,graphical models,variational message passing Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3.3 Classifier: Programming Language :: Python :: 3.4 Classifier: Development Status :: 4 - Beta Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: Intended Audience :: Science/Research Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Topic :: Scientific/Engineering Classifier: Topic :: Scientific/Engineering :: Information Analysis Provides-Extra: doc Provides-Extra: dev ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273981.0 bayespy-0.6.2/bayespy.egg-info/SOURCES.txt0000644000175100001770000001327100000000000021037 0ustar00runnerdocker00000000000000CHANGELOG.rst INSTALL.rst LICENSE MANIFEST.in README.rst setup.cfg setup.py versioneer.py bayespy/__init__.py bayespy/_meta.py bayespy/_version.py bayespy/discrete_example.py bayespy/plot.py bayespy/testing.py bayespy.egg-info/PKG-INFO bayespy.egg-info/SOURCES.txt bayespy.egg-info/dependency_links.txt bayespy.egg-info/entry_points.txt bayespy.egg-info/requires.txt bayespy.egg-info/top_level.txt bayespy/demos/__init__.py bayespy/demos/annealing.py bayespy/demos/black_box.py bayespy/demos/categorical.py bayespy/demos/collapsed_cg.py bayespy/demos/gamma_shape.py bayespy/demos/hmm.py bayespy/demos/lda.py bayespy/demos/lssm.py bayespy/demos/lssm_sd.py bayespy/demos/lssm_tvd.py bayespy/demos/mog.py bayespy/demos/pattern_search.py bayespy/demos/pca.py bayespy/demos/saving.py bayespy/demos/stochastic_inference.py bayespy/inference/__init__.py bayespy/inference/vmp/__init__.py bayespy/inference/vmp/transformations.py bayespy/inference/vmp/vmp.py bayespy/inference/vmp/nodes/CovarianceFunctions.py bayespy/inference/vmp/nodes/GaussianProcesses.py bayespy/inference/vmp/nodes/__init__.py bayespy/inference/vmp/nodes/add.py bayespy/inference/vmp/nodes/bernoulli.py bayespy/inference/vmp/nodes/beta.py bayespy/inference/vmp/nodes/binomial.py bayespy/inference/vmp/nodes/categorical.py bayespy/inference/vmp/nodes/categorical_markov_chain.py bayespy/inference/vmp/nodes/concat_gaussian.py bayespy/inference/vmp/nodes/concatenate.py bayespy/inference/vmp/nodes/constant.py bayespy/inference/vmp/nodes/converters.py bayespy/inference/vmp/nodes/deterministic.py bayespy/inference/vmp/nodes/dirichlet.py bayespy/inference/vmp/nodes/dot.py bayespy/inference/vmp/nodes/expfamily.py bayespy/inference/vmp/nodes/exponential.py bayespy/inference/vmp/nodes/gamma.py bayespy/inference/vmp/nodes/gate.py bayespy/inference/vmp/nodes/gaussian.py bayespy/inference/vmp/nodes/gaussian_markov_chain.py bayespy/inference/vmp/nodes/gp.py bayespy/inference/vmp/nodes/logistic.py bayespy/inference/vmp/nodes/logpdf.py bayespy/inference/vmp/nodes/mixture.py bayespy/inference/vmp/nodes/ml.py bayespy/inference/vmp/nodes/multinomial.py bayespy/inference/vmp/nodes/node.py bayespy/inference/vmp/nodes/point_estimate.py bayespy/inference/vmp/nodes/poisson.py bayespy/inference/vmp/nodes/stochastic.py bayespy/inference/vmp/nodes/take.py bayespy/inference/vmp/nodes/wishart.py bayespy/inference/vmp/nodes/tests/__init__.py bayespy/inference/vmp/nodes/tests/test_bernoulli.py bayespy/inference/vmp/nodes/tests/test_beta.py bayespy/inference/vmp/nodes/tests/test_binomial.py bayespy/inference/vmp/nodes/tests/test_categorical.py bayespy/inference/vmp/nodes/tests/test_categorical_markov_chain.py bayespy/inference/vmp/nodes/tests/test_concatenate.py bayespy/inference/vmp/nodes/tests/test_deterministic.py bayespy/inference/vmp/nodes/tests/test_dirichlet.py bayespy/inference/vmp/nodes/tests/test_dot.py bayespy/inference/vmp/nodes/tests/test_gamma.py bayespy/inference/vmp/nodes/tests/test_gate.py bayespy/inference/vmp/nodes/tests/test_gaussian.py bayespy/inference/vmp/nodes/tests/test_gaussian_markov_chain.py bayespy/inference/vmp/nodes/tests/test_mixture.py bayespy/inference/vmp/nodes/tests/test_multinomial.py bayespy/inference/vmp/nodes/tests/test_node.py bayespy/inference/vmp/nodes/tests/test_poisson.py bayespy/inference/vmp/nodes/tests/test_take.py bayespy/inference/vmp/nodes/tests/test_wishart.py bayespy/inference/vmp/tests/__init__.py bayespy/inference/vmp/tests/test_annealing.py bayespy/inference/vmp/tests/test_transformations.py bayespy/nodes/__init__.py bayespy/tests/__init__.py bayespy/tests/baseline_images/test_plot/contour.png bayespy/tests/baseline_images/test_plot/gaussian_mixture.png bayespy/tests/baseline_images/test_plot/hinton_p.png bayespy/tests/baseline_images/test_plot/hinton_r.png bayespy/tests/baseline_images/test_plot/hinton_z.png bayespy/tests/baseline_images/test_plot/pdf.png bayespy/utils/__init__.py bayespy/utils/linalg.py bayespy/utils/misc.py bayespy/utils/optimize.py bayespy/utils/random.py bayespy/utils/covfunc/__init__.py bayespy/utils/covfunc/covariance.py bayespy/utils/tests/__init__.py bayespy/utils/tests/test_linalg.py bayespy/utils/tests/test_misc.py bayespy/utils/tests/test_random.py doc/Makefile doc/source/conf.py doc/source/demos.rst doc/source/index.rst doc/source/intro.rst doc/source/nodes.rst doc/source/references.rst doc/source/_templates/autosummary/class.rst doc/source/_templates/autosummary/module.rst doc/source/_templates/autosummary/short_module.rst doc/source/dev_api/dev_api.rst doc/source/dev_api/distributions.rst doc/source/dev_api/moments.rst doc/source/dev_api/nodes.rst doc/source/dev_api/utils.rst doc/source/dev_guide/advanced.rst doc/source/dev_guide/dev_guide.rst doc/source/dev_guide/engine.rst doc/source/dev_guide/vmp.rst doc/source/dev_guide/workflow.rst doc/source/dev_guide/writingnodes.rst doc/source/dev_guide/vmp/vmp_gamma.rst doc/source/dev_guide/vmp/vmp_gaussian.rst doc/source/dev_guide/vmp/vmp_gaussian_gamma.rst doc/source/dev_guide/vmp/vmp_gaussian_wishart.rst doc/source/dev_guide/vmp/vmp_mixture.rst doc/source/dev_guide/vmp/vmp_normal.rst doc/source/dev_guide/vmp/vmp_normal_gamma.rst doc/source/dev_guide/vmp/vmp_wishart.rst doc/source/examples/additive_fhmm.rst doc/source/examples/bmm.rst doc/source/examples/examples.rst doc/source/examples/gmm.rst doc/source/examples/hmm.rst doc/source/examples/lda.rst doc/source/examples/lssm.rst doc/source/examples/pca.rst doc/source/user_api/user_api.rst doc/source/user_guide/advanced.rst doc/source/user_guide/inference.rst doc/source/user_guide/install.rst doc/source/user_guide/modelconstruct.rst doc/source/user_guide/plot.rst doc/source/user_guide/quickstart.rst doc/source/user_guide/quickstartbackup.py doc/source/user_guide/quickstartbackup.rst doc/source/user_guide/user_guide.rst././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273981.0 bayespy-0.6.2/bayespy.egg-info/dependency_links.txt0000644000175100001770000000000100000000000023215 0ustar00runnerdocker00000000000000 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273981.0 bayespy-0.6.2/bayespy.egg-info/entry_points.txt0000644000175100001770000000007200000000000022444 0ustar00runnerdocker00000000000000[nose.plugins] warnaserror = bayespy.testing:WarnAsError ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273981.0 bayespy-0.6.2/bayespy.egg-info/requires.txt0000644000175100001770000000025700000000000021553 0ustar00runnerdocker00000000000000numpy>=1.10.0 scipy>=0.13.0 h5py truncnorm [dev] nose nosebook [doc] sphinx>=1.4.0 sphinxcontrib-tikz>=0.4.2 sphinxcontrib-bayesnet sphinxcontrib-bibtex nbsphinx matplotlib ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273981.0 bayespy-0.6.2/bayespy.egg-info/top_level.txt0000644000175100001770000000001000000000000021670 0ustar00runnerdocker00000000000000bayespy ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.417372 bayespy-0.6.2/doc/0000755000175100001770000000000000000000000014546 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/Makefile0000644000175100001770000001572100000000000016214 0ustar00runnerdocker00000000000000# Makefile for Sphinx documentation # # You can set these variables from the command line. SPHINXOPTS = -v SPHINXBUILD = python -m sphinx # sphinx-build PAPER = BUILDDIR = build # User-friendly check for sphinx-build #ifeq ($(shell which ${SPHINXBUILD%% *} >/dev/null 2>&1; echo $$?), 1) #$(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '${SPHINXBUILD%% *}' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/) #endif # Internal variables. PAPEROPT_a4 = -D latex_paper_size=a4 PAPEROPT_letter = -D latex_paper_size=letter ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source # the i18n builder cannot share the environment and doctrees with the others I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext help: @echo "Please use \`make ' where is one of" @echo " html to make standalone HTML files" @echo " dirhtml to make HTML files named index.html in directories" @echo " singlehtml to make a single large HTML file" @echo " pickle to make pickle files" @echo " json to make JSON files" @echo " htmlhelp to make HTML files and a HTML help project" @echo " qthelp to make HTML files and a qthelp project" @echo " devhelp to make HTML files and a Devhelp project" @echo " epub to make an epub" @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" @echo " latexpdf to make LaTeX files and run them through pdflatex" @echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx" @echo " text to make text files" @echo " man to make manual pages" @echo " texinfo to make Texinfo files" @echo " info to make Texinfo files and run them through makeinfo" @echo " gettext to make PO message catalogs" @echo " changes to make an overview of all changed/added/deprecated items" @echo " xml to make Docutils-native XML files" @echo " pseudoxml to make pseudoxml-XML files for display purposes" @echo " linkcheck to check all external links for integrity" @echo " doctest to run all doctests embedded in the documentation (if enabled)" cleangenerated: rm -rf source/*/generated clean: cleangenerated rm -rf $(BUILDDIR)/* rm -rf source/*/*_files gh-pages: make clean cd build && git clone https://github.com/bayespy/bayespy.git -b gh-pages html make latexpdf make html cd build/html && git add -f *.png *.html *.txt *.js *.pdf *.py *.css * && git commit -a && git push html: $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html @echo @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." dirhtml: $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml @echo @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." singlehtml: $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml @echo @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." pickle: $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle @echo @echo "Build finished; now you can process the pickle files." json: $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json @echo @echo "Build finished; now you can process the JSON files." htmlhelp: $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp @echo @echo "Build finished; now you can run HTML Help Workshop with the" \ ".hhp project file in $(BUILDDIR)/htmlhelp." qthelp: $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp @echo @echo "Build finished; now you can run "qcollectiongenerator" with the" \ ".qhcp project file in $(BUILDDIR)/qthelp, like this:" @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/BayesPy.qhcp" @echo "To view the help file:" @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/BayesPy.qhc" devhelp: $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp @echo @echo "Build finished." @echo "To view the help file:" @echo "# mkdir -p $$HOME/.local/share/devhelp/BayesPy" @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/BayesPy" @echo "# devhelp" epub: $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub @echo @echo "Build finished. The epub file is in $(BUILDDIR)/epub." latex: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex @echo @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." @echo "Run \`make' in that directory to run these through (pdf)latex" \ "(use \`make latexpdf' here to do that automatically)." latexpdf: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex @echo "Running LaTeX files through pdflatex..." $(MAKE) -C $(BUILDDIR)/latex all-pdf @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." latexpdfja: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex @echo "Running LaTeX files through platex and dvipdfmx..." $(MAKE) -C $(BUILDDIR)/latex all-pdf-ja @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." text: $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text @echo @echo "Build finished. The text files are in $(BUILDDIR)/text." man: $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man @echo @echo "Build finished. The manual pages are in $(BUILDDIR)/man." texinfo: $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo @echo @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." @echo "Run \`make' in that directory to run these through makeinfo" \ "(use \`make info' here to do that automatically)." info: $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo @echo "Running Texinfo files through makeinfo..." make -C $(BUILDDIR)/texinfo info @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." gettext: $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale @echo @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." changes: $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes @echo @echo "The overview file is in $(BUILDDIR)/changes." linkcheck: $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck @echo @echo "Link check complete; look for any errors in the above output " \ "or in $(BUILDDIR)/linkcheck/output.txt." doctest: $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest @echo "Testing of doctests in the sources finished, look at the " \ "results in $(BUILDDIR)/doctest/output.txt." xml: $(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml @echo @echo "Build finished. The XML files are in $(BUILDDIR)/xml." pseudoxml: $(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml @echo @echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml." ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.417372 bayespy-0.6.2/doc/source/0000755000175100001770000000000000000000000016046 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.401372 bayespy-0.6.2/doc/source/_templates/0000755000175100001770000000000000000000000020203 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.417372 bayespy-0.6.2/doc/source/_templates/autosummary/0000755000175100001770000000000000000000000022571 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/_templates/autosummary/class.rst0000644000175100001770000000110500000000000024425 0ustar00runnerdocker00000000000000{{ fullname }} {{ underline }} .. currentmodule:: {{ module }} .. autoclass:: {{ objname }} {% block methods %} .. automethod:: __init__ {% if methods %} .. rubric:: Methods .. autosummary:: :toctree: generated/ {% for item in methods %} ~{{ name }}.{{ item }} {%- endfor %} {% endif %} {% endblock %} {% block attributes %} {% if attributes %} .. rubric:: Attributes .. autosummary:: :toctree: generated/ {% for item in attributes %} ~{{ name }}.{{ item }} {%- endfor %} {% endif %} {% endblock %} ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/_templates/autosummary/module.rst0000644000175100001770000000131400000000000024607 0ustar00runnerdocker00000000000000{{ fullname }} {{ underline }} .. automodule:: {{ fullname }} {% block functions %} {% if functions %} .. rubric:: Functions .. autosummary:: :toctree: generated/ {% for item in functions %} {{ item }} {%- endfor %} {% endif %} {% endblock %} {% block classes %} {% if classes %} .. rubric:: Classes .. autosummary:: :toctree: generated/ {% for item in classes %} {{ item }} {%- endfor %} {% endif %} {% endblock %} {% block exceptions %} {% if exceptions %} .. rubric:: Exceptions .. autosummary:: :toctree: generated/ {% for item in exceptions %} {{ item }} {%- endfor %} {% endif %} {% endblock %} ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/_templates/autosummary/short_module.rst0000644000175100001770000000007700000000000026033 0ustar00runnerdocker00000000000000{{ fullname }} {{ underline }} .. automodule:: {{ fullname }} ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/conf.py0000644000175100001770000002553400000000000017356 0ustar00runnerdocker00000000000000# -*- coding: utf-8 -*- # # BayesPy documentation build configuration file, created by # sphinx-quickstart on Mon Aug 27 12:22:11 2012. # # This file is execfile()d with the current directory set to its containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. import sys, os ON_RTD = os.environ.get('READTHEDOCS') == 'True' # Use some dummy modules on Read the Docs because they are not available # (requires some C libraries) # http://read-the-docs.readthedocs.org/en/latest/faq.html#i-get-import-errors-on-libraries-that-depend-on-c-modules if ON_RTD: from unittest.mock import MagicMock MOCK_MODULES = ['h5py'] sys.modules.update((mod_name, MagicMock()) for mod_name in MOCK_MODULES) # -- General configuration ----------------------------------------------------- import bayespy as bp # If your documentation needs a minimal Sphinx version, state it here. #needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be extensions # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. extensions = [ 'sphinx.ext.autodoc', 'sphinx.ext.imgmath', 'sphinx.ext.todo', 'sphinx.ext.viewcode', 'sphinx.ext.doctest', 'sphinx.ext.napoleon', 'matplotlib.sphinxext.plot_directive', 'sphinx.ext.autosummary', 'sphinxcontrib.tikz', 'sphinxcontrib.bayesnet', 'sphinxcontrib.bibtex', 'nbsphinx', ] bibtex_bibfiles = ["references.bib"] # Image format for math imgmath_image_format = 'svg' # Choose the image processing ‹suite›, either 'Netpbm', 'pdf2svg', 'GhostScript', 'ImageMagick' ('Netpbm' by default): # If you want your documentation to be built on http://readthedocs.org, you have to choose GhostScript. # All suites produce png images, excepted 'pdf2svg' which produces svg. if ON_RTD: tikz_proc_suite = 'GhostScript' else: tikz_proc_suite = 'pdf2svg' if ON_RTD: # For some reason, RTD needs these to be set explicitly although they # should have default values math_number_all = False numpydoc_show_class_members = False # Include TODOs in the documentation? todo_include_todos = True # Generate autosummary stub pages automatically # Or manually: sphinx-autogen -o source/generated source/*.rst #autosummary_generate = False import glob autosummary_generate = glob.glob("*.rst") + glob.glob("*/*.rst") + glob.glob("*/*/*.rst") + glob.glob("*/*/*/*.rst") # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The encoding of source files. #source_encoding = 'utf-8-sig' # The master toctree document. master_doc = 'index' # General information about the project. project = "BayesPy" copyright = bp.__copyright__ # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = bp.__version__ # The full version, including alpha/beta/rc tags. release = bp.__version__ # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. #language = None # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: #today = '' # Else, today_fmt is used as the format for a strftime call. #today_fmt = '%B %d, %Y' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. exclude_patterns = [ '**.ipynb_checkpoints' ] # The reST default role (used for this markup: `text`) to use for all documents. #default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. #add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). #add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. #show_authors = False # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'sphinx' # A list of ignored prefixes for module index sorting. #modindex_common_prefix = [] # -- Options for HTML output --------------------------------------------------- # Sphinx-TikZ extension tikz_latex_preamble = r""" \usepackage{amsmath} """ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. #html_theme = 'sphinxdoc' #html_theme = 'nature' #html_theme = 'default' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # html_theme_options = { # "sidebarwidth": 300 # } # Add any paths that contain custom themes here, relative to this directory. #html_theme_path = [] # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". html_title = "BayesPy v%s Documentation" % (version) # A shorter title for the navigation bar. Default is the same as html_title. #html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. #html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. #html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. #html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. #html_use_smartypants = True # Custom sidebar templates, maps document names to template names. #html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. #html_additional_pages = {} # If false, no module index is generated. #html_domain_indices = True # If false, no index is generated. #html_use_index = True # If true, the index is split into individual pages for each letter. #html_split_index = False # If true, links to the reST sources are added to the pages. #html_show_sourcelink = True # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. #html_show_sphinx = True # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. #html_show_copyright = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. #html_use_opensearch = '' # This is the file name suffix for HTML files (e.g. ".xhtml"). #html_file_suffix = None # Output file base name for HTML help builder. htmlhelp_basename = 'BayesPydoc' # -- Options for LaTeX output -------------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). #'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). #'pointsize': '10pt', # Additional stuff for the LaTeX preamble. 'preamble': r''' \usepackage{tikz} \usepackage{amssymb} \usepackage{amsmath} \usepackage{svg} \usetikzlibrary{shapes} \usetikzlibrary{fit} \usetikzlibrary{chains} \usetikzlibrary{arrows} ''', # Do not use [T1]{fontenc} because it does not work on libre systems 'fontenc': '' } #latex_additional_files = ['images/bayesnet.sty',] # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, author, documentclass [howto/manual]). latex_documents = [ ('index', 'BayesPy.tex', u'BayesPy Documentation', u'Jaakko Luttinen', 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. #latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. #latex_use_parts = False # If true, show page references after internal links. #latex_show_pagerefs = False # If true, show URL addresses after external links. #latex_show_urls = False # Documents to append as an appendix to all manuals. #latex_appendices = [] # If false, no module index is generated. #latex_domain_indices = True # -- Options for manual page output -------------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ ('index', 'bayespy', u'BayesPy Documentation', [u'Jaakko Luttinen'], 1) ] # If true, show URL addresses after external links. #man_show_urls = False # -- Options for Texinfo output ------------------------------------------------ # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ ('index', 'BayesPy', u'BayesPy Documentation', u'Jaakko Luttinen', 'BayesPy', 'One line description of project.', 'Miscellaneous'), ] # Documents to append as an appendix to all manuals. #texinfo_appendices = [] # If false, no module index is generated. #texinfo_domain_indices = True # How to display URL addresses: 'footnote', 'no', or 'inline'. #texinfo_show_urls = 'footnote' # -- Options for Epub output --------------------------------------------------- # Bibliographic Dublin Core info. epub_title = u'BayesPy' epub_author = bp.__author__ epub_publisher = bp.__author__ epub_copyright = bp.__copyright__ # The language of the text. It defaults to the language option # or en if the language is not set. #epub_language = '' # The scheme of the identifier. Typical schemes are ISBN or URL. #epub_scheme = '' # The unique identifier of the text. This can be a ISBN number # or the project homepage. #epub_identifier = '' # A unique identification for the text. #epub_uid = '' # A tuple containing the cover image and cover page html template filenames. #epub_cover = () # HTML files that should be inserted before the pages created by sphinx. # The format is a list of tuples containing the path and title. #epub_pre_files = [] # HTML files shat should be inserted after the pages created by sphinx. # The format is a list of tuples containing the path and title. #epub_post_files = [] # A list of files that should not be packed into the epub file. #epub_exclude_files = [] # The depth of the table of contents in toc.ncx. #epub_tocdepth = 3 # Allow duplicate toc entries. #epub_tocdup = True # Read the docs fails to import _tkinter so use Agg backend import matplotlib matplotlib.use('agg') ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/demos.rst0000644000175100001770000000032500000000000017707 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Examples ******** .. toctree:: _notebooks/mog.rst _notebooks/pca.rst ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.421372 bayespy-0.6.2/doc/source/dev_api/0000755000175100001770000000000000000000000017455 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_api/dev_api.rst0000644000175100001770000000054300000000000021620 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Developer API ============= This chapter contains API specifications which are relevant to BayesPy developers and contributors. .. toctree:: :maxdepth: 1 nodes moments distributions utils ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_api/distributions.rst0000644000175100001770000000176700000000000023124 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Distributions ============= .. currentmodule:: bayespy.inference.vmp.nodes .. autosummary:: :toctree: generated/ stochastic.Distribution expfamily.ExponentialFamilyDistribution gaussian.GaussianDistribution gaussian.GaussianARDDistribution gaussian.GaussianGammaDistribution gaussian.GaussianWishartDistribution gaussian_markov_chain.GaussianMarkovChainDistribution gaussian_markov_chain.SwitchingGaussianMarkovChainDistribution gaussian_markov_chain.VaryingGaussianMarkovChainDistribution gamma.GammaDistribution wishart.WishartDistribution beta.BetaDistribution dirichlet.DirichletDistribution bernoulli.BernoulliDistribution binomial.BinomialDistribution categorical.CategoricalDistribution categorical_markov_chain.CategoricalMarkovChainDistribution multinomial.MultinomialDistribution poisson.PoissonDistribution ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_api/moments.rst0000644000175100001770000000131100000000000021665 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Moments ======= .. currentmodule:: bayespy.inference.vmp.nodes .. autosummary:: :toctree: generated/ node.Moments gaussian.GaussianMoments gaussian_markov_chain.GaussianMarkovChainMoments gaussian.GaussianGammaMoments gaussian.GaussianWishartMoments gamma.GammaMoments wishart.WishartMoments beta.BetaMoments dirichlet.DirichletMoments bernoulli.BernoulliMoments binomial.BinomialMoments categorical.CategoricalMoments categorical_markov_chain.CategoricalMarkovChainMoments multinomial.MultinomialMoments poisson.PoissonMoments ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_api/nodes.rst0000644000175100001770000000135700000000000021325 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Developer nodes =============== .. currentmodule:: bayespy.inference.vmp.nodes The following base classes are useful if writing new nodes: .. autosummary:: :toctree: generated/ node.Node stochastic.Stochastic expfamily.ExponentialFamily deterministic.Deterministic The following nodes are examples of special nodes that remain hidden for the user although they are often implicitly used: .. autosummary:: :toctree: generated/ constant.Constant gaussian.GaussianToGaussianGamma gaussian.GaussianGammaToGaussianWishart gaussian.WrapToGaussianGamma gaussian.WrapToGaussianWishart ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_api/utils.rst0000644000175100001770000000043700000000000021353 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Utility functions ================= .. currentmodule:: bayespy.utils .. autosummary:: :toctree: generated/ linalg random optimize misc ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.421372 bayespy-0.6.2/doc/source/dev_guide/0000755000175100001770000000000000000000000020001 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/advanced.rst0000644000175100001770000001407100000000000022303 0ustar00runnerdocker00000000000000.. Copyright (C) 2014-2015 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: import numpy as np np.random.seed(1) # This is the PCA model from the previous sections from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau, name='Y') c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) from bayespy.inference import VB import bayespy.plot as bpplt Q = None Advanced topics =============== The VB lower bound and its gradients ------------------------------------ The VB lower bound: .. math:: \mathcal{L} &= \underbrace{\langle \log p(X,Z) \rangle}_{\equiv \mathcal{L}_p} - \underbrace{\langle \log q(Z) \rangle}_{\equiv \mathcal{L}_q} The child nodes pass the gradient to the parent node so the parent node can optimize its parameters. In general, :math:`\mathcal{L}_p` can be almost arbitrarily complex function of :math:`Z`: .. math:: \mathcal{L}_p = \langle \log p(X,Z) \rangle. The gradient is .. math:: \frac{\partial}{\partial \xi} \mathcal{L}_p &= \frac{\partial}{\partial \xi} \langle \log p(X,Z) \rangle \\ &= \langle \log p(X,Z) \frac{\partial}{\partial \xi} \log q(Z) \rangle which can be computed, for instance, by sampling from :math:`q(Z)`. Note that :math:`\xi` can represent, for instance, the expectation parameters :math:`\bar{u}` of :math:`q(Z)` in order to obtain the Riemannian gradient for an exponential family :math:`q(Z)`. Often, :math:`\mathcal{L}_p` has a simpler form (or it can be further lower bounded by a simpler form). If :math:`\mathcal{L}_p` can be written as a function of :math:`\bar{u}` as .. math:: \mathcal{L}_p = \bar{u}^T \psi + \mathrm{const}, the gradient with respect to the moments is .. math:: \frac{\partial}{\partial \bar{u}} \mathcal{L}_p = \psi. It can be that :math:`\psi` can be computed exactly by using moments of other nodes or it needs to be approximated by using sampling from the distribution of other nodes. To summarize, the gradient message can be a numerical gradient, an approximate stochastic gradient (by sampling other nodes) or a function which can be used to compute an approximate stochastic gradient by sampling the node itself (and possibly other nodes). Riemannian gradient ------------------- In principle, the VB lower bound can be maximized with respect to any parameterization of the approximate distribution. However, normal gradient can perform badly, because it doesn't take into account the geometry of the space of probability distributions. This can be fixed by using Riemannian (i.e., natural) gradient. In general, the Riemannian gradient is defined as .. math:: \tilde{\nabla}_\xi \mathcal{L} = G^{-1} \nabla_\xi \mathcal{L} where .. math:: [G]_{ij} = \left\langle \frac{\partial \log q(Z)}{\partial \xi_i} \frac{\partial \log q(Z)}{\partial \xi_j} \right\rangle = - \left\langle \frac{\partial^2 \log q(Z)}{\partial \xi_i \partial \xi_j} \right\rangle. For exponential family distributions, the Riemannian gradient with respect to the natural parameters :math:`\phi` can be computed easily by taking the gradient with respect to the moments :math:`\bar{u}`: .. math:: \tilde{\nabla}_\phi = G^{-1} \nabla_\phi \mathcal{L} = \nabla_{\bar{u}} \mathcal{L}. Note that :math:`G` depends only on the approximate distribution :math:`q(Z)`. Thus, the model itself does not need to be in the exponential family but only the approximation, in order to use this property. The Riemannian gradient of :math:`\mathcal{L}_q` for exponential family distributions :math:`q(Z)` is .. math:: \tilde{\nabla}_\phi \mathcal{L}_q = \nabla_{\bar{u}} \mathcal{L}_q = \nabla_{\bar{u}} [ \bar{u}^T \phi + \langle f(Z) \rangle + g(\phi) ] = \phi. Thus, the Riemannian gradient is .. math:: \tilde{\nabla}_\phi \mathcal{L} = \nabla_{\bar{u}} \mathcal{L}_p - \phi. .. todo:: Should f(Z) be taken into account? It cancels out if prior and q are in the same family. But if they are not, it doesn't cancel out. Does it affect the gradient? Nonlinear conjugate gradient methods :cite:`Hensman:2012`: * Fletcher-Reeves: .. math:: \beta_n = \frac { \langle \tilde{g}_n, \tilde{g}_n \rangle_n } { \langle \tilde{g}_{n-1}, \tilde{g}_{n-1} \rangle_{n-1} } = \frac { \langle g_n, \tilde{g}_n \rangle } { \langle g_{n-1}, \tilde{g}_{n-1} \rangle } * Polak-Ribiere: .. math:: \beta_n = \frac { \langle \tilde{g}_n, \tilde{g}_n - \tilde{g}_{n-1} \rangle_n } { \langle \tilde{g}_{n-1}, \tilde{g}_{n-1} \rangle_{n-1} } = \frac { \langle g_n, \tilde{g}_n - \tilde{g}_{n-1} \rangle } { \langle g_{n-1}, \tilde{g}_{n-1} \rangle } * Hestenes-Stiefel: .. math:: \beta_n = - \frac { \langle \tilde{g}_n, \tilde{g}_n - \tilde{g}_{n-1} \rangle_n } { \langle \tilde{g}_{n-1}, \tilde{g}_{n-1} \rangle_{n-1} } = - \frac { \langle g_n, \tilde{g}_n - \tilde{g}_{n-1} \rangle } { \langle g_{n-1}, \tilde{g}_{n-1} \rangle } where :math:`\langle \rangle_i` denotes the inner product in the Riemannian geometry, :math:`\langle \rangle` denotes the inner product in the Euclidean space, :math:`\tilde{g}` denotes the Riemannian gradient and :math:`g` denotes the gradient, and the following property has been used: .. math:: \langle \tilde{g}_n, \tilde{x} \rangle_n = \tilde{g}_n^T G_n \tilde{x} = g^T G^{-1}_n G_n \tilde{x} = g^T \tilde{x} = \langle g, \tilde{x} \rangle TODO ---- * simulated annealing * Riemannian (conjugate) gradient * black box variational inference * stochastic variational inference * pattern search * fast inference * parameter expansion ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/dev_guide.rst0000644000175100001770000000071300000000000022467 0ustar00runnerdocker00000000000000.. Copyright (C) 2011,2012 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Developer guide *************** This chapter provides basic information for developers about contributing, the theoretical background and the core structure. It is assumed that the reader has read and is familiar with :ref:`sec-user-guide`. .. toctree:: :maxdepth: 2 workflow vmp engine writingnodes ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/engine.rst0000644000175100001770000000127500000000000022005 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Implementing inference engines ============================== Currently, only variational Bayesian inference engine is implemented. This implementation is not very modular, that is, the inference engine is not well separated from the model construction. Thus, it is not straightforward to implement other inference engines at the moment. Improving the modularity of the inference engine and model construction is future work with high priority. In any case, BayesPy aims to be an efficient, simple and modular Bayesian package for variational inference at least. ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.421372 bayespy-0.6.2/doc/source/dev_guide/vmp/0000755000175100001770000000000000000000000020603 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_gamma.rst0000644000175100001770000000004600000000000023301 0ustar00runnerdocker00000000000000Gamma distribution ------------------ ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_gaussian.rst0000644000175100001770000000464100000000000024036 0ustar00runnerdocker00000000000000Multivariate normal distribution -------------------------------- .. math:: \mathbf{x} &\sim \mathcal{N}(\boldsymbol{\mu}, \mathbf{\Lambda}), .. math:: \mathbf{x},\boldsymbol{\mu} \in \mathbb{R}^{D}, \quad \mathbf{\Lambda} \in \mathbb{R}^{D \times D}, \quad \mathbf{\Lambda} \text{ symmetric positive definite} .. math:: \log\mathcal{N}( \mathbf{x} | \boldsymbol{\mu}, \mathbf{\Lambda} ) &= - \frac{1}{2} \mathbf{x}^{\mathrm{T}} \mathbf{\Lambda} \mathbf{x} + \mathbf{x}^{\mathrm{T}} \mathbf{\Lambda} \boldsymbol{\mu} - \frac{1}{2} \boldsymbol{\mu}^{\mathrm{T}} \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log |\mathbf{\Lambda}| - \frac{D}{2} \log (2\pi) .. math:: \mathbf{u} (\mathbf{x}) &= \left[ \begin{matrix} \mathbf{x} \\ \mathbf{xx}^{\mathrm{T}} \end{matrix} \right] \\ \boldsymbol{\phi} (\boldsymbol{\mu}, \mathbf{\Lambda}) &= \left[ \begin{matrix} \mathbf{\Lambda} \boldsymbol{\mu} \\ - \frac{1}{2} \mathbf{\Lambda} \end{matrix} \right] \\ \boldsymbol{\phi}_{\boldsymbol{\mu}} (\mathbf{x}, \mathbf{\Lambda}) &= \left[ \begin{matrix} \mathbf{\Lambda} \mathbf{x} \\ - \frac{1}{2} \mathbf{\Lambda} \end{matrix} \right] \\ \boldsymbol{\phi}_{\mathbf{\Lambda}} (\mathbf{x}, \boldsymbol{\mu}) &= \left[ \begin{matrix} - \frac{1}{2} \mathbf{xx}^{\mathrm{T}} + \frac{1}{2} \mathbf{x}\boldsymbol{\mu}^{\mathrm{T}} + \frac{1}{2} \boldsymbol{\mu}\mathbf{x}^{\mathrm{T}} - \frac{1}{2} \boldsymbol{\mu\mu}^{\mathrm{T}} \\ \frac{1}{2} \end{matrix} \right] \\ g (\boldsymbol{\mu}, \mathbf{\Lambda}) &= - \frac{1}{2} \operatorname{tr}(\boldsymbol{\mu\mu}^{\mathrm{T}} \mathbf{\Lambda} ) + \frac{1}{2} \log |\mathbf{\Lambda}| \\ g_{\boldsymbol{\phi}} (\boldsymbol{\phi}) &= \frac{1}{4} \boldsymbol{\phi}^{\mathrm{T}}_1 \boldsymbol{\phi}^{-1}_2 \boldsymbol{\phi}_1 + \frac{1}{2} \log | -2 \boldsymbol{\phi}_2 | \\ f(\mathbf{x}) &= - \frac{D}{2} \log(2\pi) \\ \overline{\mathbf{u}} (\boldsymbol{\phi}) &= \left[ \begin{matrix} - \frac{1}{2} \boldsymbol{\phi}^{-1}_2 \boldsymbol{\phi}_1 \\ \frac{1}{4} \boldsymbol{\phi}^{-1}_2 \boldsymbol{\phi}_1 \boldsymbol{\phi}^{\mathrm{T}}_1 \boldsymbol{\phi}^{-1}_2 - \frac{1}{2} \boldsymbol{\phi}^{-1}_2 \end{matrix} \right] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_gaussian_gamma.rst0000644000175100001770000000007100000000000025171 0ustar00runnerdocker00000000000000 Gaussian-Gamma distribution --------------------------- ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_gaussian_wishart.rst0000644000175100001770000000007500000000000025574 0ustar00runnerdocker00000000000000Gaussian-Wishart distribution ----------------------------- ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_mixture.rst0000644000175100001770000000632600000000000023723 0ustar00runnerdocker00000000000000Mixture distribution -------------------- .. math:: \mathbf{x} &\sim \mathrm{Mix}_{\mathcal{D}} \left( \lambda, \left\{ \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right\}^N_{n=1} \right) .. math:: \lambda \in \{1, \ldots, N\}, \quad \mathcal{D} \text{ is an exp.fam. distribution}, \quad \mathbf{\Theta}^{(n)}_k \text{ are parameters of } \mathcal{D} .. math:: \log\mathrm{Mix}_{\mathcal{D}} \left( \mathbf{x} \left| \lambda, \left\{ \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right\}^N_{n=1} \right. \right) &= \sum^N_{n=1} [\lambda=n] \mathbf{u}_{\mathcal{D}}(\mathbf{x})^{\mathrm{T}} \boldsymbol{\phi}_{\mathcal{D}} \left( \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right) \\ & \quad + \sum^N_{n=1} [\lambda=n] g_{\mathcal{D}} \left( \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right) + f_{\mathcal{D}} (\mathbf{x}) .. math:: \mathbf{u} (\mathbf{x}) &= \mathbf{u}_{\mathcal{D}} (\mathbf{x}) \\ \boldsymbol{\phi} \left( \lambda, \left\{ \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right\}^N_{n=1} \right) &= \sum^N_{n=1} [\lambda=n] \boldsymbol{\phi}_{\mathcal{D}} \left( \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right) % \\ % \boldsymbol{\phi}_{\lambda} \left( \mathbf{x}, \left\{ \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right\}^N_{n=1} \right) &= \left[\begin{matrix} \mathbf{u}_{\mathcal{D}} (\mathbf{x})^{\mathrm{T}} \boldsymbol{\phi}_{\mathcal{D}} \left( \mathbf{\Theta}^{(1)}_1, \ldots, \mathbf{\Theta}^{(1)}_K \right) + g_{\mathcal{D}} \left( \mathbf{\Theta}^{(1)}_1, \ldots, \mathbf{\Theta}^{(1)}_K \right) \\ \vdots \\ \mathbf{u}_{\mathcal{D}} (\mathbf{x})^{\mathrm{T}} \boldsymbol{\phi}_{\mathcal{D}} \left( \mathbf{\Theta}^{(N)}_1, \ldots, \mathbf{\Theta}^{(N)}_K \right) + g_{\mathcal{D}} \left( \mathbf{\Theta}^{(N)}_1, \ldots, \mathbf{\Theta}^{(N)}_K \right) \end{matrix}\right] % \\ % \boldsymbol{\phi}_{\mathbf{\Theta}^{(m)}_l} \left( \mathbf{x}, \lambda, \left\{ \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right\}^N_{n=1} \setminus \left\{ \mathbf{\Theta}^{(m)}_l \right\} \right) &= [\lambda=m] \boldsymbol{\phi}_{\mathcal{D}\rightarrow\mathbf{\Theta}_l} \left( \mathbf{x}, \left\{ \mathbf{\Theta}^{(m)}_k \right\}_{k\neq l} \right) % \\ % g \left( \lambda, \left\{ \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right\}^N_{n=1} \right) &= \sum^N_{n=1} [\lambda=n] g_{\mathcal{D}} \left( \mathbf{\Theta}^{(n)}_1, \ldots, \mathbf{\Theta}^{(n)}_K \right) \\ g (\boldsymbol{\phi}) &= g_{\mathcal{D}} (\boldsymbol{\phi}) \\ f(\mathbf{x}) &= f_{\mathcal{D}} (\mathbf{x}) \\ \overline{\mathbf{u}} (\boldsymbol{\phi}) &= \overline{\mathbf{u}}_{\mathcal{D}} (\boldsymbol{\phi}) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_normal.rst0000644000175100001770000000007600000000000023512 0ustar00runnerdocker00000000000000Univariate normal distribution ------------------------------ ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_normal_gamma.rst0000644000175100001770000000006400000000000024651 0ustar00runnerdocker00000000000000Normal-Gamma distribution ------------------------- ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp/vmp_wishart.rst0000644000175100001770000000360200000000000023701 0ustar00runnerdocker00000000000000Wishart distribution -------------------- .. math:: \mathbf{\Lambda} &\sim \mathcal{W}(n, \mathbf{V}), .. math:: n > D-1, \quad \mathbf{\Lambda}, \mathbf{V} \in \mathbb{R}^{D \times D}, \quad \mathbf{\Lambda}, \mathbf{V} \text{ symmetric positive definite} .. math:: \log\mathcal{W}( \mathbf{\Lambda} | n, \mathbf{V} ) &= - \frac{1}{2} \operatorname{tr} (\mathbf{\Lambda V}) + \frac{n}{2} \log |\mathbf{\Lambda}| + \frac{n}{2} \log |\mathbf{V}| - \frac{D+1}{2} \log |\mathbf{\Lambda}| - \frac{nD}{2} \log 2 - \log \Gamma_D \left(\frac{n}{2}\right) .. math:: \mathbf{u} (\mathbf{\Lambda}) &= \left[ \begin{matrix} \mathbf{\Lambda} \\ \log |\mathbf{\Lambda}| \end{matrix} \right] \\ \boldsymbol{\phi} (n, \mathbf{V}) &= \left[ \begin{matrix} - \frac{1}{2} \mathbf{V} \\ \frac{1}{2} n \end{matrix} \right] \\ \boldsymbol{\phi}_{n} (\mathbf{\Lambda}, \mathbf{V}) &= \left[ \begin{matrix} \frac{1}{2}\log|\mathbf{\Lambda}| + \frac{1}{2}\log|\mathbf{V}| + \frac{D}{2} \log 2 \\ -1 \end{matrix} \right] \\ \boldsymbol{\phi}_{\mathbf{V}} (\mathbf{\Lambda}, n) &= \left[ \begin{matrix} - \frac{1}{2} \mathbf{\Lambda} \\ \frac{1}{2} n \end{matrix} \right] \\ g (n, \mathbf{V}) &= \frac{n}{2} \log|\mathbf{V}| - \frac{nD}{2}\log 2 - \log \Gamma_D \left(\frac{n}{2}\right) \\ g_{\boldsymbol{\phi}} (\boldsymbol{\phi}) &= \boldsymbol{\phi}_2 \log|-\boldsymbol{\phi}_1| - \log \Gamma_D (\boldsymbol{\phi}_2) \\ f(\mathbf{\Lambda}) &= - \frac{D+1}{2} \log|\mathbf{\Lambda}| \\ \overline{\mathbf{u}} (\boldsymbol{\phi}) &= \left[ \begin{matrix} - \boldsymbol{\phi}_2 \boldsymbol{\phi}^{-1}_1 \\ - \log|-\boldsymbol{\phi}_1| + \psi_D(\boldsymbol{\phi}_2) \end{matrix} \right] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/vmp.rst0000644000175100001770000002503100000000000021336 0ustar00runnerdocker00000000000000.. Copyright (C) 2012 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Variational message passing =========================== This section briefly describes the variational message passing (VMP) framework, which is currently the only implemented inference engine in BayesPy. The variational Bayesian (VB) inference engine in BayesPy assumes that the posterior approximation factorizes with respect to nodes and plates. VMP is based on updating one node at a time (the plates in one node can be updated simultaneously) and iteratively updating all nodes in turns until convergence. Standard update equation ------------------------ The general update equation for the factorized approximation of node :math:`\boldsymbol{\theta}` is the following: .. math:: :label: vmp_general_update \log q(\boldsymbol{\theta}) &= \langle \log p\left( \boldsymbol{\theta} | \mathrm{pa}(\boldsymbol{\theta}) \right) \rangle + \sum_{\mathbf{x} \in \mathrm{ch}(\boldsymbol{\theta})} \langle \log p(\mathbf{x}|\mathrm{pa}(\mathbf{x})) \rangle + \mathrm{const}, where :math:`\mathrm{pa}(\boldsymbol{\theta})` and :math:`\mathrm{ch}(\boldsymbol{\theta})` are the set of parents and children of :math:`\boldsymbol{\theta}`, respectively. Thus, the posterior approximation of a node is updated by taking a sum of the expectations of all log densities in which the node variable appears. The expectations are over the approximate distribution of all other variables than :math:`\boldsymbol{\theta}`. Actually, not all the variables are needed, because the non-constant part depends only on the Markov blanket of :math:`\boldsymbol{\theta}`. This leads to a local optimization scheme, which uses messages from neighbouring nodes. The messages are simple for conjugate exponential family models. An exponential family distribution has the following log probability density function: .. math:: :label: likelihood \log p(\mathbf{x}|\mathbf{\Theta}) &= \mathbf{u}_{\mathbf{x}}(\mathbf{x})^{\mathrm{T}} \boldsymbol{\phi}_{\mathbf{x}}(\mathbf{\Theta}) + g_{\mathbf{x}}(\mathbf{\Theta}) + f_{\mathbf{x}}(\mathbf{x}), where :math:`\mathbf{\Theta}=\{\boldsymbol{\theta}_j\}` is the set of parents, :math:`\mathbf{u}` is the sufficient statistic vector, :math:`\boldsymbol{\phi}` is the natural parameter vector, :math:`g` is the negative log normalizer, and :math:`f` is the log base function. Note that the log density is linear with respect to the terms that are functions of :math:`\mathbf{x}`: :math:`\mathbf{u}` and :math:`f`. If a parent has a conjugate prior, :eq:`likelihood` is also linear with respect to the parent's sufficient statistic vector. Thus, :eq:`likelihood` can be re-organized with respect to a parent :math:`\boldsymbol{\theta}_j` as .. math:: \log p(\mathbf{x}|\mathbf{\Theta}) &= \mathbf{u}_{\boldsymbol{\theta}_j}(\boldsymbol{\theta}_j)^{\mathrm{T}} \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}_j} (\mathbf{x}, \{\boldsymbol{\theta}_k\}_{k\neq j}) + \mathrm{const}, where :math:`\mathbf{u}_{\boldsymbol{\theta}_j}` is the sufficient statistic vector of :math:`\boldsymbol{\theta}_j` and the constant part is constant with respect to :math:`\boldsymbol{\theta}_j`. Thus, the update equation :eq:`vmp_general_update` for :math:`\boldsymbol{\theta}_j` can be written as .. math:: \log q(\boldsymbol{\theta}_j) &= \mathbf{u}_{\boldsymbol{\theta}_j}(\boldsymbol{\theta}_j)^{\mathrm{T}} \langle \boldsymbol{\phi}_{\boldsymbol{\theta}_j} \rangle + f_{\boldsymbol{\theta}_j}(\boldsymbol{\theta}_j) + \mathbf{u}_{\boldsymbol{\theta}_j}(\boldsymbol{\theta}_j)^{\mathrm{T}} \sum_{\mathbf{x} \in \mathrm{ch}(\boldsymbol{\theta}_j)} \langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}_j} \rangle + \mathrm{const}, \\ &= \mathbf{u}_{\boldsymbol{\theta}_j}(\boldsymbol{\theta}_j)^{\mathrm{T}} \left( \langle \boldsymbol{\phi}_{\boldsymbol{\theta}_j} \rangle + \sum_{\mathbf{x} \in \mathrm{ch}(\boldsymbol{\theta}_j)} \langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}_j} \rangle \right) + f_{\boldsymbol{\theta}_j}(\boldsymbol{\theta}_j) + \mathrm{const}, where the summation is over all the child nodes of :math:`\boldsymbol{\theta}_j`. Because of the conjugacy, :math:`\langle\boldsymbol{\phi}_{\boldsymbol{\theta}_j}\rangle` depends (multi)linearly on the parents' sufficient statistic vector. Similarly, :math:`\langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}_j} \rangle` depends (multi)linearly on the expectations of the children's and co-parents' sufficient statistics. This gives the following update equation for the natural parameter vector of the posterior approximation :math:`q(\boldsymbol{\phi}_j)`: .. math:: :label: update_phi \tilde{\boldsymbol{\phi}}_j &= \langle \boldsymbol{\phi}_{\boldsymbol{\theta}_j} \rangle + \sum_{\mathbf{x} \in \mathrm{ch}(\boldsymbol{\theta}_j)} \langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}_j} \rangle. Variational messages -------------------- The update equation :eq:`update_phi` leads to a message passing scheme: the term :math:`\langle \boldsymbol{\phi}_{\boldsymbol{\theta}_j} \rangle` is a function of the parents' sufficient statistic vector and the term :math:`\langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}_j} \rangle` can be interpreted as a message from the child node :math:`\mathbf{x}`. Thus, the message from the child node :math:`\mathbf{x}` to the parent node :math:`\boldsymbol{\theta}` is .. math:: \mathbf{m}_{\mathbf{x}\rightarrow\boldsymbol{\theta}} &\equiv \langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}} \rangle, which can be computed as a function of the sufficient statistic vector of the co-parent nodes of :math:`\boldsymbol{\theta}` and the sufficient statistic vector of the child node :math:`\mathbf{x}`. The message from the parent node :math:`\boldsymbol{\theta}` to the child node :math:`\mathbf{x}` is simply the expectation of the sufficient statistic vector: .. math:: \mathbf{m}_{\mathbf{\boldsymbol{\theta}}\rightarrow\mathbf{x}} &\equiv \langle \mathbf{u}_{\boldsymbol{\theta}} \rangle. In order to compute the expectation of the sufficient statistic vector we need to write :math:`q(\boldsymbol{\theta})` as .. math:: \log q(\boldsymbol{\theta}) &= \mathbf{u}(\boldsymbol{\theta})^{\mathrm{T}} \tilde{\boldsymbol{\phi}} + \tilde{g}(\tilde{\boldsymbol{\phi}}) + f(\boldsymbol{\theta}), where :math:`\tilde{\boldsymbol{\phi}}` is the natural parameter vector of :math:`q(\boldsymbol{\theta})`. Now, the expectation of the sufficient statistic vector is defined as .. math:: :label: moments \langle \mathbf{u}_{\boldsymbol{\theta}} \rangle &= - \frac{\partial \tilde{g}}{\partial \tilde{\boldsymbol{\phi}}_{\boldsymbol{\theta}}} (\tilde{\boldsymbol{\phi}}_{\boldsymbol{\theta}}). We call this expectation of the sufficient statistic vector as the moments vector. Lower bound ----------- Computing the VB lower bound is not necessary in order to find the posterior approximation, although it is extremely useful in monitoring convergence and possible bugs. The VB lower bound can be written as .. math:: \mathcal{L} = \langle \log p(\mathbf{Y}, \mathbf{X}) \rangle - \langle \log q(\mathbf{X}) \rangle, where :math:`\mathbf{Y}` is the set of all observed variables and :math:`\mathbf{X}` is the set of all latent variables. It can also be written as .. math:: \mathcal{L} = \sum_{\mathbf{y} \in \mathbf{Y}} \langle \log p(\mathbf{y} | \mathrm{pa}(\mathbf{y})) \rangle + \sum_{\mathbf{x} \in \mathbf{X}} \left[ \langle \log p(\mathbf{x} | \mathrm{pa}(\mathbf{x})) \rangle - \langle \log q(\mathbf{x}) \right], which shows that observed and latent variables contribute differently to the lower bound. These contributions have simple forms for exponential family nodes. Observed exponential family nodes contribute to the lower bound as follows: .. math:: \langle \log p(\mathbf{y}|\mathrm{pa}(\mathbf{y})) \rangle &= \mathbf{u}(\mathbf{y})^T \langle \boldsymbol{\phi} \rangle + \langle g \rangle + f(\mathbf{x}), where :math:`\mathbf{y}` is the observed data. On the other hand, latent exponential family nodes contribute to the lower bound as follows: .. math:: \langle \log p(\mathbf{x}|\boldsymbol{\theta}) \rangle - \langle \log q(\mathbf{x}) \rangle &= \langle \mathbf{u} \rangle^T (\langle \boldsymbol{\phi} \rangle - \tilde{\boldsymbol{\phi}} ) + \langle g \rangle - \tilde{g}. If a node is partially observed and partially unobserved, these formulas are applied plate-wise appropriately. .. _sec-vmp-terms: Terms ----- To summarize, implementing VMP requires one to write for each stochastic exponential family node: :math:`\langle \boldsymbol{\phi} \rangle` : the expectation of the prior natural parameter vector Computed as a function of the messages from parents. :math:`\tilde{\boldsymbol{\phi}}` : natural parameter vector of the posterior approximation Computed as a sum of :math:`\langle \boldsymbol{\phi} \rangle` and the messages from children. :math:`\langle \mathbf{u} \rangle` : the posterior moments vector Computed as a function of :math:`\tilde{\boldsymbol{\phi}}` as defined in :eq:`moments`. :math:`\mathbf{u}(\mathbf{x})` : the moments vector for given data Computed as a function of of the observed data :math:`\mathbf{x}`. :math:`\langle g \rangle` : the expectation of the negative log normalizer of the prior Computed as a function of parent moments. :math:`\tilde{g}` : the negative log normalizer of the posterior approximation Computed as a function of :math:`\tilde{\boldsymbol{\phi}}`. :math:`f(\mathbf{x})` : the log base measure for given data Computed as a function of the observed data :math:`\mathbf{x}`. :math:`\langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}} \rangle` : the message to parent :math:`\boldsymbol{\theta}` Computed as a function of the moments of this node and the other parents. Deterministic nodes require only the following terms: :math:`\langle \mathbf{u} \rangle` : the posterior moments vector Computed as a function of the messages from the parents. :math:`\mathbf{m}` : the message to a parent Computed as a function of the messages from the other parents and all children. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/workflow.rst0000644000175100001770000000464400000000000022415 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Workflow ======== The main forum for BayesPy development is `GitHub `_. Bugs and other issues can be reported at https://github.com/bayespy/bayespy/issues. Contributions to the code and documentation are welcome and should be given as pull requests at https://github.com/bayespy/bayespy/pulls. In order to create pull requests, it is recommended to fork the git repository, make local changes and submit these changes as a pull request. The style guide for writing docstrings follows the style guide of NumPy, available at https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt. Detailed instructions on development workflow can be read from NumPy guide, available at http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html. BayesPy uses the following acronyms to start the commit message: * API: an (incompatible) API change * BLD: change related to building numpy * BUG: bug fix * DEMO: modification in demo code * DEP: deprecate something, or remove a deprecated object * DEV: development tool or utility * DOC: documentation * ENH: enhancement * MAINT: maintenance commit (refactoring, typos, etc.) * REV: revert an earlier commit * STY: style fix (whitespace, PEP8) * TST: addition or modification of tests * REL: related to releasing Since version 0.3.7, we have started following `Vincent Driessen's branching model `_ in how git is used. Making releases --------------- * Commit any current changes to git. * Start a release branch: ``git flow release start x.y.z`` * Edit version number in setup.py and commit. * Add changes to CHANGELOG.rst and commit. * Publish the release branch: ``git flow release publish x.y.z`` * Finish the release: ``git flow release finish x.y.z``. Write the following commit message: ``REL: Version x.y.z``. * Push to GitHub: ``git push && git push --tags`` * Download the release tarball from GitHub and use that in the phases below. This avoids having local garbage in the release. * Publish in PyPI: ``python setup.py release_pypi`` * Update the documentation web page: ``cd doc && make gh-pages`` * Publish in mloss.org. * Announcements to bayespy@googlegroups.com, scipy-user@scipy.org and numpy-discussion@scipy.org. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/dev_guide/writingnodes.rst0000644000175100001770000003630100000000000023252 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Implementing nodes ================== The main goal of BayesPy is to provide a package which enables easy and flexible construction of simple and complex models with efficient inference. However, users may sometimes be unable to construct their models because the built-in nodes do not implement some specific features. Thus, one may need to implement new nodes in order to construct the model. BayesPy aims to make the implementation of new nodes both simple and fast. Probably, a large complex model can be constructed almost completely with the built-in nodes and the user needs to implement only a few nodes. Moments ------- .. currentmodule:: bayespy.inference.vmp.nodes.node In order to implement nodes, it is important to understand the messaging framework of the nodes. A node is a unit of calculation which communicates to its parent and child nodes using messages. These messages have types that need to match between nodes, that is, the child node needs to understand the messages its parents are sending and vice versa. Thus, a node defines which message type it requires from each of its parents, and only nodes that have that type of output message (i.e., the message to a child node) are valid parent nodes for that node. The message type is defined by the moments of the parent node. The moments are a collection of expectations: :math:`\{ \langle f_1(X) \rangle, \ldots, \langle f_N(X) \rangle \}`. The functions :math:`f_1, \ldots, f_N` (and the number of the functions) define the message type and they are the sufficient statistic as discussed in the previous section. Different message types are represented by :class:`Moments` class hierarchy. For instance, :class:`GaussianMoments` represents a message type with parent moments :math:`\{\langle \mathbf{x} \rangle, \langle \mathbf{xx}^T \rangle \}` and :class:`WishartMoments` a message type with parent moments :math:`\{\langle \mathbf{\Lambda} \rangle, \langle \log |\mathbf{\Lambda}| \rangle\}`. .. currentmodule:: bayespy.nodes Let us give an example: :class:`Gaussian` node outputs :class:`GaussianMoments` messages and :class:`Wishart` node outputs :class:`WishartMoments` messages. :class:`Gaussian` node requires that it receives :class:`GaussianMoments` messages from the mean parent node and :class:`WishartMoments` messages from the precision parent node. Thus, :class:`Gaussian` and :class:`Wishart` are valid node classes as the mean and precision parent nodes of :class:`Gaussian` node. Note that several nodes may have the same output message type and some message types can be transformed to other message types using deterministic converter nodes. For instance, :class:`Gaussian` and :class:`GaussianARD` nodes both output :class:`GaussianMoments` messages, deterministic :class:`SumMultiply` also outputs :class:`GaussianMoments` messages, and deterministic converter :class:`_MarkovChainToGaussian` converts :class:`GaussianMarkovChainMoments` to :class:`GaussianMoments`. .. currentmodule:: bayespy.inference.vmp.nodes.node Each node specifies the message type requirements of its parents by :attr:`Node._parent_moments` attribute which is a list of :class:`Moments` sub-class instances. These moments objects have a few purpose when creating the node: 1) check that parents are sending proper messages; 2) if parents use different message type, try to add a converter which converts the messages to the correct type if possible; 3) if given parents are not nodes but numeric arrays, convert them to constant nodes with correct output message type. When implementing a new node, it is not always necessary to implement a new moments class. If another node has the same sufficient statistic vector, thus the same moments, that moments class can be used. Otherwise, one must implement a simple moments class which has the following methods: * :func:`Moments.compute_fixed_moments` Computes the moments for a known value. This is used to compute the moments of constant numeric arrays and wrap them into constant nodes. * :func:`Moments.compute_dims_from_values` Given a known value of the variable, return the shape of the variable dimensions in the moments. This is used to solve the shape of the moments array for constant nodes. Distributions ------------- .. currentmodule:: bayespy.inference.vmp.stochastic In order to implement a stochastic exponential family node, one must first write down the log probability density function of the node and derive the terms discussed in section :ref:`sec-vmp-terms`. These terms are implemented and collected as a class which is a subclass of :class:`Distribution`. The main reason to implement these methods in another class instead of the node class itself is that these methods can be used without creating a node, for instance, in :class:`Mixture` class. .. currentmodule:: bayespy.inference.vmp.nodes.expfamily For exponential family distributions, the distribution class is a subclass of :class:`ExponentialFamilyDistribution`, and the relation between the terms in section :ref:`sec-vmp-terms` and the methods is as follows: * :func:`ExponentialFamilyDistribution.compute_phi_from_parents` Computes the expectation of the natural parameters :math:`\langle \boldsymbol{\phi} \rangle` in the prior distribution given the moments of the parents. * :func:`ExponentialFamilyDistribution.compute_cgf_from_parents` Computes the expectation of the negative log normalizer :math:`\langle g \rangle` of the prior distribution given the moments of the parents. * :func:`ExponentialFamilyDistribution.compute_moments_and_cgf` Computes the moments :math:`\langle \mathbf{u} \rangle` and the negative log normalizer :math:`\tilde{g}` of the posterior distribution given the natural parameters :math:`\tilde{\boldsymbol{\phi}}`. * :func:`ExponentialFamilyDistribution.compute_message_to_parent` Computes the message :math:`\langle \boldsymbol{\phi}_{\mathbf{x}\rightarrow\boldsymbol{\theta}} \rangle` from the node :math:`\mathbf{x}` to its parent node :math:`\boldsymbol{\theta}` given the moments of the node and the other parents. * :func:`ExponentialFamilyDistribution.compute_fixed_moments_and_f` Computes :math:`\mathbf{u}(\mathbf{x})` and :math:`f(\mathbf{x})` for given observed value :math:`\mathbf{x}`. Without this method, variables from this distribution cannot be observed. For each stochastic exponential family node, one must write a distribution class which implements these methods. After that, the node class is basically a simple wrapper and it also stores the moments and the natural parameters of the current posterior approximation. Note that the distribution classes do not store node-specific information, they are more like static collections of methods. However, sometimes the implementations depend on some information, such as the dimensionality of the variable, and this information must be provided, if needed, when constructing the distribution object. In addition to the methods listed above, it is necessary to implement a few more methods in some cases. This happens when the plates of the parent do not map to the plates directly as discussed in section :ref:`sec-irregular-plates`. Then, one must write methods that implement this plate mapping and apply the same mapping to the mask array: * :func:`ExponentialFamilyDistribution.plates_from_parent` Given the plates of the parent, return the resulting plates of the child. * :func:`ExponentialFamilyDistribution.plates_to_parent` Given the plates of the child, return the plates of the parent that would have resulted them. * :func:`ExponentialFamilyDistribution.compute_mask_to_parent` Given the mask array of the child, apply the plate mapping. It is important to understand when one must implement these methods, because the default implementations in the base class will lead to errors or weird results. Stochastic exponential family nodes ----------------------------------- After implementing the distribution class, the next task is to implement the node class. First, we need to explain a few important attributes before we can explain how to implement a node class. Stochastic exponential family nodes have two attributes that store the state of the posterior distribution: * ``phi`` The natural parameter vector :math:`\tilde{\boldsymbol{\phi}}` of the posterior approximation. * ``u`` The moments :math:`\langle \mathbf{u} \rangle` of the posterior approximation. Instead of storing these two variables as vectors (as in the mathematical formulas), they are stored as lists of arrays with convenient shapes. For instance, :class:`Gaussian` node stores the moments as a list consisting of a vector :math:`\langle \mathbf{x} \rangle` and a matrix :math:`\langle \mathbf{xx}^T \rangle` instead of reshaping and concatenating these into a single vector. The same applies for the natural parameters ``phi`` because it has the same shape as ``u``. The shapes of the arrays in the lists ``u`` and ``phi`` consist of the shape caused by the plates and the shape caused by the variable itself. For instance, the moments of :class:`Gaussian` node have shape ``(D,)`` and ``(D, D)``, where ``D`` is the dimensionality of the Gaussian vector. In addition, if the node has plates, they are added to these shapes. Thus, for instance, if the :class:`Gaussian` node has plates ``(3, 7)`` and ``D`` is 5, the shape of ``u[0]`` and ``phi[0]`` would be ``(3, 7, 5)`` and the shape of ``u[1]`` and ``phi[1]`` would be ``(3, 7, 5, 5)``. This shape information is stored in the following attributes: * ``plates`` : a tuple The plates of the node. In our example, ``(3, 7)``. * ``dims`` : a list of tuples The shape of each of the moments arrays (or natural parameter arrays) without plates. In our example, ``[ (5,), (5, 5) ]``. Finally, three attributes define VMP for the node: * ``_moments`` : :class:`Moments` sub-class instance An object defining the moments of the node. * ``_parent_moments`` : list of :class:`Moments` sub-class instances A list defining the moments requirements for each parent. * ``_distribution`` : :class:`Distribution` sub-class instance An object implementing the VMP formulas. Basically, a node class is a collection of the above attributes. When a node is created, these attributes are defined. The base class for exponential family nodes, :class:`ExponentialFamily`, provides a simple default constructor which does not need to be overwritten if ``dims``, ``_moments``, ``_parent_moments`` and ``_distribution`` can be provided as static class attributes. For instance, :class:`Gamma` node defines these attributes statically. However, usually at least one of these attributes cannot be defined statically in the class. In that case, one must implement a class method which overloads :func:`ExponentialFamily._constructor`. The purpose of this method is to define all the attributes given the parent nodes. These are defined using a class method instead of ``__init__`` method in order to be able to use the class constructors statically, for instance, in :class:`Mixture` class. This construction allows users to create mixtures of any exponential family distribution with simple syntax. The parents of a node must be converted so that they have a correct message type, because the user may have provided numeric arrays or nodes with incorrect message type. Numeric arrays should be converted to constant nodes with correct message type. Incorrect message type nodes should be converted to correct message type nodes if possible. Thus, the constructor should use ``Node._ensure_moments`` method to make sure the parent is a node with correct message type. Instead of calling this method for each parent node in the constructor, one can use ``ensureparents`` decorator to do this automatically. However, the decorator requires that ``_parent_moments`` attribute has already been defined statically. If this is not possible, the parent nodes must be converted manually in the constructor, because one should never assume that the parent nodes given to the constructor are nodes with correct message type or even nodes at all. Deterministic nodes ------------------- Deterministic nodes are nodes that do not correspond to any probability distribution but rather a deterministic function. It does not have any moments or natural parameters to store. A deterministic node is implemented as a subclass of :class:`Deterministic` base class. The new node class must implement the following methods: * :func:`Deterministic._compute_moments` Computes the moments given the moments of the parents. * :func:`Deterministic._compute_message_to_parent` Computes the message to a parent node given the message from the children and the moments of the other parents. In some cases, one may want to implement :func:`Deterministic._compute_message_and_mask_to_parent` or :func:`Deterministic._message_to_parent` instead in order to gain more control over efficient computation. Similarly as in :class:`Distribution` class, if the node handles plates irregularly, it is important to implement the following methods: * :func:`Deterministic._plates_from_parent` Given the plates of the parent, return the resulting plates of the child. * :func:`Deterministic._plates_to_parent` Given the plates of the child, return the plates of the parent that would have resulted them. * :func:`Deterministic._compute_weights_to_parent` Given the mask array, convert it to a plate mask of the parent. Converter nodes +++++++++++++++ Sometimes a node has incorrect message type but the message can be converted into a correct type. For instance, :class:`GaussianMarkovChain` has :class:`GaussianMarkovChainMoments` message type, which means moments :math:`\{ \langle \mathbf{x}_n \rangle, \langle \mathbf{x}_n \mathbf{x}_n^T \rangle, \langle \mathbf{x}_n \mathbf{x}_{n-1}^T \rangle \}^N_{n=1}`. These moments can be converted to :class:`GaussianMoments` by ignoring the third element and considering the time axis as a plate axis. Thus, if a node requires :class:`GaussianMoments` message from its parent, :class:`GaussianMarkovChain` is a valid parent if its messages are modified properly. This conversion is implemented in :class:`_MarkovChainToGaussian` converter class. Converter nodes are simple deterministic nodes that have one parent node and they convert the messages to another message type. For the user, it is not convenient if the exact message type has to be known and an explicit converter node needs to be created. Thus, the conversions are done automatically and the user will be unaware of them. In order to enable this automatization, when writing a converter node, one should register the converter to the moments class using :func:`Moments.add_converter`. For instance, a class ``X`` which converts moments ``A`` to moments ``B`` is registered as ``A.add_conveter(B, X)``. After that, :func:`Node._ensure_moments` and :func:`Node._convert` methods are used to perform the conversions automatically. The conversion can consist of several consecutive converter nodes, and the least number of conversions is used. ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.421372 bayespy-0.6.2/doc/source/examples/0000755000175100001770000000000000000000000017664 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/additive_fhmm.rst0000644000175100001770000000704100000000000023220 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Additive factorial hidden Markov model ====================================== .. code:: python import numpy as np from bayespy.nodes import (Dirichlet, CategoricalMarkovChain, GaussianARD, Gate, SumMultiply, Gamma) D = 4 # dimensionality of the data vectors (and mu vectors) N = 5 # number of chains K = 3 # number of states in each chain T = 100 # length of each chain # Markov chain parameters. # Use known values p0 = np.ones(K) / K P = np.ones((K,K)) / K # Or set Dirichlet prior. p0 = Dirichlet(np.ones(K), plates=(N,)) P = Dirichlet(np.ones(K), plates=(N,1,K)) # N Markov chains with K possible states, and length T X = CategoricalMarkovChain(p0, P, states=T, plates=(N,)) # For each of the N chains, have K different D-dimensional mu's # Unknown mu's mu = GaussianARD(0, 1e-3, plates=(D,1,1,K), shape=(N,)) # Gate/select mu's print(mu.plates, mu.dims[0], X.plates, X.dims[1]) Z = Gate(X, mu) # Sum the mu's of different chains F = SumMultiply('i->', Z) print(mu.plates, mu.dims[0], X.plates, X.dims[0], Z.plates, Z.dims[0], F.plates, F.dims[0]) # Known observation noise inverse covariance tau = np.ones(D) # or unknown observation noise inverse covariance tau = Gamma(1e-3, 1e-3, plates=(D,)) # Observed process Y = GaussianARD(F, tau) # Data data = np.random.randn(T, D) Y.observe(data) from bayespy.inference import VB Q = VB(Y, X, p0, P, mu, tau) Q.update(repeat=10) .. parsed-literal:: (4, 1, 1, 3) (5,) (5,) (99, 3, 3) (4, 1, 1, 3) (5,) (5,) (3,) (4, 5, 100) (5,) (4, 5, 100) () :: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 41 42 # Observed process ---> 43 Y = GaussianARD(F, tau) 44 45 # Data /home/jluttine/workspace/bayespy/bayespy/inference/vmp/nodes/expfamily.py in constructor_decorator(self, *args, **kwargs) 81 82 (args, kwargs, dims, plates, dist, stats, pstats) = \ ---> 83 self._constructor(*args, **kwargs) 84 85 self.dims = dims /home/jluttine/workspace/bayespy/bayespy/inference/vmp/nodes/gaussian.py in _constructor(cls, mu, alpha, ndim, shape, **kwargs) 890 plates = cls._total_plates(kwargs.get('plates'), 891 distribution.plates_from_parent(0, mu.plates), --> 892 distribution.plates_from_parent(1, alpha.plates)) 893 894 parents = [mu, alpha] /home/jluttine/workspace/bayespy/bayespy/inference/vmp/nodes/node.py in _total_plates(cls, plates, *parent_plates) 244 return utils.broadcasted_shape(*parent_plates) 245 except ValueError: --> 246 raise ValueError("The plates of the parents do not broadcast.") 247 else: 248 # Check that the parent_plates are a subset of plates. ValueError: The plates of the parents do not broadcast. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/bmm.rst0000644000175100001770000002016600000000000021176 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: import numpy numpy.random.seed(1) Bernoulli mixture model ======================= This example considers data generated from a Bernoulli mixture model. One simple example process could be a questionnaire for election candidates. We observe a set of binary vectors, where each vector represents a candidate in the election and each element in these vectors correspond to a candidate's answer to a yes-or-no question. The goal is to find groups of similar candidates and analyze the answer patterns of these groups. Data ---- First, we generate artificial data to analyze. Let us assume that the questionnaire contains ten yes-or-no questions. We assume that there are three groups with similar opinions. These groups could represent parties. These groups have the following answering patterns, which are represented by vectors with probabilities of a candidate answering yes to the questions: >>> p0 = [0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9] >>> p1 = [0.1, 0.1, 0.1, 0.1, 0.1, 0.9, 0.9, 0.9, 0.9, 0.9] >>> p2 = [0.9, 0.9, 0.9, 0.9, 0.9, 0.1, 0.1, 0.1, 0.1, 0.1] Thus, the candidates in the first group are likely to answer no to questions 1, 3, 5, 7 and 9, and yes to questions 2, 4, 6, 8, 10. The candidates in the second group are likely to answer yes to the last five questions, whereas the candidates in the third group are likely to answer yes to the first five questions. For convenience, we form a NumPy array of these vectors: >>> import numpy as np >>> p = np.array([p0, p1, p2]) Next, we generate a hundred candidates. First, we randomly select the group for each candidate: >>> from bayespy.utils import random >>> z = random.categorical([1/3, 1/3, 1/3], size=100) Using the group patterns, we generate yes-or-no answers for the candidates: >>> x = random.bernoulli(p[z]) This is our simulated data to be analyzed. Model ----- Now, we construct a model for learning the structure in the data. We have a dataset of hundred 10-dimensional binary vectors: >>> N = 100 >>> D = 10 We will create a Bernoulli mixture model. We assume that the true number of groups is unknown to us, so we use a large enough number of clusters: >>> K = 10 We use the categorical distribution for the group assignments and give the group assignment probabilities an uninformative Dirichlet prior: >>> from bayespy.nodes import Categorical, Dirichlet >>> R = Dirichlet(K*[1e-5], ... name='R') >>> Z = Categorical(R, ... plates=(N,1), ... name='Z') Each group has a probability of a yes answer for each question. These probabilities are given beta priors: >>> from bayespy.nodes import Beta >>> P = Beta([0.5, 0.5], ... plates=(D,K), ... name='P') The answers of the candidates are modelled with the Bernoulli distribution: >>> from bayespy.nodes import Mixture, Bernoulli >>> X = Mixture(Z, Bernoulli, P) Here, ``Z`` defines the group assignments and ``P`` the answering probability patterns for each group. Note how the plates of the nodes are matched: ``Z`` has plates ``(N,1)`` and ``P`` has plates ``(D,K)``, but in the mixture node the last plate axis of ``P`` is discarded and thus the node broadcasts plates ``(N,1)`` and ``(D,)`` resulting in plates ``(N,D)`` for ``X``. Inference --------- In order to infer the variables in our model, we construct a variational Bayesian inference engine: >>> from bayespy.inference import VB >>> Q = VB(Z, R, X, P) This also gives the default update order of the nodes. In order to find different groups, they must be initialized differently, thus we use random initialization for the group probability patterns: >>> P.initialize_from_random() We provide our simulated data: >>> X.observe(x) Now, we can run inference: >>> Q.update(repeat=1000) Iteration 1: loglike=-6.872145e+02 (... seconds) ... Iteration 17: loglike=-5.236921e+02 (... seconds) Converged at iteration 17. The algorithm converges in 17 iterations. Results ------- Now we can examine the approximate posterior distribution. First, let us plot the group assignment probabilities: >>> import bayespy.plot as bpplt >>> bpplt.hinton(R) .. plot:: import numpy numpy.random.seed(1) p0 = [0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9] p1 = [0.1, 0.1, 0.1, 0.1, 0.1, 0.9, 0.9, 0.9, 0.9, 0.9] p2 = [0.9, 0.9, 0.9, 0.9, 0.9, 0.1, 0.1, 0.1, 0.1, 0.1] import numpy as np p = np.array([p0, p1, p2]) from bayespy.utils import random z = random.categorical([1/3, 1/3, 1/3], size=100) x = random.bernoulli(p[z]) N = 100 D = 10 K = 10 from bayespy.nodes import Categorical, Dirichlet R = Dirichlet(K*[1e-5], name='R') Z = Categorical(R, plates=(N,1), name='Z') from bayespy.nodes import Beta P = Beta([0.5, 0.5], plates=(D,K), name='P') from bayespy.nodes import Mixture, Bernoulli X = Mixture(Z, Bernoulli, P) from bayespy.inference import VB Q = VB(Z, R, X, P) P.initialize_from_random() X.observe(x) Q.update(repeat=1000) import bayespy.plot as bpplt bpplt.hinton(R) bpplt.pyplot.show() This plot shows that there are three dominant groups, which is equal to the true number of groups used to generate the data. However, there are still two smaller groups as the data does not give enough evidence to prune them out. The yes-or-no answer probability patterns for the groups can be plotted as: >>> bpplt.hinton(P) .. plot:: import numpy numpy.random.seed(1) p0 = [0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9] p1 = [0.1, 0.1, 0.1, 0.1, 0.1, 0.9, 0.9, 0.9, 0.9, 0.9] p2 = [0.9, 0.9, 0.9, 0.9, 0.9, 0.1, 0.1, 0.1, 0.1, 0.1] import numpy as np p = np.array([p0, p1, p2]) from bayespy.utils import random z = random.categorical([1/3, 1/3, 1/3], size=100) x = random.bernoulli(p[z]) N = 100 D = 10 K = 10 from bayespy.nodes import Categorical, Dirichlet R = Dirichlet(K*[1e-5], name='R') Z = Categorical(R, plates=(N,1), name='Z') from bayespy.nodes import Beta P = Beta([0.5, 0.5], plates=(D,K), name='P') from bayespy.nodes import Mixture, Bernoulli X = Mixture(Z, Bernoulli, P) from bayespy.inference import VB Q = VB(Z, R, X, P) P.initialize_from_random() X.observe(x) Q.update(repeat=1000) import bayespy.plot as bpplt bpplt.hinton(P) bpplt.pyplot.show() The three dominant groups have found the true patterns accurately. The patterns of the two minor groups some kind of mixtures of the three groups and they exist because the generated data happened to contain a few samples giving evidence for these groups. Finally, we can plot the group assignment probabilities for the candidates: >>> bpplt.hinton(Z) .. plot:: import numpy numpy.random.seed(1) p0 = [0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9] p1 = [0.1, 0.1, 0.1, 0.1, 0.1, 0.9, 0.9, 0.9, 0.9, 0.9] p2 = [0.9, 0.9, 0.9, 0.9, 0.9, 0.1, 0.1, 0.1, 0.1, 0.1] import numpy as np p = np.array([p0, p1, p2]) from bayespy.utils import random z = random.categorical([1/3, 1/3, 1/3], size=100) x = random.bernoulli(p[z]) N = 100 D = 10 K = 10 from bayespy.nodes import Categorical, Dirichlet R = Dirichlet(K*[1e-5], name='R') Z = Categorical(R, plates=(N,1), name='Z') from bayespy.nodes import Beta P = Beta([0.5, 0.5], plates=(D,K), name='P') from bayespy.nodes import Mixture, Bernoulli X = Mixture(Z, Bernoulli, P) from bayespy.inference import VB Q = VB(Z, R, X, P) P.initialize_from_random() X.observe(x) Q.update(repeat=1000) import bayespy.plot as bpplt bpplt.hinton(Z) bpplt.pyplot.show() .. currentmodule:: bayespy.plot This plot shows the clustering of the candidates. It is possible to use :class:`HintonPlotter` to enable monitoring during the VB iteration by providing ``plotter=HintonPlotter()`` for ``Z``, ``P`` and ``R`` when creating the nodes. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/examples.rst0000644000175100001770000000037700000000000022243 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Examples ******** .. toctree:: :maxdepth: 1 multinomial regression gmm bmm hmm pca lssm lda ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/gmm.rst0000644000175100001770000001752700000000000021212 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: import numpy numpy.random.seed(1) Gaussian mixture model ====================== This example demonstrates the use of Gaussian mixture model for flexible density estimation, clustering or classification. Data ---- First, let us generate some artificial data for the analysis. The data are two-dimensional vectors from one of the four different Gaussian distributions: >>> import numpy as np >>> y0 = np.random.multivariate_normal([0, 0], [[2, 0], [0, 0.1]], size=50) >>> y1 = np.random.multivariate_normal([0, 0], [[0.1, 0], [0, 2]], size=50) >>> y2 = np.random.multivariate_normal([2, 2], [[2, -1.5], [-1.5, 2]], size=50) >>> y3 = np.random.multivariate_normal([-2, -2], [[0.5, 0], [0, 0.5]], size=50) >>> y = np.vstack([y0, y1, y2, y3]) Thus, there are 200 data vectors in total. The data looks as follows: >>> import bayespy.plot as bpplt >>> bpplt.pyplot.plot(y[:,0], y[:,1], 'rx') [] .. plot:: import numpy as np np.random.seed(1) y0 = np.random.multivariate_normal([0, 0], [[1, 0], [0, 0.02]], size=50) y1 = np.random.multivariate_normal([0, 0], [[0.02, 0], [0, 1]], size=50) y2 = np.random.multivariate_normal([2, 2], [[1, -0.9], [-0.9, 1]], size=50) y3 = np.random.multivariate_normal([-2, -2], [[0.1, 0], [0, 0.1]], size=50) y = np.vstack([y0, y1, y2, y3]) import bayespy.plot as bpplt bpplt.pyplot.plot(y[:,0], y[:,1], 'rx') bpplt.pyplot.show() Model ----- For clarity, let us denote the number of the data vectors with ``N`` >>> N = 200 and the dimensionality of the data vectors with ``D``: >>> D = 2 We will use a "large enough" number of Gaussian clusters in our model: >>> K = 10 Cluster assignments ``Z`` and the prior for the cluster assignment probabilities ``alpha``: >>> from bayespy.nodes import Dirichlet, Categorical >>> alpha = Dirichlet(1e-5*np.ones(K), ... name='alpha') >>> Z = Categorical(alpha, ... plates=(N,), ... name='z') The mean vectors and the precision matrices of the clusters: >>> from bayespy.nodes import Gaussian, Wishart >>> mu = Gaussian(np.zeros(D), 1e-5*np.identity(D), ... plates=(K,), ... name='mu') >>> Lambda = Wishart(D, 1e-5*np.identity(D), ... plates=(K,), ... name='Lambda') If either the mean or precision should be shared between clusters, then that node should not have plates, that is, ``plates=()``. The data vectors are from a Gaussian mixture with cluster assignments ``Z`` and Gaussian component parameters ``mu`` and ``Lambda``: >>> from bayespy.nodes import Mixture >>> Y = Mixture(Z, Gaussian, mu, Lambda, ... name='Y') >>> Z.initialize_from_random() >>> from bayespy.inference import VB >>> Q = VB(Y, mu, Lambda, Z, alpha) Inference --------- Before running the inference algorithm, we provide the data: >>> Y.observe(y) Then, run VB iteration until convergence: >>> Q.update(repeat=1000) Iteration 1: loglike=-1.402345e+03 (... seconds) ... Iteration 61: loglike=-8.888464e+02 (... seconds) Converged at iteration 61. The algorithm converges very quickly. Note that the default update order of the nodes was such that ``mu`` and ``Lambda`` were updated before ``Z``, which is what we wanted because ``Z`` was initialized randomly. Results ------- .. currentmodule:: bayespy.plot For two-dimensional Gaussian mixtures, the mixture components can be plotted using :func:`gaussian_mixture_2d`: >>> bpplt.gaussian_mixture_2d(Y, alpha=alpha, scale=2) .. plot:: import numpy numpy.random.seed(1) import numpy as np y0 = np.random.multivariate_normal([0, 0], [[1, 0], [0, 0.02]], size=50) y1 = np.random.multivariate_normal([0, 0], [[0.02, 0], [0, 1]], size=50) y2 = np.random.multivariate_normal([2, 2], [[1, -0.9], [-0.9, 1]], size=50) y3 = np.random.multivariate_normal([-2, -2], [[0.1, 0], [0, 0.1]], size=50) y = np.vstack([y0, y1, y2, y3]) import bayespy.plot as bpplt bpplt.pyplot.plot(y[:,0], y[:,1], 'rx') N = 200 D = 2 K = 10 from bayespy.nodes import Dirichlet, Categorical alpha = Dirichlet(1e-5*np.ones(K), name='alpha') Z = Categorical(alpha, plates=(N,), name='z') from bayespy.nodes import Gaussian, Wishart mu = Gaussian(np.zeros(D), 1e-5*np.identity(D), plates=(K,), name='mu') Lambda = Wishart(D, 1e-5*np.identity(D), plates=(K,), name='Lambda') from bayespy.nodes import Mixture Y = Mixture(Z, Gaussian, mu, Lambda, name='Y') Z.initialize_from_random() from bayespy.inference import VB Q = VB(Y, mu, Lambda, Z, alpha) Y.observe(y) Q.update(repeat=1000) bpplt.gaussian_mixture_2d(Y, alpha=alpha, scale=2) bpplt.pyplot.show() The function is called with ``scale=2`` which means that each ellipse shows two standard deviations. From the ten cluster components, the model uses effectively the correct number of clusters (4). These clusters capture the true density accurately. In addition to clustering and density estimation, this model could also be used for classification by setting the known class assignments as observed. Advanced next steps ------------------- Joint node for mean and precision +++++++++++++++++++++++++++++++++ .. currentmodule:: bayespy.nodes The next step for improving the results could be to use :class:`GaussianWishart` node for modelling the mean vectors ``mu`` and precision matrices ``Lambda`` jointly without factorization. This should improve the accuracy of the posterior approximation and the speed of the VB estimation. However, the implementation is a bit more complex. Fast collapsed inference ++++++++++++++++++++++++ .. MOVE THE FOLLOWING TO, FOR INSTANCE, MOG OR PCA EXAMPLE: >>> def reset(): ... alpha.initialize_from_prior() ... C.initialize_from_prior() ... X.initialize_from_random() ... tau.initialize_from_prior() ... return VB(Y, C, X, alpha, tau) >>> Q = reset() >>> Q.update(repeat=1000) ... >>> bpplt.pyplot.plot(Q.L, 'k-') >>> Q = reset() >>> Q.optimize(X, C, alpha, tau, maxiter=1000) ... >>> bpplt.pyplot.plot(Q.L, 'r--') .. plot:: import numpy as np np.random.seed(1) # This is the PCA model from the previous sections from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau, name='Y') c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) from bayespy.inference import VB import bayespy.plot as bpplt Q = None def reset(): alpha.initialize_from_prior() C.initialize_from_prior() X.initialize_from_random() tau.initialize_from_prior() return VB(Y, C, X, alpha, tau) Q = reset() Q.update(repeat=1000) bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'k-') Q = reset() Q.optimize(X, C, alpha, tau, maxiter=1000) bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'b--') Q = reset() Q.optimize(C, tau, maxiter=1000, collapsed=[X, alpha]) bpplt.pyplot.plot(np.cumsum(Q.cputime), Q.L, 'r:') bpplt.pyplot.ylim(-100, 100) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/hmm.rst0000644000175100001770000003000300000000000021173 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: import numpy numpy.random.seed(1) Hidden Markov model =================== In this example, we will demonstrate the use of hidden Markov model in the case of known and unknown parameters. We will also use two different emission distributions to demonstrate the flexibility of the model construction. Known parameters ---------------- This example follows the one presented in `Wikipedia `__. Model +++++ Each day, the state of the weather is either 'rainy' or 'sunny'. The weather follows a first-order discrete Markov process. It has the following initial state probabilities >>> a0 = [0.6, 0.4] # p(rainy)=0.6, p(sunny)=0.4 and state transition probabilities: >>> A = [[0.7, 0.3], # p(rainy->rainy)=0.7, p(rainy->sunny)=0.3 ... [0.4, 0.6]] # p(sunny->rainy)=0.4, p(sunny->sunny)=0.6 We will be observing one hundred samples: >>> N = 100 The discrete first-order Markov chain is constructed as: >>> from bayespy.nodes import CategoricalMarkovChain >>> Z = CategoricalMarkovChain(a0, A, states=N) However, instead of observing this process directly, we observe whether Bob is 'walking', 'shopping' or 'cleaning'. The probability of each activity depends on the current weather as follows: >>> P = [[0.1, 0.4, 0.5], ... [0.6, 0.3, 0.1]] where the first row contains activity probabilities on a rainy weather and the second row contains activity probabilities on a sunny weather. Using these emission probabilities, the observed process is constructed as: >>> from bayespy.nodes import Categorical, Mixture >>> Y = Mixture(Z, Categorical, P) Data ++++ In order to test our method, we'll generate artificial data from the model itself. First, draw realization of the weather process: >>> weather = Z.random() Then, using this weather, draw realizations of the activities: >>> activity = Mixture(weather, Categorical, P).random() Inference +++++++++ Now, using this data, we set our variable :math:`Y` to be observed: >>> Y.observe(activity) In order to run inference, we construct variational Bayesian inference engine: >>> from bayespy.inference import VB >>> Q = VB(Y, Z) Note that we need to give all random variables to ``VB``. In this case, the only random variables were ``Y`` and ``Z``. Next we run the inference, that is, compute our posterior distribution: >>> Q.update() Iteration 1: loglike=-1.095883e+02 (... seconds) In this case, because there is only one unobserved random variable, we recover the exact posterior distribution and there is no need to iterate more than one step. Results +++++++ .. currentmodule:: bayespy.plot One way to plot a 2-class categorical timeseries is to use the basic :func:`plot` function: >>> import bayespy.plot as bpplt >>> bpplt.plot(Z) >>> bpplt.plot(1-weather, color='r', marker='x') .. plot:: import numpy numpy.random.seed(1) from bayespy.nodes import CategoricalMarkovChain a0 = [0.6, 0.4] # p(rainy)=0.6, p(sunny)=0.4 A = [[0.7, 0.3], # p(rainy->rainy)=0.7, p(rainy->sunny)=0.3 [0.4, 0.6]] # p(sunny->rainy)=0.4, p(sunny->sunny)=0.6 N = 100 Z = CategoricalMarkovChain(a0, A, states=N) from bayespy.nodes import Categorical, Mixture P = [[0.1, 0.4, 0.5], [0.6, 0.3, 0.1]] Y = Mixture(Z, Categorical, P) weather = Z.random() activity = Mixture(weather, Categorical, P).random() Y.observe(activity) from bayespy.inference import VB Q = VB(Y, Z) Q.update() import bayespy.plot as bpplt bpplt.plot(Z) bpplt.plot(1-weather, color='r', marker='x') bpplt.pyplot.show() The black line shows the posterior probability of rain and the red line and crosses show the true state. Clearly, the method is not able to infer the weather very accurately in this case because the activies do not give that much information about the weather. Unknown parameters ------------------ In this example, we consider unknown parameters for the Markov process and different emission distribution. Data ++++ We generate data from three 2-dimensional Gaussian distributions with different mean vectors and common standard deviation: >>> import numpy as np >>> mu = np.array([ [0,0], [3,4], [6,0] ]) >>> std = 2.0 Thus, the number of clusters is three: >>> K = 3 And the number of samples is 200: >>> N = 200 Each initial state is equally probable: >>> p0 = np.ones(K) / K State transition matrix is such that with probability 0.9 the process stays in the same state. The probability to move one of the other two states is 0.05 for both of those states. >>> q = 0.9 >>> r = (1-q) / (K-1) >>> P = q*np.identity(K) + r*(np.ones((3,3))-np.identity(3)) Simulate the data: >>> y = np.zeros((N,2)) >>> z = np.zeros(N) >>> state = np.random.choice(K, p=p0) >>> for n in range(N): ... z[n] = state ... y[n,:] = std*np.random.randn(2) + mu[state] ... state = np.random.choice(K, p=P[state]) Then, let us visualize the data: >>> bpplt.pyplot.figure() >>> bpplt.pyplot.axis('equal') (...) >>> colors = [ [[1,0,0], [0,1,0], [0,0,1]][int(state)] for state in z ] >>> bpplt.pyplot.plot(y[:,0], y[:,1], 'k-', zorder=-10) [] >>> bpplt.pyplot.scatter(y[:,0], y[:,1], c=colors, s=40) .. plot:: import numpy numpy.random.seed(1) from bayespy.nodes import CategoricalMarkovChain a0 = [0.6, 0.4] # p(rainy)=0.6, p(sunny)=0.4 A = [[0.7, 0.3], # p(rainy->rainy)=0.7, p(rainy->sunny)=0.3 [0.4, 0.6]] # p(sunny->rainy)=0.4, p(sunny->sunny)=0.6 N = 100 Z = CategoricalMarkovChain(a0, A, states=N) from bayespy.nodes import Categorical, Mixture P = [[0.1, 0.4, 0.5], [0.6, 0.3, 0.1]] Y = Mixture(Z, Categorical, P) weather = Z.random() from bayespy.inference import VB import bayespy.plot as bpplt import numpy as np mu = np.array([ [0,0], [3,4], [6,0] ]) std = 2.0 K = 3 N = 200 p0 = np.ones(K) / K q = 0.9 r = (1-q)/(K-1) P = q*np.identity(K) + r*(np.ones((3,3))-np.identity(3)) y = np.zeros((N,2)) z = np.zeros(N) state = np.random.choice(K, p=p0) for n in range(N): z[n] = state y[n,:] = std*np.random.randn(2) + mu[state] state = np.random.choice(K, p=P[state]) bpplt.pyplot.figure() bpplt.pyplot.axis('equal') colors = [ [[1,0,0], [0,1,0], [0,0,1]][int(state)] for state in z ] bpplt.pyplot.plot(y[:,0], y[:,1], 'k-', zorder=-10) bpplt.pyplot.scatter(y[:,0], y[:,1], c=colors, s=40) bpplt.pyplot.show() Consecutive states are connected by a solid black line and the dot color shows the true class. Model +++++ Now, assume that we do not know the parameters of the process (initial state probability and state transition probabilities). We give these parameters quite non-informative priors, but it is possible to provide more informative priors if such information is available: >>> from bayespy.nodes import Dirichlet >>> a0 = Dirichlet(1e-3*np.ones(K)) >>> A = Dirichlet(1e-3*np.ones((K,K))) The discrete Markov chain is constructed as: >>> Z = CategoricalMarkovChain(a0, A, states=N) Now, instead of using categorical emission distribution as before, we'll use Gaussian distribution. For simplicity, we use the true parameters of the Gaussian distributions instead of giving priors and estimating them. The known standard deviation can be converted to a precision matrix as: >>> Lambda = std**(-2) * np.identity(2) Thus, the observed process is a Gaussian mixture with cluster assignments from the hidden Markov process ``Z``: >>> from bayespy.nodes import Gaussian >>> Y = Mixture(Z, Gaussian, mu, Lambda) Note that ``Lambda`` does not have cluster plate axis because it is shared between the clusters. Inference +++++++++ Let us use the simulated data: >>> Y.observe(y) Because ``VB`` takes all the random variables, we need to provide ``A`` and ``a0`` also: >>> Q = VB(Y, Z, A, a0) Then, run VB iteration until convergence: >>> Q.update(repeat=1000) Iteration 1: loglike=-9.963054e+02 (... seconds) ... Iteration 8: loglike=-9.235053e+02 (... seconds) Converged at iteration 8. Results +++++++ Plot the classification of the data similarly as the data: >>> bpplt.pyplot.figure() >>> bpplt.pyplot.axis('equal') (...) >>> colors = Y.parents[0].get_moments()[0] >>> bpplt.pyplot.plot(y[:,0], y[:,1], 'k-', zorder=-10) [] >>> bpplt.pyplot.scatter(y[:,0], y[:,1], c=colors, s=40) .. plot:: import numpy numpy.random.seed(1) from bayespy.nodes import CategoricalMarkovChain a0 = [0.6, 0.4] # p(rainy)=0.6, p(sunny)=0.4 A = [[0.7, 0.3], # p(rainy->rainy)=0.7, p(rainy->sunny)=0.3 [0.4, 0.6]] # p(sunny->rainy)=0.4, p(sunny->sunny)=0.6 N = 100 Z = CategoricalMarkovChain(a0, A, states=N) from bayespy.nodes import Categorical, Mixture P = [[0.1, 0.4, 0.5], [0.6, 0.3, 0.1]] Y = Mixture(Z, Categorical, P) weather = Z.random() from bayespy.inference import VB import bayespy.plot as bpplt import numpy as np mu = np.array([ [0,0], [3,4], [6,0] ]) std = 2.0 K = 3 N = 200 p0 = np.ones(K) / K q = 0.9 r = (1-q)/(K-1) P = q*np.identity(K) + r*(np.ones((3,3))-np.identity(3)) y = np.zeros((N,2)) z = np.zeros(N) state = np.random.choice(K, p=p0) for n in range(N): z[n] = state y[n,:] = std*np.random.randn(2) + mu[state] state = np.random.choice(K, p=P[state]) from bayespy.nodes import Dirichlet a0 = Dirichlet(1e-3*np.ones(K)) A = Dirichlet(1e-3*np.ones((K,K))) Z = CategoricalMarkovChain(a0, A, states=N) Lambda = std**(-2) * np.identity(2) from bayespy.nodes import Gaussian Y = Mixture(Z, Gaussian, mu, Lambda) Y.observe(y) Q = VB(Y, Z, A, a0) Q.update(repeat=1000) bpplt.pyplot.figure() bpplt.pyplot.axis('equal') colors = Y.parents[0].get_moments()[0] bpplt.pyplot.plot(y[:,0], y[:,1], 'k-', zorder=-10) bpplt.pyplot.scatter(y[:,0], y[:,1], c=colors, s=40) bpplt.pyplot.show() The data has been classified quite correctly. Even samples that are more in the region of another cluster are classified correctly if the previous and next sample provide enough evidence for the correct class. We can also plot the state transition matrix: >>> bpplt.hinton(A) .. plot:: import numpy numpy.random.seed(1) from bayespy.nodes import CategoricalMarkovChain a0 = [0.6, 0.4] # p(rainy)=0.6, p(sunny)=0.4 A = [[0.7, 0.3], # p(rainy->rainy)=0.7, p(rainy->sunny)=0.3 [0.4, 0.6]] # p(sunny->rainy)=0.4, p(sunny->sunny)=0.6 N = 100 Z = CategoricalMarkovChain(a0, A, states=N) from bayespy.nodes import Categorical, Mixture P = [[0.1, 0.4, 0.5], [0.6, 0.3, 0.1]] Y = Mixture(Z, Categorical, P) weather = Z.random() from bayespy.inference import VB import bayespy.plot as bpplt import numpy as np mu = np.array([ [0,0], [3,4], [6,0] ]) std = 2.0 K = 3 N = 200 p0 = np.ones(K) / K q = 0.9 r = (1-q)/(K-1) P = q*np.identity(K) + r*(np.ones((3,3))-np.identity(3)) y = np.zeros((N,2)) z = np.zeros(N) state = np.random.choice(K, p=p0) for n in range(N): z[n] = state y[n,:] = std*np.random.randn(2) + mu[state] state = np.random.choice(K, p=P[state]) from bayespy.nodes import Dirichlet a0 = Dirichlet(1e-3*np.ones(K)) A = Dirichlet(1e-3*np.ones((K,K))) Z = CategoricalMarkovChain(a0, A, states=N) Lambda = std**(-2) * np.identity(2) from bayespy.nodes import Gaussian Y = Mixture(Z, Gaussian, mu, Lambda) Y.observe(y) Q = VB(Y, Z, A, a0) Q.update(repeat=1000) bpplt.hinton(A) bpplt.pyplot.show() Clearly, the learned state transition matrix is close to the true matrix. The models described above could also be used for classification by providing the known class assignments as observed data to ``Z`` and the unknown class assignments as missing data. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/lda.rst0000644000175100001770000002055600000000000021166 0ustar00runnerdocker00000000000000.. Copyright (C) 2015 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Latent Dirichlet allocation =========================== Latent Dirichlet allocation is a widely used topic model. The data is a collection of documents which contain words. The goal of the analysis is to find topics (distribution of words in topics) and document topics (distribution of topics in documents). Data ---- The data consists of two vectors of equal length. The elements in these vectors correspond to the words in all documents combined. If there were :math:`M` documents and each document had :math:`K` words, the vectors contain :math:`M \cdot K` elements. Let :math:`M` be the number of documents in total. The first vector gives each word a document index :math:`i\in \{0,\ldots,M-1\}` defining to which document the word belongs. Let :math:`N` be the size of the whole available vocabulary. The second vector gives each word a vocabulary index :math:`j\in \{0,\ldots,N-1\}` defining which word it is from the vocabulary. For this demo, we will just generate an artificial dataset for simplicity. We use the LDA model itself to generate the dataset. First, import relevant packages: >>> import numpy as np >>> from bayespy import nodes Let us decide the number of documents and the number of words in those documents: >>> n_documents = 10 >>> n_words = 10000 Randomly choose into which document each word belongs to: >>> word_documents = nodes.Categorical(np.ones(n_documents)/n_documents, ... plates=(n_words,)).random() Let us also decide the size of our vocabulary: >>> n_vocabulary = 100 Also, let us decide the true number of topics: >>> n_topics = 5 Generate some random distributions for the topics in each document: >>> p_topic = nodes.Dirichlet(1e-1*np.ones(n_topics), ... plates=(n_documents,)).random() Generate some random distributions for the words in each topic: >>> p_word = nodes.Dirichlet(1e-1*np.ones(n_vocabulary), ... plates=(n_topics,)).random() Sample topic assignments for each word in each document: >>> topic = nodes.Categorical(p_topic[word_documents], ... plates=(n_words,)).random() And finally, draw vocabulary indices for each word in all the documents: >>> corpus = nodes.Categorical(p_word[topic], ... plates=(n_words,)).random() Now, our dataset consists of ``word_documents`` and ``corpus``, which define the document and vocabulary indices for each word in our dataset. .. todo:: Use some large real-world dataset, for instance, Wikipedia. Model ----- Variable for learning the topic distribution for each document: >>> p_topic = nodes.Dirichlet(np.ones(n_topics), ... plates=(n_documents,), ... name='p_topic') Variable for learning the word distribution for each topic: >>> p_word = nodes.Dirichlet(np.ones(n_vocabulary), ... plates=(n_topics,), ... name='p_word') The document indices for each word in the corpus: >>> from bayespy.inference.vmp.nodes.categorical import CategoricalMoments >>> document_indices = nodes.Constant(CategoricalMoments(n_documents), word_documents, ... name='document_indices') Variable for learning the topic assignments of each word in the corpus: >>> topics = nodes.Categorical(nodes.Gate(document_indices, p_topic), ... plates=(len(corpus),), ... name='topics') The vocabulary indices for each word in the corpus: >>> words = nodes.Categorical(nodes.Gate(topics, p_word), ... name='words') Inference --------- Observe the corpus: >>> words.observe(corpus) Break symmetry by random initialization: >>> p_topic.initialize_from_random() >>> p_word.initialize_from_random() Construct inference engine: >>> from bayespy.inference import VB >>> Q = VB(words, topics, p_word, p_topic, document_indices) Run the VB learning algorithm: >>> Q.update(repeat=1000) Iteration ... Results ------- Use ``bayespy.plot`` to plot the results: >>> import bayespy.plot as bpplt Plot the topic distributions for each document: >>> bpplt.pyplot.figure() >>> bpplt.hinton(Q['p_topic']) >>> bpplt.pyplot.title("Posterior topic distribution for each document") >>> bpplt.pyplot.xlabel("Topics") >>> bpplt.pyplot.ylabel("Documents") Plot the word distributions for each topic: >>> bpplt.pyplot.figure() >>> bpplt.hinton(Q['p_word']) >>> bpplt.pyplot.title("Posterior word distributions for each topic") >>> bpplt.pyplot.xlabel("Words") >>> bpplt.pyplot.ylabel("Topics") .. todo:: Create more illustrative plots. Stochastic variational inference -------------------------------- LDA is a popular example for stochastic variational inference (SVI). Using SVI for LDA is quite simple in BayesPy. In SVI, only a subset of the dataset is used at each iteration step but this subset is "repeated" to get the same size as the original dataset. Let us define a size for the subset: >>> subset_size = 1000 Thus, our subset will be repeat this many times: >>> plates_multiplier = n_words / subset_size Note that this multiplier doesn't need to be an integer. Now, let us repeat the model construction with only one minor addition. The following variables are identical to previous: >>> p_topic = nodes.Dirichlet(np.ones(n_topics), ... plates=(n_documents,), ... name='p_topic') >>> p_word = nodes.Dirichlet(np.ones(n_vocabulary), ... plates=(n_topics,), ... name='p_word') The document indices vector is now a bit shorter, using only a subset: >>> document_indices = nodes.Constant(CategoricalMoments(n_documents), ... word_documents[:subset_size], ... name='document_indices') Note that at this point, it doesn't matter which elements we chose for the subset. For the topic assignments of each word in the corpus we need to use ``plates_multiplier`` because these topic assignments for the subset are "repeated" to recover the full dataset: >>> topics = nodes.Categorical(nodes.Gate(document_indices, p_topic), ... plates=(subset_size,), ... plates_multiplier=(plates_multiplier,), ... name='topics') Finally, the vocabulary indices for each word in the corpus are constructed as before: >>> words = nodes.Categorical(nodes.Gate(topics, p_word), ... name='words') This node inherits the plates and multipliers from its parent ``topics``, so there is no need to define them here. Again, break symmetry by random initialization: >>> p_topic.initialize_from_random() >>> p_word.initialize_from_random() Construct inference engine: >>> from bayespy.inference import VB >>> Q = VB(words, topics, p_word, p_topic, document_indices) In order to use SVI, we need to disable some lower bound checks, because the lower bound doesn't anymore necessarily increase at each iteration step: >>> Q.ignore_bound_checks = True For the stochastic gradient ascent, we'll define some learning parameters: >>> delay = 1 >>> forgetting_rate = 0.7 Run the inference: >>> for n in range(1000): ... # Observe a random mini-batch ... subset = np.random.choice(n_words, subset_size) ... Q['words'].observe(corpus[subset]) ... Q['document_indices'].set_value(word_documents[subset]) ... # Learn intermediate variables ... Q.update('topics') ... # Set step length ... step = (n + delay) ** (-forgetting_rate) ... # Stochastic gradient for the global variables ... Q.gradient_step('p_topic', 'p_word', scale=step) Iteration 1: ... If one is interested, the lower bound values during the SVI algorithm can be plotted as: >>> bpplt.pyplot.figure() >>> bpplt.pyplot.plot(Q.L) [] The other results can be plotted as before. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/lssm.rst0000644000175100001770000003346400000000000021406 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: import numpy as np np.random.seed(1) Linear state-space model ======================== Model ----- .. currentmodule:: bayespy.nodes In linear state-space models a sequence of :math:`M`-dimensional observations :math:`\mathbf{Y}=(\mathbf{y}_1,\ldots,\mathbf{y}_N)` is assumed to be generated from latent :math:`D`-dimensional states :math:`\mathbf{X}=(\mathbf{x}_1,\ldots,\mathbf{x}_N)` which follow a first-order Markov process: .. math:: \begin{aligned} \mathbf{x}_{n} &= \mathbf{A}\mathbf{x}_{n-1} + \text{noise} \,, \\ \mathbf{y}_{n} &= \mathbf{C}\mathbf{x}_{n} + \text{noise} \,, \end{aligned} where the noise is Gaussian, :math:`\mathbf{A}` is the :math:`D\times D` state dynamics matrix and :math:`\mathbf{C}` is the :math:`M\times D` loading matrix. Usually, the latent space dimensionality :math:`D` is assumed to be much smaller than the observation space dimensionality :math:`M` in order to model the dependencies of high-dimensional observations efficiently. In order to construct the model in BayesPy, first import relevant nodes: >>> from bayespy.nodes import GaussianARD, GaussianMarkovChain, Gamma, Dot The data vectors will be 30-dimensional: >>> M = 30 There will be 400 data vectors: >>> N = 400 Let us use 10-dimensional latent space: >>> D = 10 The state dynamics matrix :math:`\mathbf{A}` has ARD prior: >>> alpha = Gamma(1e-5, ... 1e-5, ... plates=(D,), ... name='alpha') >>> A = GaussianARD(0, ... alpha, ... shape=(D,), ... plates=(D,), ... name='A') Note that :math:`\mathbf{A}` is a :math:`D\times{}D`-dimensional matrix. However, in BayesPy it is modelled as a collection (``plates=(D,)``) of :math:`D`-dimensional vectors (``shape=(D,)``) because this is how the variables factorize in the posterior approximation of the state dynamics matrix in :class:`GaussianMarkovChain`. The latent states are constructed as >>> X = GaussianMarkovChain(np.zeros(D), ... 1e-3*np.identity(D), ... A, ... np.ones(D), ... n=N, ... name='X') where the first two arguments are the mean and precision matrix of the initial state, the third argument is the state dynamics matrix and the fourth argument is the diagonal elements of the precision matrix of the innovation noise. The node also needs the length of the chain given as the keyword argument ``n=N``. Thus, the shape of this node is ``(N,D)``. The linear mapping from the latent space to the observation space is modelled with the loading matrix which has ARD prior: >>> gamma = Gamma(1e-5, ... 1e-5, ... plates=(D,), ... name='gamma') >>> C = GaussianARD(0, ... gamma, ... shape=(D,), ... plates=(M,1), ... name='C') Note that the plates for ``C`` are ``(M,1)``, thus the full shape of the node is ``(M,1,D)``. The unit plate axis is added so that ``C`` broadcasts with ``X`` when computing the dot product: >>> F = Dot(C, ... X, ... name='F') This dot product is computed over the :math:`D`-dimensional latent space, thus the result is a :math:`M\times{}N`-dimensional matrix which is now represented with plates ``(M,N)`` in BayesPy: >>> F.plates (30, 400) We also need to use random initialization either for ``C`` or ``X`` in order to find non-zero latent space because by default both ``C`` and ``X`` are initialized to zero because of their prior distributions. We use random initialization for ``C`` and then we must update ``X`` the first time before updating ``C``: >>> C.initialize_from_random() The precision of the observation noise is given gamma prior: >>> tau = Gamma(1e-5, ... 1e-5, ... name='tau') The observations are noisy versions of the dot products: >>> Y = GaussianARD(F, ... tau, ... name='Y') The variational Bayesian inference engine is then construced as: >>> from bayespy.inference import VB >>> Q = VB(X, C, gamma, A, alpha, tau, Y) Note that ``X`` is given before ``C``, thus ``X`` is updated before ``C`` by default. Data ---- Now, let us generate some toy data for our model. Our true latent space is four dimensional with two noisy oscillator components, one random walk component and one white noise component. >>> w = 0.3 >>> a = np.array([[np.cos(w), -np.sin(w), 0, 0], ... [np.sin(w), np.cos(w), 0, 0], ... [0, 0, 1, 0], ... [0, 0, 0, 0]]) The true linear mapping is just random: >>> c = np.random.randn(M,4) Then, generate the latent states and the observations using the model equations: >>> x = np.empty((N,4)) >>> f = np.empty((M,N)) >>> y = np.empty((M,N)) >>> x[0] = 10*np.random.randn(4) >>> f[:,0] = np.dot(c,x[0]) >>> y[:,0] = f[:,0] + 3*np.random.randn(M) >>> for n in range(N-1): ... x[n+1] = np.dot(a,x[n]) + [1,1,10,10]*np.random.randn(4) ... f[:,n+1] = np.dot(c,x[n+1]) ... y[:,n+1] = f[:,n+1] + 3*np.random.randn(M) We want to simulate missing values, thus we create a mask which randomly removes 80% of the data: >>> from bayespy.utils import random >>> mask = random.mask(M, N, p=0.2) >>> Y.observe(y, mask=mask) Inference --------- As we did not define plotters for our nodes when creating the model, it is done now for some of the nodes: >>> import bayespy.plot as bpplt >>> X.set_plotter(bpplt.FunctionPlotter(center=True, axis=-2)) >>> A.set_plotter(bpplt.HintonPlotter()) >>> C.set_plotter(bpplt.HintonPlotter()) >>> tau.set_plotter(bpplt.PDFPlotter(np.linspace(0.02, 0.5, num=1000))) This enables plotting of the approximate posterior distributions during VB learning. The inference engine can be run using :func:`VB.update` method: >>> Q.update(repeat=10) Iteration 1: loglike=-1.439704e+05 (... seconds) ... Iteration 10: loglike=-1.051441e+04 (... seconds) The iteration progresses a bit slowly, thus we'll consider parameter expansion to speed it up. Parameter expansion +++++++++++++++++++ Section :ref:`sec-parameter-expansion` discusses parameter expansion for state-space models to speed up inference. It is based on a rotating the latent space such that the posterior in the observation space is not affected: .. math:: \mathbf{y}_n = \mathbf{C}\mathbf{x}_n = (\mathbf{C}\mathbf{R}^{-1}) (\mathbf{R}\mathbf{x}_n) \,. Thus, the transformation is :math:`\mathbf{C}\rightarrow\mathbf{C}\mathbf{R}^{-1}` and :math:`\mathbf{X}\rightarrow\mathbf{R}\mathbf{X}`. In order to keep the dynamics of the latent states unaffected by the transformation, the state dynamics matrix :math:`\mathbf{A}` must be transformed accordingly: .. math:: \mathbf{R}\mathbf{x}_n = \mathbf{R}\mathbf{A}\mathbf{R}^{-1} \mathbf{R}\mathbf{x}_{n-1} \,, resulting in a transformation :math:`\mathbf{A}\rightarrow\mathbf{R}\mathbf{A}\mathbf{R}^{-1}`. For more details, refer to :cite:`Luttinen:2013` and :cite:`Luttinen:2010`. In BayesPy, the transformations are available in :mod:`bayespy.inference.vmp.transformations`: >>> from bayespy.inference.vmp import transformations The rotation of the loading matrix along with the ARD parameters is defined as: >>> rotC = transformations.RotateGaussianARD(C, gamma) For rotating ``X``, we first need to define the rotation of the state dynamics matrix: >>> rotA = transformations.RotateGaussianARD(A, alpha) Now we can define the rotation of the latent states: >>> rotX = transformations.RotateGaussianMarkovChain(X, rotA) The optimal rotation for all these variables is found using rotation optimizer: >>> R = transformations.RotationOptimizer(rotX, rotC, D) Set the parameter expansion to be applied after each iteration: >>> Q.callback = R.rotate Now, run iterations until convergence: >>> Q.update(repeat=1000) Iteration 11: loglike=-1.010...e+04 (... seconds) ... Iteration 58: loglike=-8.906...e+03 (... seconds) Converged at iteration ... .. Iteration 60: loglike=-8.906259e+03 (... seconds) Converged at iteration 60. Results ------- Because we have set the plotters, we can plot those nodes as: >>> Q.plot(X, A, C, tau) .. plot:: import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, GaussianMarkovChain, Gamma, Dot M = 30 N = 400 D = 10 alpha = Gamma(1e-5, 1e-5, plates=(D,), name='alpha') A = GaussianARD(0, alpha, shape=(D,), plates=(D,), name='A') X = GaussianMarkovChain(np.zeros(D), 1e-3*np.identity(D), A, np.ones(D), n=N, name='X') gamma = Gamma(1e-5, 1e-5, plates=(D,), name='gamma') C = GaussianARD(0, gamma, shape=(D,), plates=(M,1), name='C') F = Dot(C, X, name='F') C.initialize_from_random() tau = Gamma(1e-5, 1e-5, name='tau') Y = GaussianARD(F, tau, name='Y') from bayespy.inference import VB Q = VB(X, C, gamma, A, alpha, tau, Y) w = 0.3 a = np.array([[np.cos(w), -np.sin(w), 0, 0], [np.sin(w), np.cos(w), 0, 0], [0, 0, 1, 0], [0, 0, 0, 0]]) c = np.random.randn(M,4) x = np.empty((N,4)) f = np.empty((M,N)) y = np.empty((M,N)) x[0] = 10*np.random.randn(4) f[:,0] = np.dot(c,x[0]) y[:,0] = f[:,0] + 3*np.random.randn(M) for n in range(N-1): x[n+1] = np.dot(a,x[n]) + [1,1,10,10]*np.random.randn(4) f[:,n+1] = np.dot(c,x[n+1]) y[:,n+1] = f[:,n+1] + 3*np.random.randn(M) from bayespy.utils import random mask = random.mask(M, N, p=0.2) Y.observe(y, mask=mask) import bayespy.plot as bpplt X.set_plotter(bpplt.FunctionPlotter(center=True, axis=-2)) A.set_plotter(bpplt.HintonPlotter()) C.set_plotter(bpplt.HintonPlotter()) tau.set_plotter(bpplt.PDFPlotter(np.linspace(0.02, 0.5, num=1000))) Q.update(repeat=10) from bayespy.inference.vmp import transformations rotC = transformations.RotateGaussianARD(C, gamma) rotA = transformations.RotateGaussianARD(A, alpha) rotX = transformations.RotateGaussianMarkovChain(X, rotA) R = transformations.RotationOptimizer(rotX, rotC, D) Q.callback = R.rotate Q.update(repeat=1000) Q.plot(X, A, C, tau) bpplt.pyplot.show() There are clearly four effective components in ``X``: random walk (component number 1), random oscillation (7 and 10), and white noise (9). These dynamics are also visible in the state dynamics matrix Hinton diagram. Note that the white noise component does not have any dynamics. Also ``C`` shows only four effective components. The posterior of ``tau`` captures the true value :math:`3^{-2}\approx0.111` accurately. We can also plot predictions in the observation space: >>> bpplt.plot(F, center=True) .. plot:: import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, GaussianMarkovChain, Gamma, Dot M = 30 N = 400 D = 10 alpha = Gamma(1e-5, 1e-5, plates=(D,), name='alpha') A = GaussianARD(0, alpha, shape=(D,), plates=(D,), name='A') X = GaussianMarkovChain(np.zeros(D), 1e-3*np.identity(D), A, np.ones(D), n=N, name='X') gamma = Gamma(1e-5, 1e-5, plates=(D,), name='gamma') C = GaussianARD(0, gamma, shape=(D,), plates=(M,1), name='C') F = Dot(C, X, name='F') C.initialize_from_random() tau = Gamma(1e-5, 1e-5, name='tau') Y = GaussianARD(F, tau, name='Y') from bayespy.inference import VB Q = VB(X, C, gamma, A, alpha, tau, Y) w = 0.3 a = np.array([[np.cos(w), -np.sin(w), 0, 0], [np.sin(w), np.cos(w), 0, 0], [0, 0, 1, 0], [0, 0, 0, 0]]) c = np.random.randn(M,4) x = np.empty((N,4)) f = np.empty((M,N)) y = np.empty((M,N)) x[0] = 10*np.random.randn(4) f[:,0] = np.dot(c,x[0]) y[:,0] = f[:,0] + 3*np.random.randn(M) for n in range(N-1): x[n+1] = np.dot(a,x[n]) + [1,1,10,10]*np.random.randn(4) f[:,n+1] = np.dot(c,x[n+1]) y[:,n+1] = f[:,n+1] + 3*np.random.randn(M) from bayespy.utils import random mask = random.mask(M, N, p=0.2) Y.observe(y, mask=mask) import bayespy.plot as bpplt Q.update(repeat=10) from bayespy.inference.vmp import transformations rotC = transformations.RotateGaussianARD(C, gamma) rotA = transformations.RotateGaussianARD(A, alpha) rotX = transformations.RotateGaussianMarkovChain(X, rotA) R = transformations.RotationOptimizer(rotX, rotC, D) Q.callback = R.rotate Q.update(repeat=1000) bpplt.plot(F, center=True) bpplt.pyplot.show() We can also measure the performance numerically by computing root-mean-square error (RMSE) of the missing values: >>> from bayespy.utils import misc >>> misc.rmse(y[~mask], F.get_moments()[0][~mask]) 5.18... This is relatively close to the standard deviation of the noise (3), so the predictions are quite good considering that only 20% of the data was used. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/examples/pca.rst0000644000175100001770000001371100000000000021164 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: import numpy numpy.random.seed(1) Principal component analysis ============================ This example uses a simple principal component analysis to find a two-dimensional latent subspace in a higher dimensional dataset. Data ---- Let us create a Gaussian dataset with latent space dimensionality two and some observation noise: >>> M = 20 >>> N = 100 >>> import numpy as np >>> x = np.random.randn(N, 2) >>> w = np.random.randn(M, 2) >>> f = np.einsum('ik,jk->ij', w, x) >>> y = f + 0.1*np.random.randn(M, N) Model ----- We will use 10-dimensional latent space in our model and let it learn the true dimensionality: >>> D = 10 Import relevant nodes: >>> from bayespy.nodes import GaussianARD, Gamma, SumMultiply The latent states: >>> X = GaussianARD(0, 1, plates=(1,N), shape=(D,)) The loading matrix with automatic relevance determination (ARD) prior: >>> alpha = Gamma(1e-5, 1e-5, plates=(D,)) >>> C = GaussianARD(0, alpha, plates=(M,1), shape=(D,)) Compute the dot product of the latent states and the loading matrix: >>> F = SumMultiply('d,d->', X, C) The observation noise: >>> tau = Gamma(1e-5, 1e-5) The observed variable: >>> Y = GaussianARD(F, tau) Inference --------- Observe the data: >>> Y.observe(y) We do not have missing data now, but they could be easily handled with ``mask`` keyword argument. Construct variational Bayesian (VB) inference engine: >>> from bayespy.inference import VB >>> Q = VB(Y, X, C, alpha, tau) Initialize the latent subspace randomly, otherwise both ``X`` and ``C`` would converge to zero: >>> C.initialize_from_random() .. currentmodule:: bayespy.inference Now we could use :func:`VB.update` to run the inference. However, let us first create a parameter expansion to speed up the inference. The expansion is based on rotating the latent subspace optimally. This is optional but will usually improve the speed of the inference significantly, especially in high-dimensional problems: >>> from bayespy.inference.vmp.transformations import RotateGaussianARD >>> rot_X = RotateGaussianARD(X) >>> rot_C = RotateGaussianARD(C, alpha) By giving ``alpha`` for ``rot_C``, the rotation will also optimize ``alpha`` jointly with ``C``. Now that we have defined the rotations for our variables, we need to construct an optimizer: >>> from bayespy.inference.vmp.transformations import RotationOptimizer >>> R = RotationOptimizer(rot_X, rot_C, D) In order to use the rotations automatically, we need to set it as a callback function: >>> Q.set_callback(R.rotate) For more information about the rotation parameter expansion, see :cite:`Luttinen:2010` and :cite:`Luttinen:2013`. Now we can run the actual inference until convergence: >>> Q.update(repeat=1000) Iteration 1: loglike=-2.33...e+03 (... seconds) ... Iteration ...: loglike=6.500...e+02 (... seconds) Converged at iteration ... Results ------- The results can be visualized, for instance, by plotting the Hinton diagram of the loading matrix: >>> import bayespy.plot as bpplt >>> bpplt.pyplot.figure() >>> bpplt.hinton(C) .. plot:: import numpy numpy.random.seed(1) M = 20 N = 100 import numpy as np x = np.random.randn(N, 2) w = np.random.randn(M, 2) f = np.einsum('ik,jk->ij', w, x) y = f + 0.1*np.random.randn(M, N) D = 10 from bayespy.nodes import GaussianARD, Gamma, SumMultiply X = GaussianARD(0, 1, plates=(1,N), shape=(D,)) alpha = Gamma(1e-5, 1e-5, plates=(D,)) C = GaussianARD(0, alpha, plates=(M,1), shape=(D,)) F = SumMultiply('d,d->', X, C) tau = Gamma(1e-5, 1e-5) Y = GaussianARD(F, tau) Y.observe(y) from bayespy.inference import VB Q = VB(Y, X, C, alpha, tau) C.initialize_from_random() from bayespy.inference.vmp.transformations import RotateGaussianARD rot_X = RotateGaussianARD(X) rot_C = RotateGaussianARD(C, alpha) from bayespy.inference.vmp.transformations import RotationOptimizer R = RotationOptimizer(rot_X, rot_C, D) Q.set_callback(R.rotate) Q.update(repeat=1000) import bayespy.plot as bpplt bpplt.hinton(C) The method has been able to prune out unnecessary latent dimensions and keep two components, which is the true number of components. >>> bpplt.pyplot.figure() >>> bpplt.plot(F) >>> bpplt.plot(f, color='r', marker='x', linestyle='None') .. plot:: import numpy numpy.random.seed(1) M = 20 N = 100 import numpy as np x = np.random.randn(N, 2) w = np.random.randn(M, 2) f = np.einsum('ik,jk->ij', w, x) y = f + 0.1*np.random.randn(M, N) D = 10 from bayespy.nodes import GaussianARD, Gamma, SumMultiply X = GaussianARD(0, 1, plates=(1,N), shape=(D,)) alpha = Gamma(1e-5, 1e-5, plates=(D,)) C = GaussianARD(0, alpha, plates=(M,1), shape=(D,)) F = SumMultiply('d,d->', X, C) tau = Gamma(1e-5, 1e-5) Y = GaussianARD(F, tau) Y.observe(y) from bayespy.inference import VB Q = VB(Y, X, C, alpha, tau) C.initialize_from_random() from bayespy.inference.vmp.transformations import RotateGaussianARD rot_X = RotateGaussianARD(X) rot_C = RotateGaussianARD(C, alpha) from bayespy.inference.vmp.transformations import RotationOptimizer R = RotationOptimizer(rot_X, rot_C, D) Q.set_callback(R.rotate) Q.update(repeat=1000) import bayespy.plot as bpplt bpplt.plot(F) bpplt.plot(f, color='r', marker='x', linestyle='None') .. currentmodule:: bayespy.nodes The reconstruction of the noiseless function values are practically perfect in this simple example. Larger noise variance, more latent space dimensions and missing values would make this problem more difficult. The model construction could also be improved by having, for instance, ``C`` and ``tau`` in the same node without factorizing between them in the posterior approximation. This can be achieved by using :class:`GaussianGammaISO` node. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/index.rst0000644000175100001770000000070100000000000017705 0ustar00runnerdocker00000000000000.. Copyright (C) 2011,2012 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. BayesPy -- Bayesian Python ========================== .. toctree:: :maxdepth: 2 intro user_guide/user_guide examples/examples dev_guide/dev_guide user_api/user_api dev_api/dev_api references * :doc:`Bibliography ` * :ref:`genindex` * :ref:`modindex` * :ref:`search` ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/intro.rst0000644000175100001770000000043300000000000017733 0ustar00runnerdocker00000000000000.. Copyright (C) 2011,2012 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Introduction ============ .. include:: ../../README.rst :start-line: 2 Version history --------------- .. include:: ../../CHANGELOG.rst ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/nodes.rst0000644000175100001770000000101600000000000017706 0ustar00runnerdocker00000000000000.. Copyright (C) 2011,2012 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Nodes ===== Stochastic nodes ---------------- Normal ++++++ Gaussian ++++++++ Gamma +++++ Wishart +++++++ Bernoulli +++++++++ Categorical +++++++++++ Binomial ++++++++ Multinomial +++++++++++ Beta ++++ Dirichlet +++++++++ Mixture +++++++ Constant ++++++++ Deterministic nodes ------------------- Dot product (of Gaussian variables) +++++++++++++++++++++++++++++++++++ ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/references.rst0000644000175100001770000000033300000000000020720 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. rubric:: Bibliography .. bibliography:: references.bib :all: :style: plain ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.421372 bayespy-0.6.2/doc/source/user_api/0000755000175100001770000000000000000000000017655 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_api/user_api.rst0000644000175100001770000000050300000000000022214 0ustar00runnerdocker00000000000000.. Copyright (C) 2011-2013 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. _sec-user-api: User API ======== .. autosummary:: :toctree: generated/ :template: autosummary/short_module.rst bayespy.nodes bayespy.inference bayespy.plot ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.425372 bayespy-0.6.2/doc/source/user_guide/0000755000175100001770000000000000000000000020201 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/advanced.rst0000644000175100001770000003423500000000000022507 0ustar00runnerdocker00000000000000.. Copyright (C) 2014-2015 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: import numpy as np np.random.seed(1) # This is the PCA model from the previous sections from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau, name='Y') c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) from bayespy.inference import VB import bayespy.plot as bpplt Q = VB(Y, C, X, alpha, tau) X.initialize_from_parameters(np.random.randn(1, 100, D), 10) from bayespy.inference.vmp import transformations rotX = transformations.RotateGaussianARD(X) rotC = transformations.RotateGaussianARD(C, alpha) R = transformations.RotationOptimizer(rotC, rotX, D) Q = VB(Y, C, X, alpha, tau) Q.callback = R.rotate Q.update(repeat=1000, tol=1e-6, verbose=False) import warnings warnings.simplefilter('error', UserWarning) Advanced topics =============== This section contains brief information on how to implement some advanced methods in BayesPy. These methods include Riemannian conjugate gradient methods, pattern search, simulated annealing, collapsed variational inference and stochastic variational inference. In order to use these methods properly, the user should understand them to some extent. They are also considered experimental, thus you may encounter bugs or unimplemented features. In any case, these methods may provide huge performance improvements easily compared to the standard VB-EM algorithm. Gradient-based optimization --------------------------- Variational Bayesian learning basically means that the parameters of the approximate posterior distributions are optimized to maximize the lower bound of the marginal log likelihood :cite:`Honkela:2010`. This optimization can be done by using gradient-based optimization methods. In order to improve the gradient-based methods, it is recommended to take into account the information geometry by using the Riemannian (a.k.a. natural) gradient. In fact, the standard VB-EM algorithm is equivalent to a gradient ascent method which uses the Riemannian gradient and step length 1. Thus, it is natural to try to improve this method by using non-linear conjugate gradient methods instead of gradient ascent. These optimization methods are especially useful when the VB-EM update equations are not available but one has to use fixed form approximation. But it is possible that the Riemannian conjugate gradient method improve performance even when the VB-EM update equations are available. .. currentmodule:: bayespy.inference The optimization algorithm in :func:`VB.optimize` has a simple interface. Instead of using the default Riemannian geometry, one can use the Euclidean geometry by giving :code:`riemannian=False`. It is also possible to choose the optimization method from gradient ascent (:code:`method='gradient'`) or conjugate gradient methods (only :code:`method='fletcher-reeves'` implemented at the moment). For instance, we could optimize nodes ``C`` and ``X`` jointly using Euclidean gradient ascent as: >>> Q = VB(Y, C, X, alpha, tau) >>> Q.optimize(C, X, riemannian=False, method='gradient', maxiter=5) Iteration ... Note that this is very inefficient way of updating those nodes (bad geometry and not using conjugate gradients). Thus, one should understand the idea of these optimization methods, otherwise one may do something extremely inefficient. Most likely this method can be found useful in combination with the advanced tricks in the following sections. .. note:: The Euclidean gradient has not been implemented for all nodes yet. The Euclidean gradient is required by the Euclidean geometry based optimization but also by the conjugate gradient methods in the Riemannian geometry. Thus, the Riemannian conjugate gradient may not yet work for all models. It is possible to construct custom optimization algorithms with the tools provided by :class:`VB`. For instance, :func:`VB.get_parameters` and :func:`VB.set_parameters` can be used to handle the parameters of nodes. :func:`VB.get_gradients` is used for computing the gradients of nodes. The parameter and gradient objects are not numerical arrays but more complex nested lists not meant to be accessed by the user. Thus, for simple arithmetics with the parameter and gradient objects, use functions :func:`VB.add` and :func:`VB.dot`. Finally, :func:`VB.compute_lowerbound` and :func:`VB.has_converged` can be used to monitor the lower bound. Collapsed inference ------------------- The optimization method can be used efficiently in such a way that some of the variables are collapsed, that is, marginalized out :cite:`Hensman:2012`. The collapsed variables must be conditionally independent given the observations and all other variables. Probably, one also wants that the size of the marginalized variables is large and the size of the optimized variables is small. For instance, in our PCA example, we could optimize as follows: >>> Q.optimize(C, tau, maxiter=10, collapsed=[X, alpha]) Iteration ... The collapsed variables are given as a list. This optimization does basically the following: It first computes the gradients for ``C`` and ``tau`` and takes an update step using the desired optimization method. Then, it updates the collapsed variables by using the standard VB-EM update equations. These two steps are taken in turns. Effectively, this corresponds to collapsing the variables ``X`` and ``alpha`` in a particular way. The point of this method is that the number of parameters in the optimization reduces significantly and the collapsed variables are updated optimally. For more details, see :cite:`Hensman:2012`. It is possible to use this method in such a way, that the collapsed variables are not conditionally independent given the observations and all other variables. However, in that case, the method does not anymore correspond to collapsing the variables but just using VB-EM updates after gradient-based updates. The method does not check for conditional independence, so the user is free to do this. .. note:: Although the Riemannian conjugate gradient method has not yet been implemented for all nodes, it may be possible to collapse those nodes and optimize the other nodes for which the Euclidean gradient is already implemented. Pattern search -------------- The pattern search method estimates the direction in which the approximate posterior distributions are updating and performs a line search in that direction :cite:`Honkela:2003`. The search direction is based on the difference in the VB parameters on successive updates (or several updates). The idea is that the VB-EM algorithm may be slow because it just zigzags and this can be fixed by moving to the direction in which the VB-EM is slowly moving. BayesPy offers a simple built-in pattern search method :func:`VB.pattern_search`. The method updates the nodes twice, measures the difference in the parameters and performs a line search with a small number of function evaluations: >>> Q.pattern_search(C, X) Iteration ... Similarly to the collapsed optimization, it is possible to collapse some of the variables in the pattern search. The same rules of conditional independence apply as above. The collapsed variables are given as list: >>> Q.pattern_search(C, tau, collapsed=[X, alpha]) Iteration ... Also, a maximum number of iterations can be set by using ``maxiter`` keyword argument. It is not always obvious whether a pattern search will improve the rate of convergence or not but if it seems that the convergence is slow because of zigzagging, it may be worth a try. Note that the computational cost of the pattern search is quite high, thus it is not recommended to perform it after every VB-EM update but every now and then, for instance, after every 10 iterations. In addition, it is possible to write a more customized VB learning algorithm which uses pattern searches by using the different methods of :class:`VB` discussed above. Deterministic annealing ----------------------- The standard VB-EM algorithm converges to a local optimum which can often be inferior to the global optimum and many other local optima. Deterministic annealing aims at finding a better local optimum, hopefully even the global optimum :cite:`Katahira:2008`. It does this by increasing the weight on the entropy of the posterior approximation in the VB lower bound. Effectively, the annealed lower bound becomes closer to a uniform function instead of the original multimodal lower bound. The weight on the entropy is recovered slowly and the optimization is much more robust to initialization. In BayesPy, the annealing can be set by using :func:`VB.set_annealing`. The given annealing should be in range :math:`(0,1]` but this is not validated in case the user wants to do something experimental. If annealing is set to 1, the original VB lower bound is recovered. Annealing with 0 would lead to an improper uniform distribution, thus it will lead to errors. The entropy term is weighted by the inverse of this annealing term. An alternative view is that the model probability density functions are raised to the power of the annealing term. Typically, the annealing is used in such a way that the annealing is small at the beginning and increased after every convergence of the VB algorithm until value 1 is reached. After the annealing value is increased, the algorithm continues from where it had just converged. The annealing can be used for instance as: >>> beta = 0.1 >>> while beta < 1.0: ... beta = min(beta*1.5, 1.0) ... Q.set_annealing(beta) ... Q.update(repeat=100, tol=1e-4) Iteration ... Here, the ``tol`` keyword argument is used to adjust the threshold for convergence. In this case, it is a bit larger than by default so the algorithm does not need to converge perfectly but a rougher convergence is sufficient for the next iteration with a new annealing value. Stochastic variational inference -------------------------------- In stochastic variational inference :cite:`Hoffman:2013`, the idea is to use mini-batches of large datasets to compute noisy gradients and learn the VB distributions by using stochastic gradient ascent. In order for it to be useful, the model must be such that it can be divided into "intermediate" and "global" variables. The number of intermediate variables increases with the data but the number of global variables remains fixed. The global variables are learnt in the stochastic optimization. By denoting the data as :math:`Y=[Y_1, \ldots, Y_N]`, the intermediate variables as :math:`Z=[Z_1, \ldots, Z_N]` and the global variables as :math:`\theta`, the model needs to have the following structure: .. math:: p(Y, Z, \theta) &= p(\theta) \prod^N_{n=1} p(Y_n|Z_n,\theta) p(Z_n|\theta) The algorithm consists of three steps which are iterated: 1) a random mini-batch of the data is selected, 2) the corresponding intermediate variables are updated by using normal VB update equations, and 3) the global variables are updated with (stochastic) gradient ascent as if there was as many replications of the mini-batch as needed to recover the original dataset size. The learning rate for the gradient ascent must satisfy: .. math:: \sum^\infty_{i=1} \alpha_i = \infty \qquad \text{and} \qquad \sum^\infty_{i=1} \alpha^2 < \infty, where :math:`i` is the iteration number. An example of a valid learning parameter is :math:`\alpha_i = (\delta + i)^{-\gamma}`, where :math:`\delta \geq 0` is a delay and :math:`\gamma\in (0.5, 1]` is a forgetting rate. Stochastic variational inference is relatively easy to use in BayesPy. The idea is that the user creates a model for the size of a mini-batch and specifies a multiplier for those plate axes that are replicated. For the PCA example, the mini-batch model can be costructed as follows. We decide to use ``X`` as an intermediate variable and the other variables are global. The global variables ``alpha``, ``C`` and ``tau`` are constructed identically as before. The intermediate variable ``X`` is constructed as: >>> X = GaussianARD(0, 1, ... shape=(D,), ... plates=(1,5), ... plates_multiplier=(1,20), ... name='X') Note that the plates are ``(1,5)`` whereas they are ``(1,100)`` in the full model. Thus, we need to provide a plates multiplier ``(1,20)`` to define how the plates are replicated to get the full dataset. These multipliers do not need to be integers, in this case the latter plate axis is multiplied by :math:`100/5=20`. The remaining variables are defined as before: >>> F = Dot(C, X) >>> Y = GaussianARD(F, tau, name='Y') Note that the plates of ``Y`` and ``F`` also correspond to the size of the mini-batch and they also deduce the plate multipliers from their parents, thus we do not need to specify the multiplier here explicitly (although it is ok to do so). Let us construct the inference engine for the new mini-batch model: >>> Q = VB(Y, C, X, alpha, tau) Use random initialization for ``C`` to break the symmetry in ``C`` and ``X``: >>> C.initialize_from_random() Then, stochastic variational inference algorithm could look as follows: >>> Q.ignore_bound_checks = True >>> for n in range(200): ... subset = np.random.choice(100, 5) ... Y.observe(data[:,subset]) ... Q.update(X) ... learning_rate = (n + 2.0) ** (-0.7) ... Q.gradient_step(C, alpha, tau, scale=learning_rate) Iteration ... First, we ignore the bound checks because they are noisy. Then, the loop consists of three parts: 1) Draw a random mini-batch of the data (5 samples from 100). 2) Update the intermediate variable ``X``. 3) Update global variables with gradient ascent using a proper learning rate. Black-box variational inference ------------------------------- NOT YET IMPLEMENTED. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/inference.rst0000644000175100001770000003367500000000000022707 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: # This is the PCA model from the previous section import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau) Performing inference ==================== Approximation of the posterior distribution can be divided into several steps: - Observe some nodes - Choose the inference engine - Initialize the posterior approximation - Run the inference algorithm In order to illustrate these steps, we'll be using the PCA model constructed in the previous section. Observing nodes --------------- First, let us generate some toy data: >>> c = np.random.randn(10, 2) >>> x = np.random.randn(2, 100) >>> data = np.dot(c, x) + 0.1*np.random.randn(10, 100) The data is provided by simply calling ``observe`` method of a stochastic node: >>> Y.observe(data) It is important that the shape of the ``data`` array matches the plates and shape of the node ``Y``. For instance, if ``Y`` was :class:`Wishart` node for :math:`3\times 3` matrices with plates ``(5,1,10)``, the full shape of ``Y`` would be ``(5,1,10,3,3)``. The ``data`` array should have this shape exactly, that is, no broadcasting rules are applied. Missing values ++++++++++++++ It is possible to mark missing values by providing a mask which is a boolean array: >>> Y.observe(data, mask=[[True], [False], [False], [True], [True], ... [False], [True], [True], [True], [False]]) ``True`` means that the value is observed and ``False`` means that the value is missing. The shape of the above mask is ``(10,1)``, which broadcasts to the plates of Y, ``(10,100)``. Thus, the above mask means that the second, third, sixth and tenth rows of the :math:`10\times 100` data matrix are missing. The mask is applied to the *plates*, not to the data array directly. This means that it is not possible to observe a random variable partially, each repetition defined by the plates is either fully observed or fully missing. Thus, the mask is applied to the plates. It is often possible to circumvent this seemingly tight restriction by adding an observable child node which factorizes more. The shape of the mask is broadcasted to plates using standard NumPy broadcasting rules. So, if the variable has plates ``(5,1,10)``, the mask could have a shape ``()``, ``(1,)``, ``(1,1)``, ``(1,1,1)``, ``(10,)``, ``(1,10)``, ``(1,1,10)``, ``(5,1,1)`` or ``(5,1,10)``. In order to speed up the inference, missing values are automatically integrated out if they are not needed as latent variables to child nodes. This leads to faster convergence and more accurate approximations. Choosing the inference method ----------------------------- Inference methods can be found in :mod:`bayespy.inference` package. Currently, only variational Bayesian approximation is implemented (:class:`bayespy.inference.VB`). The inference engine is constructed by giving the stochastic nodes of the model. >>> from bayespy.inference import VB >>> Q = VB(Y, C, X, alpha, tau) There is no need to give any deterministic nodes. Currently, the inference engine does not automatically search for stochastic parents and children, thus it is important that all stochastic nodes of the model are given. This should be made more robust in future versions. A node of the model can be obtained by using the name of the node as a key: >>> Q['X'] Note that the returned object is the same as the node object itself: >>> Q['X'] is X True Thus, one may use the object ``X`` when it is available. However, if the model and the inference engine are constructed in another function or module, the node object may not be available directly and this feature becomes useful. Initializing the posterior approximation ---------------------------------------- The inference engines give some initialization to the stochastic nodes by default. However, the inference algorithms can be sensitive to the initialization, thus it is sometimes necessary to have better control over the initialization. For VB, the following initialization methods are available: - ``initialize_from_prior``: Use the current states of the parent nodes to update the node. This is the default initialization. - ``initialize_from_parameters``: Use the given parameter values for the distribution. - ``initialize_from_value``: Use the given value for the variable. - ``initialize_from_random``: Draw a random value for the variable. The random sample is drawn from the current state of the node's distribution. Note that ``initialize_from_value`` and ``initialize_from_random`` initialize the distribution with a value of the variable instead of parameters of the distribution. Thus, the distribution is actually a delta distribution with a peak on the value after the initialization. This state of the distribution does not have proper natural parameter values nor normalization, thus the VB lower bound terms are ``np.nan`` for this initial state. These initialization methods can be used to perform even a bit more complex initializations. For instance, a Gaussian distribution could be initialized with a random mean and variance 0.1. In our PCA model, this can be obtained by >>> X.initialize_from_parameters(np.random.randn(1, 100, D), 10) Note that the shape of the random mean is the sum of the plates ``(1, 100)`` and the variable shape ``(D,)``. In addition, instead of variance, :class:`GaussianARD` uses precision as the second parameter, thus we initialized the variance to :math:`\frac{1}{10}`. This random initialization is important in our PCA model because the default initialization gives ``C`` and ``X`` zero mean. If the mean of the other variable was zero when the other is updated, the other variable gets zero mean too. This would lead to an update algorithm where both means remain zeros and effectively no latent space is found. Thus, it is important to give non-zero random initialization for ``X`` if ``C`` is updated before ``X`` the first time. It is typical that at least some nodes need be initialized with some randomness. By default, nodes are initialized with the method ``initialize_from_prior``. The method is not very time consuming but if for any reason you want to avoid that default initialization computation, you can provide ``initialize=False`` when creating the stochastic node. However, the node does not have a proper state in that case, which leads to errors in VB learning unless the distribution is initialized using the above methods. Running the inference algorithm ------------------------------- The approximation methods are based on iterative algorithms, which can be run using ``update`` method. By default, it takes one iteration step updating all nodes once: >>> Q.update() Iteration 1: loglike=-9.305259e+02 (... seconds) The ``loglike`` tells the VB lower bound. The order in which the nodes are updated is the same as the order in which the nodes were given when creating ``Q``. If you want to change the order or update only some of the nodes, you can give as arguments the nodes you want to update and they are updated in the given order: >>> Q.update(C, X) Iteration 2: loglike=-8.818976e+02 (... seconds) It is also possible to give the same node several times: >>> Q.update(C, X, C, tau) Iteration 3: loglike=-8.071222e+02 (... seconds) Note that each call to ``update`` is counted as one iteration step although not variables are necessarily updated. Instead of doing one iteration step, ``repeat`` keyword argument can be used to perform several iteration steps: >>> Q.update(repeat=10) Iteration 4: loglike=-7.167588e+02 (... seconds) Iteration 5: loglike=-6.827873e+02 (... seconds) Iteration 6: loglike=-6.259477e+02 (... seconds) Iteration 7: loglike=-4.725400e+02 (... seconds) Iteration 8: loglike=-3.270816e+02 (... seconds) Iteration 9: loglike=-2.208865e+02 (... seconds) Iteration 10: loglike=-1.658761e+02 (... seconds) Iteration 11: loglike=-1.469468e+02 (... seconds) Iteration 12: loglike=-1.420311e+02 (... seconds) Iteration 13: loglike=-1.405139e+02 (... seconds) The VB algorithm stops automatically if it converges, that is, the relative change in the lower bound is below some threshold: >>> Q.update(repeat=1000) Iteration 14: loglike=-1.396481e+02 (... seconds) ... Iteration 488: loglike=-1.224106e+02 (... seconds) Converged at iteration 488. Now the algorithm stopped before taking 1000 iteration steps because it converged. The relative tolerance can be adjusted by providing ``tol`` keyword argument to the ``update`` method: >>> Q.update(repeat=10000, tol=1e-6) Iteration 489: loglike=-1.224094e+02 (... seconds) ... Iteration 847: loglike=-1.222506e+02 (... seconds) Converged at iteration 847. Making the tolerance smaller, may improve the result but it may also significantly increase the iteration steps until convergence. Instead of using ``update`` method of the inference engine ``VB``, it is possible to use the ``update`` methods of the nodes directly as >>> C.update() or >>> Q['C'].update() However, this is not recommended, because the ``update`` method of the inference engine ``VB`` is a wrapper which, in addition to calling the nodes' ``update`` methods, checks for convergence and does a few other useful minor things. But if for any reason these direct update methods are needed, they can be used. .. _sec-parameter-expansion: Parameter expansion +++++++++++++++++++ Sometimes the VB algorithm converges very slowly. This may happen when the variables are strongly coupled in the true posterior but factorized in the approximate posterior. This coupling leads to zigzagging of the variational parameters which progresses slowly. One solution to this problem is to use parameter expansion. The idea is to add an auxiliary variable which parameterizes the posterior approximation of several variables. Then optimizing this auxiliary variable actually optimizes several posterior approximations jointly leading to faster convergence. The parameter expansion is model specific. Currently in BayesPy, only state-space models have built-in parameter expansions available. These state-space models contain a variable which is a dot product of two variables (plus some noise): .. math:: y = \mathbf{c}^T\mathbf{x} + \mathrm{noise} The parameter expansion can be motivated by noticing that we can add an auxiliary variable which rotates the variables :math:`\mathbf{c}` and :math:`\mathbf{x}` so that the dot product is unaffected: .. math:: y &= \mathbf{c}^T\mathbf{x} + \mathrm{noise} = \mathbf{c}^T \mathbf{R} \mathbf{R}^{-1}\mathbf{x} + \mathrm{noise} = (\mathbf{R}^T\mathbf{c})^T(\mathbf{R}^{-1}\mathbf{x}) + \mathrm{noise} Now, applying this rotation to the posterior approximations :math:`q(\mathbf{c})` and :math:`q(\mathbf{x})`, and optimizing the VB lower bound with respect to the rotation leads to parameterized joint optimization of :math:`\mathbf{c}` and :math:`\mathbf{x}`. The available parameter expansion methods are in module ``transformations``: >>> from bayespy.inference.vmp import transformations First, you create the rotation transformations for the two variables: >>> rotX = transformations.RotateGaussianARD(X) >>> rotC = transformations.RotateGaussianARD(C, alpha) .. currentmodule:: bayespy.inference.vmp.transformations Here, the rotation for ``C`` provides the ARD parameters ``alpha`` so they are updated simultaneously. In addition to :class:`RotateGaussianARD`, there are a few other built-in rotations defined, for instance, :class:`RotateGaussian` and :class:`RotateGaussianMarkovChain`. It is extremely important that the model satisfies the assumptions made by the rotation class and the user is mostly responsible for this. The optimizer for the rotations is constructed by giving the two rotations and the dimensionality of the rotated space: .. currentmodule:: bayespy.nodes >>> R = transformations.RotationOptimizer(rotC, rotX, D) Now, calling ``rotate`` method will find optimal rotation and update the relevant nodes (``X``, ``C`` and ``alpha``) accordingly: >>> R.rotate() Let us see how our iteration would have gone if we had used this parameter expansion. First, let us re-initialize our nodes and VB algorithm: >>> alpha.initialize_from_prior() >>> C.initialize_from_prior() >>> X.initialize_from_parameters(np.random.randn(1, 100, D), 10) >>> tau.initialize_from_prior() >>> Q = VB(Y, C, X, alpha, tau) Then, the rotation is set to run after each iteration step: >>> Q.callback = R.rotate Now the iteration converges to the relative tolerance :math:`10^{-6}` much faster: >>> Q.update(repeat=1000, tol=1e-6) Iteration 1: loglike=-9.363...e+02 (... seconds) ... Iteration 18: loglike=-1.221354e+02 (... seconds) Converged at iteration 18. The convergence took 18 iterations with rotations and 488 or 847 iterations without the parameter expansion. In addition, the lower bound is improved slightly. One can compare the number of iteration steps in this case because the cost per iteration step with or without parameter expansion is approximately the same. Sometimes the parameter expansion can have the drawback that it converges to a bad local optimum. Usually, this can be solved by updating the nodes near the observations a few times before starting to update the hyperparameters and to use parameter expansion. In any case, the parameter expansion is practically necessary when using state-space models in order to converge to a proper solution in a reasonable time. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/install.rst0000644000175100001770000000025500000000000022403 0ustar00runnerdocker00000000000000.. Copyright (C) 2011,2012 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. include:: ../../../INSTALL.rst ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/modelconstruct.rst0000644000175100001770000005066600000000000024015 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. currentmodule:: bayespy.nodes Constructing the model ====================== In BayesPy, the model is constructed by creating nodes which form a directed network. There are two types of nodes: stochastic and deterministic. A stochastic node corresponds to a random variable (or a set of random variables) from a specific probability distribution. A deterministic node corresponds to a deterministic function of its parents. For a list of built-in nodes, see the :ref:`sec-user-api`. Creating nodes -------------- Creating a node is basically like writing the conditional prior distribution of the variable in Python. The node is constructed by giving the parent nodes, that is, the conditioning variables as arguments. The number of parents and their meaning depend on the node. For instance, a :class:`Gaussian` node is created by giving the mean vector and the precision matrix. These parents can be constant numerical arrays if they are known: >>> from bayespy.nodes import Gaussian >>> X = Gaussian([2, 5], [[1.0, 0.3], [0.3, 1.0]]) or other nodes if they are unknown and given prior distributions: >>> from bayespy.nodes import Gaussian, Wishart >>> mu = Gaussian([0, 0], [[1e-6, 0],[0, 1e-6]]) >>> Lambda = Wishart(2, [[1, 0], [0, 1]]) >>> X = Gaussian(mu, Lambda) Nodes can also be named by providing ``name`` keyword argument: >>> X = Gaussian(mu, Lambda, name='x') The name may be useful when referring to the node using an inference engine. For the parent nodes, there are two main restrictions: non-constant parent nodes must be conjugate and the parent nodes must be mutually independent in the posterior approximation. Conjugacy of the parents ~~~~~~~~~~~~~~~~~~~~~~~~ In Bayesian framework in general, one can give quite arbitrary probability distributions for variables. However, one often uses distributions that are easy to handle in practice. Quite often this means that the parents are given conjugate priors. This is also one of the limitations in BayesPy: only conjugate family prior distributions are accepted currently. Thus, although in principle one could give, for instance, gamma prior for the mean parameter ``mu``, only Gaussian-family distributions are accepted because of the conjugacy. If the parent is not of a proper type, an error is raised. This conjugacy is checked automatically by BayesPy and ``NoConverterError`` is raised if a parent cannot be interpreted as being from a conjugate distribution. Independence of the parents ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Another a bit rarely encountered limitation is that the parents must be mutually independent (in the posterior factorization). Thus, a node cannot have the same stochastic node as several parents without intermediate stochastic nodes. For instance, the following leads to an error: >>> from bayespy.nodes import Dot >>> Y = Dot(X, X) Traceback (most recent call last): ... ValueError: Parent nodes are not independent The error is raised because ``X`` is given as two parents for ``Y``, and obviously ``X`` is not independent of ``X`` in the posterior approximation. Even if ``X`` is not given several times directly but there are some intermediate deterministic nodes, an error is raised because the deterministic nodes depend on their parents and thus the parents of ``Y`` would not be independent. However, it is valid that a node is a parent of another node via several paths if all the paths or all except one path has intermediate stochastic nodes. This is valid because the intermediate stochastic nodes have independent posterior approximations. Thus, for instance, the following construction does not raise errors: >>> from bayespy.nodes import Dot >>> Z = Gaussian(X, [[1,0], [0,1]]) >>> Y = Dot(X, Z) This works because there is now an intermediate stochastic node ``Z`` on the other path from ``X`` node to ``Y`` node. Effects of the nodes on inference --------------------------------- When constructing the network with nodes, the stochastic nodes actually define three important aspects: 1. the prior probability distribution for the variables, 2. the factorization of the posterior approximation, 3. the functional form of the posterior approximation for the variables. Prior probability distribution ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First, the most intuitive feature of the nodes is that they define the prior distribution. In the previous example, ``mu`` was a stochastic :class:`GaussianARD` node corresponding to :math:`\mu` from the normal distribution, ``tau`` was a stochastic :class:`Gamma` node corresponding to :math:`\tau` from the gamma distribution, and ``y`` was a stochastic :class:`GaussianARD` node corresponding to :math:`y` from the normal distribution with mean :math:`\mu` and precision :math:`\tau`. If we denote the set of all stochastic nodes by :math:`\Omega`, and by :math:`\pi_X` the set of parents of a node :math:`X`, the model is defined as .. math:: p(\Omega) = \prod_{X \in \Omega} p(X|\pi_X), where nodes correspond to the terms :math:`p(X|\pi_X)`\ . Posterior factorization ~~~~~~~~~~~~~~~~~~~~~~~ Second, the nodes define the structure of the posterior approximation. The variational Bayesian approximation factorizes with respect to nodes, that is, each node corresponds to an independent probability distribution in the posterior approximation. In the previous example, ``mu`` and ``tau`` were separate nodes, thus the posterior approximation factorizes with respect to them: :math:`q(\mu)q(\tau)`\ . Thus, the posterior approximation can be written as: .. math:: p(\tilde{\Omega}|\hat{\Omega}) \approx \prod_{X \in \tilde{\Omega}} q(X), where :math:`\tilde{\Omega}` is the set of latent stochastic nodes and :math:`\hat{\Omega}` is the set of observed stochastic nodes. Sometimes one may want to avoid the factorization between some variables. For this purpose, there are some nodes which model several variables jointly without factorization. For instance, :class:`GaussianGammaISO` is a joint node for :math:`\mu` and :math:`\tau` variables from the normal-gamma distribution and the posterior approximation does not factorize between :math:`\mu` and :math:`\tau`, that is, the posterior approximation is :math:`q(\mu,\tau)`. Functional form of the posterior ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Last, the nodes define the functional form of the posterior approximation. Usually, the posterior approximation has the same or similar functional form as the prior. For instance, :class:`Gamma` uses gamma distribution to also approximate the posterior distribution. Similarly, :class:`GaussianARD` uses Gaussian distribution for the posterior. However, the posterior approximation of :class:`GaussianARD` uses a full covariance matrix although the prior assumes a diagonal covariance matrix. Thus, there can be slight differences in the exact functional form of the posterior approximation but the rule of thumb is that the functional form of the posterior approximation is the same as or more general than the functional form of the prior. Using plate notation -------------------- Defining plates ~~~~~~~~~~~~~~~ Stochastic nodes take the optional parameter ``plates``, which can be used to define plates of the variable. A plate defines the number of repetitions of a set of variables. For instance, a set of random variables :math:`\mathbf{y}_{mn}` could be defined as .. math:: \mathbf{y}_{mn} \sim \mathcal{N}(\boldsymbol{\mu}, \mathbf{\Lambda}),\qquad m=0,\ldots,9, \quad n=0,\ldots,29. This can also be visualized as a graphical model: .. bayesnet:: \node[latent] (y) {$\mathbf{y}_{mn}$} ; \node[latent, above left=1.8 and 0.4 of y] (mu) {$\boldsymbol{\mu}$} ; \node[latent, above right=1.8 and 0.4 of y] (Lambda) {$\mathbf{\Lambda}$} ; \factor[above=of y] {y-f} {left:$\mathcal{N}$} {mu,Lambda} {y}; \plate {m-plate} {(y)(y-f)(y-f-caption)} {$m=0,\ldots,9$} ; \plate {n-plate} {(m-plate)(m-plate-caption)} {$n=0,\ldots,29$} ; The variable has two plates: one for the index :math:`m` and one for the index :math:`n`\ . In BayesPy, this random variable can be constructed as: >>> y = Gaussian(mu, Lambda, plates=(10,30)) .. note:: The plates are always given as a tuple of positive integers. Plates also define indexing for the nodes, thus you can use simple NumPy-style slice indexing to obtain a subset of the plates: >>> y_0 = y[0] >>> y_0.plates (30,) >>> y_even = y[:,::2] >>> y_even.plates (10, 15) >>> y_complex = y[:5, 10:20:5] >>> y_complex.plates (5, 2) Note that this indexing is for the plates only, not for the random variable dimensions. Sharing and broadcasting plates ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Instead of having a common mean and precision matrix for all :math:`\mathbf{y}_{mn}`\ , it is also possible to share plates with parents. For instance, the mean could be different for each index :math:`m` and the precision for each index :math:`n`\ : .. math:: \mathbf{y}_{mn} \sim \mathcal{N}(\boldsymbol{\mu}_m, \mathbf{\Lambda}_n),\qquad m=0,\ldots,9, \quad n=0,\ldots,29. which has the following graphical representation: .. bayesnet:: \node[latent] (y) {$\mathbf{y}_{mn}$} ; \node[latent, above left=1 and 2 of y] (mu) {$\boldsymbol{\mu}_m$} ; \node[latent, above right=1 and 1 of y] (Lambda) {$\mathbf{\Lambda}_n$} ; \factor[above=of y] {y-f} {above:$\mathcal{N}$} {mu,Lambda} {y}; \plate {m-plate} {(mu)(y)(y-f)(y-f-caption)} {$m=0,\ldots,9$} ; \plate {n-plate} {(Lambda)(y)(y-f)(y-f-caption)(m-plate-caption)(m-plate.north east)} {$n=0,\ldots,29$} ; This can be constructed in BayesPy, for instance, as: >>> from bayespy.nodes import Gaussian, Wishart >>> mu = Gaussian([0, 0], [[1e-6, 0],[0, 1e-6]], plates=(10,1)) >>> Lambda = Wishart(2, [[1, 0], [0, 1]], plates=(1,30)) >>> X = Gaussian(mu, Lambda) There are a few things to notice here. First, the plates are defined similarly as shapes in NumPy, that is, they use similar broadcasting rules. For instance, the plates ``(10,1)`` and ``(1,30)`` broadcast to ``(10,30)``. In fact, one could use plates ``(10,1)`` and ``(30,)`` to get the broadcasted plates ``(10,30)`` because broadcasting compares the plates from right to left starting from the last axis. Second, ``X`` is not given ``plates`` keyword argument because the default plates are the plates broadcasted from the parents and that was what we wanted so it was not necessary to provide the keyword argument. If we wanted, for instance, plates ``(20,10,30)`` for ``X``, then we would have needed to provide ``plates=(20,10,30)``. The validity of the plates between a child and its parents is checked as follows. The plates are compared plate-wise starting from the last axis and working the way forward. A plate of the child is compatible with a plate of the parent if either of the following conditions is met: 1. The two plates have equal size 2. The parent has size 1 (or no plate) Table below shows an example of compatible plates for a child node and its two parent nodes: +---------+----------------------------+ | node | plates | +=========+===+===+===+===+===+===+====+ | parent1 | | 3 | 1 | 1 | 1 | 8 | 10 | +---------+---+---+---+---+---+---+----+ | parent2 | | | 1 | 1 | 5 | 1 | 10 | +---------+---+---+---+---+---+---+----+ | child | 5 | 3 | 1 | 7 | 5 | 8 | 10 | +---------+---+---+---+---+---+---+----+ Plates in deterministic nodes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Note that plates can be defined explicitly only for stochastic nodes. For deterministic nodes, the plates are defined implicitly by the plate broadcasting rules from the parents. Deterministic nodes do not need more plates than this because there is no randomness. The deterministic node would just have the same value over the extra plates, but it is not necessary to do this explicitly because the child nodes of the deterministic node can utilize broadcasting anyway. Thus, there is no point in having extra plates in deterministic nodes, and for this reason, deterministic nodes do not use ``plates`` keyword argument. Plates in constants ~~~~~~~~~~~~~~~~~~~ It is useful to understand how the plates and the shape of a random variable are connected. The shape of an array which contains all the plates of a random variable is the concatenation of the plates and the shape of the variable. For instance, consider a 2-dimensional Gaussian variable with plates ``(3,)``. If you want the value of the constant mean vector and constant precision matrix to vary between plates, they are given as ``(3,2)``-shape and ``(3,2,2)``-shape arrays, respectively: >>> import numpy as np >>> mu = [ [0,0], [1,1], [2,2] ] >>> Lambda = [ [[1.0, 0.0], ... [0.0, 1.0]], ... [[1.0, 0.9], ... [0.9, 1.0]], ... [[1.0, -0.3], ... [-0.3, 1.0]] ] >>> X = Gaussian(mu, Lambda) >>> np.shape(mu) (3, 2) >>> np.shape(Lambda) (3, 2, 2) >>> X.plates (3,) Thus, the leading axes of an array are the plate axes and the trailing axes are the random variable axes. In the example above, the mean vector has plates ``(3,)`` and shape ``(2,)``, and the precision matrix has plates ``(3,)`` and shape ``(2,2)``. Factorization of plates ~~~~~~~~~~~~~~~~~~~~~~~ It is important to undestand the independency structure the plates induce for the model. First, the repetitions defined by a plate are independent a priori given the parents. Second, the repetitions are independent in the posterior approximation, that is, the posterior approximation factorizes with respect to plates. Thus, the plates also have an effect on the independence structure of the posterior approximation, not only prior. If dependencies between a set of variables need to be handled, that set must be handled as a some kind of multi-dimensional variable. .. _sec-irregular-plates: Irregular plates ~~~~~~~~~~~~~~~~ The handling of plates is not always as simple as described above. There are cases in which the plates of the parents do not map directly to the plates of the child node. The user API should mention such irregularities. For instance, the parents of a mixture distribution have a plate which contains the different parameters for each cluster, but the variable from the mixture distribution does not have that plate: >>> from bayespy.nodes import Gaussian, Wishart, Categorical, Mixture >>> mu = Gaussian([[0], [0], [0]], [ [[1]], [[1]], [[1]] ]) >>> Lambda = Wishart(1, [ [[1]], [[1]], [[1]]]) >>> Z = Categorical([1/3, 1/3, 1/3], plates=(100,)) >>> X = Mixture(Z, Gaussian, mu, Lambda) >>> mu.plates (3,) >>> Lambda.plates (3,) >>> Z.plates (100,) >>> X.plates (100,) The plates ``(3,)`` and ``(100,)`` should not broadcast according to the rules mentioned above. However, when validating the plates, :class:`Mixture` removes the plate which corresponds to the clusters in ``mu`` and ``Lambda``. Thus, ``X`` has plates which are the result of broadcasting plates ``()`` and ``(100,)`` which equals ``(100,)``. Also, sometimes the plates of the parents may be mapped to the variable axes. For instance, an automatic relevance determination (ARD) prior for a Gaussian variable is constructed by giving the diagonal elements of the precision matrix (or tensor). The Gaussian variable itself can be a scalar, a vector, a matrix or a tensor. A set of five :math:`4 \times 3` -dimensional Gaussian matrices with ARD prior is constructed as: >>> from bayespy.nodes import GaussianARD, Gamma >>> tau = Gamma(1, 1, plates=(5,4,3)) >>> X = GaussianARD(0, tau, shape=(4,3)) >>> tau.plates (5, 4, 3) >>> X.plates (5,) Note how the last two plate axes of ``tau`` are mapped to the variable axes of ``X`` with shape ``(4,3)`` and the plates of ``X`` are obtained by taking the remaining leading plate axes of ``tau``. Example model: Principal component analysis ------------------------------------------- Now, we'll construct a bit more complex model which will be used in the following sections. The model is a probabilistic version of principal component analysis (PCA): .. math:: \mathbf{Y} = \mathbf{C}\mathbf{X}^T + \mathrm{noise} where :math:`\mathbf{Y}` is :math:`M\times N` data matrix, :math:`\mathbf{C}` is :math:`M\times D` loading matrix, :math:`\mathbf{X}` is :math:`N\times D` state matrix, and noise is isotropic Gaussian. The dimensionality :math:`D` is usually assumed to be much smaller than :math:`M` and :math:`N`. A probabilistic formulation can be written as: .. math:: p(\mathbf{Y}) &= \prod^{M-1}_{m=0} \prod^{N-1}_{n=0} \mathcal{N}(y_{mn} | \mathbf{c}_m^T \mathbf{x}_n, \tau) \\ p(\mathbf{X}) &= \prod^{N-1}_{n=0} \prod^{D-1}_{d=0} \mathcal{N}(x_{nd} | 0, 1) \\ p(\mathbf{C}) &= \prod^{M-1}_{m=0} \prod^{D-1}_{d=0} \mathcal{N}(c_{md} | 0, \alpha_d) \\ p(\boldsymbol{\alpha}) &= \prod^{D-1}_{d=0} \mathcal{G} (\alpha_d | 10^{-3}, 10^{-3}) \\ p(\tau) &= \mathcal{G} (\tau | 10^{-3}, 10^{-3}) where we have given automatic relevance determination (ARD) prior for :math:`\mathbf{C}`. This can be visualized as a graphical model: .. bayesnet:: \node[latent] (y) {$\mathbf{y}_{mn}$} ; \node[det, above=of y] (dot) {dot} ; \node[latent, right=2 of dot] (tau) {$\tau$} ; \node[latent, above left=1 and 2 of dot] (C) {$c_{md}$} ; \node[latent, above=of C] (alpha) {$\alpha_d$} ; \node[latent, above right=1 and 1 of dot] (X) {$x_{nd}$} ; \factor[above=of y] {y-f} {left:$\mathcal{N}$} {dot,tau} {y}; \factor[above=of C] {C-f} {left:$\mathcal{N}$} {alpha} {C}; \factor[above=of X] {X-f} {above:$\mathcal{N}$} {} {X}; \factor[above=of alpha] {alpha-f} {above:$\mathcal{G}$} {} {alpha}; \factor[above=of tau] {tau-f} {above:$\mathcal{G}$} {} {tau}; \edge {C,X} {dot}; \tikzstyle{plate caption} += [below left=0pt and 0pt of #1.north east] ; \plate {d-plate} {(X)(X-f)(X-f-caption)(C)(C-f)(C-f-caption)(alpha)(alpha-f)(alpha-f-caption)} {$d=0,\ldots,2$} ; \tikzstyle{plate caption} += [below left=5pt and 0pt of #1.south east] ; \plate {m-plate} {(y)(y-f)(y-f-caption)(C)(C-f)(C-f-caption)(d-plate.south west)} {$m=0,\ldots,9$} ; \plate {n-plate} {(y)(y-f)(y-f-caption)(X)(X-f)(X-f-caption)(m-plate-caption)(m-plate.north east)(d-plate.south east)} {$n=0,\ldots,99$} ; Now, let us construct this model in BayesPy. First, we'll define the dimensionality of the latent space in our model: >>> D = 3 Then the prior for the latent states :math:`\mathbf{X}`: >>> X = GaussianARD(0, 1, ... shape=(D,), ... plates=(1,100), ... name='X') Note that the shape of ``X`` is ``(D,)``, although the latent dimensions are marked with a plate in the graphical model and they are conditionally independent in the prior. However, we want to (and need to) model the posterior dependency of the latent dimensions, thus we cannot factorize them, which would happen if we used ``plates=(1,100,D)`` and ``shape=()``. The first plate axis with size 1 is given just for clarity. The prior for the ARD parameters :math:`\boldsymbol{\alpha}` of the loading matrix: >>> alpha = Gamma(1e-3, 1e-3, ... plates=(D,), ... name='alpha') The prior for the loading matrix :math:`\mathbf{C}`: >>> C = GaussianARD(0, alpha, ... shape=(D,), ... plates=(10,1), ... name='C') Again, note that the shape is the same as for ``X`` for the same reason. Also, the plates of ``alpha``, ``(D,)``, are mapped to the full shape of the node ``C``, ``(10,1,D)``, using standard broadcasting rules. The dot product is just a deterministic node: >>> F = Dot(C, X) However, note that ``Dot`` requires that the input Gaussian nodes have the same shape and that this shape has exactly one axis, that is, the variables are vectors. This the reason why we used shape ``(D,)`` for ``X`` and ``C`` but from a bit different perspective. The node computes the inner product of :math:`D`-dimensional vectors resulting in plates ``(10,100)`` broadcasted from the plates ``(1,100)`` and ``(10,1)``: >>> F.plates (10, 100) The prior for the observation noise :math:`\tau`: >>> tau = Gamma(1e-3, 1e-3, name='tau') Finally, the observations are conditionally independent Gaussian scalars: >>> Y = GaussianARD(F, tau, name='Y') Now we have defined our model and the next step is to observe some data and to perform inference. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/plot.rst0000644000175100001770000003636200000000000021723 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: # This is the PCA model from the previous sections import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau, name='Y') c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) Y.observe(data, mask=[[True], [False], [False], [True], [True], [False], [True], [True], [True], [False]]) X.initialize_from_parameters(np.random.randn(1, 100, D), 10) from bayespy.inference import VB Q = VB(Y, C, X, alpha, tau) X.initialize_from_parameters(np.random.randn(1, 100, D), 10) from bayespy.inference.vmp import transformations rotX = transformations.RotateGaussianARD(X) rotC = transformations.RotateGaussianARD(C, alpha) R = transformations.RotationOptimizer(rotC, rotX, D) Q = VB(Y, C, X, alpha, tau) Q.callback = R.rotate Q.update(repeat=1000, tol=1e-6, verbose=False) from bayespy.nodes import Gaussian Examining the results ===================== After the results have been obtained, it is important to be able to examine the results easily. The results can be examined either numerically by inspecting numerical arrays or visually by plotting distributions of the nodes. In addition, the posterior distributions can be visualized during the learning algorithm and the results can saved into a file. Plotting the results -------------------- .. currentmodule:: bayespy The module :mod:`plot` offers some plotting basic functionality: >>> import bayespy.plot as bpplt .. currentmodule:: bayespy.plot The module contains ``matplotlib.pyplot`` module if the user needs that. For instance, interactive plotting can be enabled as: >>> bpplt.pyplot.ion() The :mod:`plot` module contains some functions but it is not a very comprehensive collection, thus the user may need to write some problem- or model-specific plotting functions. The current collection is: * :func:`pdf`: show probability density function of a scalar * :func:`contour`: show probability density function of two-element vector * :func:`hinton`: show the Hinton diagram * :func:`plot`: show value as a function The probability density function of a scalar random variable can be plotted using the function :func:`pdf`: >>> bpplt.pyplot.figure() >>> bpplt.pdf(Q['tau'], np.linspace(60, 140, num=100)) [] .. plot:: # This is the PCA model from the previous sections import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau) c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) Y.observe(data, mask=[[True], [False], [False], [True], [True], [False], [True], [True], [True], [False]]) from bayespy.inference import VB Q = VB(Y, C, X, alpha, tau) X.initialize_from_parameters(np.random.randn(1, 100, D), 10) from bayespy.inference.vmp import transformations rotX = transformations.RotateGaussianARD(X) rotC = transformations.RotateGaussianARD(C, alpha) R = transformations.RotationOptimizer(rotC, rotX, D) Q = VB(Y, C, X, alpha, tau) Q.callback = R.rotate Q.update(repeat=1000, tol=1e-6) Q.update(repeat=50, tol=np.nan) import bayespy.plot as bpplt bpplt.pdf(Q['tau'], np.linspace(60, 140, num=100)) The variable ``tau`` models the inverse variance of the noise, for which the true value is :math:`0.1^{-2}=100`. Thus, the posterior captures the true value quite accurately. Similarly, the function :func:`contour` can be used to plot the probability density function of a 2-dimensional variable, for instance: >>> V = Gaussian([3, 5], [[4, 2], [2, 5]]) >>> bpplt.pyplot.figure() >>> bpplt.contour(V, np.linspace(1, 5, num=100), np.linspace(3, 7, num=100)) .. plot:: # This is the PCA model from the previous sections import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau) c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) Y.observe(data, mask=[[True], [False], [False], [True], [True], [False], [True], [True], [True], [False]]) from bayespy.inference import VB Q = VB(Y, C, X, alpha, tau) X.initialize_from_parameters(np.random.randn(1, 100, D), 10) from bayespy.inference.vmp import transformations rotX = transformations.RotateGaussianARD(X) rotC = transformations.RotateGaussianARD(C, alpha) R = transformations.RotationOptimizer(rotC, rotX, D) Q = VB(Y, C, X, alpha, tau) Q.callback = R.rotate Q.update(repeat=1000, tol=1e-6) Q.update(repeat=50, tol=np.nan) import bayespy.plot as bpplt from bayespy.nodes import Gaussian V = Gaussian([3, 5], [[4, 2], [2, 5]]) bpplt.contour(V, np.linspace(1, 5, num=100), np.linspace(3, 7, num=100)) Both :func:`pdf` and :func:`contour` require that the user provides the grid on which the probability density function is computed. They also support several keyword arguments for modifying the output, similarly as ``plot`` and ``contour`` in ``matplotlib.pyplot``. These functions can be used only for stochastic nodes. A few other plot types are also available as built-in functions. A Hinton diagram can be plotted as: >>> bpplt.pyplot.figure() >>> bpplt.hinton(C) .. plot:: # This is the PCA model from the previous sections import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau) c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) Y.observe(data, mask=[[True], [False], [False], [True], [True], [False], [True], [True], [True], [False]]) from bayespy.inference import VB Q = VB(Y, C, X, alpha, tau) X.initialize_from_parameters(np.random.randn(1, 100, D), 10) from bayespy.inference.vmp import transformations rotX = transformations.RotateGaussianARD(X) rotC = transformations.RotateGaussianARD(C, alpha) R = transformations.RotationOptimizer(rotC, rotX, D) Q = VB(Y, C, X, alpha, tau) Q.callback = R.rotate Q.update(repeat=1000, tol=1e-6) Q.update(repeat=50, tol=np.nan) import bayespy.plot as bpplt from bayespy.nodes import Gaussian bpplt.hinton(C) The diagram shows the elements of the matrix :math:`C`. The size of the filled rectangle corresponds to the absolute value of the element mean, and white and black correspond to positive and negative values, respectively. The non-filled rectangle shows standard deviation. From this diagram it is clear that the third column of :math:`C` has been pruned out and the rows that were missing in the data have zero mean and column-specific variance. The function :func:`hinton` is a simple wrapper for node-specific Hinton diagram plotters, such as :func:`gaussian_hinton` and :func:`dirichlet_hinton`. Thus, the keyword arguments depend on the node which is plotted. Another plotting function is :func:`plot`, which just plots the values of the node over one axis as a function: >>> bpplt.pyplot.figure() >>> bpplt.plot(X, axis=-2) .. plot:: # This is the PCA model from the previous sections import numpy as np np.random.seed(1) from bayespy.nodes import GaussianARD, Gamma, Dot D = 3 X = GaussianARD(0, 1, shape=(D,), plates=(1,100), name='X') alpha = Gamma(1e-3, 1e-3, plates=(D,), name='alpha') C = GaussianARD(0, alpha, shape=(D,), plates=(10,1), name='C') F = Dot(C, X) tau = Gamma(1e-3, 1e-3, name='tau') Y = GaussianARD(F, tau) c = np.random.randn(10, 2) x = np.random.randn(2, 100) data = np.dot(c, x) + 0.1*np.random.randn(10, 100) Y.observe(data) Y.observe(data, mask=[[True], [False], [False], [True], [True], [False], [True], [True], [True], [False]]) from bayespy.inference import VB Q = VB(Y, C, X, alpha, tau) X.initialize_from_parameters(np.random.randn(1, 100, D), 10) from bayespy.inference.vmp import transformations rotX = transformations.RotateGaussianARD(X) rotC = transformations.RotateGaussianARD(C, alpha) R = transformations.RotationOptimizer(rotC, rotX, D) Q = VB(Y, C, X, alpha, tau) Q.callback = R.rotate Q.update(repeat=1000, tol=1e-6) Q.update(repeat=50, tol=np.nan) import bayespy.plot as bpplt from bayespy.nodes import Gaussian bpplt.plot(X, axis=-2) Now, the ``axis`` is the second last axis which corresponds to :math:`n=0,\ldots,N-1`. As :math:`D=3`, there are three subplots. For Gaussian variables, the function shows the mean and two standard deviations. The plot shows that the third component has been pruned out, thus the method has been able to recover the true dimensionality of the latent space. It also has similar keyword arguments to ``plot`` function in ``matplotlib.pyplot``. Again, :func:`plot` is a simple wrapper over node-specific plotting functions, thus it supports only some node classes. Monitoring during the inference algorithm ----------------------------------------- It is possible to plot the distribution of the nodes during the learning algorithm. This is useful when the user is interested to see how the distributions evolve during learning and what is happening to the distributions. In order to utilize monitoring, the user must set plotters for the nodes that he or she wishes to monitor. This can be done either when creating the node or later at any time. The plotters are set by creating a plotter object and providing this object to the node. The plotter is a wrapper of one of the plotting functions mentioned above: :class:`PDFPlotter`, :class:`ContourPlotter`, :class:`HintonPlotter` or :class:`FunctionPlotter`. Thus, our example model could use the following plotters: >>> tau.set_plotter(bpplt.PDFPlotter(np.linspace(60, 140, num=100))) >>> C.set_plotter(bpplt.HintonPlotter()) >>> X.set_plotter(bpplt.FunctionPlotter(axis=-2)) These could have been given at node creation as a keyword argument ``plotter``: >>> V = Gaussian([3, 5], [[4, 2], [2, 5]], ... plotter=bpplt.ContourPlotter(np.linspace(1, 5, num=100), ... np.linspace(3, 7, num=100))) When the plotter is set, one can use the ``plot`` method of the node to perform plotting: >>> V.plot() Nodes can also be plotted using the ``plot`` method of the inference engine: >>> Q.plot('C') This method remembers the figure in which a node has been plotted and uses that every time it plots the same node. In order to monitor the nodes during learning, it is possible to use the keyword argument ``plot``: >>> Q.update(repeat=5, plot=True, tol=np.nan) Iteration 19: loglike=-1.221354e+02 (... seconds) Iteration 20: loglike=-1.221354e+02 (... seconds) Iteration 21: loglike=-1.221354e+02 (... seconds) Iteration 22: loglike=-1.221354e+02 (... seconds) Iteration 23: loglike=-1.221354e+02 (... seconds) Each node which has a plotter set will be plotted after it is updated. Note that this may slow down the inference significantly if the plotting operation is time consuming. Posterior parameters and moments -------------------------------- If the built-in plotting functions are not sufficient, it is possible to use ``matplotlib.pyplot`` for custom plotting. Each node has ``get_moments`` method which returns the moments and they can be used for plotting. Stochastic exponential family nodes have natural parameter vectors which can also be used. In addition to plotting, it is also possible to just print the moments or parameters in the console. Saving and loading results -------------------------- .. currentmodule:: bayespy.inference The results of the inference engine can be easily saved and loaded using :func:`VB.save` and :func:`VB.load` methods: >>> import tempfile >>> filename = tempfile.mkstemp(suffix='.hdf5')[1] >>> Q.save(filename=filename) >>> Q.load(filename=filename) The results are stored in a HDF5 file. The user may set an autosave file in which the results are automatically saved regularly. Autosave filename can be set at creation time by ``autosave_filename`` keyword argument or later using :func:`VB.set_autosave` method. If autosave file has been set, the :func:`VB.save` and :func:`VB.load` methods use that file by default. In order for the saving to work, all stochastic nodes must have been given (unique) names. However, note that these methods do *not* save nor load the node definitions. It means that the user must create the nodes and the inference engine and then use :func:`VB.load` to set the state of the nodes and the inference engine. If there are any differences in the model that was saved and the one which is tried to update using loading, then loading does not work. Thus, the user should keep the model construction unmodified in a Python file in order to be able to load the results later. Or if the user wishes to share the results, he or she must share the model construction Python file with the HDF5 results file. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/quickstart.rst0000644000175100001770000001337000000000000023131 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. testsetup:: from matplotlib import pyplot pyplot.ion() import numpy numpy.random.seed(1) Quick start guide ================= This short guide shows the key steps in using BayesPy for variational Bayesian inference by applying BayesPy to a simple problem. The key steps in using BayesPy are the following: - Construct the model - Observe some of the variables by providing the data in a proper format - Run variational Bayesian inference - Examine the resulting posterior approximation To demonstrate BayesPy, we'll consider a very simple problem: we have a set of observations from a Gaussian distribution with unknown mean and variance, and we want to learn these parameters. In this case, we do not use any real-world data but generate some artificial data. The dataset consists of ten samples from a Gaussian distribution with mean 5 and standard deviation 10. This dataset can be generated with NumPy as follows: >>> import numpy as np >>> data = np.random.normal(5, 10, size=(10,)) Constructing the model ---------------------- Now, given this data we would like to estimate the mean and the standard deviation as if we didn't know their values. The model can be defined as follows: .. math:: \begin{split} p(\mathbf{y}|\mu,\tau) &= \prod^{9}_{n=0} \mathcal{N}(y_n|\mu,\tau) \\ p(\mu) &= \mathcal{N}(\mu|0,10^{-6}) \\ p(\tau) &= \mathcal{G}(\tau|10^{-6},10^{-6}) \end{split} where :math:`\mathcal{N}` is the Gaussian distribution parameterized by its mean and precision (i.e., inverse variance), and :math:`\mathcal{G}` is the gamma distribution parameterized by its shape and rate parameters. Note that we have given quite uninformative priors for the variables :math:`\mu` and :math:`\tau`\ . This simple model can also be shown as a directed factor graph: .. bayesnet:: Directed factor graph of the example model. \node[obs] (y) {$y_n$} ; \node[latent, above left=1.5 and 0.5 of y] (mu) {$\mu$} ; \node[latent, above right=1.5 and 0.5 of y] (tau) {$\tau$} ; \factor[above=of mu] {mu-f} {left:$\mathcal{N}$} {} {mu} ; \factor[above=of tau] {tau-f} {left:$\mathcal{G}$} {} {tau} ; \factor[above=of y] {y-f} {left:$\mathcal{N}$} {mu,tau} {y}; \plate {} {(y)(y-f)(y-f-caption)} {$n=0,\ldots,9$} ; This model can be constructed in BayesPy as follows: >>> from bayespy.nodes import GaussianARD, Gamma >>> mu = GaussianARD(0, 1e-6) >>> tau = Gamma(1e-6, 1e-6) >>> y = GaussianARD(mu, tau, plates=(10,)) .. currentmodule:: bayespy.nodes This is quite self-explanatory given the model definitions above. We have used two types of nodes :class:`GaussianARD` and :class:`Gamma` to represent Gaussian and gamma distributions, respectively. There are much more distributions in :mod:`bayespy.nodes` so you can construct quite complex conjugate exponential family models. The node :code:`y` uses keyword argument :code:`plates` to define the plates :math:`n=0,\ldots,9`. Performing inference -------------------- Now that we have created the model, we can provide our data by setting ``y`` as observed: >>> y.observe(data) Next we want to estimate the posterior distribution. In principle, we could use different inference engines (e.g., MCMC or EP) but currently only variational Bayesian (VB) engine is implemented. The engine is initialized by giving all the nodes of the model: >>> from bayespy.inference import VB >>> Q = VB(mu, tau, y) The inference algorithm can be run as long as wanted (max. 20 iterations in this case): >>> Q.update(repeat=20) Iteration 1: loglike=-6.020956e+01 (... seconds) Iteration 2: loglike=-5.820527e+01 (... seconds) Iteration 3: loglike=-5.820290e+01 (... seconds) Iteration 4: loglike=-5.820288e+01 (... seconds) Converged at iteration 4. Now the algorithm converged after four iterations, before the requested 20 iterations. VB approximates the true posterior :math:`p(\mu,\tau|\mathbf{y})` with a distribution which factorizes with respect to the nodes: :math:`q(\mu)q(\tau)`\ . Examining posterior approximation --------------------------------- The resulting approximate posterior distributions :math:`q(\mu)` and :math:`q(\tau)` can be examined, for instance, by plotting the marginal probability density functions: >>> import bayespy.plot as bpplt >>> bpplt.pyplot.subplot(2, 1, 1) >>> bpplt.pdf(mu, np.linspace(-10, 20, num=100), color='k', name=r'\mu') [] >>> bpplt.pyplot.subplot(2, 1, 2) >>> bpplt.pdf(tau, np.linspace(1e-6, 0.08, num=100), color='k', name=r'\tau') [] >>> bpplt.pyplot.tight_layout() >>> bpplt.pyplot.show() .. plot:: import numpy as np np.random.seed(1) data = np.random.normal(5, 10, size=(10,)) from bayespy.nodes import GaussianARD, Gamma mu = GaussianARD(0, 1e-6) tau = Gamma(1e-6, 1e-6) y = GaussianARD(mu, tau, plates=(10,)) y.observe(data) from bayespy.inference import VB Q = VB(mu, tau, y) Q.update(repeat=20) import bayespy.plot as bpplt bpplt.pyplot.subplot(2, 1, 1) bpplt.pdf(mu, np.linspace(-10, 20, num=100), color='k', name=r'\mu') bpplt.pyplot.subplot(2, 1, 2) bpplt.pdf(tau, np.linspace(1e-6, 0.08, num=100), color='k', name=r'\tau') bpplt.pyplot.tight_layout() bpplt.pyplot.show() This example was a very simple introduction to using BayesPy. The model can be much more complex and each phase contains more options to give the user more control over the inference. The following sections give more details about the phases. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/quickstartbackup.py0000644000175100001770000000646400000000000024145 0ustar00runnerdocker00000000000000 # coding: utf-8 ## Quick start guide # This short guide shows the key steps in using BayesPy for variational Bayesian inference by applying BayesPy to a simple problem. The key steps in using BayesPy are the following: # # * Construct the model # # * Observe some of the variables by providing the data in a proper format # # * Run variational Bayesian inference # # * Examine the resulting posterior approximation # # To demonstrate BayesPy, we'll consider a very simple problem: we have a set of observations from a Gaussian distribution with unknown mean and variance, and we want to learn these parameters. In this case, we do not use any real-world data but generate some artificial data. The dataset consists of ten samples from a Gaussian distribution with mean 5 and standard deviation 10. This dataset can be generated with NumPy as follows: # In[1]: import numpy as np data = np.random.normal(5, 10, size=(10,)) # Now, given this data we would like to estimate the mean and the standard deviation as if we didn't know their values. The model can be defined as follows: # # $$ # \begin{split} # p(\mathbf{y}|\mu,\tau) &= \prod^{9}_{n=0} \mathcal{N}(y_n|\mu,\tau) \\ # p(\mu) &= \mathcal{N}(\mu|0,10^{-6}) \\ # p(\tau) &= \mathcal{G}(\tau|10^{-6},10^{-6}) # \end{split} # $$ # # where $\mathcal{N}$ is the Gaussian distribution parameterized by its mean and precision (i.e., inverse variance), and $\mathcal{G}$ is the gamma distribution parameterized by its shape and rate parameters. Note that we have given quite uninformative priors for the variables $\mu$ and $\tau$. This simple model can also be shown as a directed factor graph: # This model can be constructed in BayesPy as follows: # In[2]: from bayespy.nodes import GaussianARD, Gamma mu = GaussianARD(0, 1e-6) tau = Gamma(1e-6, 1e-6) y = GaussianARD(mu, tau, plates=(10,)) # In[3]: y.observe(data) # Next we want to estimate the posterior distribution. In principle, we could use different inference engines (e.g., MCMC or EP) but currently only variational Bayesian (VB) engine is implemented. The engine is initialized by giving all the nodes of the model: # In[4]: from bayespy.inference import VB Q = VB(mu, tau, y) # The inference algorithm can be run as long as wanted (max. 20 iterations in this case): # In[5]: Q.update(repeat=20) # Now the algorithm converged after four iterations, before the requested 20 iterations. # # VB approximates the true posterior $p(\mu,\tau|\mathbf{y})$ with a distribution which factorizes with respect to the nodes: $q(\mu)q(\tau)$. The resulting approximate posterior distributions $q(\mu)$ and $q(\tau)$ can be examined, for instance, by plotting the marginal probability density functions: # In[6]: import bayespy.plot as bpplt # The following two two lines are just for enabling matplotlib plotting in notebooks get_ipython().magic('matplotlib inline') bpplt.pyplot.plot([]) bpplt.pyplot.subplot(2, 1, 1) bpplt.pdf(mu, np.linspace(-10, 20, num=100), color='k', name=r'\mu') bpplt.pyplot.subplot(2, 1, 2) bpplt.pdf(tau, np.linspace(1e-6, 0.08, num=100), color='k', name=r'\tau'); # This example was a very simple introduction to using BayesPy. The model can be much more complex and each phase contains more options to give the user more control over the inference. The following sections give more details about the phases. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/quickstartbackup.rst0000644000175100001770000001171400000000000024317 0ustar00runnerdocker00000000000000.. Copyright (C) 2014 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. Quick start guide ================= This short guide shows the key steps in using BayesPy for variational Bayesian inference by applying BayesPy to a simple problem. The key steps in using BayesPy are the following: - Construct the model - Observe some of the variables by providing the data in a proper format - Run variational Bayesian inference - Examine the resulting posterior approximation To demonstrate BayesPy, we'll consider a very simple problem: we have a set of observations from a Gaussian distribution with unknown mean and variance, and we want to learn these parameters. In this case, we do not use any real-world data but generate some artificial data. The dataset consists of ten samples from a Gaussian distribution with mean 5 and standard deviation 10. This dataset can be generated with NumPy as follows: .. code:: python import numpy as np data = np.random.normal(5, 10, size=(10,)) Now, given this data we would like to estimate the mean and the standard deviation as if we didn't know their values. The model can be defined as follows: .. math:: \begin{split} p(\mathbf{y}|\mu,\tau) &= \prod^{9}_{n=0} \mathcal{N}(y_n|\mu,\tau) \\ p(\mu) &= \mathcal{N}(\mu|0,10^{-6}) \\ p(\tau) &= \mathcal{G}(\tau|10^{-6},10^{-6}) \end{split} where :math:`\mathcal{N}` is the Gaussian distribution parameterized by its mean and precision (i.e., inverse variance), and :math:`\mathcal{G}` is the gamma distribution parameterized by its shape and rate parameters. Note that we have given quite uninformative priors for the variables :math:`\mu` and :math:`\tau`\ . This simple model can also be shown as a directed factor graph: .. bayesnet:: Directed factor graph of the example model. \node[obs] (y) {$y_n$} ; \node[latent, above left=1.5 and 0.5 of y] (mu) {$\mu$} ; \node[latent, above right=1.5 and 0.5 of y] (tau) {$\tau$} ; \factor[above=of mu] {mu-f} {left:$\mathcal{N}$} {} {mu} ; \factor[above=of tau] {tau-f} {left:$\mathcal{G}$} {} {tau} ; \factor[above=of y] {y-f} {left:$\mathcal{N}$} {mu,tau} {y}; \plate {} {(y)(y-f)(y-f-caption)} {$n=0,\ldots,9$} ; This model can be constructed in BayesPy as follows: .. code:: python from bayespy.nodes import GaussianARD, Gamma mu = GaussianARD(0, 1e-6) tau = Gamma(1e-6, 1e-6) y = GaussianARD(mu, tau, plates=(10,)) .. currentmodule:: bayespy.nodes This is quite self-explanatory given the model definitions above. We have used two types of nodes :class:`GaussianARD` and :class:`Gamma` to represent Gaussian and gamma distributions, respectively. There are much more distributions in :mod:`bayespy.nodes` so you can construct quite complex conjugate exponential family models. The node :code:`y` uses keyword argument :code:`plates` to define the plates :math:`n=0,\ldots,9`. Now that we have created the model, we can provide our data by setting :code:`y` as observed: .. code:: python y.observe(data) Next we want to estimate the posterior distribution. In principle, we could use different inference engines (e.g., MCMC or EP) but currently only variational Bayesian (VB) engine is implemented. The engine is initialized by giving all the nodes of the model: .. code:: python from bayespy.inference import VB Q = VB(mu, tau, y) The inference algorithm can be run as long as wanted (max. 20 iterations in this case): .. code:: python Q.update(repeat=20) .. parsed-literal:: Iteration 1: loglike=-5.789562e+01 (0.010 seconds) Iteration 2: loglike=-5.612083e+01 (0.000 seconds) Iteration 3: loglike=-5.611848e+01 (0.010 seconds) Iteration 4: loglike=-5.611846e+01 (0.000 seconds) Converged. Now the algorithm converged after four iterations, before the requested 20 iterations. VB approximates the true posterior :math:`p(\mu,\tau|\mathbf{y})` with a distribution which factorizes with respect to the nodes: :math:`q(\mu)q(\tau)`\ . The resulting approximate posterior distributions :math:`q(\mu)` and :math:`q(\tau)` can be examined, for instance, by plotting the marginal probability density functions: .. code:: python import bayespy.plot as bpplt # The following two two lines are just for enabling matplotlib plotting in notebooks %matplotlib inline bpplt.pyplot.plot([]) bpplt.pyplot.subplot(2, 1, 1) bpplt.pdf(mu, np.linspace(-10, 20, num=100), color='k', name=r'\mu') bpplt.pyplot.subplot(2, 1, 2) bpplt.pdf(tau, np.linspace(1e-6, 0.08, num=100), color='k', name=r'\tau'); .. image:: quickstartbackup_files/quickstartbackup_14_0.png This example was a very simple introduction to using BayesPy. The model can be much more complex and each phase contains more options to give the user more control over the inference. The following sections give more details about the phases. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/doc/source/user_guide/user_guide.rst0000644000175100001770000000042700000000000023071 0ustar00runnerdocker00000000000000.. Copyright (C) 2011,2012 Jaakko Luttinen This file is licensed under the MIT License. See LICENSE for a text of the license. .. _sec-user-guide: User guide ********** .. toctree:: install quickstart modelconstruct inference plot advanced ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1725273981.425372 bayespy-0.6.2/setup.cfg0000644000175100001770000000057200000000000015626 0ustar00runnerdocker00000000000000[egg_info] tag_build = tag_date = 0 [aliases] release = egg_info -Db '' release_pypi = release register sdist upload [nosetests] with-doctest = 1 doctest-options = +ELLIPSIS with-coverage = 1 cover-erase = 1 verbose = 1 detailed-errors = 1 [versioneer] vcs = git style = pep440 versionfile_source = bayespy/_version.py versionfile_build = bayespy/_version.py tag_prefix = ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/setup.py0000644000175100001770000000703300000000000015516 0ustar00runnerdocker00000000000000#!/usr/bin/env python ################################################################################ # Copyright (C) 2011-2015 Jaakko Luttinen # # This file is licensed under the MIT License. ################################################################################ import os import versioneer meta = {} base_dir = os.path.dirname(os.path.abspath(__file__)) with open(os.path.join(base_dir, 'bayespy', '_meta.py')) as fp: exec(fp.read(), meta) NAME = 'bayespy' DESCRIPTION = 'Variational Bayesian inference tools for Python' AUTHOR = meta['__author__'] AUTHOR_EMAIL = meta['__contact__'] URL = 'http://bayespy.org' VERSION = versioneer.get_version() COPYRIGHT = meta['__copyright__'] if __name__ == "__main__": import os import sys python_version = int(sys.version.split('.')[0]) if python_version < 3: raise RuntimeError("BayesPy requires Python 3. You are running Python " "{0}.".format(python_version)) install_requires = [ 'numpy>=1.10.0', # 1.10 implements broadcast_to # 1.8 implements broadcasting in numpy.linalg 'scipy>=0.13.0', # <0.13 have a bug in special.multigammaln 'h5py', 'truncnorm', ] # Utility function to read the README file. # Used for the long_description. It's nice, because now 1) we have a top level # README file and 2) it's easier to type in the README file than to put a raw # string in below ... def read(fname): return open(os.path.join(os.path.dirname(__file__), fname)).read() from setuptools import setup, find_packages # Setup for BayesPy setup( install_requires = install_requires, extras_require = { 'doc': [ 'sphinx>=1.4.0', # 1.4.0 adds imgmath extension 'sphinxcontrib-tikz>=0.4.2', 'sphinxcontrib-bayesnet', 'sphinxcontrib-bibtex', 'nbsphinx', 'matplotlib', ], 'dev': [ 'nose', 'nosebook', ] }, packages = find_packages(), package_data = { NAME: ["tests/baseline_images/test_plot/*.png"] }, name = NAME, version = VERSION, author = AUTHOR, author_email = AUTHOR_EMAIL, description = DESCRIPTION, url = URL, long_description = read('README.rst'), cmdclass = versioneer.get_cmdclass(), keywords = [ 'variational Bayes', 'probabilistic programming', 'Bayesian networks', 'graphical models', 'variational message passing' ], classifiers = [ 'Programming Language :: Python :: 3 :: Only', 'Programming Language :: Python :: 3.3', 'Programming Language :: Python :: 3.4', 'Development Status :: 4 - Beta', 'Environment :: Console', 'Intended Audience :: Developers', 'Intended Audience :: Science/Research', 'License :: OSI Approved :: {0}'.format(meta['__license__']), 'Operating System :: OS Independent', 'Topic :: Scientific/Engineering', 'Topic :: Scientific/Engineering :: Information Analysis' ], entry_points = { 'nose.plugins': [ 'warnaserror = bayespy.testing:WarnAsError', ] }, ) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1725273975.0 bayespy-0.6.2/versioneer.py0000644000175100001770000025122500000000000016543 0ustar00runnerdocker00000000000000 # Version: 0.29 """The Versioneer - like a rocketeer, but for versions. The Versioneer ============== * like a rocketeer, but for versions! * https://github.com/python-versioneer/python-versioneer * Brian Warner * License: Public Domain (Unlicense) * Compatible with: Python 3.7, 3.8, 3.9, 3.10, 3.11 and pypy3 * [![Latest Version][pypi-image]][pypi-url] * [![Build Status][travis-image]][travis-url] This is a tool for managing a recorded version number in setuptools-based python projects. The goal is to remove the tedious and error-prone "update the embedded version string" step from your release process. Making a new release should be as easy as recording a new tag in your version-control system, and maybe making new tarballs. ## Quick Install Versioneer provides two installation modes. The "classic" vendored mode installs a copy of versioneer into your repository. The experimental build-time dependency mode is intended to allow you to skip this step and simplify the process of upgrading. ### Vendored mode * `pip install versioneer` to somewhere in your $PATH * A [conda-forge recipe](https://github.com/conda-forge/versioneer-feedstock) is available, so you can also use `conda install -c conda-forge versioneer` * add a `[tool.versioneer]` section to your `pyproject.toml` or a `[versioneer]` section to your `setup.cfg` (see [Install](INSTALL.md)) * Note that you will need to add `tomli; python_version < "3.11"` to your build-time dependencies if you use `pyproject.toml` * run `versioneer install --vendor` in your source tree, commit the results * verify version information with `python setup.py version` ### Build-time dependency mode * `pip install versioneer` to somewhere in your $PATH * A [conda-forge recipe](https://github.com/conda-forge/versioneer-feedstock) is available, so you can also use `conda install -c conda-forge versioneer` * add a `[tool.versioneer]` section to your `pyproject.toml` or a `[versioneer]` section to your `setup.cfg` (see [Install](INSTALL.md)) * add `versioneer` (with `[toml]` extra, if configuring in `pyproject.toml`) to the `requires` key of the `build-system` table in `pyproject.toml`: ```toml [build-system] requires = ["setuptools", "versioneer[toml]"] build-backend = "setuptools.build_meta" ``` * run `versioneer install --no-vendor` in your source tree, commit the results * verify version information with `python setup.py version` ## Version Identifiers Source trees come from a variety of places: * a version-control system checkout (mostly used by developers) * a nightly tarball, produced by build automation * a snapshot tarball, produced by a web-based VCS browser, like github's "tarball from tag" feature * a release tarball, produced by "setup.py sdist", distributed through PyPI Within each source tree, the version identifier (either a string or a number, this tool is format-agnostic) can come from a variety of places: * ask the VCS tool itself, e.g. "git describe" (for checkouts), which knows about recent "tags" and an absolute revision-id * the name of the directory into which the tarball was unpacked * an expanded VCS keyword ($Id$, etc) * a `_version.py` created by some earlier build step For released software, the version identifier is closely related to a VCS tag. Some projects use tag names that include more than just the version string (e.g. "myproject-1.2" instead of just "1.2"), in which case the tool needs to strip the tag prefix to extract the version identifier. For unreleased software (between tags), the version identifier should provide enough information to help developers recreate the same tree, while also giving them an idea of roughly how old the tree is (after version 1.2, before version 1.3). Many VCS systems can report a description that captures this, for example `git describe --tags --dirty --always` reports things like "0.7-1-g574ab98-dirty" to indicate that the checkout is one revision past the 0.7 tag, has a unique revision id of "574ab98", and is "dirty" (it has uncommitted changes). The version identifier is used for multiple purposes: * to allow the module to self-identify its version: `myproject.__version__` * to choose a name and prefix for a 'setup.py sdist' tarball ## Theory of Operation Versioneer works by adding a special `_version.py` file into your source tree, where your `__init__.py` can import it. This `_version.py` knows how to dynamically ask the VCS tool for version information at import time. `_version.py` also contains `$Revision$` markers, and the installation process marks `_version.py` to have this marker rewritten with a tag name during the `git archive` command. As a result, generated tarballs will contain enough information to get the proper version. To allow `setup.py` to compute a version too, a `versioneer.py` is added to the top level of your source tree, next to `setup.py` and the `setup.cfg` that configures it. This overrides several distutils/setuptools commands to compute the version when invoked, and changes `setup.py build` and `setup.py sdist` to replace `_version.py` with a small static file that contains just the generated version data. ## Installation See [INSTALL.md](./INSTALL.md) for detailed installation instructions. ## Version-String Flavors Code which uses Versioneer can learn about its version string at runtime by importing `_version` from your main `__init__.py` file and running the `get_versions()` function. From the "outside" (e.g. in `setup.py`), you can import the top-level `versioneer.py` and run `get_versions()`. Both functions return a dictionary with different flavors of version information: * `['version']`: A condensed version string, rendered using the selected style. This is the most commonly used value for the project's version string. The default "pep440" style yields strings like `0.11`, `0.11+2.g1076c97`, or `0.11+2.g1076c97.dirty`. See the "Styles" section below for alternative styles. * `['full-revisionid']`: detailed revision identifier. For Git, this is the full SHA1 commit id, e.g. "1076c978a8d3cfc70f408fe5974aa6c092c949ac". * `['date']`: Date and time of the latest `HEAD` commit. For Git, it is the commit date in ISO 8601 format. This will be None if the date is not available. * `['dirty']`: a boolean, True if the tree has uncommitted changes. Note that this is only accurate if run in a VCS checkout, otherwise it is likely to be False or None * `['error']`: if the version string could not be computed, this will be set to a string describing the problem, otherwise it will be None. It may be useful to throw an exception in setup.py if this is set, to avoid e.g. creating tarballs with a version string of "unknown". Some variants are more useful than others. Including `full-revisionid` in a bug report should allow developers to reconstruct the exact code being tested (or indicate the presence of local changes that should be shared with the developers). `version` is suitable for display in an "about" box or a CLI `--version` output: it can be easily compared against release notes and lists of bugs fixed in various releases. The installer adds the following text to your `__init__.py` to place a basic version in `YOURPROJECT.__version__`: from ._version import get_versions __version__ = get_versions()['version'] del get_versions ## Styles The setup.cfg `style=` configuration controls how the VCS information is rendered into a version string. The default style, "pep440", produces a PEP440-compliant string, equal to the un-prefixed tag name for actual releases, and containing an additional "local version" section with more detail for in-between builds. For Git, this is TAG[+DISTANCE.gHEX[.dirty]] , using information from `git describe --tags --dirty --always`. For example "0.11+2.g1076c97.dirty" indicates that the tree is like the "1076c97" commit but has uncommitted changes (".dirty"), and that this commit is two revisions ("+2") beyond the "0.11" tag. For released software (exactly equal to a known tag), the identifier will only contain the stripped tag, e.g. "0.11". Other styles are available. See [details.md](details.md) in the Versioneer source tree for descriptions. ## Debugging Versioneer tries to avoid fatal errors: if something goes wrong, it will tend to return a version of "0+unknown". To investigate the problem, run `setup.py version`, which will run the version-lookup code in a verbose mode, and will display the full contents of `get_versions()` (including the `error` string, which may help identify what went wrong). ## Known Limitations Some situations are known to cause problems for Versioneer. This details the most significant ones. More can be found on Github [issues page](https://github.com/python-versioneer/python-versioneer/issues). ### Subprojects Versioneer has limited support for source trees in which `setup.py` is not in the root directory (e.g. `setup.py` and `.git/` are *not* siblings). The are two common reasons why `setup.py` might not be in the root: * Source trees which contain multiple subprojects, such as [Buildbot](https://github.com/buildbot/buildbot), which contains both "master" and "slave" subprojects, each with their own `setup.py`, `setup.cfg`, and `tox.ini`. Projects like these produce multiple PyPI distributions (and upload multiple independently-installable tarballs). * Source trees whose main purpose is to contain a C library, but which also provide bindings to Python (and perhaps other languages) in subdirectories. Versioneer will look for `.git` in parent directories, and most operations should get the right version string. However `pip` and `setuptools` have bugs and implementation details which frequently cause `pip install .` from a subproject directory to fail to find a correct version string (so it usually defaults to `0+unknown`). `pip install --editable .` should work correctly. `setup.py install` might work too. Pip-8.1.1 is known to have this problem, but hopefully it will get fixed in some later version. [Bug #38](https://github.com/python-versioneer/python-versioneer/issues/38) is tracking this issue. The discussion in [PR #61](https://github.com/python-versioneer/python-versioneer/pull/61) describes the issue from the Versioneer side in more detail. [pip PR#3176](https://github.com/pypa/pip/pull/3176) and [pip PR#3615](https://github.com/pypa/pip/pull/3615) contain work to improve pip to let Versioneer work correctly. Versioneer-0.16 and earlier only looked for a `.git` directory next to the `setup.cfg`, so subprojects were completely unsupported with those releases. ### Editable installs with setuptools <= 18.5 `setup.py develop` and `pip install --editable .` allow you to install a project into a virtualenv once, then continue editing the source code (and test) without re-installing after every change. "Entry-point scripts" (`setup(entry_points={"console_scripts": ..})`) are a convenient way to specify executable scripts that should be installed along with the python package. These both work as expected when using modern setuptools. When using setuptools-18.5 or earlier, however, certain operations will cause `pkg_resources.DistributionNotFound` errors when running the entrypoint script, which must be resolved by re-installing the package. This happens when the install happens with one version, then the egg_info data is regenerated while a different version is checked out. Many setup.py commands cause egg_info to be rebuilt (including `sdist`, `wheel`, and installing into a different virtualenv), so this can be surprising. [Bug #83](https://github.com/python-versioneer/python-versioneer/issues/83) describes this one, but upgrading to a newer version of setuptools should probably resolve it. ## Updating Versioneer To upgrade your project to a new release of Versioneer, do the following: * install the new Versioneer (`pip install -U versioneer` or equivalent) * edit `setup.cfg` and `pyproject.toml`, if necessary, to include any new configuration settings indicated by the release notes. See [UPGRADING](./UPGRADING.md) for details. * re-run `versioneer install --[no-]vendor` in your source tree, to replace `SRC/_version.py` * commit any changed files ## Future Directions This tool is designed to make it easily extended to other version-control systems: all VCS-specific components are in separate directories like src/git/ . The top-level `versioneer.py` script is assembled from these components by running make-versioneer.py . In the future, make-versioneer.py will take a VCS name as an argument, and will construct a version of `versioneer.py` that is specific to the given VCS. It might also take the configuration arguments that are currently provided manually during installation by editing setup.py . Alternatively, it might go the other direction and include code from all supported VCS systems, reducing the number of intermediate scripts. ## Similar projects * [setuptools_scm](https://github.com/pypa/setuptools_scm/) - a non-vendored build-time dependency * [minver](https://github.com/jbweston/miniver) - a lightweight reimplementation of versioneer * [versioningit](https://github.com/jwodder/versioningit) - a PEP 518-based setuptools plugin ## License To make Versioneer easier to embed, all its code is dedicated to the public domain. The `_version.py` that it creates is also in the public domain. Specifically, both are released under the "Unlicense", as described in https://unlicense.org/. [pypi-image]: https://img.shields.io/pypi/v/versioneer.svg [pypi-url]: https://pypi.python.org/pypi/versioneer/ [travis-image]: https://img.shields.io/travis/com/python-versioneer/python-versioneer.svg [travis-url]: https://travis-ci.com/github/python-versioneer/python-versioneer """ # pylint:disable=invalid-name,import-outside-toplevel,missing-function-docstring # pylint:disable=missing-class-docstring,too-many-branches,too-many-statements # pylint:disable=raise-missing-from,too-many-lines,too-many-locals,import-error # pylint:disable=too-few-public-methods,redefined-outer-name,consider-using-with # pylint:disable=attribute-defined-outside-init,too-many-arguments import configparser import errno import json import os import re import subprocess import sys from pathlib import Path from typing import Any, Callable, cast, Dict, List, Optional, Tuple, Union from typing import NoReturn import functools have_tomllib = True if sys.version_info >= (3, 11): import tomllib else: try: import tomli as tomllib except ImportError: have_tomllib = False class VersioneerConfig: """Container for Versioneer configuration parameters.""" VCS: str style: str tag_prefix: str versionfile_source: str versionfile_build: Optional[str] parentdir_prefix: Optional[str] verbose: Optional[bool] def get_root() -> str: """Get the project root directory. We require that all commands are run from the project root, i.e. the directory that contains setup.py, setup.cfg, and versioneer.py . """ root = os.path.realpath(os.path.abspath(os.getcwd())) setup_py = os.path.join(root, "setup.py") pyproject_toml = os.path.join(root, "pyproject.toml") versioneer_py = os.path.join(root, "versioneer.py") if not ( os.path.exists(setup_py) or os.path.exists(pyproject_toml) or os.path.exists(versioneer_py) ): # allow 'python path/to/setup.py COMMAND' root = os.path.dirname(os.path.realpath(os.path.abspath(sys.argv[0]))) setup_py = os.path.join(root, "setup.py") pyproject_toml = os.path.join(root, "pyproject.toml") versioneer_py = os.path.join(root, "versioneer.py") if not ( os.path.exists(setup_py) or os.path.exists(pyproject_toml) or os.path.exists(versioneer_py) ): err = ("Versioneer was unable to run the project root directory. " "Versioneer requires setup.py to be executed from " "its immediate directory (like 'python setup.py COMMAND'), " "or in a way that lets it use sys.argv[0] to find the root " "(like 'python path/to/setup.py COMMAND').") raise VersioneerBadRootError(err) try: # Certain runtime workflows (setup.py install/develop in a setuptools # tree) execute all dependencies in a single python process, so # "versioneer" may be imported multiple times, and python's shared # module-import table will cache the first one. So we can't use # os.path.dirname(__file__), as that will find whichever # versioneer.py was first imported, even in later projects. my_path = os.path.realpath(os.path.abspath(__file__)) me_dir = os.path.normcase(os.path.splitext(my_path)[0]) vsr_dir = os.path.normcase(os.path.splitext(versioneer_py)[0]) if me_dir != vsr_dir and "VERSIONEER_PEP518" not in globals(): print("Warning: build in %s is using versioneer.py from %s" % (os.path.dirname(my_path), versioneer_py)) except NameError: pass return root def get_config_from_root(root: str) -> VersioneerConfig: """Read the project setup.cfg file to determine Versioneer config.""" # This might raise OSError (if setup.cfg is missing), or # configparser.NoSectionError (if it lacks a [versioneer] section), or # configparser.NoOptionError (if it lacks "VCS="). See the docstring at # the top of versioneer.py for instructions on writing your setup.cfg . root_pth = Path(root) pyproject_toml = root_pth / "pyproject.toml" setup_cfg = root_pth / "setup.cfg" section: Union[Dict[str, Any], configparser.SectionProxy, None] = None if pyproject_toml.exists() and have_tomllib: try: with open(pyproject_toml, 'rb') as fobj: pp = tomllib.load(fobj) section = pp['tool']['versioneer'] except (tomllib.TOMLDecodeError, KeyError) as e: print(f"Failed to load config from {pyproject_toml}: {e}") print("Try to load it from setup.cfg") if not section: parser = configparser.ConfigParser() with open(setup_cfg) as cfg_file: parser.read_file(cfg_file) parser.get("versioneer", "VCS") # raise error if missing section = parser["versioneer"] # `cast`` really shouldn't be used, but its simplest for the # common VersioneerConfig users at the moment. We verify against # `None` values elsewhere where it matters cfg = VersioneerConfig() cfg.VCS = section['VCS'] cfg.style = section.get("style", "") cfg.versionfile_source = cast(str, section.get("versionfile_source")) cfg.versionfile_build = section.get("versionfile_build") cfg.tag_prefix = cast(str, section.get("tag_prefix")) if cfg.tag_prefix in ("''", '""', None): cfg.tag_prefix = "" cfg.parentdir_prefix = section.get("parentdir_prefix") if isinstance(section, configparser.SectionProxy): # Make sure configparser translates to bool cfg.verbose = section.getboolean("verbose") else: cfg.verbose = section.get("verbose") return cfg class NotThisMethod(Exception): """Exception raised if a method is not valid for the current scenario.""" # these dictionaries contain VCS-specific tools LONG_VERSION_PY: Dict[str, str] = {} HANDLERS: Dict[str, Dict[str, Callable]] = {} def register_vcs_handler(vcs: str, method: str) -> Callable: # decorator """Create decorator to mark a method as the handler of a VCS.""" def decorate(f: Callable) -> Callable: """Store f in HANDLERS[vcs][method].""" HANDLERS.setdefault(vcs, {})[method] = f return f return decorate def run_command( commands: List[str], args: List[str], cwd: Optional[str] = None, verbose: bool = False, hide_stderr: bool = False, env: Optional[Dict[str, str]] = None, ) -> Tuple[Optional[str], Optional[int]]: """Call the given command(s).""" assert isinstance(commands, list) process = None popen_kwargs: Dict[str, Any] = {} if sys.platform == "win32": # This hides the console window if pythonw.exe is used startupinfo = subprocess.STARTUPINFO() startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW popen_kwargs["startupinfo"] = startupinfo for command in commands: try: dispcmd = str([command] + args) # remember shell=False, so use git.cmd on windows, not just git process = subprocess.Popen([command] + args, cwd=cwd, env=env, stdout=subprocess.PIPE, stderr=(subprocess.PIPE if hide_stderr else None), **popen_kwargs) break except OSError as e: if e.errno == errno.ENOENT: continue if verbose: print("unable to run %s" % dispcmd) print(e) return None, None else: if verbose: print("unable to find command, tried %s" % (commands,)) return None, None stdout = process.communicate()[0].strip().decode() if process.returncode != 0: if verbose: print("unable to run %s (error)" % dispcmd) print("stdout was %s" % stdout) return None, process.returncode return stdout, process.returncode LONG_VERSION_PY['git'] = r''' # This file helps to compute a version number in source trees obtained from # git-archive tarball (such as those provided by githubs download-from-tag # feature). Distribution tarballs (built by setup.py sdist) and build # directories (produced by setup.py build) will contain a much shorter file # that just contains the computed version number. # This file is released into the public domain. # Generated by versioneer-0.29 # https://github.com/python-versioneer/python-versioneer """Git implementation of _version.py.""" import errno import os import re import subprocess import sys from typing import Any, Callable, Dict, List, Optional, Tuple import functools def get_keywords() -> Dict[str, str]: """Get the keywords needed to look up the version information.""" # these strings will be replaced by git during git-archive. # setup.py/versioneer.py will grep for the variable names, so they must # each be defined on a line of their own. _version.py will just call # get_keywords(). git_refnames = "%(DOLLAR)sFormat:%%d%(DOLLAR)s" git_full = "%(DOLLAR)sFormat:%%H%(DOLLAR)s" git_date = "%(DOLLAR)sFormat:%%ci%(DOLLAR)s" keywords = {"refnames": git_refnames, "full": git_full, "date": git_date} return keywords class VersioneerConfig: """Container for Versioneer configuration parameters.""" VCS: str style: str tag_prefix: str parentdir_prefix: str versionfile_source: str verbose: bool def get_config() -> VersioneerConfig: """Create, populate and return the VersioneerConfig() object.""" # these strings are filled in when 'setup.py versioneer' creates # _version.py cfg = VersioneerConfig() cfg.VCS = "git" cfg.style = "%(STYLE)s" cfg.tag_prefix = "%(TAG_PREFIX)s" cfg.parentdir_prefix = "%(PARENTDIR_PREFIX)s" cfg.versionfile_source = "%(VERSIONFILE_SOURCE)s" cfg.verbose = False return cfg class NotThisMethod(Exception): """Exception raised if a method is not valid for the current scenario.""" LONG_VERSION_PY: Dict[str, str] = {} HANDLERS: Dict[str, Dict[str, Callable]] = {} def register_vcs_handler(vcs: str, method: str) -> Callable: # decorator """Create decorator to mark a method as the handler of a VCS.""" def decorate(f: Callable) -> Callable: """Store f in HANDLERS[vcs][method].""" if vcs not in HANDLERS: HANDLERS[vcs] = {} HANDLERS[vcs][method] = f return f return decorate def run_command( commands: List[str], args: List[str], cwd: Optional[str] = None, verbose: bool = False, hide_stderr: bool = False, env: Optional[Dict[str, str]] = None, ) -> Tuple[Optional[str], Optional[int]]: """Call the given command(s).""" assert isinstance(commands, list) process = None popen_kwargs: Dict[str, Any] = {} if sys.platform == "win32": # This hides the console window if pythonw.exe is used startupinfo = subprocess.STARTUPINFO() startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW popen_kwargs["startupinfo"] = startupinfo for command in commands: try: dispcmd = str([command] + args) # remember shell=False, so use git.cmd on windows, not just git process = subprocess.Popen([command] + args, cwd=cwd, env=env, stdout=subprocess.PIPE, stderr=(subprocess.PIPE if hide_stderr else None), **popen_kwargs) break except OSError as e: if e.errno == errno.ENOENT: continue if verbose: print("unable to run %%s" %% dispcmd) print(e) return None, None else: if verbose: print("unable to find command, tried %%s" %% (commands,)) return None, None stdout = process.communicate()[0].strip().decode() if process.returncode != 0: if verbose: print("unable to run %%s (error)" %% dispcmd) print("stdout was %%s" %% stdout) return None, process.returncode return stdout, process.returncode def versions_from_parentdir( parentdir_prefix: str, root: str, verbose: bool, ) -> Dict[str, Any]: """Try to determine the version from the parent directory name. Source tarballs conventionally unpack into a directory that includes both the project name and a version string. We will also support searching up two directory levels for an appropriately named parent directory """ rootdirs = [] for _ in range(3): dirname = os.path.basename(root) if dirname.startswith(parentdir_prefix): return {"version": dirname[len(parentdir_prefix):], "full-revisionid": None, "dirty": False, "error": None, "date": None} rootdirs.append(root) root = os.path.dirname(root) # up a level if verbose: print("Tried directories %%s but none started with prefix %%s" %% (str(rootdirs), parentdir_prefix)) raise NotThisMethod("rootdir doesn't start with parentdir_prefix") @register_vcs_handler("git", "get_keywords") def git_get_keywords(versionfile_abs: str) -> Dict[str, str]: """Extract version information from the given file.""" # the code embedded in _version.py can just fetch the value of these # keywords. When used from setup.py, we don't want to import _version.py, # so we do it with a regexp instead. This function is not used from # _version.py. keywords: Dict[str, str] = {} try: with open(versionfile_abs, "r") as fobj: for line in fobj: if line.strip().startswith("git_refnames ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["refnames"] = mo.group(1) if line.strip().startswith("git_full ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["full"] = mo.group(1) if line.strip().startswith("git_date ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["date"] = mo.group(1) except OSError: pass return keywords @register_vcs_handler("git", "keywords") def git_versions_from_keywords( keywords: Dict[str, str], tag_prefix: str, verbose: bool, ) -> Dict[str, Any]: """Get version information from git keywords.""" if "refnames" not in keywords: raise NotThisMethod("Short version file found") date = keywords.get("date") if date is not None: # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] # git-2.2.0 added "%%cI", which expands to an ISO-8601 -compliant # datestamp. However we prefer "%%ci" (which expands to an "ISO-8601 # -like" string, which we must then edit to make compliant), because # it's been around since git-1.5.3, and it's too difficult to # discover which version we're using, or to work around using an # older one. date = date.strip().replace(" ", "T", 1).replace(" ", "", 1) refnames = keywords["refnames"].strip() if refnames.startswith("$Format"): if verbose: print("keywords are unexpanded, not using") raise NotThisMethod("unexpanded keywords, not a git-archive tarball") refs = {r.strip() for r in refnames.strip("()").split(",")} # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of # just "foo-1.0". If we see a "tag: " prefix, prefer those. TAG = "tag: " tags = {r[len(TAG):] for r in refs if r.startswith(TAG)} if not tags: # Either we're using git < 1.8.3, or there really are no tags. We use # a heuristic: assume all version tags have a digit. The old git %%d # expansion behaves like git log --decorate=short and strips out the # refs/heads/ and refs/tags/ prefixes that would let us distinguish # between branches and tags. By ignoring refnames without digits, we # filter out many common branch names like "release" and # "stabilization", as well as "HEAD" and "master". tags = {r for r in refs if re.search(r'\d', r)} if verbose: print("discarding '%%s', no digits" %% ",".join(refs - tags)) if verbose: print("likely tags: %%s" %% ",".join(sorted(tags))) for ref in sorted(tags): # sorting will prefer e.g. "2.0" over "2.0rc1" if ref.startswith(tag_prefix): r = ref[len(tag_prefix):] # Filter out refs that exactly match prefix or that don't start # with a number once the prefix is stripped (mostly a concern # when prefix is '') if not re.match(r'\d', r): continue if verbose: print("picking %%s" %% r) return {"version": r, "full-revisionid": keywords["full"].strip(), "dirty": False, "error": None, "date": date} # no suitable tags, so version is "0+unknown", but full hex is still there if verbose: print("no suitable tags, using unknown + full revision id") return {"version": "0+unknown", "full-revisionid": keywords["full"].strip(), "dirty": False, "error": "no suitable tags", "date": None} @register_vcs_handler("git", "pieces_from_vcs") def git_pieces_from_vcs( tag_prefix: str, root: str, verbose: bool, runner: Callable = run_command ) -> Dict[str, Any]: """Get version from 'git describe' in the root of the source tree. This only gets called if the git-archive 'subst' keywords were *not* expanded, and _version.py hasn't already been rewritten with a short version string, meaning we're inside a checked out source tree. """ GITS = ["git"] if sys.platform == "win32": GITS = ["git.cmd", "git.exe"] # GIT_DIR can interfere with correct operation of Versioneer. # It may be intended to be passed to the Versioneer-versioned project, # but that should not change where we get our version from. env = os.environ.copy() env.pop("GIT_DIR", None) runner = functools.partial(runner, env=env) _, rc = runner(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=not verbose) if rc != 0: if verbose: print("Directory %%s not under git control" %% root) raise NotThisMethod("'git rev-parse --git-dir' returned error") # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty] # if there isn't one, this yields HEX[-dirty] (no NUM) describe_out, rc = runner(GITS, [ "describe", "--tags", "--dirty", "--always", "--long", "--match", f"{tag_prefix}[[:digit:]]*" ], cwd=root) # --long was added in git-1.5.5 if describe_out is None: raise NotThisMethod("'git describe' failed") describe_out = describe_out.strip() full_out, rc = runner(GITS, ["rev-parse", "HEAD"], cwd=root) if full_out is None: raise NotThisMethod("'git rev-parse' failed") full_out = full_out.strip() pieces: Dict[str, Any] = {} pieces["long"] = full_out pieces["short"] = full_out[:7] # maybe improved later pieces["error"] = None branch_name, rc = runner(GITS, ["rev-parse", "--abbrev-ref", "HEAD"], cwd=root) # --abbrev-ref was added in git-1.6.3 if rc != 0 or branch_name is None: raise NotThisMethod("'git rev-parse --abbrev-ref' returned error") branch_name = branch_name.strip() if branch_name == "HEAD": # If we aren't exactly on a branch, pick a branch which represents # the current commit. If all else fails, we are on a branchless # commit. branches, rc = runner(GITS, ["branch", "--contains"], cwd=root) # --contains was added in git-1.5.4 if rc != 0 or branches is None: raise NotThisMethod("'git branch --contains' returned error") branches = branches.split("\n") # Remove the first line if we're running detached if "(" in branches[0]: branches.pop(0) # Strip off the leading "* " from the list of branches. branches = [branch[2:] for branch in branches] if "master" in branches: branch_name = "master" elif not branches: branch_name = None else: # Pick the first branch that is returned. Good or bad. branch_name = branches[0] pieces["branch"] = branch_name # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty] # TAG might have hyphens. git_describe = describe_out # look for -dirty suffix dirty = git_describe.endswith("-dirty") pieces["dirty"] = dirty if dirty: git_describe = git_describe[:git_describe.rindex("-dirty")] # now we have TAG-NUM-gHEX or HEX if "-" in git_describe: # TAG-NUM-gHEX mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe) if not mo: # unparsable. Maybe git-describe is misbehaving? pieces["error"] = ("unable to parse git-describe output: '%%s'" %% describe_out) return pieces # tag full_tag = mo.group(1) if not full_tag.startswith(tag_prefix): if verbose: fmt = "tag '%%s' doesn't start with prefix '%%s'" print(fmt %% (full_tag, tag_prefix)) pieces["error"] = ("tag '%%s' doesn't start with prefix '%%s'" %% (full_tag, tag_prefix)) return pieces pieces["closest-tag"] = full_tag[len(tag_prefix):] # distance: number of commits since tag pieces["distance"] = int(mo.group(2)) # commit: short hex revision ID pieces["short"] = mo.group(3) else: # HEX: no tags pieces["closest-tag"] = None out, rc = runner(GITS, ["rev-list", "HEAD", "--left-right"], cwd=root) pieces["distance"] = len(out.split()) # total number of commits # commit date: see ISO-8601 comment in git_versions_from_keywords() date = runner(GITS, ["show", "-s", "--format=%%ci", "HEAD"], cwd=root)[0].strip() # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1) return pieces def plus_or_dot(pieces: Dict[str, Any]) -> str: """Return a + if we don't already have one, else return a .""" if "+" in pieces.get("closest-tag", ""): return "." return "+" def render_pep440(pieces: Dict[str, Any]) -> str: """Build up version string, with post-release "local version identifier". Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you get a tagged build and then dirty it, you'll get TAG+0.gHEX.dirty Exceptions: 1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += plus_or_dot(pieces) rendered += "%%d.g%%s" %% (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0+untagged.%%d.g%%s" %% (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" return rendered def render_pep440_branch(pieces: Dict[str, Any]) -> str: """TAG[[.dev0]+DISTANCE.gHEX[.dirty]] . The ".dev0" means not master branch. Note that .dev0 sorts backwards (a feature branch will appear "older" than the master branch). Exceptions: 1: no tags. 0[.dev0]+untagged.DISTANCE.gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: if pieces["branch"] != "master": rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "%%d.g%%s" %% (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0" if pieces["branch"] != "master": rendered += ".dev0" rendered += "+untagged.%%d.g%%s" %% (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" return rendered def pep440_split_post(ver: str) -> Tuple[str, Optional[int]]: """Split pep440 version string at the post-release segment. Returns the release segments before the post-release and the post-release version number (or -1 if no post-release segment is present). """ vc = str.split(ver, ".post") return vc[0], int(vc[1] or 0) if len(vc) == 2 else None def render_pep440_pre(pieces: Dict[str, Any]) -> str: """TAG[.postN.devDISTANCE] -- No -dirty. Exceptions: 1: no tags. 0.post0.devDISTANCE """ if pieces["closest-tag"]: if pieces["distance"]: # update the post release segment tag_version, post_version = pep440_split_post(pieces["closest-tag"]) rendered = tag_version if post_version is not None: rendered += ".post%%d.dev%%d" %% (post_version + 1, pieces["distance"]) else: rendered += ".post0.dev%%d" %% (pieces["distance"]) else: # no commits, use the tag as the version rendered = pieces["closest-tag"] else: # exception #1 rendered = "0.post0.dev%%d" %% pieces["distance"] return rendered def render_pep440_post(pieces: Dict[str, Any]) -> str: """TAG[.postDISTANCE[.dev0]+gHEX] . The ".dev0" means dirty. Note that .dev0 sorts backwards (a dirty tree will appear "older" than the corresponding clean one), but you shouldn't be releasing software with -dirty anyways. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "g%%s" %% pieces["short"] else: # exception #1 rendered = "0.post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += "+g%%s" %% pieces["short"] return rendered def render_pep440_post_branch(pieces: Dict[str, Any]) -> str: """TAG[.postDISTANCE[.dev0]+gHEX[.dirty]] . The ".dev0" means not master branch. Exceptions: 1: no tags. 0.postDISTANCE[.dev0]+gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%%d" %% pieces["distance"] if pieces["branch"] != "master": rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "g%%s" %% pieces["short"] if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0.post%%d" %% pieces["distance"] if pieces["branch"] != "master": rendered += ".dev0" rendered += "+g%%s" %% pieces["short"] if pieces["dirty"]: rendered += ".dirty" return rendered def render_pep440_old(pieces: Dict[str, Any]) -> str: """TAG[.postDISTANCE[.dev0]] . The ".dev0" means dirty. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" else: # exception #1 rendered = "0.post%%d" %% pieces["distance"] if pieces["dirty"]: rendered += ".dev0" return rendered def render_git_describe(pieces: Dict[str, Any]) -> str: """TAG[-DISTANCE-gHEX][-dirty]. Like 'git describe --tags --dirty --always'. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += "-%%d-g%%s" %% (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render_git_describe_long(pieces: Dict[str, Any]) -> str: """TAG-DISTANCE-gHEX[-dirty]. Like 'git describe --tags --dirty --always -long'. The distance/hash is unconditional. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] rendered += "-%%d-g%%s" %% (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render(pieces: Dict[str, Any], style: str) -> Dict[str, Any]: """Render the given version pieces into the requested style.""" if pieces["error"]: return {"version": "unknown", "full-revisionid": pieces.get("long"), "dirty": None, "error": pieces["error"], "date": None} if not style or style == "default": style = "pep440" # the default if style == "pep440": rendered = render_pep440(pieces) elif style == "pep440-branch": rendered = render_pep440_branch(pieces) elif style == "pep440-pre": rendered = render_pep440_pre(pieces) elif style == "pep440-post": rendered = render_pep440_post(pieces) elif style == "pep440-post-branch": rendered = render_pep440_post_branch(pieces) elif style == "pep440-old": rendered = render_pep440_old(pieces) elif style == "git-describe": rendered = render_git_describe(pieces) elif style == "git-describe-long": rendered = render_git_describe_long(pieces) else: raise ValueError("unknown style '%%s'" %% style) return {"version": rendered, "full-revisionid": pieces["long"], "dirty": pieces["dirty"], "error": None, "date": pieces.get("date")} def get_versions() -> Dict[str, Any]: """Get version information or return default if unable to do so.""" # I am in _version.py, which lives at ROOT/VERSIONFILE_SOURCE. If we have # __file__, we can work backwards from there to the root. Some # py2exe/bbfreeze/non-CPython implementations don't do __file__, in which # case we can only use expanded keywords. cfg = get_config() verbose = cfg.verbose try: return git_versions_from_keywords(get_keywords(), cfg.tag_prefix, verbose) except NotThisMethod: pass try: root = os.path.realpath(__file__) # versionfile_source is the relative path from the top of the source # tree (where the .git directory might live) to this file. Invert # this to find the root from __file__. for _ in cfg.versionfile_source.split('/'): root = os.path.dirname(root) except NameError: return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to find root of source tree", "date": None} try: pieces = git_pieces_from_vcs(cfg.tag_prefix, root, verbose) return render(pieces, cfg.style) except NotThisMethod: pass try: if cfg.parentdir_prefix: return versions_from_parentdir(cfg.parentdir_prefix, root, verbose) except NotThisMethod: pass return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to compute version", "date": None} ''' @register_vcs_handler("git", "get_keywords") def git_get_keywords(versionfile_abs: str) -> Dict[str, str]: """Extract version information from the given file.""" # the code embedded in _version.py can just fetch the value of these # keywords. When used from setup.py, we don't want to import _version.py, # so we do it with a regexp instead. This function is not used from # _version.py. keywords: Dict[str, str] = {} try: with open(versionfile_abs, "r") as fobj: for line in fobj: if line.strip().startswith("git_refnames ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["refnames"] = mo.group(1) if line.strip().startswith("git_full ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["full"] = mo.group(1) if line.strip().startswith("git_date ="): mo = re.search(r'=\s*"(.*)"', line) if mo: keywords["date"] = mo.group(1) except OSError: pass return keywords @register_vcs_handler("git", "keywords") def git_versions_from_keywords( keywords: Dict[str, str], tag_prefix: str, verbose: bool, ) -> Dict[str, Any]: """Get version information from git keywords.""" if "refnames" not in keywords: raise NotThisMethod("Short version file found") date = keywords.get("date") if date is not None: # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] # git-2.2.0 added "%cI", which expands to an ISO-8601 -compliant # datestamp. However we prefer "%ci" (which expands to an "ISO-8601 # -like" string, which we must then edit to make compliant), because # it's been around since git-1.5.3, and it's too difficult to # discover which version we're using, or to work around using an # older one. date = date.strip().replace(" ", "T", 1).replace(" ", "", 1) refnames = keywords["refnames"].strip() if refnames.startswith("$Format"): if verbose: print("keywords are unexpanded, not using") raise NotThisMethod("unexpanded keywords, not a git-archive tarball") refs = {r.strip() for r in refnames.strip("()").split(",")} # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of # just "foo-1.0". If we see a "tag: " prefix, prefer those. TAG = "tag: " tags = {r[len(TAG):] for r in refs if r.startswith(TAG)} if not tags: # Either we're using git < 1.8.3, or there really are no tags. We use # a heuristic: assume all version tags have a digit. The old git %d # expansion behaves like git log --decorate=short and strips out the # refs/heads/ and refs/tags/ prefixes that would let us distinguish # between branches and tags. By ignoring refnames without digits, we # filter out many common branch names like "release" and # "stabilization", as well as "HEAD" and "master". tags = {r for r in refs if re.search(r'\d', r)} if verbose: print("discarding '%s', no digits" % ",".join(refs - tags)) if verbose: print("likely tags: %s" % ",".join(sorted(tags))) for ref in sorted(tags): # sorting will prefer e.g. "2.0" over "2.0rc1" if ref.startswith(tag_prefix): r = ref[len(tag_prefix):] # Filter out refs that exactly match prefix or that don't start # with a number once the prefix is stripped (mostly a concern # when prefix is '') if not re.match(r'\d', r): continue if verbose: print("picking %s" % r) return {"version": r, "full-revisionid": keywords["full"].strip(), "dirty": False, "error": None, "date": date} # no suitable tags, so version is "0+unknown", but full hex is still there if verbose: print("no suitable tags, using unknown + full revision id") return {"version": "0+unknown", "full-revisionid": keywords["full"].strip(), "dirty": False, "error": "no suitable tags", "date": None} @register_vcs_handler("git", "pieces_from_vcs") def git_pieces_from_vcs( tag_prefix: str, root: str, verbose: bool, runner: Callable = run_command ) -> Dict[str, Any]: """Get version from 'git describe' in the root of the source tree. This only gets called if the git-archive 'subst' keywords were *not* expanded, and _version.py hasn't already been rewritten with a short version string, meaning we're inside a checked out source tree. """ GITS = ["git"] if sys.platform == "win32": GITS = ["git.cmd", "git.exe"] # GIT_DIR can interfere with correct operation of Versioneer. # It may be intended to be passed to the Versioneer-versioned project, # but that should not change where we get our version from. env = os.environ.copy() env.pop("GIT_DIR", None) runner = functools.partial(runner, env=env) _, rc = runner(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=not verbose) if rc != 0: if verbose: print("Directory %s not under git control" % root) raise NotThisMethod("'git rev-parse --git-dir' returned error") # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty] # if there isn't one, this yields HEX[-dirty] (no NUM) describe_out, rc = runner(GITS, [ "describe", "--tags", "--dirty", "--always", "--long", "--match", f"{tag_prefix}[[:digit:]]*" ], cwd=root) # --long was added in git-1.5.5 if describe_out is None: raise NotThisMethod("'git describe' failed") describe_out = describe_out.strip() full_out, rc = runner(GITS, ["rev-parse", "HEAD"], cwd=root) if full_out is None: raise NotThisMethod("'git rev-parse' failed") full_out = full_out.strip() pieces: Dict[str, Any] = {} pieces["long"] = full_out pieces["short"] = full_out[:7] # maybe improved later pieces["error"] = None branch_name, rc = runner(GITS, ["rev-parse", "--abbrev-ref", "HEAD"], cwd=root) # --abbrev-ref was added in git-1.6.3 if rc != 0 or branch_name is None: raise NotThisMethod("'git rev-parse --abbrev-ref' returned error") branch_name = branch_name.strip() if branch_name == "HEAD": # If we aren't exactly on a branch, pick a branch which represents # the current commit. If all else fails, we are on a branchless # commit. branches, rc = runner(GITS, ["branch", "--contains"], cwd=root) # --contains was added in git-1.5.4 if rc != 0 or branches is None: raise NotThisMethod("'git branch --contains' returned error") branches = branches.split("\n") # Remove the first line if we're running detached if "(" in branches[0]: branches.pop(0) # Strip off the leading "* " from the list of branches. branches = [branch[2:] for branch in branches] if "master" in branches: branch_name = "master" elif not branches: branch_name = None else: # Pick the first branch that is returned. Good or bad. branch_name = branches[0] pieces["branch"] = branch_name # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty] # TAG might have hyphens. git_describe = describe_out # look for -dirty suffix dirty = git_describe.endswith("-dirty") pieces["dirty"] = dirty if dirty: git_describe = git_describe[:git_describe.rindex("-dirty")] # now we have TAG-NUM-gHEX or HEX if "-" in git_describe: # TAG-NUM-gHEX mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe) if not mo: # unparsable. Maybe git-describe is misbehaving? pieces["error"] = ("unable to parse git-describe output: '%s'" % describe_out) return pieces # tag full_tag = mo.group(1) if not full_tag.startswith(tag_prefix): if verbose: fmt = "tag '%s' doesn't start with prefix '%s'" print(fmt % (full_tag, tag_prefix)) pieces["error"] = ("tag '%s' doesn't start with prefix '%s'" % (full_tag, tag_prefix)) return pieces pieces["closest-tag"] = full_tag[len(tag_prefix):] # distance: number of commits since tag pieces["distance"] = int(mo.group(2)) # commit: short hex revision ID pieces["short"] = mo.group(3) else: # HEX: no tags pieces["closest-tag"] = None out, rc = runner(GITS, ["rev-list", "HEAD", "--left-right"], cwd=root) pieces["distance"] = len(out.split()) # total number of commits # commit date: see ISO-8601 comment in git_versions_from_keywords() date = runner(GITS, ["show", "-s", "--format=%ci", "HEAD"], cwd=root)[0].strip() # Use only the last line. Previous lines may contain GPG signature # information. date = date.splitlines()[-1] pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1) return pieces def do_vcs_install(versionfile_source: str, ipy: Optional[str]) -> None: """Git-specific installation logic for Versioneer. For Git, this means creating/changing .gitattributes to mark _version.py for export-subst keyword substitution. """ GITS = ["git"] if sys.platform == "win32": GITS = ["git.cmd", "git.exe"] files = [versionfile_source] if ipy: files.append(ipy) if "VERSIONEER_PEP518" not in globals(): try: my_path = __file__ if my_path.endswith((".pyc", ".pyo")): my_path = os.path.splitext(my_path)[0] + ".py" versioneer_file = os.path.relpath(my_path) except NameError: versioneer_file = "versioneer.py" files.append(versioneer_file) present = False try: with open(".gitattributes", "r") as fobj: for line in fobj: if line.strip().startswith(versionfile_source): if "export-subst" in line.strip().split()[1:]: present = True break except OSError: pass if not present: with open(".gitattributes", "a+") as fobj: fobj.write(f"{versionfile_source} export-subst\n") files.append(".gitattributes") run_command(GITS, ["add", "--"] + files) def versions_from_parentdir( parentdir_prefix: str, root: str, verbose: bool, ) -> Dict[str, Any]: """Try to determine the version from the parent directory name. Source tarballs conventionally unpack into a directory that includes both the project name and a version string. We will also support searching up two directory levels for an appropriately named parent directory """ rootdirs = [] for _ in range(3): dirname = os.path.basename(root) if dirname.startswith(parentdir_prefix): return {"version": dirname[len(parentdir_prefix):], "full-revisionid": None, "dirty": False, "error": None, "date": None} rootdirs.append(root) root = os.path.dirname(root) # up a level if verbose: print("Tried directories %s but none started with prefix %s" % (str(rootdirs), parentdir_prefix)) raise NotThisMethod("rootdir doesn't start with parentdir_prefix") SHORT_VERSION_PY = """ # This file was generated by 'versioneer.py' (0.29) from # revision-control system data, or from the parent directory name of an # unpacked source archive. Distribution tarballs contain a pre-generated copy # of this file. import json version_json = ''' %s ''' # END VERSION_JSON def get_versions(): return json.loads(version_json) """ def versions_from_file(filename: str) -> Dict[str, Any]: """Try to determine the version from _version.py if present.""" try: with open(filename) as f: contents = f.read() except OSError: raise NotThisMethod("unable to read _version.py") mo = re.search(r"version_json = '''\n(.*)''' # END VERSION_JSON", contents, re.M | re.S) if not mo: mo = re.search(r"version_json = '''\r\n(.*)''' # END VERSION_JSON", contents, re.M | re.S) if not mo: raise NotThisMethod("no version_json in _version.py") return json.loads(mo.group(1)) def write_to_version_file(filename: str, versions: Dict[str, Any]) -> None: """Write the given version number to the given _version.py file.""" contents = json.dumps(versions, sort_keys=True, indent=1, separators=(",", ": ")) with open(filename, "w") as f: f.write(SHORT_VERSION_PY % contents) print("set %s to '%s'" % (filename, versions["version"])) def plus_or_dot(pieces: Dict[str, Any]) -> str: """Return a + if we don't already have one, else return a .""" if "+" in pieces.get("closest-tag", ""): return "." return "+" def render_pep440(pieces: Dict[str, Any]) -> str: """Build up version string, with post-release "local version identifier". Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you get a tagged build and then dirty it, you'll get TAG+0.gHEX.dirty Exceptions: 1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += plus_or_dot(pieces) rendered += "%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0+untagged.%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" return rendered def render_pep440_branch(pieces: Dict[str, Any]) -> str: """TAG[[.dev0]+DISTANCE.gHEX[.dirty]] . The ".dev0" means not master branch. Note that .dev0 sorts backwards (a feature branch will appear "older" than the master branch). Exceptions: 1: no tags. 0[.dev0]+untagged.DISTANCE.gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: if pieces["branch"] != "master": rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0" if pieces["branch"] != "master": rendered += ".dev0" rendered += "+untagged.%d.g%s" % (pieces["distance"], pieces["short"]) if pieces["dirty"]: rendered += ".dirty" return rendered def pep440_split_post(ver: str) -> Tuple[str, Optional[int]]: """Split pep440 version string at the post-release segment. Returns the release segments before the post-release and the post-release version number (or -1 if no post-release segment is present). """ vc = str.split(ver, ".post") return vc[0], int(vc[1] or 0) if len(vc) == 2 else None def render_pep440_pre(pieces: Dict[str, Any]) -> str: """TAG[.postN.devDISTANCE] -- No -dirty. Exceptions: 1: no tags. 0.post0.devDISTANCE """ if pieces["closest-tag"]: if pieces["distance"]: # update the post release segment tag_version, post_version = pep440_split_post(pieces["closest-tag"]) rendered = tag_version if post_version is not None: rendered += ".post%d.dev%d" % (post_version + 1, pieces["distance"]) else: rendered += ".post0.dev%d" % (pieces["distance"]) else: # no commits, use the tag as the version rendered = pieces["closest-tag"] else: # exception #1 rendered = "0.post0.dev%d" % pieces["distance"] return rendered def render_pep440_post(pieces: Dict[str, Any]) -> str: """TAG[.postDISTANCE[.dev0]+gHEX] . The ".dev0" means dirty. Note that .dev0 sorts backwards (a dirty tree will appear "older" than the corresponding clean one), but you shouldn't be releasing software with -dirty anyways. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "g%s" % pieces["short"] else: # exception #1 rendered = "0.post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" rendered += "+g%s" % pieces["short"] return rendered def render_pep440_post_branch(pieces: Dict[str, Any]) -> str: """TAG[.postDISTANCE[.dev0]+gHEX[.dirty]] . The ".dev0" means not master branch. Exceptions: 1: no tags. 0.postDISTANCE[.dev0]+gHEX[.dirty] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%d" % pieces["distance"] if pieces["branch"] != "master": rendered += ".dev0" rendered += plus_or_dot(pieces) rendered += "g%s" % pieces["short"] if pieces["dirty"]: rendered += ".dirty" else: # exception #1 rendered = "0.post%d" % pieces["distance"] if pieces["branch"] != "master": rendered += ".dev0" rendered += "+g%s" % pieces["short"] if pieces["dirty"]: rendered += ".dirty" return rendered def render_pep440_old(pieces: Dict[str, Any]) -> str: """TAG[.postDISTANCE[.dev0]] . The ".dev0" means dirty. Exceptions: 1: no tags. 0.postDISTANCE[.dev0] """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"] or pieces["dirty"]: rendered += ".post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" else: # exception #1 rendered = "0.post%d" % pieces["distance"] if pieces["dirty"]: rendered += ".dev0" return rendered def render_git_describe(pieces: Dict[str, Any]) -> str: """TAG[-DISTANCE-gHEX][-dirty]. Like 'git describe --tags --dirty --always'. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] if pieces["distance"]: rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render_git_describe_long(pieces: Dict[str, Any]) -> str: """TAG-DISTANCE-gHEX[-dirty]. Like 'git describe --tags --dirty --always -long'. The distance/hash is unconditional. Exceptions: 1: no tags. HEX[-dirty] (note: no 'g' prefix) """ if pieces["closest-tag"]: rendered = pieces["closest-tag"] rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) else: # exception #1 rendered = pieces["short"] if pieces["dirty"]: rendered += "-dirty" return rendered def render(pieces: Dict[str, Any], style: str) -> Dict[str, Any]: """Render the given version pieces into the requested style.""" if pieces["error"]: return {"version": "unknown", "full-revisionid": pieces.get("long"), "dirty": None, "error": pieces["error"], "date": None} if not style or style == "default": style = "pep440" # the default if style == "pep440": rendered = render_pep440(pieces) elif style == "pep440-branch": rendered = render_pep440_branch(pieces) elif style == "pep440-pre": rendered = render_pep440_pre(pieces) elif style == "pep440-post": rendered = render_pep440_post(pieces) elif style == "pep440-post-branch": rendered = render_pep440_post_branch(pieces) elif style == "pep440-old": rendered = render_pep440_old(pieces) elif style == "git-describe": rendered = render_git_describe(pieces) elif style == "git-describe-long": rendered = render_git_describe_long(pieces) else: raise ValueError("unknown style '%s'" % style) return {"version": rendered, "full-revisionid": pieces["long"], "dirty": pieces["dirty"], "error": None, "date": pieces.get("date")} class VersioneerBadRootError(Exception): """The project root directory is unknown or missing key files.""" def get_versions(verbose: bool = False) -> Dict[str, Any]: """Get the project version from whatever source is available. Returns dict with two keys: 'version' and 'full'. """ if "versioneer" in sys.modules: # see the discussion in cmdclass.py:get_cmdclass() del sys.modules["versioneer"] root = get_root() cfg = get_config_from_root(root) assert cfg.VCS is not None, "please set [versioneer]VCS= in setup.cfg" handlers = HANDLERS.get(cfg.VCS) assert handlers, "unrecognized VCS '%s'" % cfg.VCS verbose = verbose or bool(cfg.verbose) # `bool()` used to avoid `None` assert cfg.versionfile_source is not None, \ "please set versioneer.versionfile_source" assert cfg.tag_prefix is not None, "please set versioneer.tag_prefix" versionfile_abs = os.path.join(root, cfg.versionfile_source) # extract version from first of: _version.py, VCS command (e.g. 'git # describe'), parentdir. This is meant to work for developers using a # source checkout, for users of a tarball created by 'setup.py sdist', # and for users of a tarball/zipball created by 'git archive' or github's # download-from-tag feature or the equivalent in other VCSes. get_keywords_f = handlers.get("get_keywords") from_keywords_f = handlers.get("keywords") if get_keywords_f and from_keywords_f: try: keywords = get_keywords_f(versionfile_abs) ver = from_keywords_f(keywords, cfg.tag_prefix, verbose) if verbose: print("got version from expanded keyword %s" % ver) return ver except NotThisMethod: pass try: ver = versions_from_file(versionfile_abs) if verbose: print("got version from file %s %s" % (versionfile_abs, ver)) return ver except NotThisMethod: pass from_vcs_f = handlers.get("pieces_from_vcs") if from_vcs_f: try: pieces = from_vcs_f(cfg.tag_prefix, root, verbose) ver = render(pieces, cfg.style) if verbose: print("got version from VCS %s" % ver) return ver except NotThisMethod: pass try: if cfg.parentdir_prefix: ver = versions_from_parentdir(cfg.parentdir_prefix, root, verbose) if verbose: print("got version from parentdir %s" % ver) return ver except NotThisMethod: pass if verbose: print("unable to compute version") return {"version": "0+unknown", "full-revisionid": None, "dirty": None, "error": "unable to compute version", "date": None} def get_version() -> str: """Get the short version string for this project.""" return get_versions()["version"] def get_cmdclass(cmdclass: Optional[Dict[str, Any]] = None): """Get the custom setuptools subclasses used by Versioneer. If the package uses a different cmdclass (e.g. one from numpy), it should be provide as an argument. """ if "versioneer" in sys.modules: del sys.modules["versioneer"] # this fixes the "python setup.py develop" case (also 'install' and # 'easy_install .'), in which subdependencies of the main project are # built (using setup.py bdist_egg) in the same python process. Assume # a main project A and a dependency B, which use different versions # of Versioneer. A's setup.py imports A's Versioneer, leaving it in # sys.modules by the time B's setup.py is executed, causing B to run # with the wrong versioneer. Setuptools wraps the sub-dep builds in a # sandbox that restores sys.modules to it's pre-build state, so the # parent is protected against the child's "import versioneer". By # removing ourselves from sys.modules here, before the child build # happens, we protect the child from the parent's versioneer too. # Also see https://github.com/python-versioneer/python-versioneer/issues/52 cmds = {} if cmdclass is None else cmdclass.copy() # we add "version" to setuptools from setuptools import Command class cmd_version(Command): description = "report generated version string" user_options: List[Tuple[str, str, str]] = [] boolean_options: List[str] = [] def initialize_options(self) -> None: pass def finalize_options(self) -> None: pass def run(self) -> None: vers = get_versions(verbose=True) print("Version: %s" % vers["version"]) print(" full-revisionid: %s" % vers.get("full-revisionid")) print(" dirty: %s" % vers.get("dirty")) print(" date: %s" % vers.get("date")) if vers["error"]: print(" error: %s" % vers["error"]) cmds["version"] = cmd_version # we override "build_py" in setuptools # # most invocation pathways end up running build_py: # distutils/build -> build_py # distutils/install -> distutils/build ->.. # setuptools/bdist_wheel -> distutils/install ->.. # setuptools/bdist_egg -> distutils/install_lib -> build_py # setuptools/install -> bdist_egg ->.. # setuptools/develop -> ? # pip install: # copies source tree to a tempdir before running egg_info/etc # if .git isn't copied too, 'git describe' will fail # then does setup.py bdist_wheel, or sometimes setup.py install # setup.py egg_info -> ? # pip install -e . and setuptool/editable_wheel will invoke build_py # but the build_py command is not expected to copy any files. # we override different "build_py" commands for both environments if 'build_py' in cmds: _build_py: Any = cmds['build_py'] else: from setuptools.command.build_py import build_py as _build_py class cmd_build_py(_build_py): def run(self) -> None: root = get_root() cfg = get_config_from_root(root) versions = get_versions() _build_py.run(self) if getattr(self, "editable_mode", False): # During editable installs `.py` and data files are # not copied to build_lib return # now locate _version.py in the new build/ directory and replace # it with an updated value if cfg.versionfile_build: target_versionfile = os.path.join(self.build_lib, cfg.versionfile_build) print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) cmds["build_py"] = cmd_build_py if 'build_ext' in cmds: _build_ext: Any = cmds['build_ext'] else: from setuptools.command.build_ext import build_ext as _build_ext class cmd_build_ext(_build_ext): def run(self) -> None: root = get_root() cfg = get_config_from_root(root) versions = get_versions() _build_ext.run(self) if self.inplace: # build_ext --inplace will only build extensions in # build/lib<..> dir with no _version.py to write to. # As in place builds will already have a _version.py # in the module dir, we do not need to write one. return # now locate _version.py in the new build/ directory and replace # it with an updated value if not cfg.versionfile_build: return target_versionfile = os.path.join(self.build_lib, cfg.versionfile_build) if not os.path.exists(target_versionfile): print(f"Warning: {target_versionfile} does not exist, skipping " "version update. This can happen if you are running build_ext " "without first running build_py.") return print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) cmds["build_ext"] = cmd_build_ext if "cx_Freeze" in sys.modules: # cx_freeze enabled? from cx_Freeze.dist import build_exe as _build_exe # type: ignore # nczeczulin reports that py2exe won't like the pep440-style string # as FILEVERSION, but it can be used for PRODUCTVERSION, e.g. # setup(console=[{ # "version": versioneer.get_version().split("+", 1)[0], # FILEVERSION # "product_version": versioneer.get_version(), # ... class cmd_build_exe(_build_exe): def run(self) -> None: root = get_root() cfg = get_config_from_root(root) versions = get_versions() target_versionfile = cfg.versionfile_source print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) _build_exe.run(self) os.unlink(target_versionfile) with open(cfg.versionfile_source, "w") as f: LONG = LONG_VERSION_PY[cfg.VCS] f.write(LONG % {"DOLLAR": "$", "STYLE": cfg.style, "TAG_PREFIX": cfg.tag_prefix, "PARENTDIR_PREFIX": cfg.parentdir_prefix, "VERSIONFILE_SOURCE": cfg.versionfile_source, }) cmds["build_exe"] = cmd_build_exe del cmds["build_py"] if 'py2exe' in sys.modules: # py2exe enabled? try: from py2exe.setuptools_buildexe import py2exe as _py2exe # type: ignore except ImportError: from py2exe.distutils_buildexe import py2exe as _py2exe # type: ignore class cmd_py2exe(_py2exe): def run(self) -> None: root = get_root() cfg = get_config_from_root(root) versions = get_versions() target_versionfile = cfg.versionfile_source print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, versions) _py2exe.run(self) os.unlink(target_versionfile) with open(cfg.versionfile_source, "w") as f: LONG = LONG_VERSION_PY[cfg.VCS] f.write(LONG % {"DOLLAR": "$", "STYLE": cfg.style, "TAG_PREFIX": cfg.tag_prefix, "PARENTDIR_PREFIX": cfg.parentdir_prefix, "VERSIONFILE_SOURCE": cfg.versionfile_source, }) cmds["py2exe"] = cmd_py2exe # sdist farms its file list building out to egg_info if 'egg_info' in cmds: _egg_info: Any = cmds['egg_info'] else: from setuptools.command.egg_info import egg_info as _egg_info class cmd_egg_info(_egg_info): def find_sources(self) -> None: # egg_info.find_sources builds the manifest list and writes it # in one shot super().find_sources() # Modify the filelist and normalize it root = get_root() cfg = get_config_from_root(root) self.filelist.append('versioneer.py') if cfg.versionfile_source: # There are rare cases where versionfile_source might not be # included by default, so we must be explicit self.filelist.append(cfg.versionfile_source) self.filelist.sort() self.filelist.remove_duplicates() # The write method is hidden in the manifest_maker instance that # generated the filelist and was thrown away # We will instead replicate their final normalization (to unicode, # and POSIX-style paths) from setuptools import unicode_utils normalized = [unicode_utils.filesys_decode(f).replace(os.sep, '/') for f in self.filelist.files] manifest_filename = os.path.join(self.egg_info, 'SOURCES.txt') with open(manifest_filename, 'w') as fobj: fobj.write('\n'.join(normalized)) cmds['egg_info'] = cmd_egg_info # we override different "sdist" commands for both environments if 'sdist' in cmds: _sdist: Any = cmds['sdist'] else: from setuptools.command.sdist import sdist as _sdist class cmd_sdist(_sdist): def run(self) -> None: versions = get_versions() self._versioneer_generated_versions = versions # unless we update this, the command will keep using the old # version self.distribution.metadata.version = versions["version"] return _sdist.run(self) def make_release_tree(self, base_dir: str, files: List[str]) -> None: root = get_root() cfg = get_config_from_root(root) _sdist.make_release_tree(self, base_dir, files) # now locate _version.py in the new base_dir directory # (remembering that it may be a hardlink) and replace it with an # updated value target_versionfile = os.path.join(base_dir, cfg.versionfile_source) print("UPDATING %s" % target_versionfile) write_to_version_file(target_versionfile, self._versioneer_generated_versions) cmds["sdist"] = cmd_sdist return cmds CONFIG_ERROR = """ setup.cfg is missing the necessary Versioneer configuration. You need a section like: [versioneer] VCS = git style = pep440 versionfile_source = src/myproject/_version.py versionfile_build = myproject/_version.py tag_prefix = parentdir_prefix = myproject- You will also need to edit your setup.py to use the results: import versioneer setup(version=versioneer.get_version(), cmdclass=versioneer.get_cmdclass(), ...) Please read the docstring in ./versioneer.py for configuration instructions, edit setup.cfg, and re-run the installer or 'python versioneer.py setup'. """ SAMPLE_CONFIG = """ # See the docstring in versioneer.py for instructions. Note that you must # re-run 'versioneer.py setup' after changing this section, and commit the # resulting files. [versioneer] #VCS = git #style = pep440 #versionfile_source = #versionfile_build = #tag_prefix = #parentdir_prefix = """ OLD_SNIPPET = """ from ._version import get_versions __version__ = get_versions()['version'] del get_versions """ INIT_PY_SNIPPET = """ from . import {0} __version__ = {0}.get_versions()['version'] """ def do_setup() -> int: """Do main VCS-independent setup function for installing Versioneer.""" root = get_root() try: cfg = get_config_from_root(root) except (OSError, configparser.NoSectionError, configparser.NoOptionError) as e: if isinstance(e, (OSError, configparser.NoSectionError)): print("Adding sample versioneer config to setup.cfg", file=sys.stderr) with open(os.path.join(root, "setup.cfg"), "a") as f: f.write(SAMPLE_CONFIG) print(CONFIG_ERROR, file=sys.stderr) return 1 print(" creating %s" % cfg.versionfile_source) with open(cfg.versionfile_source, "w") as f: LONG = LONG_VERSION_PY[cfg.VCS] f.write(LONG % {"DOLLAR": "$", "STYLE": cfg.style, "TAG_PREFIX": cfg.tag_prefix, "PARENTDIR_PREFIX": cfg.parentdir_prefix, "VERSIONFILE_SOURCE": cfg.versionfile_source, }) ipy = os.path.join(os.path.dirname(cfg.versionfile_source), "__init__.py") maybe_ipy: Optional[str] = ipy if os.path.exists(ipy): try: with open(ipy, "r") as f: old = f.read() except OSError: old = "" module = os.path.splitext(os.path.basename(cfg.versionfile_source))[0] snippet = INIT_PY_SNIPPET.format(module) if OLD_SNIPPET in old: print(" replacing boilerplate in %s" % ipy) with open(ipy, "w") as f: f.write(old.replace(OLD_SNIPPET, snippet)) elif snippet not in old: print(" appending to %s" % ipy) with open(ipy, "a") as f: f.write(snippet) else: print(" %s unmodified" % ipy) else: print(" %s doesn't exist, ok" % ipy) maybe_ipy = None # Make VCS-specific changes. For git, this means creating/changing # .gitattributes to mark _version.py for export-subst keyword # substitution. do_vcs_install(cfg.versionfile_source, maybe_ipy) return 0 def scan_setup_py() -> int: """Validate the contents of setup.py against Versioneer's expectations.""" found = set() setters = False errors = 0 with open("setup.py", "r") as f: for line in f.readlines(): if "import versioneer" in line: found.add("import") if "versioneer.get_cmdclass()" in line: found.add("cmdclass") if "versioneer.get_version()" in line: found.add("get_version") if "versioneer.VCS" in line: setters = True if "versioneer.versionfile_source" in line: setters = True if len(found) != 3: print("") print("Your setup.py appears to be missing some important items") print("(but I might be wrong). Please make sure it has something") print("roughly like the following:") print("") print(" import versioneer") print(" setup( version=versioneer.get_version(),") print(" cmdclass=versioneer.get_cmdclass(), ...)") print("") errors += 1 if setters: print("You should remove lines like 'versioneer.VCS = ' and") print("'versioneer.versionfile_source = ' . This configuration") print("now lives in setup.cfg, and should be removed from setup.py") print("") errors += 1 return errors def setup_command() -> NoReturn: """Set up Versioneer and exit with appropriate error code.""" errors = do_setup() errors += scan_setup_py() sys.exit(1 if errors else 0) if __name__ == "__main__": cmd = sys.argv[1] if cmd == "setup": setup_command()