pax_global_header00006660000000000000000000000064132037567070014524gustar00rootroot0000000000000052 comment=6e66cf7a7926eb044e9a35ba0276772a67ae52c5 kfj-vspline-6e66cf7a7926/000077500000000000000000000000001320375670700150755ustar00rootroot00000000000000kfj-vspline-6e66cf7a7926/LICENSE000066400000000000000000000022051320375670700161010ustar00rootroot00000000000000vspline - generic C++ code for creation and evaluation of uniform b-splines Copyright 2015, 2016 by Kay F. Jahnke Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. kfj-vspline-6e66cf7a7926/README.rst000066400000000000000000000255721320375670700165770ustar00rootroot00000000000000=================================================================== vspline - generic C++ code to create and evaluate uniform B-splines =================================================================== ------------ Introduction ------------ vspline aims to provide a free, comprehensive and fast library for uniform B-splines. Uniform B-splines are a method to provide a 'smooth' interpolation over a set of uniformly sampled data points. They are commonly used in signal processing as they have several 'nice' qualities - an in-depth treatment and comparison to other interpolation methods can be found in the paper 'Interpolation Revisited' [CIT2000]_ by Philippe Thévenaz, Member, IEEE, Thierry Blu, Member, IEEE, and Michael Unser, Fellow, IEEE. While there are several freely available packets of B-spline code, I failed to find one which is comprehensive, efficient and generic at once. vspline attempts to be all that, making use of generic programming in C++11, and of common, but often underused hardware features in modern processors. Overall, there is an emphasis on speed, even if this makes the code more complex. I tried to eke as much performance out of the hardware at my disposal as possible, only compromising when the other design goals would have been compromised. While some of the code is quite low-level, there are reasonably high-level mechanisms to interface with vspline, allowing easy access to it's functionality without requiring users to familiarize themselves with the internal workings. High-level approach is provided via class 'bspline' defined in bspline.h, 'evaluator' objects from eval.h and via the remap functions defined in remap.h. While I made an attempt to write code which is portable, vspline is only tested with g++ and clang++ on Linux. It may work in other environments, but it's unlikely it will do so without modification. An installation of Vigra_ is needed to compile, installation of Vc_ is optional but recommended. Some Linux distros offer both vigra and Vc versions suitable for vspline, alternatively you can build either from source. Packaging vspline for debian is in progress_, hopefully the package will be available soon. vspline is relatively new, the current version might qualify as late beta. I have made efforts to cover 'reasonable' use cases, but I'm sure there are corner cases and unexpected scenarios where my code fails. The code is not well shielded against inappropriate parameters. The intended audience is developers rather than end users; if the code is used as the 'engine' in a well-defined way, parametrization can be tailored by the calling code. Parameter checking is avoided where it gets in the way of speedy operation. ----- Scope ----- There are (at least) two different approaches to tackle B-splines as a mathematical problem. The first one is to look at them as a linear algebra problem. Calculating the B-spline coefficients is done by solving a set of equations, which can be codified as banded diagonal matrices with slight disturbances at the top and bottom, resulting from boundary conditions. The mathematics are reasonably straightforward and can be efficiently coded (at least for lower-degree splines), but I found it hard to generalize to B-splines of higher order. The second approach to B-splines comes from signal processing, and it's the one which I found most commonly used in the other implementations I studied. It generates the B-spline coefficients by applying a forward-backward recursive digital filter to the data and usually implements boundary conditions by picking appropriate initial causal and anticausal coefficients. Once I had understood the process, I found it elegant and beautiful - and perfectly general, lending itself to the implementation of a body of generic code with the scope I envisioned. I have made an attempt to generalize the code so that it can handle - real data types and their aggregates [1]_ - coming in strided nD memory - a reasonable selection of boundary conditions - used in either an implicit or an explicit scheme of extrapolation - arbitrary spline orders - arbitrary dimensions of the spline - in multithreaded code - using the CPU's vector units if possible On the evaluation side I provide - evaluation of the spline at point locations in the defined range - evaluation of the spline's derivatives - mapping of arbitrary coordinates into the defined range - evaluation of nD arrays of coordinates ('remap' function) - coordinate-fed remap function ('index_remap') - functor-based remap, aka 'transform' function - functor-based 'apply' function The code at the very core of my B-spline coefficient generation code evolved from the code by Philippe Thévenaz which he published here_, with some of the boundary condition treatment code derived from formulae which Philippe Thévenaz communicated to me. Next I needed code to handle multidimensional arrays in a generic fashion in C++. I chose to use Vigra_. Since my work has a focus on signal (and, more specifically image) processing, it's an excellent choice, as it provides a large body of code with precisely this focus and has well-thought-out, reliable support for multidimensional arrays. Once I had a prototype up and running, I looked out for further improvements to it's speed. While using GPUs is tempting, I chose not to go down this path. Instead I chose to stick with CPU code and use vectorization. Again, I did some research and finally found Vc_. Vc allowed me to write generic code for vectorization for a reasonably large sets of targets, and at the same time I can fall back to the scalar implementation if the target system doesn't have a vector unit, or if the vector code doesn't perform better than the scalar code. Using Vc is a bit of a tricky choice, as it makes it hard to provide binary code which runs on a variety of platforms - it has to be compiled for a specific target like AVX or SSE. Still I found the performance gain for some data types so enticing that I chose to make my code use vectorization, which was not an easy task, considering it's scope. The use of the vectorized code is optional; it can be exempt from compilation by a compile-time flag. I did all my programming on a Kubuntu_ system, running on an intel(R) Core (TM) i5-4570 CPU, and used GNU gcc_ and clang_ to compile the code in C++11 dialect. While I am confident that the code runs on other CPUs, I have not tested it with other compilers or operating systems (yet). Note: in November 2017, with help from Bernd Gaucke, vspline's companion program pv, which uses vspline heavily, was successfully compiled with 'Visual Studio Platform toolset V141'. While no further tests have been done, I hope that I can soon extend the list of supported platforms. .. _here: http://bigwww.epfl.ch/thevenaz/interpolation/ .. _Vigra: http://ukoethe.github.io/vigra/ .. _Vc: https://github.com/VcDevel/Vc .. _Kubuntu: http://kubuntu.org/ .. _gcc: https://gcc.gnu.org/ .. _clang: http://http://clang.llvm.org/ .. _progress: https://ftp-master.debian.org/new/vspline_0.2.0-1.html .. [CIT2000] Interpolation Revisited by Philippe Thévenaz, Member,IEEE, Thierry Blu, Member, IEEE, and Michael Unser, Fellow, IEEE in IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 7, JULY 2000, available online here_ .. _online here: http://bigwww.epfl.ch/publications/thevenaz0002.pdf .. [1] I use 'aggregate' here to mean a collection of identical elements, in contrast to what C++ defines as an aggregate type. So aggregates would be pixels, vigra TinyVectors, and, also, complex types. ------------- Documentation ------------- vspline uses doxygen to create documentation. You can access the documentation online here: https://kfj.bitbucket.io ----- Speed ----- While performance will vary widely from system to system and between different compiles, I'll quote some measurements from my own system. I include benchmarking code (roundtrip.cc in the examples folder). Here are some measurements done with "roundtrip", working on a full HD (1920*1080) RGB image, using single precision floats internally - the figures are averages of several runs: :: testing bc code MIRROR spline degree 3 avg 32 x prefilter:........................ 13.093750 ms avg 32 x remap from unsplit coordinates:... 59.218750 ms avg 32 x remap with internal spline:....... 75.125000 ms avg 32 x index_remap ...................... 57.781250 ms testing bc code MIRROR spline degree 3 using Vc avg 32 x prefilter:........................ 9.562500 ms avg 32 x remap from unsplit coordinates:... 22.406250 ms avg 32 x remap with internal spline:....... 35.687500 ms avg 32 x index_remap ...................... 21.656250 ms As can be seen from these test results, using Vc on my system speeds evaluation up a good deal. When it comes to prefiltering, a lot of time is spent buffering data to make them available for fast vector processing. The time spent on actual calculations is much less. Therefore prefiltering for higher-degree splines doesn't take much more time (when using Vc): :: testing bc code MIRROR spline degree 5 using Vc avg 32 x prefilter:........................ 10.687500 ms testing bc code MIRROR spline degree 7 using Vc avg 32 x prefilter:........................ 13.656250 ms Using double precision arithmetics, vectorization doesn't help so much, and prefiltering is actually slower on my system when using Vc. Doing a complete roundtrip run on your system should give you an idea about which mode of operation best suits your needs. ---------- History ---------- Some years ago, I needed uniform B-splines for a project in python. I looked for code in C which I could easily wrap with cffi_, as I intended to use it with pypy_, and found K. P. Esler's libeinspline_. I proceeded to code the wrapper, which also contained a layer to process Numpy_ nD-arrays, but at the time I did not touch the C code in libeinspline. The result of my efforts is still available from the repository_ I set up at the time. I did not use the code much and occupied myself with other projects, until my interest in B-splines was rekindled sometime in late 2015. I had a few ideas I wanted to try out which would require working with the B-spline C code itself. I started out modifying code in libeinspline, but after some initial progress I felt constricted by the code base in C, the linear algebra approach, the limitation to cubic splines, etc. Being more of a C++ programmer with a fondness for generic programming, I first re-coded the core of libeinspline in C++ (to continue using my python interface), but then, finally, I decided to start afresh in C++, abandon the link to libeinspline and not have a python interface (for now). This is the result of my work. I have chosen the name vspline, because I rely heavily on two libraries staring with 'V', namely Vigra_ and Vc_. .. _cffi: https://cffi.readthedocs.org/en/latest/ .. _pypy: http://pypy.org/ .. _libeinspline: http://einspline.sourceforge.net/ .. _Numpy: http://www.numpy.org/ .. _repository: https://bitbucket.org/kfj/python-bspline kfj-vspline-6e66cf7a7926/basis.h000066400000000000000000000272461320375670700163620ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file basis.h \brief Code to calculate the value B-spline basis function and it's derivatives. There are several variants in here. First, there is a perfectly general routine, using the Cox-de Boor recursion. While this is 'nice to have', vspline does not actually use it (except as a reference in unit testing). vspline only needs evaluation of the B-spline basis function at multiples of 0.5. With these values it can construct it's evaluators which in turn are capable of evaluating the spline at real coordinates. So next is a specialized routine using an adapted version of the recursion to calculate the basis function's value for integral operands. This isn't used in vspline either - instead vspline uses a third version which abbreviates the recursion by relying on precomputed values for the basis function with derivative 0, which the recursion reaches after as many levels as the requested derivative, so seldom deeper than 2. That makes it very fast. For comparison there is also a routine calculating an approximation of the basis function's value (only derivative 0) by means of a gaussian. This routine isn't currently used in vspline. for a discussion of the b-spline basis function, have a look at http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-basis.html */ #ifndef VSPLINE_BASIS_H #define VSPLINE_BASIS_H // poles.h has precomputed basis function values sampled at n * 1/2. // These values were calculated with the general routine (gen_bspline_basis) // using long doubles for maximum precision. If desired, the process can be // repeated by running prefilter_poles.cc, which first calculates the basis // function values and then the prefilter polesbased on the basis function // values. #include namespace vspline { /// Implementation of the Cox-de Boor recursion formula to calculate /// the value of the bspline basis function. This code is taken from vigra /// but modified to take the spline degree as a parameter. /// /// This code is quite expensive for higer spline orders /// because the routine calls itself twice recursively, so the performance is /// N*N with the spline's degree. Luckily there are ways around using this routine /// at all - whenever we need the b-spline basis function value in vspline, it is /// at multiples of 1/2, and poles.h has precomputed values for all spline /// degrees covered by vspline. I leave the code in here for reference purposes, /// and also for bootstrapping when precalculating the basis function values /// and prefilter poles from scratch, like in prefiler_poles.cc, but even there /// I now use an alternative. Still gen_bspline_basis yields the value of the /// b-spline basis function at arbitrary x, so it should not miss in a b-spline /// library. template < class real_type > real_type gen_bspline_basis ( real_type x , int degree , int derivative ) { if ( degree == 0 ) { if ( derivative == 0 ) return ( x < real_type(0.5) && real_type(-0.5) <= x ) ? real_type(1.0) : real_type(0.0) ; else return real_type(0.0); } if ( derivative == 0 ) { real_type n12 = real_type((degree + 1.0) / 2.0); return ( ( n12 + x ) * gen_bspline_basis ( x + real_type(0.5) , degree - 1 , 0 ) + ( n12 - x ) * gen_bspline_basis ( x - real_type(0.5) , degree - 1 , 0 ) ) / degree; } else { --derivative; return gen_bspline_basis ( x + real_type(0.5) , degree - 1 , derivative ) - gen_bspline_basis ( x - real_type(0.5) , degree - 1 , derivative ) ; } } /// this routine is a helper routine to cdb_bspline_basis (below), the /// modified Cox-de Boor recursion formula to calculate the b-spline basis function /// for integral operands, operating in int as long as possible. This is achieved by /// working with 'x2', the doubled x value. Since in the 'real' recursion, the next /// iteration is called with x +/- 1/2, we can call the 'doubled' version with x +/- 1. /// This routine recurses 'all the way down to degree 0, So the result is, disregarding /// arithmetic errors, the same as the result obtained with the general routine. /// This routine is used in prefiler_poles.cc to obtain the b-spline basis function /// values at half unit steps, the result is identical to using gen_bspline_basis /// with long float arguments. template < class real_type > real_type cdb_bspline_basis_2 ( int x2 , int degree , int derivative ) { if ( degree == 0 ) { if ( derivative == 0 ) return ( x2 < 1 && -1 <= x2 ) ? real_type(1.0) : real_type(0.0) ; else return real_type(0.0); } if ( derivative == 0 ) { int n122 = degree + 1 ; return ( ( n122 + x2 ) * cdb_bspline_basis_2 ( x2 + 1 , degree - 1 , 0 ) + ( n122 - x2 ) * cdb_bspline_basis_2 ( x2 - 1 , degree - 1 , 0 ) ) / ( 2 * degree ) ; } else { --derivative; return cdb_bspline_basis_2 ( x2 + 1 , degree - 1 , derivative ) - cdb_bspline_basis_2 ( x2 - 1 , degree - 1 , derivative ) ; } } /// modified Cox-de Boor recursion formula to calculate the b-spline basis function /// for integral operands, delegates to the 'doubled' routine above template < class real_type > real_type cdb_bspline_basis ( int x , int degree , int derivative = 0 ) { return cdb_bspline_basis_2 ( x + x , degree , derivative ) ; } /// see bspline_basis() below! /// this helper routine works with the doubled value of x, so it can serve for calls /// equivalent to basis ( x + .5 ) or basis ( x - .5 ) as basis2 ( 2 * x + 1 ) and /// basis2 ( 2 * x - 1 ). Having precalculated the basis function at .5 steps, we can /// therefore avoid using the general recursion formula. This is a big time-saver /// for high degrees. Note, though, that calculating the basis function for a /// spline's derivatives still needs recursion, with two branches per level. /// So calculating the basis function's value for high derivatives still consumes /// a fair amount of time. template < class real_type > real_type bspline_basis_2 ( int x2 , int degree , int derivative ) { if ( degree == 0 ) { if ( derivative == 0 ) return ( x2 < 1 && -1 <= x2 ) ? real_type(1.0) : real_type(0.0) ; else return real_type(0.0); } if ( derivative == 0 ) { if ( abs ( x2 ) > degree ) return real_type ( 0 ) ; // for derivative 0 we have precomputed values: const long double * pk = vspline_constants::precomputed_basis_function_values [ degree ] ; return pk [ abs ( x2 ) ] ; } else { --derivative; return bspline_basis_2 ( x2 + 1 , degree - 1 , derivative ) - bspline_basis_2 ( x2 - 1 , degree - 1 , derivative ) ; } } /// bspline_basis produces the value of the b-spline basis function for /// integral operands, the given degree 'degree' and the desired derivative. /// It turns out that this is all we ever need inside vspline, the calculation /// of the basis function at arbitrary points is performed via the matrix /// multiplication in the weight generating functor, and this functor sets /// it's internal matrix up with bspline basis function values at integral /// locations. /// /// bspline_basis delegates to bspline_basis_2 above, which picks precomputed /// values as soon as derivative becomes 0. This abbreviates the recursion /// a lot, since usually the derivative requested is 0 or a small integer. /// all internal calculations in vspline accessing b-spline basis function /// values are currently using this routine, not the general routine. template < class real_type > real_type bspline_basis ( int x , int degree , int derivative = 0 ) { return bspline_basis_2 ( x + x , degree , derivative ) ; } /// Gaussian approximation to B-spline basis function. This routine /// approximates the basis function of degree spline_degree for real x. /// I checked for all degrees up to 20. The partition of unity quality of the /// resulting reconstruction filter is okay for larger degrees, the cumulated /// error over the covered interval is quite low. Still, as the basis function /// is never actually evaluated in vspline (whenever it's needed, it is needed /// at n * 1/2 and we have precomputed values for that) there is not much point /// in having this function around. I leave the code in for now. template < typename real_type > real_type gaussian_bspline_basis_approximation ( real_type x , int degree ) { real_type sigma = ( degree + 1 ) / 12.0 ; return real_type(1.0) / sqrt ( real_type(2.0 * M_PI) * sigma ) * exp ( - ( x * x ) / ( real_type(2.0) * sigma ) ) ; } } ; // end of namespace vspline #endif // #define VSPLINE_BASIS_H kfj-vspline-6e66cf7a7926/brace.h000066400000000000000000000455361320375670700163370ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file brace.h \brief This file provides code for 'bracing' the spline's coefficient array. Note that this isn't really user code, it's code used by class vspline::bspline. Inspired by libeinspline, I wrote code to 'brace' the spline coefficients. The concept is this: while the IIR filter used to calculate the coefficients has infinite support (though arithmetic precision limits this in real-world applications), the evaluation of the spline at a specific location only looks at a small window of coefficients (compact, finite support). This fact can be exploited by taking note of how large the support area is and providing a few more coefficients in a frame around the 'core' coefficients to allow the evaluation to proceed without having to check for boundary conditions. While the difference is not excessive (the main computational cost is the actual evaluation itself), it's still nice to be able to code the evaluation without boundary checking, which makes the code very straightforward and legible. There is another aspect to bracing: In my implementation of vectorized evaluation, the window into the coefficient array used to pick out coefficients to evaluate at a specific location is coded as a set of offsets from it's 'low' corner. This way, several such windows can be processed in parallel. This mechanism can only function efficiently in a braced coefficient array, since it would otherwise have to give up if any of the windows accessed by the vector of coordinates had members outside the (unbraced) coefficient array and submit the coordinate vector to individual processing. I consider the logic to code this and the loss in performance too much of a bother to go down this path; all my evaluation code uses braced coefficient arrays. Of course the user is free to omit bracing, but then they have to use their own evaluation code. What's in the brace? Of course this depends on the boundary conditions chosen. In vspline, I offer code for several boundary conditions, but most have something in common: the original, finite sequence is extrapolated into an infinite periodic signal. With straight PERIODIC boundary conditions, the initial sequence is immediately followed and preceded by copies of itself. The other boundary conditions mirror the signal in some way and then repeat the mirrored signal periodically. Using boundary conditions like these, both the extrapolated signal and the coefficients share the same periodicity and mirroring. There are two ways of arriving at a braced coeffcient array: We can start from the extrapolated signal, pick a section large enough to make margin effects vanish (due to limited arithmetic precision), prefilter it and pick out a subsection containing the 'core' coefficients and their support. Alternatively, we can work only on the core coefficients, calculate suitable initial causal and anticausal coeffcients (where the calculation considers the extrapolated signal, which remains implicit), apply the filter and *then* surround the core coefficient array with more coeffcients (the brace) following the same extrapolation pattern as we imposed on the signal, but now on the coefficients rather than on the initial knot point values. The bracing can be performed without any solver-related maths by simply copying (possibly trivially modified) slices of the core coefficients to the margin area. Following the 'implicit' scheme, my default modus operandi braces after the prefiltering. Doing so, it is posible to calculate the inital causal and anticausal coefficient for the prefilter exactly. But this exactness is still, eventually, subject to discretization and can only be represented after quantization. If, instead, we prefilter a suitably extrapolated signal, now with arbitrary boundary conditions, the margin effects will vanish towards the center (due to the characteristics of the filter), and the 'core' coefficients will end up the same as in the first approach. So we might as well extrapolate the signal 'far enough', pick any boundary conditions we like (even zero padding), prefilter, and discard the margin outside the area which is unaffected by margin effects. The result is, within arithmetic precision, the same. Both approaches have advantages and disadvantages: Implicit extrapolation needs less memory - we only need to provide storage for the core coeffcients, which is just as much as we need for the original signal, so we can operate in-place. The disadvantage of the implicit scheme is that we have to capture the implicit extrapolation in code to calculate the initial causal/anticausal coefficients, which is non-trivial and requires separate routines for each case, as can be seen in my prefiltering code. And if, after prefiltering, we want to brace the core coeffcients for efficient evaluation, we still need additional memory, which, if it hasn't been allocated around the core before prefiltering, even requires us to copy the data out into a larger memory area. Explicit extrapolation needs more memory. A typical scheme would be to anticipate the space needed for the explicit extrapolation, allocate enough memory for the extrapolated signal, place the same into the center of the allocated memory, perform the extrapolation and then prefilter. The advantage is that we can run the prefilter with arbitrary initial causal/anticausal coefficients. No matter what the extrapolation looks like, we can always use the same code. And we can extrapolate in any way we see fit, without having to produce code to deal with our choice. If we pick the frame of extrapolated values large enough, we can even pick out the 'braced' coefficient array from the result of the filter. Obviously, there is no one 'right' way of doing this. Offered several choices of implicit extrapolation, the user can choose between the implicit and explicit scheme. The code in this file is useful for both choices: for the implicit scheme, bracing is applied after prefiltering to enable evaluation with vspline. For the explicit scheme, bracing may be used on the original data before prefiltering with arbitrary boundary conditions, if the user's extrapolation scheme is covered by the code given here. When using the higher-level access methods (via bspline objects), using the explicit or implicit scheme becomes a matter of passing in the right flag, so at this level, a deep understanding of the extrapolation mechanism isn't needed at all. I use the implicit scheme as the default. Since the bracing mainly requires copying data or trivial maths we can do the operations on higher-dimensional objects, like slices of a volume. To efficiently code these operations we make use of vigra's multi-math facility and it's bindAt array method, which makes these subarrays easily available. TODO: while this is convenient, it's not too fast, as it's neither multithreaded nor vectorized. Still in most 'normal' scenarios the execution time is negligible... TODO: there are 'pathological' cases where one brace is larger than the other brace and the width of the core together. These cases can't be handled for all bracing modes and will result in an exception. */ #ifndef VSPLINE_BRACE_H #define VSPLINE_BRACE_H #include #include #include "common.h" namespace vspline { /// class bracer encodes the entire bracing process. Note that contrary /// to my initial implementation, class bracer is now used exclusively /// for populating the frame around a core area of data. It has no code /// to determine which size a brace/frame should have. This is now /// determined in class bspline, see especially class bspline's methods /// get_left_brace_size(), get_right_brace_size() and setup_metrics(). template < class view_type > struct bracer { typedef typename view_type::value_type value_type ; enum { dimension = view_type::actual_dimension } ; /// for spherical images, we require special treatment for two-dimensional /// input data, because we need to shift the values by 180 degrees, or half /// the margin's width. But to compile, we also have to give a procedure /// for the other cases (not 2D), so this is first: template < typename value_type > void shift_assign ( value_type target , value_type source ) { // should not ever get used, really... } /// specialized routine for the 2D case (the slice itself is 1D) template < typename value_type > void shift_assign ( MultiArrayView < 1 , value_type > target , MultiArrayView < 1 , value_type > source ) { // bit sloppy here, with pathological data (very small source.size()) this will // be imprecise for odd sizes, for even sizes it's always fine. But then full // sphericals always have size 2N * N, so odd sizes should not occur at all for dim 0 // TODO what if it's not a full spherical? target = source ; return ; auto si = source.begin() + source.size() / 2 ; auto se = source.end() ; for ( auto& ti : target ) { ti = *si ; ++si ; if ( si >= se ) si = source.begin() ; } } /// apply the bracing to the array, performing the required copy/arithmetic operations /// to the 'frame' around the core. This routine performs the operation along axis dim. /// This is also the routine to be used for explicitly extrapolating a signal: /// you place the data into the center of a larger array, and pass in the sizes of the /// 'empty' space which is to be filled with the extrapolated data. /// /// the bracing is done one-left-one-right, to avoid corner cases as best as posible. void apply ( view_type & a , // containing array bc_code bc , // boundary condition code int lsz , // space to the left which needs to be filled int rsz , // ditto, to the right int axis ) // axis along which to apply bracing { int w = a.shape ( axis ) ; // width of containing array along axis 'axis' int m = w - ( lsz + rsz ) ; // width of 'core' array if ( m < 1 ) // has to be at least 1 throw shape_mismatch ( "combined brace sizes must be at least one less than container size" ) ; if ( ( lsz > m + rsz ) || ( rsz > m + lsz ) ) { // not enough data to fill brace if ( bc == PERIODIC || bc == NATURAL || bc == MIRROR || bc == REFLECT ) throw std::out_of_range ( "each brace must be smaller than the sum of it's opposite brace and the core's width" ) ; } int l0 = lsz - 1 ; // index of innermost empty slice on the left; like begin() int r0 = lsz + m ; // ditto, on the right int lp = l0 + 1 ; // index of leftmost occupied slice (p for pivot) int rp = r0 - 1 ; // index of rightmost occupied slice int l1 = -1 ; // index one before outermost empty slice to the left int r1 = w ; // index one after outermost empty slice on the right; like end() int lt = l0 ; // index to left target slice int rt = r0 ; // index to right target slice ; int ls , rs ; // indices to left and right source slice, will be set below int ds = 1 ; // step for source index, +1 == forẃard, used for all mirroring modes // for periodic bracing, it's set to -1. switch ( bc ) { case PERIODIC : { ls = l0 + m ; rs = r0 - m ; ds = -1 ; // step through source in reverse direction break ; } case NATURAL : case MIRROR : { ls = l0 + 2 ; rs = r0 - 2 ; break ; } case CONSTANT : case SPHERICAL : case REFLECT : { ls = l0 + 1 ; rs = r0 - 1 ; break ; } case ZEROPAD : { break ; } case IDENTITY : { // these modes perform no bracing, return prematurely return ; } default: { cerr << "bracing for BC code " << bc_name[bc] << " is not supported" << endl ; break ; } } for ( int i = max ( lsz , rsz ) ; i > 0 ; --i ) { if ( lt > l1 ) { switch ( bc ) { case PERIODIC : case MIRROR : case REFLECT : { // with these three bracing modes, we simply copy from source to target a.bindAt ( axis , lt ) = a.bindAt ( axis , ls ) ; break ; } case NATURAL : { // here, we subtract the source slice from twice the 'pivot' // easiest would be: // a.bindAt ( axis , lt ) = a.bindAt ( axis , lp ) * value_type(2) - a.bindAt ( axis , ls ) ; // but this fails in 1D TODO: why? auto target = a.bindAt ( axis , lt ) ; // get a view to left target slice target = a.bindAt ( axis , lp ) ; // assign value of left pivot slice target *= value_type(2) ; // double that target -= a.bindAt ( axis , ls ) ; // subtract left source slice break ; } case CONSTANT : { // here, we repeat the 'pivot' slice a.bindAt ( axis , lt ) = a.bindAt ( axis , lp ) ; break ; } case ZEROPAD : { // fill with 0 a.bindAt ( axis , lt ) = value_type() ; break ; } case SPHERICAL : // needs special treatment { shift_assign ( a.bindAt ( axis , lt ) , a.bindAt ( axis , ls ) ) ; break ; } default : // default: leave untouched break ; } --lt ; ls += ds ; } if ( rt < r1 ) { // essentially the same, but with rs instead of ls, etc. switch ( bc ) { case PERIODIC : case MIRROR : case REFLECT : { // with these three bracing modes, we simply copy from source to target a.bindAt ( axis , rt ) = a.bindAt ( axis , rs ) ; break ; } case NATURAL : { // here, we subtract the source slice from twice the 'pivot' // the easiest would be: // a.bindAt ( axis , rt ) = a.bindAt ( axis , rp ) * value_type(2) - a.bindAt ( axis , rs ) ; // but this fails in 1D TODO: why? auto target = a.bindAt ( axis , rt ) ; // get a view to right targte slice target = a.bindAt ( axis , rp ) ; // assign value of pivot slice target *= value_type(2) ; // double that target -= a.bindAt ( axis , rs ) ; // subtract source slice break ; } case CONSTANT : { // here, we repeat the 'pivot' slice a.bindAt ( axis , rt ) = a.bindAt ( axis , rp ) ; break ; } case ZEROPAD : { // fill with 0 a.bindAt ( axis , rt ) = value_type() ; break ; } case SPHERICAL : // needs special treatment { shift_assign ( a.bindAt ( axis , rt ) , a.bindAt ( axis , rs ) ) ; break ; } default : // default: leave untouched break ; } ++rt ; rs -= ds ; } } } /// This variant of apply braces along all axes in one go. static void apply ( view_type& a , ///< target array, containing the core and (empty) frame vigra::TinyVector < bc_code , dimension > bcv , ///< boundary condition codes vigra::TinyVector < int , dimension > left_corner , ///< sizes of left braces vigra::TinyVector < int , dimension > right_corner ) ///< sizes of right braces { for ( int dim = 0 ; dim < dimension ; dim++ ) apply ( a , bcv[dim] , left_corner[dim] , right_corner[dim] , dim ) ; } } ; } ; // end of namespace vspline #endif // VSPLINE_BRACE_H kfj-vspline-6e66cf7a7926/bspline.h000066400000000000000000001347671320375670700167240ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file bspline.h \brief defines class bspline class bspline is an object to contain a b-spline's coefficients and some metadata in one handy package. It also provides easy access to b-spline prefiltering. The idea is that user code establishes a bspline object representing the data at hand and then proceeds to create 'evaluators' to evaluate the spline. You may be reminded of SciPy's bisplrep object, and I admit that SciPy's bspline code has been one of my inspirations. It attempts to do 'the right thing' by automatically creating suitable helper objects and parametrization so that the spline does what it's supposed to do. Most users will not need anything else, and using class bspline is quite straightforward. It's quite possible to have a b-spline up and running with a few lines of code without even having to make choices concerning it's parametrization, since there are sensible defaults for everything. At the same time, pretty much everything *can* be parametrized even at this level. bspline objects can be used without any knowledge of their internals, e.g. as parameters to the remap functions. Note that class bspline does not provide evaluation of the spline. To evaluate, objects of class evaluator (see eval.h) are used, which construct from a bspline object with additional parameters, like, whether to calculate the spline's value or it's derivative(s) or whether to use optimizations for special cases. While using 'raw' coefficient arrays with an evaluation scheme which applies boundary conditions is feasible and most memory-efficient, it's not so well suited for very fast evaluation, since the boundary treatment needs conditionals, and 'breaks' uniform access, which is especially detrimental when using vectorization. So vspline uses coefficient arrays with a few extra coefficients 'framing' or 'bracing' the 'core' coefficients. Since evaluation of the spline looks at a certain small section of coefficients (the evaluator's 'support'), the bracing is chosen so that this lookup will always succeed without having to consider boundary conditions: the brace is set up to make the boundary conditions explicit, and the evaluation can proceed blindly without bounds checking. With large coefficient arrays, the little extra space needed for the brace becomes negligible, but the code for evaluation becomes much faster and simpler. In effect, 'bracing' is taken a little bit further than merely providing enough extra coefficients to cover the support: additional coefficients are produced to allow for the spline to be evaluated without bounds checking - at the lower and upper limit of the spline's defined range - and even slightly beyond those limits This makes the code more robust: being very strict about the ends of the defined range can easily result in quantization errors producing out-of-bounds access to the coefficient array, so the slightly wider brace acts as a safeguard. class bspline handles two views to the coefficients it operates on, these are realized as vigra::MultiArrayViews, and they share the same storage: - the 'core', which is a view to an array of data precisely the same shape as the knot point data over which the spline is calculated. - 'container', which surrounds the view above with an additional frame of coefficients used for 'bracing', for additional coefficints needed for using the 'explicit' scheme of extrapolation before prefiltering, and as extra 'headroom' if 'shifting' the spline is intended. Using class bspline, there is a choice of 'strategy'. The simplest strategy is 'UNBRACED'. With this strategy, after putting the knot point data into the bspline's 'core' area and calling prefilter(), the core area will contain the b-spline coefficients. The resulting b-spline object can't be safely evaluated with the code in eval.h. this mode of operation is intended for users who want to do their own processing of the coefficients and don't need the code in eval.h. prefiltering is done using an implicit scheme as far as the boundary conditions are concerned. The 'standard' strategy is 'BRACED'. Here, after prefiltering, the container in the bspline object will contain the b-spline coefficients, surrounded by a 'brace' of coefficients which allows code in eval.h to process them without special treatment for values near the border (the brace covers what support is needed by marginal coefficients). Again, an implicit scheme is used. The third strategy, 'EXPLICIT', extrapolates the knot point data in the 'core' area sufficiently far to suppress margin effects when the prefiltering is performed without initial coefficient calculation. If the 'frame' of extrapolated data is large enough, the result is just the same. The inner part of the frame is taken as the brace, so no bracing needs to be performed explicitly. The resulting b-spline object will work with vspline's evaluation code. Note that the explicit scheme uses 'GUESS' boundary conditions on the (framed) array, which tries to minimize margin effects further. Also note that the additional memory needed for the 'frame' will be held throughout the bspline object's life, the only way to 'shrink' the coefficient array to the size of the braced or core coefficients is by copying them out to a smaller array. The fourth strategy, 'MANUAL', is identical to 'EXPLICIT', except that automatic extrapolation of the core data to the frame is not performed. Instead, this strategy relies on the user to fill the frame with extrapolated data. This is to allow for the user to apply custom extrapolation schemes. The procedure would be to create the bspline object, fill the core, apply the extrapolation, then call prefilter. Probably the most common scenario is that the source data for the spline are available from someplace like a file. Instead of reading the file's contents into memory first and passing the memory to class bspline, there is a more efficient way: a bspline object is set up first, with the specification of the size of the incoming data and the intended mode of operation. The bspline object allocates the memory it will need for the purpose, but doesn't do anything else. The 'empty' bspline object is then 'filled' by the user by putting data into it's 'core' area. Subsequently, prefilter() is called, which converts the data to b-spline coefficients. This way, only one block of memory is used throughout, the initial data are overwritten by the coefficients, operation is in-place and most efficient. If this pattern can't be followed, there are alternatives: - if a view to an array at least the size of the container array is passed into bspline's constructor, this view is 'adopted' and all operations will use the data it refers to. The caller is responsible for keeping these data alive while the bspline object exists, and relinquishes control over the data, which may be changed by the bspline object. - if data are passed to prefilter(), they will be taken as containing the knot point data, rather than expecting the knot point data to be in the bspline oject's memory already. This can also be used to reuse a bspline object with new data. The data passed in will not be modified. This is most efficient when using an implicit scheme; when used together with EXPLICIT, the data are (automatically) copied into the core area before prefiltering, which is unnecessary with the implicit schemes - they can 'pull in' data in the course of their operation. While there is no explicit code to create a 'smoothing spline' - a b-spline evaluating the source data without prefiltering them - this can be achieved simply by creating a b-spline object with spline degree 0 and 'shifting' it to the desired degree for evaluation. Note that you'll need the EXPLICIT strategy for the purpose, or specify extra 'headroom', because otherwise the spline won't have enough 'headroom' for shifting. If stronger smoothing is needed, this can be achieved with the code in filter.h, passing in appropriate pole values. a single-pole filter with a positive pole in ] 0 , 1 [ will do the trick - the larger the pole, the stronger the smoothing. Note that smoothing with large pole values will need a large 'horizon' as well to handle the margins properly. This is what's used when class bspline's creator is called with a 'smoothing' paramater greater than 0. With shifting, you can also create a 'poor man's pyramid'. While using no additional storage, you can extract smoothed data from the spline by shifting it up. This only goes so far, though, because even a degree-20 b-spline reconstruction kernel's equivalent gaussian doesn't have a very large standard deviation, and evaluation times become very long. From the gaussian approximation of the b-spline basis function, it can be seen that the equivalent gaussian's standard deviation is ( degree + 1 ) / 12.0, so a quintic spline will have a standard deviation of 0.5 only. */ #ifndef VSPLINE_BSPLINE_H #define VSPLINE_BSPLINE_H #include "prefilter.h" #include "brace.h" namespace vspline { /// This enumeration is used to determine the prefiltering scheme to be used. typedef enum { UNBRACED , ///< implicit scheme, no bracing applied BRACED , ///< implicit scheme, bracing will be applied EXPLICIT , ///< explicit scheme, frame with extrapolated signal, brace MANUAL ///< like explicit, but don't frame before filtering } prefilter_strategy ; const std::string pfs_name[] = { "UNBRACED" , "BRACED " , "EXPLICIT" , "MANUAL " } ; /// struct bspline is a convenience class which bundles a coefficient array (and it's creation) /// with a set of metadata describing the parameters used to create the coefficients and the /// resulting data. I have chosen to implement class bspline so that there is only a minimal /// set of template arguments, namely the spline's data type (like pixels etc.) and it's dimension. /// All other parameters relevant to the spline's creation are passed in at construction time. /// This way, if explicit specialization becomes necessary (like, to interface to code which /// can't use templates) the number of specializations remains manageable. This design decision /// pertains specifically to the spline's degree, which could also be implemented as a template /// argument, allowing for some optimization by making some members static. Yet going down this /// path requires explicit specialization for every spline degree used and the performance gain /// I found doing so was hardly measurable, while automatic testing became difficult and compilation /// times grew. /// /// I chose making bspline a struct for now, but messing with the data inside is probably /// not a good idea. /// /// class bspline may or may not 'own' the coefficient data it refers to - this /// depends on the specific initialization used, but is handled privately by /// class b-spline, using a shared_ptr to the data if they are owned, which /// makes bspline objects trivially copyable. /// /// There are two views in class bspline: the view 'core' corresponds with the /// knot point values (aka the original data the spline was constructed over). /// The view 'container' contains 'core' and an additional 'frame' around it /// which allows for evaluating in-bounds coordinates without bounds checking. /// For splines constructed with EXPLICIT prefilter strategy and/or additional /// 'headroom' this frame may be even wider. We have a simple relationship: /// core.shape() + left_frame + right_frame == container.shape() template < class _value_type , int _dimension > struct bspline { typedef _value_type value_type ; /// pull the template arg into an enum enum { dimension = _dimension } ; /// if the coefficients are owned, this array holds the data typedef vigra::MultiArray < dimension , value_type > array_type ; /// data are read and written to vigra MultiArrayViews typedef vigra::MultiArrayView < dimension , value_type > view_type ; /// multidimensional index type typedef typename view_type::difference_type shape_type ; /// nD type for one boundary condition per axis typedef typename vigra::TinyVector < bc_code , dimension > bcv_type ; /// elementary type of value_type, like float or double typedef typename ExpandElementResult < value_type >::type real_type ; enum { channels = ExpandElementResult < value_type >::size } ; typedef bspline < real_type , dimension > channel_view_type ; // for internal calculations in the filter, we use the elementary type of value_type. // Note how in class bspline we make very specific choices about the // source data type, the target data type and the type used for arithmetics: we use // the same value_type for source and target array and perform the arithmetics with // this value_type's elementary type. The filter itself is much more flexible; all of // the three types can be different, the only requirements are that the value_types // must be vigra element-expandable types with an elementary type that can be cast // to and from math_type, and math_type must be a real data type, with the additional // requirement that it can be vectorized by VC if Vc is used. typedef real_type math_type ; // arbitrary, can use float or double here. private: // _p_coeffs points to a vigra::MultiArray, which is either default-initialized // and contains no data, or holds data which is used by 'container'. Using a // std::shared_ptr here has the pleasant side-effect that class bspline objects // can use the default copy and assignment operators. std::shared_ptr < array_type > _p_coeffs ; prefilter_strategy strategy ; public: const bcv_type bcv ; ///< boundary conditions, see common.h int spline_degree ; ///< degree of the spline (3 == cubic spline) double tolerance ; ///< acceptable error double smoothing ; ///< E ] 0 : 1 [; apply smoothing to data before prefiltering bool braced ; ///< whether coefficient array is 'braced' or not int horizon ; ///< additional frame size for explicit scheme view_type container ; ///< view to container array view_type core ; ///< view to the core part of the coefficient array shape_type left_frame ; ///< total width(s) of the left handside frame shape_type right_frame ; ///< total width(s) of the right handside frame /// lower_limit returns the lower bound of the spline's defined range. /// This is usually 0.0, but with REFLECT boundary condition it's -0.5, /// the lower point of reflection. The lowest coordinate at which the /// spline can be accessed may be lower: even splines have wider support, /// and splines with extra headroom add even more room to manoevre. // TODO might introduce code to provide the 'technical limits' long double lower_limit ( const int & axis ) const { long double limit = 0.0L ; if ( bcv [ axis ] == vspline::REFLECT ) limit = -0.5L ; else if ( bcv [ axis ] == vspline::SPHERICAL ) limit = -0.5L ; return limit ; } vigra::TinyVector < long double , dimension > lower_limit() const { vigra::TinyVector < double , dimension > limit ; for ( int d = 0 ; d < dimension ; d++ ) limit[d] = lower_limit ( d ) ; return limit ; } /// upper_limit returns the upper bound of the spline's defined range. /// This is normally M - 1 if the shape for this axis is M. Splines with /// REFLECT boundary condition use M - 0.5, the upper point of reflection, /// and periodic splines use M. The highest coordinate at which the spline /// may be accessed safely may be higher. long double upper_limit ( const int & axis ) const { long double limit = core.shape ( axis ) - 1 ; if ( bcv [ axis ] == vspline::REFLECT ) limit += 0.5L ; else if ( bcv [ axis ] == vspline::SPHERICAL ) limit += 0.5L ; else if ( bcv [ axis ] == vspline::PERIODIC ) limit += 1.0L ; return limit ; } vigra::TinyVector < long double , dimension > upper_limit() const { vigra::TinyVector < long double , dimension > limit ; for ( int d = 0 ; d < dimension ; d++ ) limit[d] = upper_limit ( d ) ; return limit ; } /// get_left_brace_size and get_right_brace_size calculate the size of /// the brace vspline puts around the 'core' coefficients to allow evaluation /// inside the defined range (and even slightly beyond) without bounds /// checking. These routines are static to allow user code to establish /// vspline's bracing requirements without creating a bspline object. /// user code might use this information to generate coefficient arrays /// suitable for use with vspline evaluation code, sidestepping use of /// a bspline object. static shape_type get_left_brace_size ( int spline_degree , bcv_type bcv ) { int support = spline_degree / 2 ; // we start out with left_brace as large as the support // of the reconstruction kernel shape_type left_brace ( support ) ; // for some situations, we extend the array further along a specific axis for ( int d = 0 ; d < dimension ; d++ ) { // If the spline is done with REFLECT or SPHERICAL // boundary conditions, the lower and upper limits are // between bounds. // the lower limit in this case is -0.5. When using // floor() or round() on this value, we receive -1, // so we need to extend the left brace. // if rounding could be done so that -0.5 is rounded // towards zero, this brace increase could be omitted // for even splines, but this would also bring operation // close to dangerous terrain: an infinitesimal undershoot // would already produce an out-of-bounds access. if ( bcv[d] == vspline::REFLECT || bcv[d] == vspline::SPHERICAL ) { left_brace[d] ++ ; } // for other boundary conditions, the lower limit is 0. // for odd splines, // as long as evaluation is at positions >= 0 this is // okay, but as soon as evaluation is tried with a value // even just below 0, we'd have an out-of-bounds access, // with potential memory fault. Rather than requiring // evaluation to never undershoot, we err on the side // of caution and extend the left brace, so that // quantization errors won't result in a crash. // This is debatable and could be omitted, if it can // be made certain that evaluation will never be tried // at values even infinitesimally below zero. // for even splines, this problem does not exist, since // coordinate splitting is done with std::round. else if ( spline_degree & 1 ) { left_brace[d]++ ; } } return left_brace ; } static shape_type get_right_brace_size ( int spline_degree , bcv_type bcv ) { int support = spline_degree / 2 ; // we start out with right_brace as large as the support // of the reconstruction kernel shape_type right_brace ( support ) ; // for some situations, we extend the array further along a specific axis for ( int d = 0 ; d < dimension ; d++ ) { // If the spline is done with REFLECT or SPHERICAL // boundary conditions, the lower and upper limits are // between bounds. // So the upper limit is Z + 0.5 where Z is integer. // using floor on this value lands at Z, which fine, // but using round() (as is done for even splines) // lands at Z+1, so for this case we need to extend // the right brace. If we could use a rounding mode // rounding towards zero, we could omit this extension, // but we'd also be cutting it very fine. if ( bcv[d] == vspline::REFLECT || bcv[d] == vspline::SPHERICAL ) { if ( ! ( spline_degree & 1 ) ) right_brace[d] ++ ; } // The upper limit is M-1 for most splines, and M-1+0.5 for // splines with REFLECT and SPHERICAL BCs. When accessing // the spline at this value, we'd be out of bounds. // For splines done with REFLECT and SPHERICAL BCs, we have // to extend the right brace to allow access to coordinates // in [M-1,M-1+0.5], there is no other option. // For other splines, We could require the evaluation code // to check and split incoming values of M-1 to M-2, 1.0, but this // would require additional inner-loop code. So we add another // coefficient on the upper side for these as well. // This is debatable, but with the current implementation of the // evaluation it's necessary. // So, erring on the side of caution, we add the extra coefficient // for all odd splines. if ( spline_degree & 1 ) { right_brace[d]++ ; } // periodic splines need an extra coefficient on the upper // side, to allow evaluation in [M-1,M]. This interval is // implicit in the original data since the value at M is // equal to the value at 0, but if we want to process without // bounds checking and index manipulations, we must provide // an extra coefficient. if ( bcv[d] == vspline::PERIODIC ) { right_brace[d]++ ; } } return right_brace ; } /// construct a bspline object with appropriate storage space to contain and process an array /// of knot point data with shape core.shape. Depending on the strategy chosen and the other /// parameters passed, more space than core.shape may be allocated. Once the bspline object /// is ready, usually it is filled with the knot point data and then the prefiltering needs /// to be done. This sequence assures that the knot point data are present in memory only once, /// the prefiltering is done in-place. So the user can create the bspline, fill in data (like, /// from a file), prefilter, and then evaluate. /// /// alternatively, if the knot point data are already manifest elsewhere, they can be passed /// to prefilter(). With this mode of operation, they are 'pulled in' during prefiltering. /// /// It's possible to pass in a view to an array providing space for the coefficients, /// or even the coefficients themselves. This is done via the parameter _space. This has /// to be an array of the same or larger shape than the container array would end up having /// given all the other parameters. This view is then 'adopted' and subsequent processing /// will operate on it's data. /// /// with the EXPLICIT scheme, the horizon is set by default to a value which is /// deemed to be 'sufficiently large' to keep the error 'low enough'. the expression /// used here produces a frame which is roughly the size needed to make any margin /// effects vanish by the time the prefilter hits the core, but it's a bit 'rule of thumb'. /// /// The additional parameter 'headroom' is used to make the 'frame' even wider. This is /// needed if the spline is to be 'shifted' up (evaluated as if it had been prefiltered /// with a higher-degree prefilter) - see shift(). If extreme precision is not an issue, /// when using the explicit extrapolation scheme, extra headroom for shifting is not /// necessary, since the data in the 'frame' are 'good enough' for the purpose. /// /// While bspline objects allow very specific parametrization, most use cases won't use /// parameters beyond the first few. The only mandatory parameter is, obviously, the /// shape of the knot point data, the original data which the spline is built over. /// This shape 'returns' as the bspline object's 'core' shape. If this is the only /// parameter passed to the constructor, the resulting bspline object will be a /// cubic b-spline with mirror boundary conditions, generated with an implicit /// extrapolation scheme to a 'good' quality, no smoothing, and allocating it's own /// storage for the coefficients, and the resuling bspline object will be suitable for /// use with vspline's evaluation code. // TODO: when bracing/framing is applied, we might widen the array size to a // multiple of the Vc:Vector's Size for the given data type to have better-aligned // access. This may or may not help, has to be tested. We might also want to position // the origin of the brace to an aligned position, since evaluation is based there. // TODO: while the coice to keep the value_types and math_type closely related makes // for simple code, with the more flexible formulation of the prefiltering code we might // widen class bspline's scope to accept input of other types and/or use a different // math_type. bspline ( shape_type _core_shape , ///< shape of knot point data int _spline_degree = 3 , ///< spline degree with reasonable default bcv_type _bcv = bcv_type ( MIRROR ) , ///< boundary conditions and common default prefilter_strategy _strategy = BRACED , ///< default strategy is the 'implicit' scheme int _horizon = -1 , ///< width of frame for explicit scheme (-1: auto) double _tolerance = -1.0 , ///< acceptable error (relative to unit pulse) (-1: auto) double _smoothing = 0.0 , ///< apply smoothing to data before prefiltering int headroom = 0 , ///< additional headroom, for 'shifting' view_type _space = view_type() ///< coefficient storage to 'adopt' ) : spline_degree ( _spline_degree ) , bcv ( _bcv ) , smoothing ( _smoothing ) , strategy ( _strategy ) , tolerance ( _tolerance ) { if ( _tolerance < 0.0 ) { // heuristic: 'reasonable' defaults // TODO derive from if ( std::is_same < real_type , float > :: value ) tolerance = .000001 ; else if ( std::is_same < real_type , double > :: value ) tolerance = .0000000000001 ; else tolerance = 0.0000000000000000001 ; } // heuristic: horizon for reasonable precision - we assume that no one in their right // minds would want a negative horizon ;) real_type max_pole = .1 ; if ( spline_degree > 1 ) max_pole = fabs ( vspline_constants::precomputed_poles [ spline_degree ] [ 0 ] ) ; if ( smoothing > max_pole ) max_pole = smoothing ; if ( _horizon < 0 ) horizon = ceil ( log ( tolerance ) / log ( max_pole ) ) ; // TODO what if tolerance == 0.0? else horizon = _horizon ; // whatever the user specifies // first, calculate the shapes and sizes used internally shape_type container_shape = _core_shape ; left_frame = right_frame = shape_type() ; switch ( strategy ) { case UNBRACED: case MANUAL: braced = false ; break ; case EXPLICIT: // for EXPLICIT extrapolation, we extend the coefficients // by 'horizon' so the filter can 'run up to precision'. // apart from that, everything is the same as for BRACED // prefilter strategy, so we fall through to that case. left_frame = horizon ; right_frame = horizon ; case BRACED: // next we add the size of the 'brace' which is needed to // perform evaluation without bounds checking, and any // additional headroom the caller has requested. braced = true ; left_frame += get_left_brace_size ( spline_degree , bcv ) ; left_frame += headroom ; right_frame += get_right_brace_size ( spline_degree , bcv ) ; right_frame += headroom ; // the container shape is the core shape plus the frame container_shape += left_frame + right_frame ; break ; } // now either adopt external memory or allocate memory for the coefficients if ( _space.hasData() ) { // caller has provided space for the coefficient array. This space // has to be at least as large as the container_shape we have // determined to make sure it's compatible with the other parameters. // With the array having been provided by the caller, it's the caller's // responsibility to keep the data alive as long as the bspline object // is used to access them. if ( ! ( allGreaterEqual ( _space.shape() , container_shape ) ) ) throw shape_mismatch ( "the intended container shape does not fit into the shape of the storage space passed in" ) ; // if the shape matches, we adopt the data in _space. // We take a view to the container_shape-sized subarray only. container = _space.subarray ( shape_type() , container_shape ) ; // _p_coeffs is made to point to a default-constructed MultiArray, // which holds no data. _p_coeffs = std::make_shared < array_type >() ; } else { // _space was default-constructed and has no data. // in this case we allocate a container array and hold a shared_ptr // to it. so we can copy bspline objects without having to worry about // dangling pointers, or who destroys the array. _p_coeffs = std::make_shared < array_type > ( container_shape ) ; // container is made to refer to a view to this array. container = *_p_coeffs ; } // finally we set the view to the core area core = container.subarray ( left_frame , left_frame + _core_shape ) ; } ; /// get a bspline object for a single channel of the data. This is lightweight /// and requires the viewed data to remain present as long as the channel view is used. /// the channel view inherits all metrics from it's parent, only the MultiArrayViews /// to the data are different. channel_view_type get_channel_view ( const int & channel ) { assert ( channel < channels ) ; real_type * base = (real_type*) ( container.data() ) ; base += channel ; auto stride = container.stride() ; stride *= channels ; MultiArrayView < dimension , real_type > channel_container ( container.shape() , stride , base ) ; return channel_view_type ( core.shape() , spline_degree , bcv , strategy , horizon , tolerance , smoothing , 0 , channel_container // coefficient storage to 'adopt' ) ; } ; /// prefilter converts the knot point data in the 'core' area into b-spline /// coefficients. Depending on the strategy chosen in the b-spline object's /// constructor, bracing/framing may be applied. Even if the degree of the /// spline is zero or one, prefilter() should be called because it also /// performs the bracing, if needed. /// /// If data are passed in, they have to have precisely the shape /// we have set up in core (_core_shape passed into the constructor). /// These data will then be used in place of any data present in the /// bspline object to calculate the coefficients. They won't be looked at /// after prefilter() terminates, so it's safe to pass in some MultiArrayView /// which is destroyed after the call to prefilter() returns. void prefilter ( view_type data = view_type() ) ///< view to knot point data to use instead of 'core' { if ( data.hasData() ) { // if the user has passed in data, they have to have precisely the shape // we have set up in core (_core_shape passed into the constructor). // This can have surprising effects if the container array isn't owned by the // spline but constitutes a view to data kept elsewhere (by passing _space the // to constructor): the data held by whatever constructed the bspline object // will be overwritten with the (prefiltered) data passed in via 'data'. if ( data.shape() != core.shape() ) throw shape_mismatch ( "when passing data to prefilter, they have to have precisely the core's shape" ) ; if ( strategy == EXPLICIT ) { // the explicit scheme requires the data and frame to be together in the // containing array, so we have to copy the data into the core. core = data ; } // the other strategies can move the data from 'data' into the spline's memory // during coefficient generation, so we needn't copy them in first. } else { // otherwise, we assume data are already in 'core' and we operate in-place // note, again, the semantics of the assignment here: since 'data' has no data, // the assignment results in 'adopting' the data in core rather than copying them data = core ; } // per default the output will be braced. This does require the output // array to be sufficiently larger than the input. class bracer handles // filling of the brace/frame bracer br ; // for the explicit scheme, we use boundary condition 'guess' which tries to // provide a good guess for the initial coefficients with a small computational // cost. using zero-padding instead introduces a sharp discontinuity at the // margins, which we want to avoid. bcv_type explicit_bcv ( GUESS ) ; switch ( strategy ) { case UNBRACED: // only call the prefilter, don't do any bracing. If necessary, bracing can be // applied later by a call to brace() - provided the bspline object has space // for the brace. solve < view_type , view_type , math_type > ( data , core , bcv , spline_degree , tolerance , smoothing ) ; break ; case BRACED: // prefilter first, passing in BC codes to pick out the appropriate functions to // calculate the initial causal and anticausal coefficient, then brace result. // note how, just as in brace(), the whole frame is filled, which may be more // than is strictly needed by the evaluator. solve < view_type , view_type , math_type > ( data , core , bcv , spline_degree , tolerance , smoothing ) ; // fill the frame for ( int d = 0 ; d < dimension ; d++ ) br.apply ( container , bcv[d] , left_frame[d] , right_frame[d] , d ) ; break ; case EXPLICIT: // first fill frame using BC codes passed in, then solve with BC code GUESS // this automatically fills the brace as well, since it's part of the frame. // TODO: the values in the frame will not come out precisely the same as they // would by filling the brace after the coefficients have been calculated. // The difference will be larger towards the margin of the frame; we assume // that due to the small support of the evaluation the differences near the // margin of the core data will be negligible, having picked a sufficiently // large frame size. This is debatable. If it's a problem, a call to brace() // after prefilter() will brace again, now with coefficients from the core. for ( int d = 0 ; d < dimension ; d++ ) br.apply ( container , bcv[d] , left_frame[d] , right_frame[d] , d ) ; solve < view_type , view_type , math_type > ( container , container , explicit_bcv , spline_degree , tolerance , smoothing ) ; break ; case MANUAL: // like EXPLICIT, but don't apply a frame, assume a frame was applied // by external code. process whole container with GUESS BC. For cases // where the frame can't be constructed by applying any of the stock bracing // modes. Note that if any data were passed into this routine, in this case // they will be silently ignored (makes no sense overwriting the core after // having manually framed it in some way) solve < view_type , view_type , math_type > ( container , container , explicit_bcv , spline_degree , tolerance , smoothing ) ; break ; } } /// if the spline coefficients are already known, they obviously don't need to be /// prefiltered. But in order to be used by vspline's evaluation code, they need to /// be 'braced' - the 'core' coefficients have to be surrounded by more coeffcients /// covering the support the evaluator needs to operate without bounds checking /// inside the spline's defined range. brace() performs this operation. brace() /// assumes the bspline object has been set up with the desired initial parameters, /// so that the boundary conditions and metrics are already known and storage is /// available. If brace() is called with an empty view (or without parameters), /// it assumes the coefficients are in the spline's core already and simply /// fills in the 'empty' space around them. If data are passed to brace(), they /// have to be the same size as the spline's core and are copied into the core /// before the bracing is applied - the same behaviour as prefilter exhibits. void brace ( view_type data = view_type() ) ///< view to knot point data to use instead of 'core' { if ( data.hasData() ) { // if the user has passed in data, they have to have precisely the shape // we have set up in core if ( data.shape() != core.shape() ) throw shape_mismatch ( "when passing data to brace(), they have to have precisely the core's shape" ) ; // we copy the data into the core core = data ; } // we use class bracer to do the work, filling the brace for all axes in turn bracer br ; for ( int d = 0 ; d < dimension ; d++ ) br.apply ( container , bcv[d] , left_frame[d] , right_frame[d] , d ) ; } /// overloaded constructor for 1D splines. This is useful because if we don't /// provide it, the caller would have to pass TinyVector < T , 1 > instead of T /// for the shape and the boundary condition. bspline ( long _core_shape , ///< shape of knot point data int _spline_degree = 3 , ///< spline degree with reasonable default bc_code _bc = MIRROR , ///< boundary conditions and common default prefilter_strategy _strategy = BRACED , ///< default strategy is the 'implicit' scheme int _horizon = -1 , ///< width of frame for explicit scheme double _tolerance = -1.0 , ///< acceptable error (relative to unit pulse) double _smoothing = 0.0 , ///< apply smoothing to data before prefiltering int headroom = 0 , ///< additional headroom, for 'shifting' view_type _space = view_type() ///< coefficient storage to 'adopt' ) :bspline ( TinyVector < long , 1 > ( _core_shape ) , _spline_degree , bcv_type ( _bc ) , _strategy , _horizon , _tolerance , _smoothing , headroom , _space ) { static_assert ( _dimension == 1 , "bspline: 1D constructor only usable for 1D splines" ) ; } ; /// shift will change the interpretation of the data in a bspline object. /// d is taken as a difference to add to the current spline degree. The coefficients /// remain the same, but creating an evaluator from the shifted spline will make /// the evaluator produce data *as if* the coefficients were those of a spline /// of the changed order. Shifting with positive d will efectively blur the /// interpolated signal, shifting with negative d will sharpen it. /// For shifting to work, the spline has to have enough 'headroom', meaning that /// spline_degree + d, the new spline degree, has to be greater or equal to 0 /// and smaller than the largest supported spline degree (lower twenties) - /// and, additionally, there has to bee a wide-enough brace to allow evaluation /// with the wider kernel of the higher-degree spline's reconstruction filter. /// So if a spline is set up with degree 0 and shifted to degree 5, it has to be /// constructed with an additional headroom of 3 (see the constructor). /// /// shiftable() is called with a desired change of spline_degree. If it /// returns true, interpreting the data in the container array as coefficients /// of a spline with the changed degree is safe. If not, the frame size is /// not sufficient or the resulting degree is invalid ans shiftable() /// returns false. Note how the decision is merely technical: if the new /// degree is okay and the *frame* is large enough, the shift will be /// considered permissible, even if the frame only holds data used for /// the EXPLICIT extrapolation scheme, which aren't really 'brace quality'. bool shiftable ( int d ) const { int new_degree = spline_degree + d ; if ( new_degree < 0 || new_degree > 24 ) return false ; shape_type new_left_brace = get_left_brace_size ( new_degree , bcv ) ; shape_type new_right_brace = get_right_brace_size ( new_degree , bcv ) ; if ( allLessEqual ( new_left_brace , left_frame ) && allLessEqual ( new_right_brace , right_frame ) ) { return true ; } return false ; } /// shift() actually changes the interpretation of the data. The data /// will be taken to be coefficients of a spline with degree /// spline_degree + d, and the original degree is lost. This operation /// is only performed if it is technically safe (see shiftable()). /// If the shift was performed successfully, this function returns true, /// false otherwise. bool shift ( int d ) { if ( shiftable ( d ) ) { spline_degree += d ; return true ; } return false ; } /// helper function to << a bspline object to an ostream friend ostream& operator<< ( ostream& osr , const bspline& bsp ) { osr << "dimension:................... " << bsp.dimension << endl ; osr << "degree:...................... " << bsp.spline_degree << endl ; osr << "boundary conditions:......... " ; for ( auto bc : bsp.bcv ) osr << " " << bc_name [ bc ] ; osr << endl ; osr << endl ; osr << "shape of container array:.... " << bsp.container.shape() << endl ; osr << "shape of core:............... " << bsp.core.shape() << endl ; osr << "braced:...................... " << ( bsp.braced ? std::string ( "yes" ) : std::string ( "no" ) ) << endl ; osr << "horizon:..................... " << bsp.horizon << endl ; osr << "left frame:.................. " << bsp.left_frame << endl ; osr << "right frame:................. " << bsp.right_frame << endl ; osr << ( bsp._p_coeffs->hasData() ? "bspline object owns data" : "data are owned externally" ) << endl ; osr << "container base adress:....... " << bsp.container.data() << endl ; osr << "core base adress:............ " << bsp.core.data() << endl ; return osr ; } } ; /// using declaration for a coordinate suitable for bspline, given /// elementary type rc_type. This produces the elementary type itself /// if the spline is 1D, a TinyVector of rc_type otherwise. template < class spline_type , typename rc_type > using bspl_coordinate_type = typename std::conditional < spline_type::dimension == 1 , rc_type , vigra::TinyVector < rc_type , spline_type::dimension > > :: type ; /// using declaration for a bspline's value type template < class spline_type > using bspl_value_type = typename spline_type::value_type ; } ; // end of namespace vspline #endif // VSPLINE_BSPLINE_H kfj-vspline-6e66cf7a7926/common.h000066400000000000000000000367751320375670700165600ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file common.h \brief definitions common to all files in this project, utility code This file contains - some common enums and strings - a traits class fixing the simdized types used for vectorized code and some additional type inference used mainly for unary functors - exceptions used throughout vspline */ #ifndef VSPLINE_COMMON #define VSPLINE_COMMON #include #include #ifdef USE_VC #include #define VECTOR_TYPE Vc::SimdArray #define DEFAULT_RSIZE Vc::Vector < ET > :: Size #define DEFAULT_VSIZE 2 * DEFAULT_RSIZE #else #define DEFAULT_RSIZE 1 #define DEFAULT_VSIZE 1 #endif // #ifdef USE_VC namespace vspline { // this enum will hold true or false, depending on whether the // translation unit including this header was compiled with USE_VC // defined or not. enum { vc_in_use = #ifdef USE_VC true #else false #endif } ; /// This enumeration is used for codes connected to boundary conditions. There are /// two aspects to boundary conditions: During prefiltering, if the implicit scheme is used, /// the initial causal and anticausal coefficients have to be calculated in a way specific to /// the chosen boundary conditions. Bracing, both before prefiltering when using the explicit /// scheme, and after prefiltering when using the implicit scheme, also needs these codes to /// pick the appropriate extrapolation code to extend the knot point data/coefficients beyond /// the core array. typedef enum { MIRROR , ///< mirror on the bounds, so that f(-x) == f(x) PERIODIC, ///< periodic boundary conditions REFLECT , ///< reflect, so that f(-1) == f(0) (mirror between bounds) NATURAL, ///< natural boundary conditions, f(-x) + f(x) == 2 * f(0) CONSTANT , ///< clamp. used for framing, with explicit prefilter scheme ZEROPAD , ///< used for boundary condition, bracing IDENTITY , ///< used as solver argument, mostly internal use GUESS , ///< used with EXPLICIT scheme to keep margin errors low SPHERICAL , ///< use for spherical panoramas, y axis } bc_code; /// bc_name is for diagnostic output of bc codes const std::string bc_name[] = { "MIRROR " , "PERIODIC ", "REFLECT " , "NATURAL ", "CONSTANT " , "ZEROPAD " , "IDENTITY " , "GUESS " , "SPHERICAL" , } ; /// using definition for the 'elementary type' of a type via vigra's /// ExpandElementResult mechanism. template < class T > using ET = typename vigra::ExpandElementResult < T > :: type ; /// unwrapping 'anything' produces the argument unchanged template < class in_type > in_type unwrap ( const in_type & in ) { return in ; } /// but unwrapping a TinyVector with just one element produces the /// contained object template < class T > T unwrap ( const vigra::TinyVector < T , 1 > & in ) { return in[0] ; } /// wrapping 'anything' packages 'in' in a TinyVector with one element template < class T > vigra::TinyVector < T , 1 > wrap ( const T & in ) { return vigra::TinyVector < T , 1 > ( in ) ; } /// but 'wrapping' a TinyVector produces the TinyVector itself, since /// it is 'already wrapped'. template < class T , int N > vigra::TinyVector < T , N > wrap ( const vigra::TinyVector < T , N > & in ) { return in ; } // if Vc isn't used, we nevertheless define is_vectorizable and vector_traits // to provide a common interface for enquiry. This way, we have a uniform // interface for inquiry about vectorization, which simply collapses to // providing the unvectorized types when queried without Vc in use. And, // additionally, in code using Vc we can switch to fallback code on // inspection of vsize, if we find it it 1: in these cases, we can route // the code so that vector code is avoided. template < typename T > class is_vectorizable : public std::false_type {} ; template < typename VT > class is_simd_type : public std::false_type {} ; #ifdef USE_VC #ifdef HAVE_IS_SIMD_VECTOR // this test yields std::true_type for any T which can be vectorized by Vc, // std::false_type for other T. TODO: not commonly available! template < class T > using is_vectorizable = typename Vc::is_simd_vector < Vc::Vector < T > > :: type ; #else // is_simd_vector hopefully comes with future Vc versions, // but for the time being, we have to be explicit. // in Vc ML discussion mkretz states that the set of types Vc can vectorize // (with 1.3) is consistent throughout all ABIs, so we can just // list the acceptable types without having to take the ABI int account. template<> class is_vectorizable : public std::true_type {} ; template<> class is_vectorizable : public std::true_type {} ; template<> class is_vectorizable : public std::true_type {} ; template<> class is_vectorizable : public std::true_type {} ; template<> class is_vectorizable : public std::true_type {} ; template<> class is_vectorizable : public std::true_type {} ; template < class candidate > using is_vcable = typename std::conditional < is_vectorizable < candidate > :: value , std::true_type , std::false_type > :: type ; #endif // HAVE_IS_SIMD_VECTOR // test if a given type VT is a Vc::Vector or Vc::SimdArray template < class T > class is_simd_type < Vc::Vector < T > > : public std::true_type {} ; template < class T , int _vsize > class is_simd_type < Vc::SimdArray < T , _vsize > > : public std::true_type {} ; // to code vector_traits, we need a few helper types. // note how I pass 'vectorizable', and SZ, below, as types, even though // it would be natural to pass a bool/an int. This is due to a bug in // g++, which produces a failure to specialize in these cases. // TODO not really for public use. hide? // default vectorized type. This is where an actual Vc type is used // to produce a SIMD type for vectorizable T. This SIMD type is not // actally used, we only use it's size template < typename T , typename vectorizable > struct dvtype { typedef T type ; enum { size = 1 } ; } ; template < typename T > struct dvtype < T , std::true_type > { typedef Vc::SimdArray < T , 2 * Vc::Vector < T > :: size() > type ; enum { size = type::size() } ; } ; // ditto, but inferring rsize, which is used in filtering code. template < typename T , typename vectorizable = std::false_type > struct drtype { typedef T type ; enum { size = 1 } ; } ; template < typename T > struct drtype < T , std::true_type > { typedef Vc::Vector < T > type ; enum { size = type::size() } ; } ; // with a given vector width, we construct the appropriate SIMD type. // with vsize==1 (below) this collopses to T itself template < typename T , typename SZ > struct vtype { enum { size = SZ::value } ; typedef Vc::SimdArray < T , size > type ; typedef typename type::IndexType index_type ; static index_type IndexesFromZero() { return index_type::IndexesFromZero() ; } } ; template < typename T > struct vtype < T , std::integral_constant > { typedef T type ; enum { size = 1 } ; typedef int index_type ; static int IndexesFromZero() { return 0 ; } } ; /// struct vector_traits is a traits class fixing the types used for /// vectorized code in vspline. template < typename T , int _vsize = 0 , int _rsize = 0 > struct vector_traits { // first, analyze T: how many channels/dimensions does it have, // and what is it's elementary type? We rely on vigra's ExpandElementResult // mechanism here, which allows us to uniformly handle all types known to // this mechanism - and since it is a traits class, it can be extended to // handle more types if necessary. enum { dimension = vigra::ExpandElementResult < T > :: size } ; typedef typename vigra::ExpandElementResult < T > :: type ele_type ; // now we take a look at the elementary type. Can it be vectorized? // is_vectorizable yields a type directly inheriting from either // std::true_type or std::false_type. We want a type here, since we // need one to specialize dvtype and drtype with a type rather than // with a boolean, which would be more natural - but g++ fails to // perform the specialization if we use a non-type template argument // here, so I'm working around a compiler bug. typedef is_vcable < ele_type > isv ; enum { size = isv::value ? _vsize == 0 ? dvtype < ele_type , isv > :: size : _vsize : 1 , rsize = isv::value ? _vsize == 0 ? drtype < ele_type , isv > :: size : _rsize : 1 } ; // the same compiler bug keeps me from specializing vtype with an int // as second template argument. Again, I wrap the argument in a // class type template argument to assure correct specialization: typedef typename std::integral_constant < int , size > sz_t ; // now the type 'ele_v', the simdized elementary type, can be produced. // This may 'collapse' to ele_type itself, if size if 1 typedef typename vtype < ele_type , sz_t > :: type ele_v ; // The next two types are TinyVectors of ele_type and ele_v, // respectively. This is 'nice to have' in case code needs TinyVectors // even if the data are 1D. typedef vigra::TinyVector < ele_type , dimension > nd_ele_type ; typedef vigra::TinyVector < ele_v , dimension > nd_ele_v ; // syn_type, which isn't (currently) used outside, is the 'synthetic' // vectorized type. It is derived from ele_v or nd_ele_v, which are // built from the elementary type - hence the term 'synthetic'. typedef typename std::conditional < std::is_same < T , ele_type > :: value , ele_v , nd_ele_v > :: type syn_type ; // finally, we look at a special case, namely size==1. This occurs if // the elementary type can't be vectorized, or if _vsize is passed as // 1 in the first place. If this is the case, we want to use T itself // as the 'final' type, rather than the synthetic type which we have // built above. Hence this last step: typedef typename std::conditional < size == 1 , T , syn_type > :: type type ; typedef typename std::conditional < size == 1 , nd_ele_type , nd_ele_v > :: type tv_type ; // Some code in vspline requires indexing of either single or SIMD values. // So we provide a type for indexing and a static function providing // canonical indices for this type. typedef typename vtype < ele_type , sz_t > :: index_type index_type ; typedef typename vigra::TinyVector < index_type , dimension > nd_index_type ; static index_type IndexesFromZero() { return vtype < ele_type , sz_t > :: IndexesFromZero() ; } } ; #else // #ifdef USE_VC template < typename T , int _vsize = 0 > struct vector_traits { enum { size = 1 , rsize = 1 } ; enum { dimension = vigra::ExpandElementResult < T > :: size } ; typedef typename vigra::ExpandElementResult < T > :: type ele_type ; typedef vigra::TinyVector < ele_type , dimension > nd_ele_type ; typedef ele_type ele_v ; typedef nd_ele_type nd_ele_v ; typedef T type ; } ; #endif // TODO The exceptions need some work. My use of exceptions is a bit sketchy... /// for interfaces which need specific implementations we use: struct not_implemented : std::invalid_argument { not_implemented ( const char * msg ) : std::invalid_argument ( msg ) { } ; } ; /// dimension-mismatch is thrown if two arrays have different dimensions /// which should have the same dimensions. struct dimension_mismatch : std::invalid_argument { dimension_mismatch ( const char * msg ) : std::invalid_argument ( msg ) { } ; } ; /// shape mismatch is the exception which is thrown if the shapes of /// an input array and an output array do not match. struct shape_mismatch : std::invalid_argument { shape_mismatch ( const char * msg ) : std::invalid_argument ( msg ) { } ; } ; /// exception which is thrown if an opertion is requested which vspline /// does not support struct not_supported : std::invalid_argument { not_supported ( const char * msg ) : std::invalid_argument ( msg ) { } ; } ; /// out_of_bounds is thrown by mapping mode REJECT for out-of-bounds coordinates /// this exception is left without a message, it only has a very specific application, /// and there it may be thrown often, so we don't want anything slowing it down. struct out_of_bounds { } ; /// exception which is thrown when an assiging an rvalue which is larger than /// what the lvalue can hold struct numeric_overflow : std::invalid_argument { numeric_overflow ( const char * msg ) : std::invalid_argument ( msg ) { } ; } ; } ; // end of namespace vspline #endif // VSPLINE_COMMON kfj-vspline-6e66cf7a7926/domain.h000066400000000000000000000307121320375670700165200ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file domain.h \brief code to perform combined scaling and translation on coordinates A common requirement is to map coordinates in one range to another range, effectively performing a combined scaling and translation. Given incoming coordinates in a range [ in_low , in_high ] and a desired range for outgoing coordinates of [ out_low , out_high ], and an incoming coordinate c, a vspline::domain performs this operation: c' = ( c - in_low ) * scale + out_low where scale = ( out_high - out_low ) / ( in_high - in_low ) The code can handle arbitrary dimensions, float and double coordinate elementary types, and, optionally, it can perform vectorized operations on vectorized coordinates. vspline::domain is derived from vspline::unary_functor and can be used like any other vspline::unary_functor. A common use case would be to access a vspline::evaluator with a different coordinate range than the spline's 'natural' coordinates (assuming a 1D spline of floats): auto _ev = vspline::make_safe_evaluator ( bspl ) ; auto ev = vspline::domain ( bspl , 0 , 100 ) + _ev ; ev.eval ( coordinate , result ) ; Here, the domain is built over the spline with an incoming range of [ 0 , 100 ], so evaluating at 100 will be equivalent to evaluating _ev at bspl.upper_limit(). */ #ifndef VSPLINE_DOMAIN_H #define VSPLINE_DOMAIN_H #include #include namespace vspline { /// class domain is a coordinate transformation functor. It provides /// a handy way to translate an arbitrary range of incoming coordinates /// to an arbitrary range of outgoing coordinates. This is done with a /// linear translation function. if the source range is [s0,s1] and the /// target range is [t0,t1], the translation function s->t is: /// /// t = ( s - s0 ) * ( t1 - t0 ) / ( s1 - s0 ) + t0 /// /// In words: the target coordinate's distance from the target range's /// lower bound is proportional to the source coordinate's distance /// from the source range's lower bound. Note that this is *not* a /// gate function: the domain will accept any incoming value and /// perform the shift/scale operation on it; incoming values outside /// [ in_low , in_high ] will produce outgoing values outside /// [ out_low , out_high ]. /// /// The first constructor takes s0, s1, t0 and t1. With this functor, /// arbitrary mappings of the form given above can be achieved. /// The second constructor takes a vspline::bspline object to obtain /// t0 and t1. These are taken as the spline's 'true' range, depending /// on it's boundary conditions: for periodic splines, this is [0...M], /// for REFLECT Bcs it's [-0.5,M-0,5], and for 'normal' splines it's /// [0,M-1]. s0 and s1, the start and end of the domain's coordinate range, /// can be passed in and default to 0 and 1, which constitutes 'normalized' /// spline coordinates, where 0 is mapped to the lower end of the 'true' /// range and 1 to the upper. /// /// class domain is especially useful for situations where several b-splines /// cover the same data in different resolution, like in image pyramids. /// If these different splines are all evaluated with a domain chained to the /// evaluator which uses a common domain range, they can all be accessed with /// identical coordinates, even if the spline shapes don't match isotropically. /// /// The evaluation routine in class domain_type makes sure that incoming /// values in [ in_low , in_high ] will never produce outgoing values /// outside [ out_low , out_high ]. If this guarantee is not needed, the /// 'raw' evaluation routine _eval can be used instead. with _eval, output /// may overshoot out_high slightly. /// /// I should mention libeinspline here, which has this facility as a fixed /// feature in it's spline types. I decided to keep it separate and create /// class domain instead for those cases where the functionality is needed. template < typename coordinate_type , int _vsize = vspline::vector_traits::size > struct domain_type : public vspline::unary_functor < coordinate_type , coordinate_type , _vsize > { typedef vspline::unary_functor < coordinate_type , coordinate_type , _vsize > base_type ; using base_type::dim_in ; using base_type::vsize ; using typename base_type::in_type ; using typename base_type::out_type ; typedef typename base_type::in_ele_type rc_type ; typedef typename vigra::TinyVector < rc_type , dim_in > limit_type ; // internally, we work with definite TinyVectors: const limit_type out_low , in_low , out_high , in_high ; const limit_type scale ; /// constructor taking the lower and upper fix points /// for incoming and outgoing values domain_type ( const coordinate_type & _in_low , const coordinate_type & _in_high , const coordinate_type & _out_low , const coordinate_type & _out_high ) : in_low ( _in_low ) , out_low ( _out_low ) , in_high ( _in_high ) , out_high ( _out_high ) , scale ( ( _out_high - _out_low ) / ( _in_high - _in_low ) ) { assert ( in_low != in_high && out_low != out_high ) ; } /// constructor taking the fix points for outgoing values /// from a bspline object, and the incoming lower and upper /// fix points explicitly template < class bspl_type > domain_type ( const bspl_type & bspl , const coordinate_type & _in_low = coordinate_type ( 0 ) , const coordinate_type & _in_high = coordinate_type ( 1 ) ) : in_low ( _in_low ) , in_high ( _in_high ) , out_low ( bspl.lower_limit() ) , out_high ( bspl.upper_limit() ) , scale ( ( bspl.upper_limit() - bspl.lower_limit() ) / ( _in_high - _in_low ) ) { static_assert ( dim_in == bspl_type::dimension , "can only create domain from spline if dimensions match" ) ; assert ( in_low != in_high ) ; } /// _eval only performs the domain functor's artithmetics. for many use /// cases this will be sufficient, but the 'official' eval routine /// (below) adds code to make sure that input inside [ in_low , in_high ] /// will produce output inside [ out_low , out_high ]. _eval may /// produce a value > out_high for _in == in_high due to quantization /// errors. template < class crd_type > void _eval ( const crd_type & _in , crd_type & _out ) const { auto in = wrap ( _in ) ; typedef decltype ( in ) nd_crd_t ; typedef typename nd_crd_t::value_type component_type ; component_type * p_out = (component_type*) ( &_out ) ; for ( int d = 0 ; d < dim_in ; d++ ) p_out[d] = ( in[d] - in_low[d] ) * scale[d] + out_low[d] ; } private: /// polish sets 'out' to 'subst' where 'patch' indicates. /// the unvectorized case will be caught by this overload: void polish ( rc_type & out , const rc_type & subst , bool patch ) const { if ( patch ) out = subst ; } #ifdef USE_VC /// whereas this template will match vectorized operation template < class c_type , class mask_type > void polish ( c_type & out , const rc_type & subst , mask_type patch ) const { out ( patch ) = subst ; } #endif public: /// eval repeats the code of _eval, adding the invocation /// of 'polish' which makes sure that, when in == in_high, template < class crd_type > void eval ( const crd_type & _in , crd_type & _out ) const { auto in = wrap ( _in ) ; typedef decltype ( in ) nd_crd_t ; typedef typename nd_crd_t::value_type component_type ; component_type * p_out = (component_type*) ( &_out ) ; for ( int d = 0 ; d < dim_in ; d++ ) { p_out[d] = ( in[d] - in_low[d] ) * scale[d] + out_low[d] ; polish ( p_out[d] , out_high[d] , in[d] == in_high[d] ) ; } } } ; /// factory function to create a domain_type type object from the /// desired lower and upper fix point for incoming coordinates and /// the lower and upper fix point for outgoing coordinates. /// the resulting functor maps incoming coordinates in the range of /// [in_low,in_high] to coordinates in the range of [out_low,out_high] template < class coordinate_type , int _vsize = vspline::vector_traits::size > vspline::domain_type < coordinate_type , _vsize > domain ( const coordinate_type & in_low , const coordinate_type & in_high , const coordinate_type & out_low , const coordinate_type & out_high ) { return vspline::domain_type < coordinate_type , _vsize > ( in_low , in_high , out_low , out_high ) ; } /// factory function to create a domain_type type object /// from the desired lower and upper reference point for incoming /// coordinates and a vspline::bspline object providing the lower /// and upper reference for outgoing coordinates /// the resulting functor maps incoming coordinates in the range of /// [ in_low , in_high ] to coordinates in the range of /// [ bspl.lower_limit() , bspl.upper_limit() ] template < class coordinate_type , class spline_type , int _vsize = vspline::vector_traits::size > vspline::domain_type < coordinate_type , _vsize > domain ( const spline_type & bspl , const coordinate_type & in_low = coordinate_type ( 0 ) , const coordinate_type & in_high = coordinate_type ( 1 ) ) { return vspline::domain_type < coordinate_type , _vsize > ( bspl , in_low , in_high ) ; } } ; // namespace vspline #endif // #ifndef VSPLINE_DOMAIN_H kfj-vspline-6e66cf7a7926/doxy.h000066400000000000000000000674521320375670700162470ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ // This header doesn't contain any code, only the text for the main page of the documentation. /*! \mainpage \section intro_sec Introduction vspline is a header-only generic C++ library for the creation and use of uniform B-splines. It aims to be as comprehensive as feasibly possible, yet at the same time producing code which performs well, so that it can be used in production. vspline was developed on a Linux system using clang++ and g++. It has not been tested much with other systems or compilers, and as of this writing I am aware that the code probably isn't fully portable. The code uses the C++11 standard. Note: in November 2017, with help from Bernd Gaucke, vspline's companion program pv, which uses vspline heavily, was successfully compiled with 'Visual Studio Platform toolset V141'. While no further tests have been done, I hope that I can soon extend the list of supported platforms. vspline's main focus is bulk data processing. It was developed to be used for image processing software. In image processing, oftentimes large amounts of pixels need to be submitted to identical operations, suggesting a functional approach. vspline offers functional programming elements to implement such programs. vspline relies heavily on two other libraries: - VIGRA, mainly for handling of multidimensional arrays and general signal processing - Vc, for the use of the CPU's vector units I find VIGRA indispensible, omitting it from vspline is not really an option. Use of Vc is optional, though, and has to be activated by defining 'USE_VC'. This should be done by passing -DUSE_VC to the compiler; defining USE_VC only for parts of a project may or may not work. Please note that vspline uses Vc's 1.3 branch, not the master branch. 1.3 is what you are likely to find in your distro's packet repositories; if you check out Vc from github, make sure you pick the 1.3 branch. I have made an attempt to generalize the code so that it can handle - arbitrary real data types and their aggregates - a reasonable selection of boundary conditions - prefiltering with implicit and explicit extrapolation schemes - arbitrary spline orders - arbitrary dimensions of the spline - in multithreaded code - using the CPU's vector units if possible On the evaluation side I provide - evaluation of the spline at point locations in the defined range - evaluation of the spline's derivatives - mapping of arbitrary coordinates into the defined range - evaluation of nD arrays of coordinates (remap function) - generalized 'transform' and 'apply' functions On top you get a unary functor type and some functional constructs to go with it. \section install_sec Installation vspline is header-only, so it's sufficient to place the headers where your code can access them. VIGRA and Vc are supposed to be installed in a location where they can be found so that includes along the lines of #include succeed. \section compile_sec Compilation While your distro's packages may be sufficient to get vspline up and running, you may need newer versions of VIGRA and Vc. At the time of this writing the latest versions commonly available were Vc 1.3.0 and VIGRA 1.11.0; I compiled Vc and VIGRA from source, using up-to-date pulls from their respective repositories. Vc 0.x.x will not work with vspline. update: ubuntu 17.04 has vigra and Vc packages which are sufficiently up-to-date. To compile software using vspline, I use clang++: ~~~~~~~~~~~~~~ clang++ -D USE_VC -pthread -O3 -march=native --std=c++11 your_code.cc -lVc -lvigraimpex ~~~~~~~~~~~~~~ where the -lvigraimpex can be omitted if vigraimpex (VIGRA's image import/export library) is not used, and linking libVc.a in statically is a good option; on my system the resulting code is faster. Please note that an executable using Vc produced on your system may likely not work on a machine with another CPU. It's best to compile on the intended target. Alternatively, the target architecture can be passed explicitly to the compiler (-march...). 'Not work' in this context means that it may as well crash due to an illegal instruction or wrong alignment. If you can't use Vc, the code can be made to compile without Vc by omitting -D USE_VC and other flags relevant for Vc: ~~~~~~~~~~~~~~ clang++ -pthread -O3 --std=c++11 your_code.cc -lvigraimpex ~~~~~~~~~~~~~~ IF you don't want to use clang++, g++ will also work. All access to Vc in the code is inside #ifdef USE_VC .... #endif statements, so not defining USE_VC will effectively prevent it's use. \section license_sec License vspline is free software, licensed under this license: ~~~~~~~~~~~~ vspline - a set of generic tools for creation and evaluation of uniform b-splines Copyright 2015 - 2017 by Kay F. Jahnke Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ~~~~~~~~~~~~ \section quickstart_sec Quickstart vspline uses vigra to handle data. There are two vigra data types which are used throughout: vigra::MultiArrayView is used for multidimensional arrays. It's a thin wrapper around the three parameters needed to describe arbitrary n-dimensional arrays of data in memory: a pointer to some 'base' location coinciding with the coordinate origin, a shape and a stride. If your code does not use vigra MultiArrayViews, it's easy to create them for the data you have, vigra offers a constructor for MultiArrayViews taking these three parameters. The other vigra data type used throughout vspline is vigra::TinyVector, a small fixed-size container type used to hold things like multidimensional coordinates or pixels. This type is also just a wrapper around a small 1D C array. It's zero overhead and contains nothing else, but offers lots of functionality like arithmetic operations. I recommend looking into vigra's documentation to get an idea about these data types, even if you only wrap your extant data in them to interface with vspline. vspline follows vigra's default axis ordering scheme: the fastest-varying index is first, so coordinates are (x,y,z...). Coordinates, strides and shapes are given relative to the MultiArrayView's value_type. If you stick with the high-level code, using class bspline or the transform functions, most of the parametrization is easy. Here are a few quick examples what you can do. This is really just to give you a first idea - there is example code covering most features of vspline where things are covered in more detail, with plenty of comments. the code in this text is also there, see quickstart.cc. Let's suppose you have data in a 2D vigra MultiArray 'a'. vspline can handle real data like float and double, and also their 'aggregates', meaning data types like pixels or vigra's TinyVector. But for now, let's assume you have plain float data. Creating the bspline object is easy: ~~~~~~~~~~~~~~ #include // given a vigra::MultiArray of data (initialization omitted) vigra::MultiArray < 2 , float > a ( 10 , 20 ) ; // let's initialize the whole array with 42 a = 42 ; // fix the type of the corresponding b-spline typedef vspline::bspline < float , 2 > spline_type ; // create bspline object 'bspl' fitting the shape of your data spline_type bspl ( a.shape() ) ; // copy the source data into the bspline object's 'core' area bspl.core = a ; // run prefilter() to convert original data to b-spline coefficients bspl.prefilter() ; ~~~~~~~~~~~~~~ The memory needed to hold the coefficients is allocated when the bspline object is constructed. Obviously many things have been done by default here: The default spline degree was used - it's 3, for a cubic spline. Also, boundary treatment mode 'MIRROR' was used per default. Further default parameters cause the spline to be 'braced' so that it can be evaluated with vspline's evaluation routines, Vc (if compiled in) was used for prefiltering, and the process is automatically partitioned and run in parallel by a thread pool. The only mandatory template arguments are the value type and the number of dimensions, which have to be known at compile time. While the sequence of operations indicated here looks a bit verbose (why not create the bspline object by a call like bspl(a) ?), in 'real' code you would use bspl.core straight away as the space to contain your data - you might get the data from a file or by some other process or do something like this where the bspline object provides the array and you interface it via a view to it's 'core': ~~~~~~~~~~~~~~ vspline::bspline < double , 1 > bsp ( 10001 , degree , vspline::MIRROR ) ; auto v1 = bsp.core ; // get a view to the bspline's 'core' for ( auto & r : v1 ) r = ... ; // assign some values bsp.prefilter() ; // perform the prefiltering ~~~~~~~~~~~~~~ This is a common idiom, because it reflects a common mode of operation where you don't need the original, unfiltered data any more after creating the spline, so the prefiltering is done in-place overwriting the original data. If you do need the original data later, you can also use a third idiom: ~~~~~~~~~~~~~~ vigra::MultiArrayView < 3 , double > my_data ( vigra::Shape3 ( 5 , 6 , 7 ) ) ; vspline::bspline < double , 3 > bsp ( my_data.shape() ) ; bsp.prefilter ( my_data ) ; ~~~~~~~~~~~~~~ Here, the bspline object is first created with the appropriate 'core' size, and prefilter() is called with an array matching the bspline's core. This results in my_data being read into the bspline object during the first pass of the prefiltering process. There are more ways of setting up a bspline object, please refer to class bspline's constructor. Of course you are also free to directly use vspline's lower-level routines to create a set of coefficients. The lowest level of filtering routine is simply a forward-backward recursive filter with a set of arbitrary poles. This code is in filter.h. Next you may want to evaluate the spline from the first example at some pair of coordinates x, y. Evaluation of the spline can be done using vspline's 'evaluator' objects. Using the highest level of access, these objects are set up with a bspline object and, after being set up, provide methods to evaluate the spline at given cordinates. Technically, evaluator objects are functors which don't have mutable state (all state is created at creation time and constant afterwards), so they are thread-safe and 'pure' in a functional-programming sense. The evaluation is done by calling the evaluator's eval() member function, which takes it's first argument (the coordinate) as a const reference and writes the result to it's second argument, which is a reference to a variable capable of holding the result. ~~~~~~~~~~~~~~ // for a 2D spline, we want 2D coordinates typedef vigra::TinyVector < float ,2 > coordinate_type ; // get the appropriate evaluator type typedef vspline::evaluator < coordinate_type , float > eval_type ; // create the evaluator eval_type ev ( bspl ) ; // create variables for input and output, coordinate_type coordinate ( 3 , 4 ) ; float result ; // use the evaluator to evaluate the spline at ( 3 , 4 ) // storing the result in 'result' ev.eval ( coordinate , result ) ; ~~~~~~~~~~~~~~ Again, some things have happened by default. The evaluator was constructed from a bspline object, making sure that the evaluator is compatible. You may ask why an evaluator doesn't provide operator(). This has technical reasons - if you're interested in the details, please refer to the documentation for vspline::unary_functor. If you need function call syntax for a vspline::unary_functor, vspline offers vspline::callable: ~~~~~~~~~~~~~ // wrap the evaluator in a vspline::callable auto f = vspline::callable ( ev ) ; // the resulting object can be called as a function float r = f ( coordinate ) ; assert ( r == result ) ; ~~~~~~~~~~~~~ What about the remap function? The little introduction demonstrated how you can evaluate the spline at a single location. Most of the time, though, you'll require evaluation at many coordinates. This is what remap does. Instead of a single coordinate, you pass a whole vigra::MultiArrayView full of coordinates to it - and another MultiArrayView of the same dimension and shape to accept the results of evaluating the spline at every coordinate in the first array. Here's a simple example, using the same array 'a' as above: ~~~~~~~~~~~~ // create a 1D array containing (2D) coordinates into 'a' vigra::MultiArray < 1 , coordinate_type > coordinate_array ( 3 ) ; // we initialize the coordinate array by hand... coordinate_array = coordinate ; // create an array to accommodate the result of the remap operation vigra::MultiArray < 1 , float > target_array ( 3 ) ; // perform the remap vspline::remap ( a , coordinate_array , target_array ) ; auto ic = coordinate_array.begin() ; for ( auto k : target_array ) assert ( k == f ( *(ic++) ) ; ~~~~~~~~~~~~ This is an 'ad-hoc' remap, passing source data as an array. You can also set up a bspline object and perform a 'transform' using an evaluator for this bspline object, with the same effect: ~~~~~~~~~~~~ // instead of the remap, we can use transform, passing the evaluator for // the b-spline over 'a' instead of 'a' itself. the result is the same. vspline::transform ( ev , coordinate_array , target_array ) ; ~~~~~~~~~~~~ This routine has wider scope: while in this example, ev is a b-spline evaluator, ev's type can be any functor capable of yielding a value of the type held in 'target_array' for a value held in 'coordinate_array'. Here, you'd typically use an object derived from class vspline::unary_functor, and vspline::evaluator is in fact derived from this base class. A unary_functor's input and output can be any data type suitable for processing with vspline (elementary types and their uniform aggregates), you're not limited to things which can be thought of as 'coordinates' etc. This generalization of remap is named 'transform' and is similar to vigra's point operator code, but uses vspline's automatic multithreading and vectorization to make it very efficient. There's a variation of it where the 'coordinate array' and the 'target array' are the same, effectively performing an in-place transformation, which is useful for things like coordinate transformations or colour space manipulations. This variation is called vspline::apply. There is one variation of transform(). This overload doesn't take a 'coordinate array', but instead feeds the unary_functor with discrete coordinates of the target location that is being filled in. It's probably easiest to understand this variant if you start out thinking of feeding the previous transform() with an array which contains discrete indices. In 2D, this array would contain (0,0) , (1,0) , (2,0) ... (0,1) , (1,1) , (2,1) ... ... So why would you set up such an array, if it merely contains the coordinates of every cell? You might as well create these values on-the-fly and omit the coordinate array. This is precisely what the second variant of transform does: ~~~~~~~~~~~~~ // create a 2D array for the index-based transform operation vigra::MultiArray < 2 , float > target_array_2d ( 3 , 4 ) ; // use transform to evaluate the spline for the coordinates of // all values in this array vspline::transform ( ev , target_array_2d ) ; // verify for ( int x = 0 ; x < 3 ; x ++ ) { for ( y = 0 ; y < 4 ; y++ ) { coordinate_type c { x , y } ; assert ( target_array_2d [ c ] == f ( c ) ) ; } } ~~~~~~~~~~~~~ If you use this variant of transform directly with a vspline::evaluator, it will reproduce your original data - within arithmetic precision of the evaluation: ~~~~~~~~~~~~~ vigra::MultiArray < 2 , float > b ( a.shape() ) ; vspline::transform ( ev , b ) ; auto ia = a.begin() ; for ( auto r : b ) assert ( vigra::closeAtTolerance ( *(ia++) , r , .00001 ) ) ; ~~~~~~~~~~~~~ Class vspline::unary_functor is coded to make it easy to implement functors for things like image processing pipelines. For more complex operations, you'd code a functor representing your processing pipeline - often by delegating to 'inner' objects also derived from vspline::unary_functor - and finally use transform() to bulk-process your data with this functor. This is about as efficient as it gets, since the data are only accessed once, and vspline's transform code does the tedious work of multithreading, deinterleaving and interleaving for you, while you are left to concentrate on the interesting bit, writing the processing pipeline code. vspline::unary_functors are reasonably straightforward to set up; for prototyping you can get away without writing vectorized code (by using broadcasting, see vspline::grok), and you'll see that writing vectorized code with Vc isn't too hard either - if your code doesn't need conditionals, you can often even get away with using the same code for vectorized and unvectorized operation. Please refer to the examples. vspline offers some functional programming constructs for functor combination, like feeding one functor's output as input to the next (vspline::chain) or translating coordinates to a different range (vspline::domain). And that's about it - vspline aims to provide all possible variants of b-splines, code to create and evaluate them and to do so for arbitraryly shaped and strided nD arrays of data. If you dig deeper into the code base, you'll find that you can stray off the default path, but there should rarely be any need not to use the high-level object 'bspline' or the transform functions. While one might argue that the remap/transform routines I present shouldn't be lumped together with the 'proper' b-spline code, I feel that only by tightly coupling them with the b-spline code I can make them really fast. And only by processing several values at once (by multithreading and vectorization) the hardware can be exploited fully. \section speed_sec Speed While performance will vary from system to system and between different compiles, I'll quote some measurements from my own system. I include benchmarking code (roundtrip.cc in the examples folder). Here are some measurements done with "roundtrip", working on a full HD (1920*1080) RGB image, using single precision floats internally - the figures are averages of 32 runs: ~~~~~~~~~~~~~~~~~~~~~ testing bc code MIRROR spline degree 3 avg 32 x prefilter:............................ 13.093750 ms avg 32 x transform from unsplit coordinates:... 59.218750 ms avg 32 x remap with internal spline:........... 75.125000 ms avg 32 x transform from indices ............... 57.781250 ms testing bc code MIRROR spline degree 3 using Vc avg 32 x prefilter:............................ 9.562500 ms avg 32 x transform from unsplit coordinates:... 22.406250 ms avg 32 x remap with internal spline:........... 35.687500 ms avg 32 x transform from indices ............... 21.656250 ms ~~~~~~~~~~~~~~~~~~~~~ As can be seen from these test results, using Vc on my system speeds evaluation up a good deal. When it comes to prefiltering, a lot of time is spent buffering data to make them available for fast vector processing. The time spent on actual calculations is much less. Therefore prefiltering for higher-degree splines doesn't take much more time (when using Vc): ~~~~~~~~~~~~~~~~~~~~~ testing bc code MIRROR spline degree 5 using Vc avg 32 x prefilter:........................ 10.687500 ms testing bc code MIRROR spline degree 7 using Vc avg 32 x prefilter:........................ 13.656250 ms ~~~~~~~~~~~~~~~~~~~~~ Using double precision arithmetics, vectorization doesn't help so much, and prefiltering is actually slower on my system when using Vc. Doing a complete roundtrip run on your system should give you an idea about which mode of operation best suits your needs. \section design_sec Design You can probably do everything vspline does with other software - there are several freely available implementations of b-spline interpolation and remap/transform routines. What I wanted to create was an implementation which was as general as possible and at the same time as fast as possible, and, on top of that, comprehensive. These demands are not easy to satisfy at the same time, but I feel that my design comes close. While generality is achieved by generic programming, speed needs exploitation of hardware features, and merely relying on the compiler is not enough. The largest speedup I saw was from multithreading the code. This may seem like a trivial observation, but my design is influenced by it: in order to efficiently multithread, the problem has to be partitioned so that it can be processed by independent threads. You can see the partitioning both in prefiltering and later in the transform routines, in fact, both even share code to do so. Another speedup method is data-parallel processing. This is often thought to be the domain of GPUs, but modern CPUs also offer it in the form of vector units. I chose implementing data-parallel processing in the CPU, as it offers tight integration with unvectorized CPU code. It's almost familiar terrain, and the way from writing conventional CPU code to vector unit code is not too far, when using tools like Vc, which abstract the hardware away. Using horizontal vectorization does require some rethinking, though - mainly a conceptual shift from an AoS to an SoA approach. vspline doesn't use vertical vectorization at all, so the code may look odd to someone looking for vector representations of, say, pixels: instead of finding SIMD vectors with three elements, there are structures of three SIMD vectors of vsize elements. To use vectorized evaluation efficiently, incoming data have to be presented to the evaluation code in vectorized form, but usually they will come from interleaved memory. Keeping the data in interleaved memory is even desirable, because it preserves locality, and usually processing accesses all parts of a value (i.e. all three channels of an RGB value) at once. After the evaluation is complete, data have to be stored again to interleaved memory. The deinterleaving and interleaving operations take time and the best strategy is to load once from interleaved memory, perform all necessary operations on deinterleaved, vectorized data and finally store the result back to interleaved memory. The sequence of operations performed on the vectorized data constitute a processing pipeline, and some data access code will feed the pipeline and dispose of it's result. vspline's unary_functor class is designed to occupy the niche of pipeline code, while remap, apply and transform provide the feed-and-dispose code. So with the framework of these routines, setting up vectorized processing pipelines becomes easy, since all the boilerplate code is there already, and only the point operations/operations on single vectors need to be provided by deriving from unary_functor. Using all these techniques together makes vspline fast. The target I was roughly aiming at was to achieve frame rates of ca. 50 fps in RGB and full HD, producing the images via transform from a precalculated warp array. On my system, I have almost reached that goal - my transform times are around 25 msec (for a cubic spline), and with memory access etc. I come up to frame rates over half of what I was aiming at. My main tesing ground is pv, my panorama viewer. Here I can often take the spline degree up to two (a quadratic spline) and still have smooth animation in full resolution. Note that class evaluator has specializations for degree-1 and degree-0 splines (aka linear and nearest-neighbour interpolation) which use optimizations making the specialized evaluator even faster than the general-purpose code. Even without using vectorization, the code is certainly fast enough for casual use and may suffice for some production scenarios. This way, vigra becomes the only dependency, and the same binary will work on a wide range of hardware. \section Literature There is a large amount of literature on b-splines available online. Here's a pick: http://bigwww.epfl.ch/thevenaz/interpolation/ http://soliton.ae.gatech.edu/people/jcraig/classes/ae4375/notes/b-splines-04.pdf http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-basis.html http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-ex-1.html */ kfj-vspline-6e66cf7a7926/eval.h000066400000000000000000002167751320375670700162170ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file eval.h \brief code to evaluate uniform b-splines This body of code contains class evaluator and auxilliary classes which are needed for it's smooth operation. The evaluation is a reasonably straightforward process: A subset of the coefficient array, containing coefficients 'near' the point of interest, is picked out, and a weighted summation over this subset produces the result of the evaluation. The complex bit is to have the right coefficients in the first place (this is what prefiltering does), and to use the appropriate weights on the coefficient window. For b-splines, there is an efficient method to calculate the weights by means of a matrix multiplication, which is easily extended to handle b-spline derivatives as well. Since this code lends itself to a generic implementation, and it can be parametrized by the spline's order, and since the method performs well, I use it here in preference to the code which Thevenaz uses (which is, for the orders of splines it encompasses, the matrix multiplication written out with a few optimizations, like omitting multiplications with zero, and slightly more concise calculation of powers) Evaluation of a b-spline is, compared to prefiltering, more computationally intensive and less memory-bound, so the profit from vectorization, especially for float data, is more pronounced here than for the prefiltering. On my system, I found single-precision operation was about three to four times as fast as unvectorized code (AVX2). The central class of this file is class evaluator. evaluator objects are set up to provide evaluation of a specific b-spline. Once they are set up they don't change and effectively become pure functors with several overloaded evaluation methods for different constellations of parameters. The evaluation methods typically take their arguments per reference. The details of the evaluation variants, together with explanations of specializations used for extra speed, can be found with the individual evaluation routines. What do I mean by the term 'pure' functor? It's a concept from functional programming. It means that calling the functor will not have any effect on the functor itself - it can't change once it has been constructed. This has several nice effects: it can potentially be optimized very well, it is thread-safe, and it will play well with functional programming concepts - and it's conceptionally appealing. Code using class evaluator will probably use it at some core place where it is part of some processing pipeline. An example would be an image processing program: one might have some outer loop generating arguments (typically simdized types) which are processed one after the other to yield a result. The processing will typically have several stages, like coordinate generation and transformations, then use class evaluator to pick an interpolated intermediate result, which is further processed by, say, colour or data type manipulations before finally being stored in some target container. The whole processing pipeline can be coded to become a single functor, with one of class evaluator's eval-type routines embedded somewhere in the middle, and all that's left is code to efficiently handle the source and destination to provide arguments to the pipeline - like the code in remap.h. And since the code in remap.h is made to provide the data feeding and storing, the only coding needed is for the pipeline, which is where the 'interesting' stuff happens. The code in this file concludes by providing factory functions to obtain evaluators for vspline::bspline objects. These factory functions produce objects which are type-erased (see vspline::grok_type) wrappers around the evaluators, which hide what's inside and simply provide evaluation of the spline at given coordinates. These objects also provide operator(), so that they can be used like functions. If Vc is used and the spline's data type is vectorizable, evaluation of vectorized coordinates is also supported and will produce vectorized values. Since the objects produced by the factory functions are derived from vspline::unary_functor, they can be fed to the functions in transform.h, like any other vspline::unary_functor. */ #ifndef VSPLINE_EVAL_H #define VSPLINE_EVAL_H #include "bspline.h" #include "unary_functor.h" #include "map.h" namespace vspline { /// is_singular tests if a type is either a plain fundamental or a Vc SIMD type. /// The second possibility is only considered if Vc is used at all. /// This test serves to differentiate between nD values like vigra TinyVectors /// which fail the test and singular values, which pass. Note that this test /// fails vigra::TinyVector < T , 1 > even though one might consider it 'singular'. template < class T > using is_singular = typename std::conditional < std::is_fundamental < T > :: value #ifdef USE_VC || Vc::is_simd_vector < T > :: value #endif , std::true_type , std::false_type > :: type ; /// next we have coordinate splitting functions. For odd splines, coordinates /// are split into an integral part and a remainder, which is used for weight /// generation. /// we have two variants for odd_split and the dispatch below them. /// Note how the initial call to std::floor produces a real type, which /// is used to subtract from 'v', yielding the remainder in 'fv'. Only after /// having used this real representation of the integral part, it is cast /// to an integral type by assigning it to 'iv'. This is the most efficient /// route, better than producing an integral-typed integral part directly /// and subtracting that from 'v', which would require another conversion. /// Technically, one might consider this 'split' as a remainder division by 1. template < typename ic_t , typename rc_t > void odd_split ( rc_t v , ic_t& iv , rc_t& fv , std::true_type ) { rc_t fl_i = std::floor ( v ) ; fv = v - fl_i ; iv = ic_t ( fl_i ) ; } template < typename ic_t , typename rc_t > void odd_split ( rc_t v , ic_t& iv , rc_t& fv , std::false_type ) { for ( int d = 0 ; d < vigra::ExpandElementResult < rc_t > :: size ; d++ ) odd_split ( v[d] , iv[d] , fv[d] , std::true_type() ) ; } template < typename ic_t , typename rc_t > void odd_split ( rc_t v , ic_t& iv , rc_t& fv ) { odd_split ( v , iv , fv , is_singular() ) ; } /// for even splines, the integral part is obtained by rounding. when the /// result of rounding is subtracted from the original coordinate, a value /// between -0.5 and 0.5 is obtained which is used for weight generation. /// we have two variants for even_split and the dispatch below. // TODO: there is a problem here: the lower limit for an even spline // is -0.5, which should be rounded *towards* zero, but std::round rounds // away from zero. The same applies to the upper limit, which should // also be rounded towards zero, not away from it. Currently I am working // around the issue by increasing the spline's headroom by 1 for even splines, // but I'd like to be able to use rounding towards zero. It might be argued, // though, that potentially producing out-of-range access by values which // are only just outside the range is cutting it a bit fine and the extra // headroom for even splines makes the code more robust, so accepting the // extra headroom would be just as acceptable as the widened right brace for // some splines which saves checking the incoming coordinate against // the upper limit. The code increasing the headroom is in bspline.h, // in bspline's constructor just befor the call to setup_metrics. template < typename ic_t , typename rc_t > void even_split ( rc_t v , ic_t& iv , rc_t& fv , std::true_type ) { rc_t fl_i = std::round ( v ) ; fv = v - fl_i ; iv = ic_t ( fl_i ) ; } template < typename ic_t , typename rc_t > void even_split ( rc_t v , ic_t& iv , rc_t& fv , std::false_type ) { for ( int d = 0 ; d < vigra::ExpandElementResult < rc_t > :: size ; d++ ) even_split ( v[d] , iv[d] , fv[d] , std::true_type() ) ; } template < typename ic_t , typename rc_t > void even_split ( rc_t v , ic_t& iv , rc_t& fv ) { even_split ( v , iv , fv , is_singular() ) ; } #ifdef USE_VC /// for TinyVectors of Vc SIMD vector types (Vc::Vector, Vc::SimdArray), /// we enable operator+= and operator*= /// This allows us to write concise code which does not need /// to iterate over the TinyVector's elements and bypasses /// vigra's trait system, which we'd otherwise have to supply /// with NumericTraits and PromoteTraits for the SIMD types. template < typename vector_type , int N > TinyVector < typename std::enable_if < Vc::is_simd_vector < vector_type > :: value , vector_type > :: type , N > & operator+= ( TinyVector < vector_type , N > & tv , const TinyVector < vector_type , N > & other ) { for ( int i = 0 ; i < N ; i++ ) tv[i] += other[i] ; return tv ; } template < typename vector_type , int N > TinyVector < typename std::enable_if < Vc::is_simd_vector < vector_type > :: value , vector_type > :: type , N > & operator*= ( TinyVector < vector_type , N > & tv , const TinyVector < vector_type , N > & other ) { for ( int i = 0 ; i < N ; i++ ) tv[i] *= other[i] ; return tv ; } /// now we also need the broadcast operation of TinyVector *= T /// for it all to pan out. We refrain from defining the reverse operation /// and the plain op types for now, but we widen the spec a little to /// allow multiplication with another vectorized type, so that mixed /// float/double operations become possible. template < typename vector_type , typename vt2 , int N > TinyVector < typename std::enable_if < Vc::is_simd_vector < vector_type > :: value , vector_type > :: type , N > & operator*= ( TinyVector < vector_type , N > & tv , const vt2 & other ) { for ( int i = 0 ; i < N ; i++ ) tv[i] *= other ; return tv ; } #endif /// a 'weight matrix' is used to calculate a set of weights for a given remainder /// part of a real coordinate. The 'weight matrix' is multipled with a vector /// containing the power series of the given remainder, yielding a set of weights /// to apply to a window of b-spline coefficients. /// /// The routine 'calculate_weight_matrix' originates from vigra. I took the original /// routine BSplineBase::calculateWeightMatrix() from vigra and /// changed it in several ways: /// /// - the spline degree is now a runtime parameter, not a template argument /// /// - the derivative degree is passed in as an additional parameter, directly /// yielding the appropriate weight matrix needed to calculate a b-spline's derivative /// with only a slight modification to the original code /// /// - the code uses my modified bspline basis function which takes the degree as a /// run time parameter instead of a template argument and works with integral /// operands and precalculated values, which makes it very fast, even for high /// spline degrees. bspline_basis() is in basis.h. template < class target_type > MultiArray < 2 , target_type > calculate_weight_matrix ( int degree , int derivative ) { const int order = degree + 1 ; // guard against impossible parameters if ( derivative >= order ) return MultiArray < 2 , target_type >() ; // allocate space for the weight matrix MultiArray < 2 , target_type > res ( order , order - derivative ) ; long double faculty = 1.0 ; for ( int row = 0 ; row < order - derivative ; row++ ) { if ( row > 1 ) faculty *= row ; int x = degree / 2 ; // (note: integer division) // we store to a MultiArray, which is row-major, so storing as we do // places the results in memory in the precise order in which we want to // use them in the weight calculation. // note how we pass x to bspline_basis() as an integer. This way, we pick // a very efficient version of the basis function which only evaluates at // whole numbers. This basis function version does hardly any calculations // but instead relies on precalculated values. see bspline_basis in prefilter.h // note: with large degrees (20+), this still takes a fair amount of time, but // rather seconds than minutes with the standard routine. for ( int column = 0 ; column < order ; ++column , --x ) res ( column , row ) = bspline_basis ( x , degree , row + derivative ) / faculty; } return res; } /// this functor calculates weights for a b-spline or it's derivatives. /// with d == 0, the weights are calculated for plain evaluation. /// Initially I implemented weight_matrix as a static member, hoping the code /// would perform better, but I could not detect significant benefits. Passing /// the derivative to the constructor gives more flexibility and less type /// proliferation. template < typename ele_type > struct bspline_derivative_weights { MultiArray < 2 , ele_type > weight_matrix ; int degree ; int derivative ; int columns ; int rows ; // we need a default constructor to create TinyVectors of this type // note that a default-constructed functor must not be called! bspline_derivative_weights() { } ; /// bspline_derivative_weights' constructor takes the desired b-spline degree /// and the desired derivative, defaulting to 0. bspline_derivative_weights ( int _degree , int _derivative = 0 ) : weight_matrix ( calculate_weight_matrix < ele_type > ( _degree , _derivative ) ) , degree ( _degree ) , derivative ( _derivative ) , columns ( _degree + 1 ) , rows ( _degree + 1 - _derivative ) { } ; /// operator() deposits weights for the given delta at the location 'result' /// points to. target_type and delta_type may be fundamentals or simdized types. template < class target_type , class delta_type > void operator() ( target_type* result , const delta_type & delta ) const { target_type power ( delta ) ; auto factor_it = weight_matrix.begin() ; auto end = weight_matrix.end() ; // the result is initialized with the first row of the 'weight matrix'. // We save ourselves multiplying it with delta^0. for ( int c = 0 ; c < columns ; c++ ) { result[c] = *factor_it ; ++factor_it ; } if ( degree ) { for ( ; ; ) { for ( int c = 0 ; c < columns ; c++ ) { result[c] += power * *factor_it ; ++factor_it ; } if ( factor_it == end ) { // avoid next multiplication if exhausted, break now break ; } // otherwise produce next power(s) of delta(s) power *= target_type ( delta ) ; } } } } ; /// class evaluator encodes evaluation of a B-spline. Technically, a vspline::evaluator /// is a vspline::unary_functor, which has the specific capability of evaluating a /// specific uniform b-spline. This makes it a candidate to be passed to the functions /// in transform.h, like remap() and transform(), and it also makes it suitable for /// vspline's functional constructs like chaining, mapping, etc. /// /// While creating and using vspline::evaluators is simple enough, especially from /// vspline::bspline objects, there are also factory functions producing objects capable /// of evaluating a b-spline. These objects are wrappers around a vspline::evaluator, /// please see the factory functions make_evaluator() and make_safe_evaluator() at the /// end of this file. /// /// If you don't want to concern yourself with the details, the easiest way is to /// have a bspline object handy and use one of the factory functions, assigning the /// resulting functor to an auto variable: /// /// // given a vspline::bspline object 'bspl' /// // create an object which can evaluate the spline /// auto ev = vspline::make_safe_evaluator ( bspl ) ; /// // which can be used like this: /// auto value = ev ( some_real_coordinate ) ; /// /// The evaluation relies on 'braced' coefficients, as they are normally provided by /// a vspline::bspline object (the exception being bspline objects created with /// UNBRACED or MANUAL strategy). While the most general constructor will accept /// a raw pointer to coefficients (enclosed in the necessary 'brace'), this will rarely /// be used, and an evaluator will be constructed from a bspline object. In the most /// trivial case there are only two things which need to be done: /// /// The specific type of evaluator has to be established by providing the relevant template /// arguments. Here, we need two types: the 'coordinate type' and the 'value type'. /// /// - The coordinate type is encoded as a vigra::TinyVector of some real data type - if you're /// doing image processing, the typical type would be a vigra::TinyVector < float , 2 >. /// fundamental real types are also accepted (for 1D splines) /// /// - The value type has to be either an elementary real data type such as 'float' or 'double', /// or a vigra::TinyVector of such an elementary type. Other data types which can be handled /// by vigra's ExpandElementResult mechanism should also work. When processing colour images, /// your value type would typically be a vigra::TinyVector < float , 3 >. /// /// Note that class evaluator operates with 'native' spline coordinates, which run with /// the coefficient array's core shape, so typically from 0 to M-1 for a 1D spline over /// M values. Access with different coordinates is most easily done by 'chaining' class /// evaluator objects to other vspline::unary_functor objects providing coordinate /// translation, see unary_functor.h, map.h and domain.h. /// /// While the template arguments specify coordinate and value type for unvectorized /// operation, the types for vectorized operation are inferred from them, using vspline's /// vector_traits mechanism. The width of SIMD vectors to be used has to be established. /// This is not mandatory - if omitted, a default value is picked. /// /// With the evaluator's type established, an evaluator of this type can be constructed by /// passing a vspline::bspline object to the constructor. Naturally, the bspline object has /// to contain data of the same value type, and the spline has to have the same number of /// dimensions as the coordinate type. Alternatively, coefficients can be passed in as a /// pointer into a field of suitably braced coefficients. /// /// I have already hinted at the evaluation process used, but here it is again in a nutshell: /// The coordinate at which the spline is to be evaluated is split into it's integral part /// and a remaining fraction. The integral part defines the location where a window from the /// coefficient array is taken, and the fractional part defines the weights to use in calculating /// a weighted sum over this window. This weighted sum represents the result of the evaluation. /// Coordinate splitting is done with the method split(), which picks the appropriate variant /// (different code is used for odd and even splines) /// /// The generation of the weights to be applied to the window of coefficients is performed /// by employing the weight functors above. What's left to do is to bring all the components /// together, which happens in class evaluator. The workhorse code in the subclass _eval /// takes care of performing the necessary operations recursively over the dimensions of the /// spline. /// /// a vspline::evaluator is technically a vspline::unary_functor. This way, it can be directly /// used by constructs like vspline::chain and has a consistent interface which allows code /// using evaluators to query it's specifics. Since evaluation uses no conditionals on the /// data path, the whole process can be formulated as a set of templated member functions /// using vectorized types or unvectorized types, so the code itself is vector-agnostic. /// This makes for a nicely compact body of code inside class evaluator, at the cost of /// having to provide a bit of collateral code to make data access syntactically uniform. /// /// There is a variety of overloads of class evaluator's eval() method available. Some of /// these overloads go beyond the 'usual' eval routines, like evaluation with known weights, /// sidestepping the weight generation from coordinates. /// /// The evaluation strategy is to have all dependencies of the evaluation except for the actual /// coordinates taken care of by the constructor - and immutable for the evaluator's lifetime. /// The resulting object has no state which is modified after construction, making it thread-safe. /// It also constitutes a 'pure' functor in a functional-programming sense, because it has /// no mutable state and no side-effects, as can be seen by the fact that the eval methods /// are all marked const. /// /// The eval() overloads form a hierarchy, as evaluation progresses from accepting unsplit real /// coordinates to offsets and weights. This allows calling code to handle parts of the /// delegation hierarchy itself, only using an evaluator at a specific level. /// /// By providing the evaluation in this way, it becomes easy for calling code to integrate /// the evaluation into more complex functors. Consider, for example, code /// which generates coordinates with a functor, then evaluates a b-spline at these coordinates, /// and finally subjects the resultant values to some postprocessing. All these processing /// steps can be bound into a single functor, and the calling code can be reduced to polling /// this functor until it has provided the desired number of output values. /// /// While the 'unspecialized' evaluator will try and do 'the right thing' by using general /// purpose code fit for all eventualities, for time-critical operation there is a /// specialization which can be used to make the code faster: /// /// - template argument 'specialize' can be set to 0 to forcibly use (more efficient) nearest /// neighbour interpolation, which has the same effect as simply running with degree 0 but avoids /// code which isn't needed for nearest neighbour interpolation (like the application of weights, /// which is futile under the circumstances, the weight always being 1.0). /// specialize can also be set to 1 for explicit n-linear interpolation. Any other value will /// result in the general-purpose code being used. /// /// Note that, contrary to my initial implementation, all forms of coordinate mapping were /// removed from class evaluator. The 'mapping' which is left is, more aptly, called /// 'splitting', since the numeric value of the incoming coordinate is never modified. /// Folding arbitrary coordinates into the spline's defined range now has to be done /// externally, typically by wrapping class evaluator together with some coordinate /// modification code into a combined vspline::unary_functor. map.h provides code for /// common mappings, see there. Arbitrary manipulation of incoming and outgoing data /// can be realized by using vspline::chain in unary_functor.h, see there. /// /// Note how the default number of vector elements is fixed by picking the number of ele_type /// which vspline::vector_traits considers appropriate. There should rarely be a need to /// choose a different number of vector elements: evaluation will often be the most /// computationally intensive part of a processing chain, and therefore this choice is /// sensible. But it's not mandatory. Just keep in mind that when building processing /// pipelines, all their elements must use the same vectorization width. // TODO: evaluator uses a great many typedefs which is a bit over the top, even though // it was helpful in the development process. template < typename _coordinate_type , // nD real coordinate typename _value_type , // type of coefficient/result // nr. of vector elements int _vsize = vspline::vector_traits < _value_type > :: size , // per default, don't specialize for degree 0 or 1 spline int specialize = -1 > class evaluator : public unary_functor < _coordinate_type , _value_type , _vsize > { public: // we want to access facilites of the base class (vspline::unary_functor<...>) // so we use a typedef for the base class. typedef unary_functor < _coordinate_type , _value_type , _vsize > base_type ; typedef _value_type value_type ; // === base_type::out_type typedef _coordinate_type coordinate_type ; // === base_type::in_type // we are relying on base_type to provide several types: typedef typename base_type::in_ele_type rc_type ; // elementary type of a coordinate typedef typename base_type::out_ele_type ele_type ; // elementary type of value_type using base_type::dim_in ; using base_type::dim_out ; using base_type::vsize ; enum { dimension = base_type::dim_in } ; enum { level = dimension - 1 } ; enum { channels = base_type::dim_out } ; // types for nD integral indices, these are not in base_type typedef int ic_type ; // we're only using int for indices typedef vigra::TinyVector < ic_type , dimension > nd_ic_type ; typedef vigra::TinyVector < rc_type , dimension > nd_rc_type ; /// view_type is used for a view to the coefficient array typedef MultiArrayView < dimension , value_type > view_type ; /// type used for nD array coordinates, array shapes typedef typename view_type::difference_type shape_type ; typedef vigra::TinyVector < int , dimension > derivative_spec_type ; typedef typename MultiArrayView < 1 , ic_type > :: const_iterator offset_iterator ; #ifdef USE_VC // types for vectorized evaluation. using typename base_type::in_v ; using typename base_type::in_ele_v ; using typename base_type::out_v ; using typename base_type::out_ele_v ; typedef typename base_type::out_ele_v ele_v ; typedef typename vector_traits < ic_type , vsize > :: ele_v ic_v ; typedef typename vector_traits < nd_ic_type , vsize > :: type nd_ic_v ; typedef typename vector_traits < nd_rc_type , vsize > :: type nd_rc_v ; #endif // USE_VC private: // Initially I was using a template argument for this flag, but it turned out // that using a const bool set at construction time performed just as well. // Since this makes using class evaluator easier, I have chosen to go this way. const bool even_spline_degree ; ///< flag containing the 'evenness' of the spline const ele_type * const cf_ebase ; ///< coefficient base adress in terms of elementary type const shape_type cf_estride ; ///< strides in terms of elementary type MultiArray < 1 , ic_type > eoffsets ; ///< offsets in terms of elementary type vigra::TinyVector < bspline_derivative_weights < ele_type > , dimension > wgt ; const int spline_degree ; const int spline_order ; const int window_size ; int min_offset , max_offset ; bool have_min_max_offset ; public: /// split function. This function is used to split incoming real coordinates /// into an integral and a remainder part, which are used at the core of the /// evaluation. selection of even or odd splitting is done via the const bool /// flag 'even_spline_degree' My initial implementation had this flag as a /// template argument, but this way it's more flexible and there seems to /// be no runtime penalty. This method delegates to the free function templates /// even_split and odd_split, respectively, which are defined above class evaluator. template < class IT , class RT > void split ( const RT& input , IT& select , RT& tune ) const { if ( even_spline_degree ) even_split < IT , RT > ( input , select , tune ) ; else odd_split < IT , RT > ( input , select , tune ) ; } const int & get_order() const { return spline_order ; } const int & get_degree() const { return spline_degree ; } const shape_type & get_estride() const { return cf_estride ; } /// this constructor is the most flexible variant and will ultimately be called /// by all other constructor overloads. This constructor will not usually be /// called directly - most code will use the overloads taking vspline::bspline /// objects. This constructor takes four arguments: /// /// - A pointer into the coefficient array, expressed as a pointer to the elementary /// type of the spline's value_type. So if the spline contains float RGB pixels, /// the elementary type is plain float. The pointer is expected to point to the /// coefficient which coincides with the origin of the data the spline was built /// over. Note that usually this is *not* simply the beginning of the coefficient /// array, since the coeffcicent array is 'braced' - surrounded by additional /// coefficients to allow evaluation without bounds checking. When generating /// this pointer from a bspline object, what's passed here is the base address /// of the coefficient array's 'core', which is the part of the coeffcicent /// array corresponding 1:1 with the data the spline was built over. /// /// - The stride of the data, expressed in units of the elementary type. /// If the data are float RGB pixels and held in a MultiArray, this will be /// thrice the MultiArray's stride. /// /// - The degree of the spline (3 == cubic spline) /// /// - Specification of the desired derivatives of the spline, /// defaults to 0 (plain evaluation). /// /// The calling code must not(!) deallocate the coefficients while the evaluator is /// in use. The coefficients are accessed via the pointer passed in, they are not /// copied. /// /// Note that if you have an array of coefficients which is not braced, you can still /// construct an evaluator from it, but it will be your responsibility to avoid /// out-of-bounds access to coefficients: there will be no safeguard, the evaluation /// code will not check incoming coordinates for validity, and even seemingly safe /// operations near the margins may fail because the evaluation code makes assumptions /// about the size of the brace which may not be obvious. If you want to go down this /// path, please refer to the bracing-related code in class bspline. There are two /// functions, get_left_brace_size() and get_right_brace_size(), which calculate /// the sizes of the braces the evaluation code expects. The easiest path to take /// would be to obtain the brace sized for your specific spline type and extract /// a MultiArrayView to the central part of your coefficient array which leaves /// a margin of just this size: /// /// // given a coefficient array cf in a MultiArrayView /// // and the left and right brace size lbsz and rbsz /// /// auto core = cf.subarray ( lbsz , cf.shape() - rbsz ) ; /// /// then you can pass core.data() to this constructor and evaluate with coordinates /// modified to reflect the smaller size of the coefficient subarray you're using /// instead of the 'full' coefficient array. There are situations where such a course /// of action is sensible, for example: you might have image data in memory which you /// want to display quickly using bilinear interpolation, not caring if a bit of the /// margin is not shown. Since you are using bilinear interpolation, you need no /// prefiltering and can access the image data directly. /// /// Note how this constructor can be used to create an evaluator for /// a 'shifted' spline by simply passing a spline_degree which is different /// to what was used for prefiltering. This is, of course, only safe if a /// lower degree is used or the coefficient array has sufficient headroom /// to avoid out-of-bounds access. See the discussion of shifting in the /// documentation of the constructor overload taking a bspline object. /// /// So, in a nutshell, if you use this constructor variant, you are supposed to /// know what you're doing. Most user code will use the 'safe' path and come from /// a bspline object. evaluator ( const ele_type * const _cf_ebase , const shape_type & _cf_estride , int _spline_degree , derivative_spec_type _derivative = derivative_spec_type ( 0 ) ) : cf_ebase ( _cf_ebase ) , cf_estride ( _cf_estride ) , spline_degree ( _spline_degree ) , even_spline_degree ( ! ( _spline_degree & 1 ) ) , spline_order ( _spline_degree + 1 ) , wgt { bspline_derivative_weights < ele_type > ( _spline_degree , 0 ) } , window_size ( std::pow ( _spline_degree + 1 , int(dimension) ) ) { // if any derivatives are used, reinitalize the weight functors where // a derivative other than 0 is requested. TODO potentially slightly wasteful, // if no degree-0 evaluation is done or duplicate weight functors are produced. // But setting up the weight functors is only expensive for high-degree splines, // so I consider this a corner case and ignore it for now. for ( int d = 0 ; d < dimension ; d++ ) { if ( _derivative[d] ) { wgt[d] = bspline_derivative_weights < ele_type > ( _spline_degree , _derivative[d] ) ; } } // The evaluation forms a weighted sum over a window into the coeffcicent array. // The sequence of offsets we calculate here is the set of pointer differences // from the first element in that window to all elements in the window. It's // another way of coding this window, where all index calculations have already // been done beforehand rather than performing them during the traversal of the // window by means of stride/shape arithmetic. Coding the window in this fashion // also makes it easy to vectorize the code. eoffsets = MultiArray < 1 , ptrdiff_t > ( window_size ) ; // we want to iterate over all nD indexes in a window which has equal extent // of spline_order in all directions (like the reconstruction filter's kernel), // relative to a point which is spline_degree/2 away from the origin along // every axis. This sounds complicated but is really quite simple: For a cubic // b-spline over 2D data we'd get // (-1,-1) , (0,-1), (1,-1), (2,-1) // (-1, 0) , (0, 0), (1, 0), (2, 0) // (-1, 1) , (0, 1), (1, 1), (2, 1) // (-1, 2) , (0, 2), (1, 2), (2, 2) // for the indexes, which are subsequently multiplied with the strides and // summed up to obtain 1D offsets instead of nD coordinates. So if the coefficient // array has strides (10,100) and the coefficients are single-channel, the sequence // of offsets generated is // -110, -100, -90, -80, // which is -1 * 10 + -1 * 100, 0 * 10 + -1 * 100 ... // -10, 0, 10, 20, // 90, 100, 110, 120, // 190, 200, 210, 220 shape_type window_shape ( spline_order ) ; vigra::MultiCoordinateIterator mci ( window_shape ) ; // we want to save the offsets in 'eoffsets' auto target = eoffsets.begin() ; for ( int i = 0 ; i < window_size ; i++ ) { // offsets are calculated by multiplying indexes with the coefficient array's // strides and summing up. So now we have offsets instead of nD indices, and // using these offsets saves us index calculations during evaluation. // Note how we subtract spline_degree/2 to obtain indexes which are relative // to the window's center. *target = sum ( ( *mci - spline_degree / 2 ) * cf_estride ) ; ++mci ; ++target ; } } ; /// construct an evaluator from a bspline object /// /// when using the higher-level interface to vspline's facilities - via class /// bspline - the bspline object provides the coefficients. The derivative /// specification is passed in just as for the general constructor. Note that /// the derivative specification can be individually chosen for each axis. /// This constructor delegates to the one above, 'unpacking' the information /// in the bspline object to provide the 'raw' pointer and stride it needs. /// /// Note how the pointer to the bspline object's 'core' is cast to ele_type, /// the elementary type of the spline's value_type, and how the stride is also /// modified to refer to ele_type for class evaluator's internal use. /// /// This constructor optionally takes a third argument 'shift', which /// indicates that the coefficient data in the spline should be interpreted /// as if they were coefficients of a different-degree spline, where the /// value of 'shift' gives the value of this difference. This may or may /// not be technically possible. User code is advised to check the /// feasability of a shift by calling the spline's shiftable() method /// before constructing the evaluator - this constructor will throw an /// exception if the spline can't be used with the requested shift. /// creating an evaluator with shift has the same effect as shifting /// the spline and then creating en evaluator from it, but it does not /// change the bspline object, which is often desirable. evaluator ( const bspline < value_type , dimension > & bspl , derivative_spec_type _derivative = derivative_spec_type ( 0 ) , int shift = 0 ) : evaluator ( (ele_type*) ( bspl.core.data() ) , channels * bspl.core.stride() , bspl.spline_degree + shift , _derivative ) { if ( bspl.spline_degree > 1 && ! bspl.braced ) throw not_supported ( "for spline degree > 1: evaluation needs braced coefficients" ) ; // while the general constructor above has already been called, // we haven't yet made certain that a requested shift has resulted // in a valid evaluator. We check this now and throw an exception // if the shift was illegal. if ( ! bspl.shiftable ( shift ) ) throw not_supported ( "insufficient frame size. the requested shift can not be performed." ) ; // #define ASSURE_IN_BOUNDS // should not be defined in production code // as we are creating the evaluator from a bspline object, // we can make sure that evalation will not access the spline // outside it's specification. The code we use to calculate // the bracing makes sure we are on the safe side even if // access is just out of bounds. The following bit of code // doublechecks that this is the case by looking at the offsets // at which the spline would be accessed when doing an evaluation // just outside the bounds, emitting a message if this would // land outside the coefficient array. // The warning criterion is even narrower, it warns also if // there are superfluous coefficients which will not be used // even if the spline is evaluated at these extreme positions, // hence the test for !=, instead of >, or <, respectively #ifdef ASSURE_IN_BOUNDS min_offset = (ele_type*) ( bspl.container.data() ) - cf_ebase ; max_offset = (ele_type*) ( bspl.container.data() + sum ( bspl.container.stride() * ( bspl.container.shape() - shape_type(1) ) ) ) - cf_ebase ; have_min_max_offset = true ; auto it_first_ofs = eoffsets.begin() ; auto first_ofs = *it_first_ofs ; auto low = bspl.lower_limit() - .0001 ; shape_type ls ; split ( low , ls , low ) ; int lofs = sum ( ls * bspl.core.stride() ) ; lofs *= channels ; lofs += first_ofs ; if ( lofs < min_offset ) std::cerr << "WARNING: min_ofs " << min_offset << " low ofs " << lofs << " " << bc_name[bspl.bcv[0]] << " " << bspl.spline_degree << std::endl ; auto it_last_ofs = eoffsets.end() ; it_last_ofs-- ; auto last_ofs = *it_last_ofs ; auto high = bspl.upper_limit() + .0001 ; shape_type hs ; split ( high , hs , high ) ; int hofs = sum ( hs * bspl.core.stride() ) ; hofs *= channels ; hofs += last_ofs ; if ( hofs > max_offset ) std::cerr << "WARNING: max_ofs " << max_offset << " high ofs " << hofs << " " << bc_name[bspl.bcv[0]] << " " << bspl.spline_degree << " last_ofs " << last_ofs << " high " << high << std::endl ; #endif } ; /// obtain_weights calculates the weights to be applied to a section /// of the coefficients from the fractional parts of the split coordinates. /// What is calculated here is the evaluation of the spline's basis function /// at dx, dx+1 , dx+2..., but doing it naively is computationally expensive, /// as the evaluation of the spline's basis function at arbitrary values has /// to look at the value, find out the right interval, and then calculate /// the value with the appropriate function. But we always have to calculate /// the basis function for *all* intervals anyway, and the method used here /// performs this tasks efficiently using a vector/matrix multiplication. /// If the spline is more than 1-dimensional, we need a set of weights for /// every dimension. The weights are accessed through a 2D MultiArrayView. /// For every dimension, there are spline_order weights. template < typename nd_rc_type , typename weight_type > void obtain_weights ( const MultiArrayView < 2 , weight_type > & weight , const nd_rc_type& c ) const { auto ci = c.cbegin() ; for ( int axis = 0 ; axis < dimension ; ++ci , ++axis ) wgt[axis] ( weight.data() + axis * spline_order , *ci ) ; } /// obtain weight for a single axis template < typename rc_type , typename weight_type > void obtain_weights ( weight_type * p_weight , const int & axis , const rc_type& c ) const { wgt[axis] ( p_weight , c ) ; } private: // next we have collateral code which we keep private in class evaluator. // TODO some of the above could also be private. // to be able to use the same code to access the coefficients, no matter if // the operation is vectorized or not, we provide a set of 'load' functions // which encapsulate the memory access. This allows us to uniformly handle // vectorized and unvectorized data. /// load function for single T template < typename T > static inline void load ( T & target , const T * const mem , int index ) { target = mem [ index ] ; } /// load function for TinyVectors of T template < typename T , int N > static inline void load ( TinyVector < T , N > & target , const T * const mem , int index ) { for ( int i = 0 ; i < N ; i++ ) target[i] = mem [ index + i ] ; } /// load function for std::complex template < typename T > static inline void load ( std::complex & target , const T * const mem , int index ) { target = std::complex ( mem [ index ] , mem [ index + 1 ] ) ; } // TODO: can we generalize load for aggregates? #ifdef USE_VC /// load function for single SIMD types. Here, instead of the single index /// in the load function above, we accept gather indexes for a SIMD gather /// operation. template < typename vector_type > typename std::enable_if < Vc::is_simd_vector < vector_type > :: value > :: type static inline load ( vector_type & target , const typename vector_type::value_type * mem , const typename vector_type::IndexType & indexes ) { target.gather ( mem , indexes ) ; } /// load function for TinyVectors of SIMD types template < typename vector_type , int N > typename std::enable_if < Vc::is_simd_vector < vector_type > :: value > :: type static inline load ( TinyVector < vector_type , N > & target , const typename vector_type::value_type * mem , const typename vector_type::IndexType & indexes ) { for ( int e = 0 ; e < N ; e++ ) target[e].gather ( mem + e , indexes ) ; } #endif /// _eval is the workhorse routine and implements the recursive arithmetic /// needed to evaluate the spline. First the weights for the current dimension /// are obtained from the weights object passed in. Once the weights are known, /// they are successively multiplied with the results of recursively calling /// _eval for the next lower dimension and the products are summed up to produce /// the return value. The scheme of using a recursive evaluation has several /// benefits: - it needs no explicit intermediate storage of partial sums /// (uses stack instead) and it makes the process dimension-agnostic in an /// elegant way. therefore, the code is also thread-safe. note that this routine /// is used for operation on braced splines, with the sequence of offsets to be /// visited fixed at the evaluator's construction. template < class dtype , int level , class wtype , class otype > struct _eval { dtype operator() ( const ele_type * const & ebase , const otype & origin , offset_iterator & ofs , const MultiArrayView < 2 , wtype > & weight ) const { dtype sum = dtype() ; ///< to accumulate the result dtype subsum ; ///< to pick up the result of the recursive call for ( int i = 0 ; i < weight.shape ( 0 ) ; i++ ) { subsum = _eval < dtype , level - 1 , wtype , otype >() ( ebase , origin , ofs , weight ) ; subsum *= weight [ vigra::Shape2 ( i , level ) ] ; sum += subsum ; } return sum ; } } ; /// at level 0 the recursion ends, now we finally apply the weights for axis 0 /// to the window of coefficients. Note how ofs is passed in per reference. This /// looks wrong, but it's necessary: When, in the course of the recursion, the /// level 0 routine is called again, it needs to access the next bunch of /// spline_order coefficients. /// /// Just incrementing the reference saves us incrementing higher up. /// This is the point where we access the spline's coefficients. Since _eval works /// for vectorized and unvectorized code alike, this access is coded as a call to /// 'load' which provides a uniform syntax for the memory access. The implementation /// of the load routines used here is just above. template < class dtype , class wtype , class otype > struct _eval < dtype , 0 , wtype , otype > { dtype operator() ( const ele_type * const & ebase , const otype & origin , offset_iterator & ofs , const MultiArrayView < 2 , wtype > & weight ) const { dtype sum = dtype() ; dtype help ; for ( int i = 0 ; i < weight.shape ( 0 ) ; i++ ) { load ( help , ebase , origin + *ofs ) ; help *= weight [ vigra::Shape2 ( i , 0 ) ] ; sum += help ; ++ofs ; } return sum ; } } ; /// specialized code for degree-1 b-splines, aka linear interpolation /// here, there is no gain to be had from working with precomputed per-axis weights, /// the weight generation is trivial. So the specialization given here is faster: template < class dtype , int level , class wtype , class otype > struct _eval_linear { dtype operator() ( const ele_type * const & ebase , ///< base adresses of components const otype& origin , // offsets to evaluation window origins offset_iterator & ofs , // offsets to coefficients inside this window const wtype& tune ) const // weights to apply { dtype sum ; ///< to accumulate the result dtype subsum ; sum = _eval_linear < dtype , level - 1 , wtype , otype >() ( ebase , origin , ofs , tune ) ; sum *= ( 1 - tune [ level ] ) ; subsum = _eval_linear < dtype , level - 1 , wtype , otype >() ( ebase , origin , ofs , tune ); subsum *= tune [ level ] ; sum += subsum ; return sum ; } } ; /// again, level 0 terminates the recursion, again accessing the spline's /// coefficients with the 'load' function defined above. template < class dtype , class wtype , class otype > struct _eval_linear < dtype , 0 , wtype , otype > { dtype operator() ( const ele_type * const & ebase , ///< base adresses of components const otype& origin , // offsets to evaluation window origins offset_iterator & ofs , // offsets to coefficients inside this window const wtype& tune ) const // weights to apply { dtype sum , help ; auto o1 = *ofs ; ++ofs ; auto o2 = *ofs ; ++ofs ; load ( sum , ebase , origin + o1 ) ; sum *= ( 1 - tune [ 0 ] ) ; load ( help , ebase , origin + o2 ) ; help *= tune [ 0 ] ; sum += help ; return sum ; } } ; public: /// the 'innermost' eval routine is called with offset(s) and weights. This /// routine is separate because it is used from outside (namely by grid_eval). /// In this final delegate we call the workhorse code in class _eval // TODO: what if the spline is degree 0 or 1? for these cases, grid_eval // should not pick this general-purpose routine template < class IC , class ELE , class OUT > void eval ( const IC& select , // offsets to lower corners of the subarrays const MultiArrayView < 2 , ELE > & weight , // vectorized weights OUT & result ) const { // we need an instance of this iterator because it's passed into _v_eval by reference // and manipulated by the code there: offset_iterator ofs = eoffsets.begin() ; // now we can call the recursive _v_eval routine yielding the result result = _eval < OUT , level , ELE , IC >() ( cf_ebase , select , ofs , weight ) ; } /// 'penultimate' eval starts from (an) offset(s) to (a) coefficient window(s); here /// the nD integral index(es) to the coefficient window(s) has/have already been /// 'condensed' into (a) 1D offset(s) into the coefficient array's memory. /// Here we have the specializations affected by the template argument 'specialize' /// which activates more efficient code for degree 0 (nearest neighbour) and degree 1 /// (linear interpolation) splines. I draw the line here; one might add further /// specializations, but from degree 2 onwards the weights are reused several times /// so looking them up in a small table (as the general-purpose code for unspecialized /// operation does) should be more efficient (TODO test). /// /// we have three variants, depending on 'specialize'. first is the specialization /// for nearest-neighbour interpolation, which doesn't delegate further, since the /// result can be obtained directly by gathering from the coefficients: // dispatch for neraest-neighbour interpolation (degree 0) template < class IC , class NDRC , class OUT > void eval ( const IC& select , // offsets to coefficient windows const NDRC& tune , // fractional parts of the coordinates OUT & result , // target std::integral_constant < int , 0 > ) const { load ( result , cf_ebase , select ) ; } /// eval dispatch for linear interpolation (degree 1) template < class IC , class NDRC , class OUT > void eval ( const IC& select , // offsets to coefficient windows const NDRC& tune , // fractional parts of the coordinates OUT & result , // target std::integral_constant < int , 1 > ) const { offset_iterator ofs = eoffsets.begin() ; // we move here from coordinates to weights. First we derive // component_t, which is either OUT itself for single channel // value_types or OUT's value_type if it is a TinyVector, as // would be the case for multichannel data. typedef typename std::conditional < is_singular < OUT > :: value , vigra::TinyVector < OUT , 1 > , OUT > :: type :: value_type component_t ; // for further processing, we wrap component_t in a TinyVector, // even if we're using 1D coordinates. The resulting type, weight_t, // has the same structure as 'tune', but it has the value_type's // fundametal type. typedef vigra::TinyVector < component_t , dim_in > weight_t ; result = _eval_linear < OUT , level , weight_t , IC >() ( cf_ebase , select , ofs , weight_t(tune) ) ; } // eval dispatch for arbitrary spline degrees template < int arbitrary_spline_degree , class IC , class NDRC , class OUT > void eval ( const IC& select , // offset(s) to coefficient window(s) const NDRC& tune , // fractional parts of the coordinates OUT & result , // target std::integral_constant < int , arbitrary_spline_degree > ) const { // we move here from coordinates to weights. First we derive // component_t, which is either OUT itself for single channel // value_types or OUT's value_type if it is a TinyVector, as // would be the case for multichannel data. typedef typename std::conditional < is_singular < OUT > :: value , vigra::TinyVector < OUT , 1 > , OUT > :: type :: value_type component_t ; // now 'weight' is a 2D vigra::MultiArray of this singular type: MultiArray < 2 , component_t > weight ( vigra::Shape2 ( spline_order , dimension ) ) ; // get a set of weights for the real remainder of the coordinate obtain_weights ( weight , tune ) ; // and delegate to the variant taking the weights eval ( select , weight , result ) ; } /// This variant of eval() takes unsplit coordinates. This is the eval variant which /// implements what is 'expected' of a vspline::unary_functor instantiated with the /// given input and output type, while the variants above, to which this one /// delegates, are additional entry points specific to class evaluator. Of course, /// the way class unary_functor is set up, the routines that a class derived from /// unary_functor provides is arbitrary, but the 'expectation' is conventional. /// /// depending on 'specialize' we dispatch to one of the three specializations above. /// /// typewise, we have a wide scope here: IN may be a singular fundamental, a singular /// simdized type or a TinyVector of either. Same holds true for OUT. To be able to /// handle the call sequence consistently, we transform singular types to TinyVectors /// of one element. Doing so, a 'coordinate' is always recognizable as such by the /// fact that it is a TinyVector. 'condensed' coordinates (aka offsets), on the other /// hand, are not, so we can tell even a 1D coordinate from an offset. template < class IN , class OUT > void eval ( const IN & input , // number of dimensions * coordinate vectors OUT & result ) const // number of channels * value vectors { // type IN may be singular or a TinyVector, we want a TinyVector, // possibly with only one element, so we 'wrap' input, so that we have // a definite TinyVector, containing 'input', the coordinate where we // want to evaluate the spline. auto _input = wrap ( input ) ; typedef decltype ( _input ) nd_rc_t ; typedef typename nd_rc_t::value_type rc_t ; #ifdef USE_VC // deduce an SIMD index type with vsize vector elements typedef typename vspline::vector_traits < ic_type , vsize > :: type ic_v ; // produce nD index type, depending on incoming coordinate // being a simdized type or not typedef typename std::conditional < Vc::is_simd_vector < rc_t > :: value , vigra::TinyVector < ic_v , dimension > , vigra::TinyVector < ic_type , dimension > > :: type nd_ic_t ; #else // without Vc, there are no alternatives. typedef vigra::TinyVector < ic_type , dimension > nd_ic_t ; #endif // two variables of these two types take the result of splitting 'input' // into it's integral and remainder parts nd_ic_t select ; nd_rc_t tune ; split ( _input , select , tune ) ; // now we define the type for a single component of the integral type. // This is the target for converting the nD index(es) to (an) offset(s) typedef typename nd_ic_t::value_type IC ; // condense nD index(es) into offset(s) by multiplying with the per-axis // strides and summing up. Note how we use 'expanded' strides which express // the strides in terms of the elementary type rather than value_type itself, // which may be a TinyVector of the elementary type. IC origin = select[0] * ic_type ( cf_estride [ 0 ] ) ; for ( int d = 1 ; d < dimension ; d++ ) origin += select[d] * ic_type ( cf_estride [ d ] ) ; // pass on to eval overload taking the offset, dispatching on 'specialize' eval ( origin , tune , result , std::integral_constant < int , specialize > () ) ; } } ; namespace detail { /// helper object to create a type-erased vspline::evaluator for /// a given bspline object. The evaluator is specialized to the /// spline's degree, so that degree-0 splines are evaluated with /// nearest neighbour interpolation, degree-1 splines with linear /// interpolation, and all other splines with general b-spline /// evaluation. The resulting vspline::evaluator is 'grokked' to /// erase it's type to make it easier to handle on the receiving /// side. template < typename spline_type , typename rc_type , int _vsize > struct build_ev { vspline::grok_type < bspl_coordinate_type < spline_type , rc_type > , bspl_value_type < spline_type > , _vsize > operator() ( const spline_type & bspl ) { typedef bspl_coordinate_type < spline_type , rc_type > crd_t ; typedef bspl_value_type < spline_type > value_t ; if ( bspl.spline_degree == 0 ) return vspline::grok ( vspline::evaluator < crd_t , value_t , _vsize , 0 > ( bspl ) ) ; else if ( bspl.spline_degree == 1 ) return vspline::grok ( vspline::evaluator < crd_t , value_t , _vsize , 1 > ( bspl ) ) ; else return vspline::grok ( vspline::evaluator < crd_t , value_t , _vsize , -1 > ( bspl ) ) ; } } ; /// helper object to create a vspline::mapper object with gate types /// matching a bspline's boundary conditions and extents matching the /// spline's lower and upper limits. Please note that these limits /// depend on the boundary conditions and are not always simply /// 0 and N-1, as they are for, say, mirror boundary conditions. /// see lower_limit() and upper_limit() in vspline::bspline. /// /// gate types are inferred from boundary conditions like this: /// PERIODIC -> periodic_gate /// MIRROR, REFLECT, SPHERICAL -> mirror_gate /// all other boundary conditions -> clamp_gate /// /// The mapper object is chained to an evaluator, resulting in /// a functor providing safe access to the evaluator. The functor /// is subsequently 'grokked' to produce a uniform return type. template < int level , typename spline_type , typename rc_type , int _vsize , class ... gate_types > struct build_safe_ev { vspline::grok_type < bspl_coordinate_type < spline_type , rc_type > , bspl_value_type < spline_type > , _vsize > operator() ( const spline_type & bspl , gate_types ... gates ) { // find out the spline's lower and upper limit for the current level rc_type lower ( bspl.lower_limit ( level ) ) ; rc_type upper ( bspl.upper_limit ( level ) ) ; // depending on the spline's boundary condition for the current // level, construct an appropriate gate object and recurse to // the next level switch ( bspl.bcv [ level ] ) { case vspline::PERIODIC: { auto gt = vspline::periodic < rc_type , _vsize > ( lower , upper ) ; return build_safe_ev < level - 1 , spline_type , rc_type , _vsize , decltype ( gt ) , gate_types ... >() ( bspl , gt , gates ... ) ; break ; } case vspline::MIRROR: case vspline::REFLECT: case vspline::SPHERICAL: { auto gt = vspline::mirror < rc_type , _vsize > ( lower , upper ) ; return build_safe_ev < level - 1 , spline_type , rc_type , _vsize , decltype ( gt ) , gate_types ... >() ( bspl , gt , gates ... ) ; break ; } default: { auto gt = vspline::clamp < rc_type , _vsize > ( lower , upper , lower , upper ) ; return build_safe_ev < level - 1 , spline_type , rc_type , _vsize , decltype ( gt ) , gate_types ... >() ( bspl , gt , gates ... ) ; break ; } } } } ; /// at level -1, there are no more axes to deal with, here the recursion /// ends and the actual mapper object is created. Specializing on the /// spline's degree (0, 1, or indeterminate), an evaluator is created /// and chained to the mapper object. The resulting functor is grokked /// to produce a uniform return type, which is returned to the caller. template < typename spline_type , typename rc_type , int _vsize , class ... gate_types > struct build_safe_ev < -1 , spline_type , rc_type , _vsize , gate_types ... > { vspline::grok_type < bspl_coordinate_type < spline_type , rc_type > , bspl_value_type < spline_type > , _vsize > operator() ( const spline_type & bspl , gate_types ... gates ) { typedef bspl_coordinate_type < spline_type , rc_type > crd_t ; typedef bspl_value_type < spline_type > value_t ; if ( bspl.spline_degree == 0 ) return vspline::grok ( vspline::mapper < crd_t , _vsize , gate_types... > ( gates ... ) + vspline::evaluator < crd_t , value_t , _vsize , 0 > ( bspl ) ) ; else if ( bspl.spline_degree == 1 ) return vspline::grok ( vspline::mapper < crd_t , _vsize , gate_types... > ( gates ... ) + vspline::evaluator < crd_t , value_t , _vsize , 1 > ( bspl ) ) ; else return vspline::grok ( vspline::mapper < crd_t , _vsize , gate_types... > ( gates ... ) + vspline::evaluator < crd_t , value_t , _vsize , -1 > ( bspl ) ) ; } } ; } ; // namespace detail /// make_evaluator is a factory function, producing a functor /// which provides access to an evaluator object. Evaluation /// using the resulting object is *not* intrinsically safe, /// it's the user's responsibility not to pass coordinates /// which are outside the spline's defined range. If you need /// safe access, see 'make_safe_evaluator' below. Not safe in /// this context means that evaluation at out-of-bounds /// locations will result in a memory fault or produce /// undefined results. Note that vspline's bspline objects /// are set up as to allow evaluation at the lower and upper /// limit of the spline and will also tolerate values 'just /// outside' the bounds to guard against quantization errors. /// see vspline::bspline for details. /// /// The evaluator will be specialized to the spline's degree: /// degree 0 splines will be evaluated with nearest neighbour, /// degree 1 splines with linear interpolation, all other splines /// will use general b-spline evaluation. /// /// This function returns the evaluator wrapped in an object which /// hides it's type. This object only 'knows' what coordinates it /// can take and what values it will produce. The extra level of /// indirection costs a bit of performance, but having a common type /// simplifies handling. The wrapped evaluator also provides operator(). /// /// So, if you have a vspline::bspline object 'bspl', you can use this /// factory function like this: /// /// auto ev = make_evaluator ( bspl ) ; /// typedef typename decltype(ev)::in_type coordinate_type ; /// coordinate_type c ; /// auto result = ev ( c ) ; /// /// make_evaluator requires one template argument: spline_type, the /// type of the vspline::bspline object you want to have evaluated. /// Optionally, you can specify the elementary type for coordinates /// (use either float or double) and the vectorization width. The /// latter will only have an effect if Vc is used (-DUSE_VC) and /// the spline's data type can be vectorized. Otherwise evaluation /// will 'collapse' to using unvectorized code. Per default, the /// vectorization width will be inferred from the spline's value_type /// by querying vspline::vector_traits, which tries to provide a /// 'good' choice. Note that a specific evaluator will only be capable /// of processing vectorized coordinates of precisely the _vsize it /// has been created with. template < class spline_type , typename rc_type = float , // watch out, default not very precise! int _vsize = vspline::vector_traits < typename spline_type::value_type > :: size > vspline::grok_type < bspl_coordinate_type < spline_type , rc_type > , typename spline_type::value_type , _vsize > make_evaluator ( const spline_type & bspl ) { // first we figure out if the spline's data type can be vectorized. // If so, we'll accept any _vsize template argument, but if not, we // only accept _vsize == 1. If there is a mismatch, we fail a // static_assert to provide a reasonably meaningful message. typedef typename spline_type::value_type value_type ; typedef typename vigra::ExpandElementResult < value_type > :: type ele_type ; enum { vsize = is_vectorizable < ele_type > :: value ? _vsize : 1 } ; static_assert ( vsize == _vsize , "only _vsize 1 acceptable: can't provide vectorization for the spline's data type" ) ; // if all is well, we build the evaluator return detail::build_ev < spline_type , rc_type , _vsize >() ( bspl ) ; } /// make_safe_evaluator is a factory function, producing a functor /// which provides safe access to an evaluator object. This functor /// will map incoming coordinates into the spline's defined range, /// as given by the spline with it's lower_limit and upper_limit /// functions, honoring the bspline objects's boundary conditions. /// So if, for example, the spline is periodic, all incoming /// coordinates are valid and will be mapped to the first period. /// Note the use of lower_limit and upper_limit. These values /// also depend on the spline's boundary conditions, please see /// class vspline::bspline for details. If there is no way to /// meaningfully fold the coordinate into the defined range, the /// coordinate is clamped to the nearest limit. /// /// The evaluator will be specialized to the spline's degree: /// degree 0 splines will be evaluated with nearest neighbour, /// degree 1 splines with linear interpolation, all other splines /// will use general b-spline evaluation. /// /// This function returns the functor wrapped in an object which /// hides it's type. This object only 'knows' what coordinates it /// can take and what values it will produce. The extra level of /// indirection costs a bit of performance, but having a common type /// simplifies handling: the type returned by this function only /// depends on the spline's data type, the coordinate type and, /// optionally, the vectorization width. template < class spline_type , typename rc_type = float , // watch out, default not very precise! int _vsize = vspline::vector_traits < typename spline_type::value_type > :: size > vspline::grok_type < bspl_coordinate_type < spline_type , rc_type > , typename spline_type::value_type , _vsize > make_safe_evaluator ( const spline_type & bspl ) { // first we figure out if the spline's data type can be vectorized. // If so, we'll accept any _vsize template argument, but if not, we // only accept _vsize == 1. If there is a mismatch, we fail a // static_assert to provide a reasonably meaningful message. typedef typename spline_type::value_type value_type ; typedef typename vigra::ExpandElementResult < value_type > :: type ele_type ; enum { vsize = is_vectorizable < ele_type > :: value ? _vsize : 1 } ; static_assert ( vsize == _vsize , "only _vsize 1 acceptable: can't provide vectorization for the spline's data type" ) ; // if all is well, we build the evaluator return detail::build_safe_ev < spline_type::dimension - 1 , spline_type , rc_type , _vsize >() ( bspl ) ; } ; } ; // end of namespace vspline #endif // VSPLINE_EVAL_H kfj-vspline-6e66cf7a7926/example/000077500000000000000000000000001320375670700165305ustar00rootroot00000000000000kfj-vspline-6e66cf7a7926/example/ca_correct.cc000066400000000000000000000467611320375670700211610ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// ca_correct.cc /// /// Perform correction of chromatic aberration using a cubic polynomial /// This uses panotools-compatible parameters. Currently only processes /// 8bit RGB, processing is done on the sRGB data without conversion /// to linear and back, iamges with alpha channel won't compute. /// To see how the panotools lens correction model functions, /// please refer to https://wiki.panotools.org/Lens_correction_model /// /// compile with: /// clang++ -std=c++11 -march=native -o ca_correct -O3 -pthread -DUSE_VC ca_correct.cc -lvigraimpex -lVc /// /// you can also use g++ instead of clang++. If you don't have Vc on /// your system, omit '-DUSE_VC' to obtain a working program which does /// not use vectorization - it will produce the same result, only slower. /// /// invoke with ca_correct ar br cr dr ag bg cg dg ab bb cb db d e /// where 'ar' stands for 'parameter a for red channel' etc., and the /// trailing d and e are center shift in x and y direction in pixels. /// /// The purpose here is more to demonstrate how to implement the maths /// using vspline, but the program could easily be fleshed out to /// become really usable: /// /// -add alpha channel treatment /// -add processing of 16bit data /// -add colour space management and internal linear processing // TODO: handle incoming alpha channel // TODO: differentiate between 8/16 bit // TODO: operate in linear RGB // TODO: may produce transparent output where coordinate is out-of-range // TODO might allow to parametrize normalization (currently using PT's way) // TODO: while emulating the panotools way of ca correction is nice // to have, try implementing shift-curve based correction: with // pixel's radial distance r to origin (normalized to [0,1]) // access a 1D b-spline over M data points, so that // res = ev ( r * ( M - 1 ) ) // and picking up from source at res intead of r. // the coefficients of the shift curve can be generated by sampling the // 'normal' method or any other model, it's methodically neutral, within // the fidelity of the spline to the approximated signal, which can be // taken arbitrarily high. // TODO: consider two more shift curves Cx and Cy defined over [-1,1] // then, if the incoming coordinate is(x,y), let // x' = Cx ( x ) and y' = Cy ( y ). This would introduce a tensor- // based component, usable for sensor tilt compensation (check!) // pulls in all of vspline's functionality #include // in addition to and , // which are necessarily included by vspline, we want to use vigra's // import/export facilites to load and store images: #include #include #include // we'll be working with float data, so we set up a program-wide // constant for the vector width appropriate for float data const int VSIZE = vspline::vector_traits < float > :: size ; // we silently assume we have a colour image typedef vigra::RGBValue pixel_type; // coordinate_type is a 2D coordinate typedef vigra::TinyVector < float , 2 > coordinate_type ; // type of b-spline object used as interpolator typedef vspline::bspline < pixel_type , 2 > spline_type ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; // type of b-spline evaluator producing single floats typedef vspline::evaluator < coordinate_type , // incoming coordinate's type float // processing single channel data > ev_type ; // gate type to force singular coordinates to a range // we are using mirror boundary conditions, which may not be a good // choice for 'real' work. typedef vspline::mirror_gate < float > gate_type ; // mapper uses two gates, for x and y component typedef vspline::map_functor < coordinate_type , VSIZE , gate_type , gate_type > mapper_type ; // we start out by coding the functor implementing // the coordinate transformation for a single channel. // we inherit from vspline::unary_functor so that our coordinate // transformation fits well into vspline's functional processing scheme struct ev_radial_correction : public vspline::unary_functor < float , float > { // incoming coordinates are shifted by dx and dy. These values are expected // in image coordinates. The shift values should be so that a pixel which // is located at the intersection of the sensor with the optical axis comes // out as (0,0). If the optical system is perfectly centered, dx and dy will // be the coordinates of the image center, so if the image is size X * Y, // dx = ( X - 1 ) / 2 // dy = ( Y - 1 ) / 2 const float dx , dy ; // next we have a scaling factor. Once the coordinates are shifted to // coordinates relative to the optical axis, we apply a scaling factor // which 'normalizes' the pixel's distance from the optical axis. // A typical scaling factor would be the distance of the image center // from the top left corner at (-0.5,-0.5). With dx and dy as above // dxc = dx - -0.5 ; // dyc = dy - -0.5 ; // scale = 1 / sqrt ( dx * dx + dy * dy ) ; // Here we use a different choice to be compatible with panotools: // we use the vertical distance from image center to top/bottom // margin. // Since we'll be using a polynomial over the radial distance, picking // scale values larger or smaller than this 'typical' value can be used // to affect the precise effect of the radial function. // rscale is simply the reciprocal value for faster computation. const float scale ; const float rscale ; // After applying the scale, we have a normalized coordinate. The functor will // use this normalized coordinate to calculate the normalized distance from // the optical axis. The resulting distance is the argument to the radial // correction function. For the radial correction function, we use a cubic // polynomial, which needs four coefficients: const float a , b , c , d ; // finally we have the PT d and e values, which we label x_shift and y_shift // to avoid confusion with the fourth coefficient of the polynomial. const float x_shift , y_shift ; // we use two static functions to concisely initialize some of the // constant values above static double d_from_extent ( double d ) { return ( d - 1.0 ) / 2.0 ; } static double rscale_from_wh ( double w , double h ) { double dx = d_from_extent ( w ) ; double dy = d_from_extent ( h ) ; // I'd normalize to the corner, but to be compatible with panotools, // I use normalization to top margin center instead. // return sqrt ( dx * dx + dy * dy ) ; return sqrt ( dy * dy ) ; } // her's the constructor for the radial correction functor, taking all // the values passed from main() and initalizing the constants ev_radial_correction ( const double & _width , const double & _height , const double & _x_shift , const double & _y_shift , const double & _a , const double & _b , const double & _c , const double & _d ) : dx ( d_from_extent ( _width ) ) , dy ( d_from_extent ( _height ) ) , x_shift ( _x_shift ) , y_shift ( _y_shift ) , rscale ( rscale_from_wh ( _width , _height ) ) , scale ( 1.0 / rscale_from_wh ( _width , _height ) ) , a ( _a ) , b ( _b ) , c ( _c ) , d ( _d ) { // we echo the internal state std::cout << "dx: " << dx << std::endl ; std::cout << "dy: " << dy << std::endl ; std::cout << "scale: " << scale << std::endl ; std::cout << "rscale: " << rscale << std::endl ; std::cout << "a: " << a << std::endl ; std::cout << "b: " << b << std::endl ; std::cout << "c: " << c << std::endl ; std::cout << "d: " << d << std::endl ; } ; // now we provide evaluation code for the functor. // since the code is the same for vectorized and unvectorized // operation, we can write a template, In words: // eval is a function template with the coordinate type as it's // template argument. eval receives it's argument as a const // reference to a coordinate and deposits it's result to a reference // to a coordinate. This function will not change the state of the // functor (hence the const) - the functor does not have mutable state // anyway. Note how CRD can be a single coordinate_type, or it's // vectorized equivalent. template < class CRD > void eval ( const CRD & in , CRD & result ) const { // set up coordinate-type variable to work on, copy input to it. CRD cc ( in ) ; // shift and scale // TODO: is it right to add the shift here, or should I subtract cc[0] -= ( dx + x_shift ) ; cc[0] *= scale ; cc[1] -= ( dy + y_shift ) ; cc[1] *= scale ; // calculate distance from center (this is normalized due to scaled cc) auto r = sqrt ( cc[0] * cc[0] + cc[1] * cc[1] ) ; // apply polynomial to obtain the scaling factor. auto rr = a * r * r * r + b * r * r + c * r + d ; // use rr to scale cc - this is the radial correction cc[0] *= rr ; cc[1] *= rr ; // apply rscale to revert to centered image coordinates cc[0] *= rscale ; cc[1] *= rscale ; // reverse initial shift to arrive at UL-based image coordinates cc[0] += ( dx + x_shift ) ; cc[1] += ( dy + y_shift ) ; // assign to result result = cc ; } } ; // next we set up the functor processing all three channels. This functor // receives three ev_radial_correction functors and three channel views: struct ev_ca_correct : public vspline::unary_functor < coordinate_type , pixel_type > { // these three functors hold the radial corrections for the three // colour channels ev_radial_correction rc_red ; ev_radial_correction rc_green ; ev_radial_correction rc_blue ; // these three functors hold interpolators for the colour channels ev_type ev_red ; ev_type ev_green ; ev_type ev_blue ; // and this object deals with out-of-bounds coordinates mapper_type m ; // the constructor receives all the functors we'll use. Note how we // can simply copy-construct the functors. ev_ca_correct ( const ev_radial_correction & _rc_red , const ev_radial_correction & _rc_green , const ev_radial_correction & _rc_blue , const ev_type & _ev_red , const ev_type & _ev_green , const ev_type & _ev_blue , const mapper_type & _m ) : rc_red ( _rc_red ) , rc_green ( _rc_green ) , rc_blue ( _rc_blue ) , ev_red ( _ev_red ) , ev_green ( _ev_green ) , ev_blue ( _ev_blue ) , m ( _m ) { } ; // the eval routine is simple, it simply applies the coordinate // transformation, applies the mapper to force the transformed // coordinate into the range, an then picks the interpolated value // using the interpolator for the channel. This is done for all // channels in turn. // since the code is the same for vectorized and unvectorized // operation, we can again write a template: template < class IN , class OUT > void eval ( const IN & c , OUT & result ) const { // work variable containg a (possibly vectorized) 2D coordinate IN cc ; // apply the radial correction to the incoming coordinate in c, // storing result to cc. Note that c contains the 'target' coordinate: // The coordinate of the pixel in the target which we want to compute rc_red.eval ( c , cc ) ; // force coordinate into the defined range (here we use mirroring) m.eval ( cc , cc ) ; // evaluate channel view at corrected coordinate, storing result // to the red channel of 'result' ev_red.eval ( cc , result[0] ) ; // ditto, for the remaining channels rc_green.eval ( c , cc ) ; m.eval ( cc , cc ) ; ev_green.eval ( cc , result[1] ) ; rc_blue.eval ( c , cc ) ; m.eval ( cc , cc ) ; ev_blue.eval ( cc , result[2] ) ; } } ; int main ( int argc , char * argv[] ) { if ( argc < 11 ) { std::cerr << "pass a colour image file as first argument" << std::endl ; std::cerr << "followed by a, b, c for red, green, blue" << std::endl ; std::cerr << "and the horizontal and vertical shift" << std::endl ; std::cerr << "like ca_correct 0 0 0 .001 0 0 0 0 0 0 0 -.001 100 50" << std::endl ; exit( -1 ) ; } double ar = atof ( argv[2] ) ; double br = atof ( argv[3] ) ; double cr = atof ( argv[4] ) ; double dr = atof ( argv[5] ) ; double ag = atof ( argv[6] ) ; double bg = atof ( argv[7] ) ; double cg = atof ( argv[8] ) ; double dg = atof ( argv[9] ) ; double ab = atof ( argv[10] ) ; double bb = atof ( argv[11] ) ; double cb = atof ( argv[12] ) ; double db = atof ( argv[13] ) ; double x_shift = atof ( argv[14] ) ; // aka panotools 'd' double y_shift = atof ( argv[15] ) ; // aka panotools 'e' // get the image file name vigra::ImageImportInfo imageInfo ( argv[1] ) ; // we want a b-spline with natural boundary conditions vigra::TinyVector < vspline::bc_code , 2 > bcv ( vspline::NATURAL ) ; // create cubic 2D b-spline object containing the image data // TODO allow passing in arbitrary spline order spline_type bspl ( imageInfo.shape() , // the shape of the data for the spline 5 , // degree 5 == quintic spline bcv // specifies natural BCs along both axes ) ; // load the image data into the b-spline's core. This is a common idiom: // the spline's 'core' is a MultiArrayView to that part of the spline's // data container which precisely corresponds with the input data. // This saves loading the image to some memory first and then transferring // the data into the spline. Since the core is a vigra::MultiarrayView, // we can pass it to importImage as the desired target for loading the // image from disk. std::cout << "reading image " << argv[1] << " from disk" << std::endl ; vigra::importImage ( imageInfo , bspl.core ) ; // prefilter the b-spline std::cout << "setting up b-spline interpolator for image data" << std::endl ; bspl.prefilter() ; // this is where the result should go: target_type target ( imageInfo.shape() ) ; // process the image metrics float width = imageInfo.width() ; float height = imageInfo.height() ; // set up the radial transformation functors std::cout << "setting up radial coorrection for red channel:" << std::endl ; ev_radial_correction ca_red ( width , height , x_shift , y_shift , ar , br , cr , dr ) ; std::cout << "setting up radial coorrection for green channel:" << std::endl ; ev_radial_correction ca_green ( width , height , x_shift , y_shift , ag , bg , cg , dg ) ; std::cout << "setting up radial coorrection for blue channel:" << std::endl ; ev_radial_correction ca_blue ( width , height , x_shift , y_shift , ab , bb , cb , db ) ; // here we create the channel views. auto red_channel = bspl.get_channel_view ( 0 ) ; auto green_channel = bspl.get_channel_view ( 1 ) ; auto blue_channel = bspl.get_channel_view ( 2 ) ; // and set up the per-channel interpolators ev_type red_ev ( red_channel ) ; ev_type green_ev ( green_channel ) ; ev_type blue_ev ( blue_channel ) ; // next we set up coordinate mapping to the defined range gate_type g_x ( 0.0 , width - 1.0 ) ; gate_type g_y ( 0.0 , height - 1.0 ) ; mapper_type m ( g_x , g_y ) ; // using vspline's factory functions to create the 'gates' and the // 'mapper' applying them, we could instead create m like this: // (Note how we have to be explicit about using 'float' // for the arguments to the gates - using double arguments would not // work here unless we'd also specify the vector width.) // auto m = vspline::mapper < coordinate_type > // ( vspline::mirror ( 0.0f , width - 1.0f ) , // vspline::mirror ( 0.0f , height - 1.0f ) ) ; // finally, we create the top-level functor, passing in the three // radial correction functors, the channel-wise evaluators and the // mapper object ev_ca_correct correct ( ca_red , ca_green , ca_blue , red_ev , green_ev , blue_ev , m ) ; // now we obtain the result by performing a vspline::transform. this transform // successively passes discrete coordinates into the target to the functor // it's invoked with, storing the result of the functor's evaluation // at the self-same coordinates in it's target, so for each coordinate // (X,Y), target[(X,Y)] = correct(X,Y) std::cout << "rendering the target image" << std::endl ; vspline::transform ( correct , target ) ; // store the result with vigra impex std::cout << "storing the target image as 'ca_correct.tif'" << std::endl ; vigra::ImageExportInfo eximageInfo ( "ca_correct.tif" ); vigra::exportImage ( target , eximageInfo .setPixelType("UINT8") .setForcedRangeMapping ( 0 , 255 , 0 , 255 ) ) ; std::cout << "done" << std::endl ; } kfj-vspline-6e66cf7a7926/example/channels.cc000066400000000000000000000142041320375670700206330ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file channels.cc /// /// \brief demonstrates the use of 'channel views' /// /// This example is derived from 'slice.cc', we use the same volume /// as source data. But instead of producing an image output, we create /// three separate colour channels of the bspline object and assert that /// the evaluation of the channel views is identical with the evaluation /// of the 'mother' spline. /// For a more involved example using channel views, see ca_correct.cc /// /// compile with: /// clang++ -std=c++11 -march=native -o channels -O3 -pthread -DUSE_VC channels.cc -lvigraimpex -lVc /// g++ also works, but only up to -O1. #include #include #include #include int main ( int argc , char * argv[] ) { // pixel_type is the result type, an RGB float pixel typedef vigra::TinyVector < float , 3 > pixel_type ; // voxel_type is the source data type typedef vigra::TinyVector < float , 3 > voxel_type ; // coordinate_type has a 3D coordinate typedef vigra::TinyVector < float , 3 > coordinate_type ; // warp_type is a 2D array of coordinates typedef vigra::MultiArray < 2 , coordinate_type > warp_type ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; // we want a b-spline with natural boundary conditions vigra::TinyVector < vspline::bc_code , 3 > bcv ( vspline::NATURAL ) ; // create quintic 3D b-spline object containing voxels vspline::bspline < voxel_type , 3 > space ( vigra::Shape3 ( 10 , 10 , 10 ) , 5 , bcv ) ; // here we create the channel view. Since these are merely views // to the same data, no data will be copied, and it doesn't matter // whether we create these views before or after prefiltering. auto red_channel = space.get_channel_view ( 0 ) ; auto green_channel = space.get_channel_view ( 1 ) ; auto blue_channel = space.get_channel_view ( 2 ) ; // fill the b-spline's core with a three-way gradient for ( int z = 0 ; z < 10 ; z++ ) { for ( int y = 0 ; y < 10 ; y++ ) { for ( int x = 0 ; x < 10 ; x++ ) { voxel_type & c ( space.core [ vigra::Shape3 ( x , y , z ) ] ) ; c[0] = 25.5 * x ; c[1] = 25.5 * y ; c[2] = 25.5 * z ; } } } // prefilter the b-spline space.prefilter() ; // now make a warp array with 1920X1080 3D coordinates warp_type warp ( vigra::Shape2 ( 1920 , 1080 ) ) ; // we want the coordinates to follow this scheme: // warp(x,y) = (x,1-x,y) // scaled appropriately for ( int y = 0 ; y < 1080 ; y++ ) { for ( int x = 0 ; x < 1920 ; x++ ) { coordinate_type & c ( warp [ vigra::Shape2 ( x , y ) ] ) ; c[0] = float ( x ) / 192.0 ; c[1] = 10.0 - c[0] ; c[2] = float ( y ) / 108.0 ; } } // get an evaluator for the b-spline typedef vspline::evaluator < coordinate_type , voxel_type > ev_type ; ev_type _ev ( space ) ; auto ev = vspline::callable ( _ev ) ; // the evaluators of the channel views have their own type: typedef vspline::evaluator < coordinate_type , float > ch_ev_type ; // we create the three evaluators for the three channel views ch_ev_type _red_ev ( red_channel ) ; ch_ev_type _green_ev ( green_channel ) ; ch_ev_type _blue_ev ( blue_channel ) ; auto red_ev = vspline::callable ( _red_ev ) ; auto green_ev = vspline::callable ( _green_ev ) ; auto blue_ev = vspline::callable ( _blue_ev ) ; // and make sure the evaluation results match for ( int y = 0 ; y < 1080 ; y++ ) { for ( int x = 0 ; x < 1920 ; x++ ) { coordinate_type & c ( warp [ vigra::Shape2 ( x , y ) ] ) ; // TODO: with g++ and optimization better than -O1 the assertions fail assert ( ev ( c ) [ 0 ] == red_ev ( c ) ) ; assert ( ev ( c ) [ 1 ] == green_ev ( c ) ) ; assert ( ev ( c ) [ 2 ] == blue_ev ( c ) ) ; } } std::cout << "success" << std::endl ; exit ( 0 ) ; } kfj-vspline-6e66cf7a7926/example/complex.cc000066400000000000000000000120221320375670700205030ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file complex.cc /// /// \brief demonstrate use of b-spline over std::complex data /// /// vspline handles std::complex data like pairs of the complex /// type's value_type, and uses a vigra::TinyVector of two /// simdized value_types as the vectorized type. /// /// compile: clang++ -std=c++11 -march=native -o complex -O3 -pthread -DUSE_VC complex.cc -lvigraimpex -lVc #include #include #include #include #include int main ( int argc , char * argv[] ) { // nicely formatted output std::cout << std::fixed << std::showpoint << std::showpos << std::setprecision(6) ; // create default b-spline over 100 values vspline::bspline < std::complex < float > , 1 > bsp ( 100 ) ; // get a vigra::MultiArrayView to the spline's 'core'. This is the // area corresponding with the original data and is filled with zero. auto v1 = bsp.core ; // we only set one single value in the middle of the array // g++ does not like this: // v1 [ 50 ] = std::complex ( 3.3 + 2.2i ) ; v1 [ 50 ] = std::complex ( 3.3 , 2.2 ) ; // now we convert the original data to b-spline coefficients // by calling prefilter() bsp.prefilter() ; // and create an evaluator over the spline. Here we pass three // template arguments to the evaluator's type declaration: // - float for the 'icoming type' (coordinates) // - std::complex for the 'outgoing type' (values) // - 4 for the vector width, just for this example typedef vspline::evaluator < float , std::complex , 4 > ev_type ; // for convenience, we wrap the evalator in a vspline::callable // object, which we can call like a function auto ev = vspline::callable ( ev_type ( bsp ) ) ; // now we evaluate the spline in the region around the single // nonzero value in the input data and print argument and result for ( int k = -12 ; k < 13 ; k++ ) { std::cout << "ev(" << 50.0 + k * 0.1 << ") = " << ev ( 50.0 + k * 0.1 ) << std::endl ; } #ifdef USE_VC // if this example was built with Vc support, repeat the example // evaluation with vector data for ( int k = -12 ; k < 13 ; k++ ) { // feed the evaluator with vectors. Note how we obtain the // type of vectorized data the evaluator will accept from // the evaluator, by querying for it's 'in_v' type. Here, // the vectorized input type is a Vc::SimdArray of four floats. // The result appears as a vigra::TinyVector of two // Vc::SimdArrays of four floats, the vectorized type which // vspline deems appropriate for complex data: one SimdArray // for the real parts, one SimdArray for the imaginary parts. typename ev_type::in_v vk ( 50.0 + k * 0.1 ) ; std::cout << "ev(" << vk << ") = " << ev(vk) << std::endl ; } #endif } kfj-vspline-6e66cf7a7926/example/eval.cc000066400000000000000000000134171320375670700177740ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file eval.cc /// /// \brief simple demonstration of creation and evaluation of a b-spline /// /// takes a set of knot point values from cin, calculates a 1D b-spline /// over them, and evaluates it at coordinates taken from cin. /// The output shows how the coordinate is split into integral and real /// part and the result of evaluating the spline at this point. /// Note how the coordinate is not checked for validity. when accessing /// the spline outside the defined range, the coefficient array will /// receive an out-of-bounds access, which may result in a seg fault. /// eval warns you when receiving invalid arguments, but proceeds regardless. /// To see how out-of-range incoming coordinates can be handled to avoid /// out-of-bounds access to the coefficients, see use_map.cc /// /// compile: clang++ -std=c++11 -o eval -pthread eval.cc #include #include using namespace std ; using namespace vigra ; using namespace vspline ; int main ( int argc , char * argv[] ) { // get the spline degree and boundary conditions from the console cout << "enter spline degree: " ; int spline_degree ; cin >> spline_degree ; int bci = -1 ; bc_code bc ; while ( bci < 1 || bci > 4 ) { cout << "choose boundary condition" << endl ; cout << "1) MIRROR" << endl ; cout << "2) PERIODIC" << endl ; cout << "3) REFLECT" << endl ; cout << "4) NATURAL" << endl ; cin >> bci ; } switch ( bci ) { case 1 : bc = MIRROR ; break ; case 2 : bc = PERIODIC ; break ; case 3 : bc = REFLECT ; break ; case 4 : bc = NATURAL ; break ; } // put the BC code into a TinyVector TinyVector < bc_code , 1 > bcv ( bc ) ; TinyVector < int , 1 > deriv_spec ( 0 ) ; // obtain knot point values double v ; std::vector dv ; cout << "enter knot point values (end with EOF)" << endl ; while ( cin >> v ) dv.push_back ( v ) ; cin.clear() ; // put the size into a TinyVector TinyVector < int , 1 > shape ( dv.size() ) ; // fix the type for the bspline object typedef bspline < double , 1 > spline_type ; spline_type bsp ( shape , spline_degree , bcv ) ; // , EXPLICIT ) ; cout << "created bspline object:" << endl << bsp << endl ; // fill the data into the spline's 'core' area for ( size_t i = 0 ; i < dv.size() ; i++ ) bsp.core[i] = dv[i] ; // prefilter the data bsp.prefilter() ; cout << fixed << showpoint << setprecision(12) ; cout << "spline coefficients (with frame)" << endl ; for ( auto& coeff : bsp.container ) cout << " " << coeff << endl ; auto ev = vspline::make_safe_evaluator < spline_type , double > ( bsp ) ; int ic ; double rc ; double res ; cout << "enter coordinates to evaluate (end with EOF)" << endl ; while ( ! cin.eof() ) { // get a coordinate cin >> v ; if ( v < bsp.lower_limit(0) || v > bsp.upper_limit(0) ) { std::cout << "warning: " << v << " is outside the spline's defined range." << std::endl ; std::cout << "using automatic folding to process this value." << std::endl ; } // evaluate the spline at this location ev.eval ( v , res ) ; cout << v << " -> " << res << endl ; #ifdef USE_VC auto vev = vspline::make_safe_evaluator < spline_type , double , 4 > ( bsp ) ; typedef typename decltype(vev)::in_ele_v vtype ; vtype vv(v) ; vtype vres ; vev.eval ( vv , vres ) ; cout << vv << " -> " << vres << endl ; #endif } } kfj-vspline-6e66cf7a7926/example/examples.sh000077500000000000000000000012251320375670700207050ustar00rootroot00000000000000#! /bin/bash # compile all examples for f in $@ do body=$(basename $f .cc) common_flags="-Ofast -std=c++11 -march=native -pthread" echo compiling $body with: echo g++ $common_flags -Wno-abi -o$body $f -lvigraimpex g++ $common_flags -Wno-abi -o$body $f -lvigraimpex echo g++ -DUSE_VC $common_flags -Wno-abi -o$body $f -lVc -lvigraimpex g++ -DUSE_VC $common_flags -Wno-abi -o$body $f -lVc -lvigraimpex echo clang++ $common_flags -o$body $f -lvigraimpex clang++ $common_flags -o$body $f -lvigraimpex echo clang++ -DUSE_VC $common_flags -o$body $f -lVc -lvigraimpex clang++ -DUSE_VC $common_flags -o$body $f -lVc -lvigraimpex done kfj-vspline-6e66cf7a7926/example/gradient.cc000066400000000000000000000143241320375670700206400ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file gradient.cc /// /// \brief evaluating a specific spline, derivatives, precision /// /// If we create a b-spline over an array containing, at each grid point, /// the sum of the grid point's coordinates, each 1D row, column, etc will /// hold a linear gradient with first derivative == 1. If we use NATURAL /// BCs, evaluating the spline with real coordinates anywhere inside the /// defined range should produce precisely the sum of the coordinates. /// This is a good test for both the precision of the evaluation and it's /// correct functioning, particularly with higher-D arrays. /// /// compile: clang++ -O3 -DUSE_VC -march=native -std=c++11 -pthread -o gradient gradient.cc -lVc #include #include using namespace std ; int main ( int argc , char * argv[] ) { typedef vspline::bspline < double , 3 > spline_type ; typedef typename spline_type::shape_type shape_type ; typedef typename spline_type::view_type view_type ; typedef typename spline_type::bcv_type bcv_type ; // let's have a knot point array with nicely odd shape shape_type core_shape = { 35 , 43 , 19 } ; // we have to use a longish call to the constructor since we want to pass // 0.0 to 'tolerance' and it's way down in the argument list, so we have to // explicitly pass a few arguments which usually take default values before // we have a chance to pass the tolerance spline_type bspl ( core_shape , // shape of knot point array 3 , // cubic b-spline bcv_type ( vspline::NATURAL ) , // natural boundary conditions vspline::BRACED , // implicit scheme, bracing coeffs -1 , // horizon (unused w. BRACED) 0.0 ) ; // no tolerance // get a view to the bspline's core, to fill it with data view_type core = bspl.core ; // create the gradient in each dimension for ( int d = 0 ; d < bspl.dimension ; d++ ) { for ( int c = 0 ; c < core_shape[d] ; c++ ) core.bindAt ( d , c ) += c ; } // now prefilter the spline bspl.prefilter() ; // set up the evaluator type typedef vigra::TinyVector < double , 3 > coordinate_type ; typedef vspline::evaluator < coordinate_type , double > evaluator_type ; // we also want to verify the derivative along each axis typedef typename evaluator_type::derivative_spec_type deriv_t ; deriv_t dsx , dsy , dsz ; dsx[0] = 1 ; // first derivative along axis 0 dsy[1] = 1 ; // first derivative along axis 1 dsz[2] = 1 ; // first derivative along axis 2 // set up the evaluator for the underived result evaluator_type ev ( bspl ) ; // and evaluators for the three first derivatives evaluator_type ev_dx ( bspl , dsx ) ; evaluator_type ev_dy ( bspl , dsy ) ; evaluator_type ev_dz ( bspl , dsz ) ; // we want to bombard the evaluator with random in-range coordinates std::random_device rd; std::mt19937 gen(rd()); // std::mt19937 gen(12345); // fix starting value for reproducibility coordinate_type c ; // here comes our test, feed 100 random 3D coordinates and compare the // evaluator's result with the expected value, which is precisely the // sum of the coordinate's components. The printout of the derivatives // is boring: it's always 1. But this assures us that the b-spline is // perfectly plane, even off the grid points. for ( int times = 0 ; times < 100 ; times++ ) { for ( int d = 0 ; d < bspl.dimension ; d++ ) c[d] = ( core_shape[d] - 1 ) * std::generate_canonical(gen) ; double result ; ev.eval ( c , result ) ; double delta = result - sum ( c ) ; cout << "eval(" << c << ") = " << result << " -> delta = " << delta << endl ; ev_dx.eval ( c , result ) ; cout << "dx: " << result ; ev_dy.eval ( c , result ) ; cout << " dy: " << result ; ev_dz.eval ( c , result ) ; cout << " dz: " << result << std::endl ; } } kfj-vspline-6e66cf7a7926/example/gradient2.cc000066400000000000000000000141351320375670700207220ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// gradient.cc /// /// If we create a b-spline over an array containing, at each grid point, /// the sum of the grid point's coordinates, each 1D row, column, etc will /// hold a linear gradient with first derivative == 1. If we use NATURAL /// BCs, evaluating the spline with real coordinates anywhere inside the /// defined range should produce precisely the sum of the coordinates. /// This is a good test for both the precision of the evaluation and it's /// correct functioning, particularly with higher-D arrays. /// /// In this variant of the program, we use a vspline::domain functor. /// This functor provides a handy way to access a b-spline with normalized /// coordinates: instead of evaluating coordinates in the range of /// [ 0 , core_shape - 1 ] (natural spline coordinates), we pass coordinates /// in the range of [ 0 , 1 ] to the domain object, which is chained to /// the evaluator and passes natural spline coordinates to it. /// /// compile: clang++ -O3 -DUSE_VC -march=native -std=c++11 -pthread -o gradient2 gradient2.cc -lVc #include #include using namespace std ; int main ( int argc , char * argv[] ) { typedef vspline::bspline < double , 3 > spline_type ; typedef typename spline_type::shape_type shape_type ; typedef typename spline_type::view_type view_type ; typedef typename spline_type::bcv_type bcv_type ; // let's have a knot point array with nicely odd shape shape_type core_shape = { 35 , 43 , 19 } ; // we have to use a longish call to the constructor since we want to pass // 0.0 to 'tolerance' and it's way down in the argument list, so we have to // explicitly pass a few arguments which usually take default values before // we have a chance to pass the tolerance spline_type bspl ( core_shape , // shape of knot point array 3 , // cubic b-spline bcv_type ( vspline::NATURAL ) , // natural boundary conditions vspline::BRACED , // implicit scheme, bracing coeffs -1 , // horizon (unused w. BRACED) 0.0 ) ; // no tolerance // get a view to the bspline's core, to fill it with data view_type core = bspl.core ; // create the gradient in each dimension for ( int d = 0 ; d < bspl.dimension ; d++ ) { for ( int c = 0 ; c < core_shape[d] ; c++ ) core.bindAt ( d , c ) += c ; } // now prefilter the spline bspl.prefilter() ; // set up an evaluator typedef vigra::TinyVector < double , 3 > coordinate_type ; typedef vspline::evaluator < coordinate_type , double > evaluator_type ; evaluator_type inner_ev ( bspl ) ; // create the domain from the bspline: auto dom = vspline::domain < coordinate_type > ( bspl ) ; // chain domain and inner evaluator auto ev = dom + inner_ev ; // we want to bombard the evaluator with random in-range coordinates std::random_device rd; std::mt19937 gen(rd()); // std::mt19937 gen(12345); // fix starting value for reproducibility coordinate_type c ; // here comes our test, feed 100 random 3D coordinates and compare the // evaluator's result with the expected value, which is precisely the // sum of the coordinate's components for ( int times = 0 ; times < 100 ; times++ ) { for ( int d = 0 ; d < bspl.dimension ; d++ ) { // note the difference to the code in gardient.cc here: // our test coordinates are in the range of [ 0 , 1 ] c[d] = std::generate_canonical(gen) ; } double result ; ev.eval ( c , result ) ; // 'result' is the same as in gradient.cc, but we have to calculate // the expected value differently. double delta = result - sum ( c * ( core_shape - 1 ) ) ; cout << "eval(" << c << ") = " << result << " -> delta = " << delta << endl ; } } kfj-vspline-6e66cf7a7926/example/grok.cc000066400000000000000000000115551320375670700200100ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file grok.cc /// /// \brief demonstrates use of vspline::grok_type #include #include // sin_eval implements an eval-type function typical for vspline void sin_eval ( const double & input , double & output ) { output = sin ( input ) ; } // t_sin_eval is an equivalent template, it can evaluate SIMD data // just as non-vectorized types template < typename T > void t_sin_eval ( const T & input , T & output ) { output = sin ( input ) ; } struct sin_uf : public vspline::unary_functor < double , double > { template < typename IN , typename OUT > void eval ( const IN & in , OUT & out ) const { out = sin ( in ) ; } } ; int main ( int argc , char * argv[] ) { typedef vspline::grok_type < double , double > grok_t ; // we create a first grok_type object passing sin_eval to the // constructor. sin_eval has the functor syntax typical for vspline: // it takes both it's arguments per reference an has void return type. // Note how, with Vc in use, this grok_type object will broadcast // sin_eval to process vectorized types grok_t gk1 ( sin_eval ) ; // grok_type is quite flexible and can also take 'ordinary' functions. // again, with the single-argument constructor, sin will be broadcast. grok_t gk2 ( sin ) ; // and it can take vspline::unary_functors as constructor argument as well. // since sin_uf uses a template to implement it's eval routine, it actually // uses vector code when evaluating vector data. Hence the grok_type // created from it will also use vector code. sin_uf suf ; grok_t gk3 ( suf ) ; double x = 1.0 ; std::cout << x << " -> " << gk1 ( x ) << std::endl ; std::cout << x << " -> " << gk2 ( x ) << std::endl ; std::cout << x << " -> " << gk3 ( x ) << std::endl ; // argtype will be an SIMD type if Vc is used, otherwise it will // simply be double. typedef typename grok_t::in_v argtype ; argtype xx = 1.0 ; std::cout << xx << " -> " << gk1 ( xx ) << std::endl ; std::cout << xx << " -> " << gk2 ( xx ) << std::endl ; std::cout << xx << " -> " << gk3 ( xx ) << std::endl ; #ifdef USE_VC // here we use a two-argument constructor for a grok_type. // we pass a lambda expression for the vectorized evaluation, // and the evaluation will use vector code. grok_t gk4 ( sin , [] ( const argtype & x ) { return sin(x) ; } ) ; std::cout << xx << " -> " << gk4 ( xx ) << std::endl ; // another example for using the two-argument constructor // here we specialize t_sin_eval to provide acceptable aguments grok_t gk5 ( t_sin_eval , t_sin_eval ) ; std::cout << xx << " -> " << gk5 ( xx ) << std::endl ; #endif } kfj-vspline-6e66cf7a7926/example/gsm.cc000066400000000000000000000133301320375670700176250ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file gsm.cc /// /// \brief calculating the gradient squared magnitude, derivatives /// /// implementation of gsm.cc, performing the calculation of the /// gradient squared magnitude in a loop using two evaluators for /// the two derivatives, adding the squared magnitudes and writing /// the result to an image file /// /// compile with: /// clang++ -std=c++11 -march=native -o gsm -O3 -pthread -DUSE_VC gsm.cc -lvigraimpex -lVc /// /// invoke passing a colour image file. the result will be written to 'gsm.tif' #include #include #include #include // we silently assume we have a colour image typedef vigra::RGBValue pixel_type; // coordinate_type has a 2D coordinate typedef vigra::TinyVector < float , 2 > coordinate_type ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; // b-spline evaluator producing float pixels typedef vspline::evaluator < coordinate_type , // incoming coordinate's type pixel_type // singular result data type > ev_type ; int main ( int argc , char * argv[] ) { if ( argc < 2 ) { std::cerr << "pass a colour image file as argument" << std::endl ; exit( -1 ) ; } vigra::ImageImportInfo imageInfo ( argv[1] ) ; // we want a b-spline with natural boundary conditions vigra::TinyVector < vspline::bc_code , 2 > bcv ( vspline::NATURAL ) ; // create cubic 2D b-spline object containing the image data vspline::bspline < pixel_type , 2 > bspl ( imageInfo.shape() , 3 , bcv ) ; // load the image data into the b-spline's core vigra::importImage ( imageInfo , bspl.core ) ; // prefilter the b-spline bspl.prefilter() ; // we create two evaluators for the b-spline, one for the horizontal and // one for the vertical gradient. The derivatives for a b-spline are requested // by passing a TinyVector with as many elements as the spline's dimension // with the desired derivative degree for each dimension. Here we want the // first derivative in x and y direction: const vigra::TinyVector < float , 2 > dx1_spec { 1 , 0 } ; const vigra::TinyVector < float , 2 > dy1_spec { 0 , 1 } ; // we pass the derivative specifications to the two evaluators' constructors ev_type xev ( bspl , dx1_spec ) ; ev_type yev ( bspl , dy1_spec ) ; // this is where the result should go: target_type target ( imageInfo.shape() ) ; // quick-shot solution, iterating in a loop, not vectorized auto start = vigra::createCoupledIterator ( target ) ; auto end = start.getEndIterator() ; for ( auto it = start ; it < end ; ++it ) { // we fetch the discrete coordinate from the coupled iterator // and create a coordinate_type from it. Note that we can't pass // the discrete coordinate directly to the evaluator's operator() // because this fails to be disambiguated. coordinate_type crd ( it.get<0>() ) ; // now we get the two gradients by evaluating the gradient evaluators // at the given coordinate pixel_type dx , dy ; xev.eval ( crd , dx ) ; yev.eval ( crd , dy ) ; // and conclude by writing the sum of the squared gradients to target it.get<1>() = dx * dx + dy * dy ; } // store the result with vigra impex vigra::ImageExportInfo eximageInfo ( "gsm.tif" ); vigra::exportImage ( target , eximageInfo .setPixelType("UINT8") ) ; exit ( 0 ) ; } kfj-vspline-6e66cf7a7926/example/gsm2.cc000066400000000000000000000205611320375670700177130ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file gsm2.cc /// /// \brief calculating the gradient squared magnitude, derivatives /// /// alternative implementation of gsm.cc, performing the calculation of the /// gradient squared magnitude with a functor and transform(), which is faster /// since the whole operation is multithreaded and potentially vectorized. /// /// compile with: /// clang++ -std=c++11 -march=native -o gsm -O3 -pthread -DUSE_VC gsm.cc -lvigraimpex -lVc /// /// invoke passing an image file. the result will be written to 'gsm2.tif' #include #include #include #include // we silently assume we have a colour image typedef vigra::RGBValue pixel_type; // coordinate_type has a 2D coordinate typedef vigra::TinyVector < float , 2 > coordinate_type ; // type of b-spline object typedef vspline::bspline < pixel_type , 2 > spline_type ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; // b-spline evaluator producing float pixels typedef vspline::evaluator < coordinate_type , // incoming coordinate's type pixel_type // singular result data type > ev_type ; /// we build a vspline::unary_functor which calculates the sum of gradient squared /// magnitudes. /// Note how the 'compound evaluator' we construct follows a pattern of /// - derive from vspline::unary_functor /// - keep copies 'inner' types /// - initialize these in the constructor, yielding a 'pure' functor /// - if the vector code is identical to the unvectorized code, implement /// eval() with a template // we start out by coding the evaluation functor. // this class does the actual computations: struct ev_gsm : public vspline::unary_functor < coordinate_type , pixel_type > { // we create two evaluators for the b-spline, one for the horizontal and // one for the vertical gradient. The derivatives for a b-spline are requested // by passing a TinyVector with as many elements as the spline's dimension // with the desired derivative degree for each dimension. Here we want the // first derivatives in x and in y direction: const vigra::TinyVector < float , 2 > dx1_spec { 1 , 0 } ; const vigra::TinyVector < float , 2 > dy1_spec { 0 , 1 } ; // we keep two 'inner' evaluators, one for each direction const ev_type xev , yev ; // which are initialized in the constructor, using the bspline and the // derivative specifiers ev_gsm ( const spline_type & bspl ) : xev ( bspl , dx1_spec ) , yev ( bspl , dy1_spec ) { } ; // since the code is the same for vectorized and unvectorized // operation, we can write a template: template < class IN , class OUT > void eval ( const IN & c , OUT & result ) const { OUT dx , dy ; xev.eval ( c , dx ) ; // get the gradient in x direction yev.eval ( c , dy ) ; // get the gradient in y direction // really, we'd like to write: // result = dx * dx + dy * dy ; // but fail due to problems with type inference: for vectorized // types, both dx and dy are vigra::TinyVectors of two simdized // values, and multiplying two such objects is not defined. // It's possible to activate code performing such operations by // defining the relevant traits in namespace vigra, but for this // example we'll stick with using multiply-assigns, which are defined. dx *= dx ; // square the gradients dy *= dy ; dx += dy ; // add them up result = dx ; // assign to result } } ; int main ( int argc , char * argv[] ) { if ( argc < 2 ) { std::cerr << "pass a colour image file as argument" << std::endl ; exit( -1 ) ; } // get the image file name vigra::ImageImportInfo imageInfo ( argv[1] ) ; // we want a b-spline with natural boundary conditions vigra::TinyVector < vspline::bc_code , 2 > bcv ( vspline::NATURAL ) ; // create cubic 2D b-spline object containing the image data spline_type bspl ( imageInfo.shape() , // the shape of the data for the spline 3 , // degree 3 == cubic spline bcv // specifies natural BCs along both axes ) ; // load the image data into the b-spline's core. This is a common idiom: // the spline's 'core' is a MultiArrayView to that part of the spline's // data container which precisely corresponds with the input data. // This saves loading the image to some memory first and then transferring // the data into the spline. Since the core is a vigra::MultiarrayView, // we can pass it to importImage as the desired target for loading the // image from disk. vigra::importImage ( imageInfo , bspl.core ) ; // prefilter the b-spline bspl.prefilter() ; // now we construct the gsm evaluator ev_gsm ev ( bspl ) ; // this is where the result should go: target_type target ( imageInfo.shape() ) ; // now we obtain the result by performing a vspline::transform. This function // successively passes discrete coordinates into the target to the evaluator // it's invoked with, storing the result of the evaluator's evaluation // at the self-same coordinates. This is done multithreaded and vectorized // automatically, so it's very convenient, if the evaluator is at hand. // So here we have invested moderately more coding effort in the evaluator // and are rewarded with being able to use the evaluator with vspline's // high-level code for a fast implementation of our gsm problem. // The difference is quite noticeable. On my system, processing a full HD // image, I get: // $ time ./gsm image.jpg // // real 0m0.838s // user 0m0.824s // sys 0m0.012s // // $ time ./gsm2 image.jpg // // real 0m0.339s // user 0m0.460s // sys 0m0.016s vspline::transform ( ev , target ) ; // store the result with vigra impex vigra::ImageExportInfo eximageInfo ( "gsm2.tif" ); vigra::exportImage ( target , eximageInfo .setPixelType("UINT8") ) ; exit ( 0 ) ; } kfj-vspline-6e66cf7a7926/example/impulse_response.cc000066400000000000000000000130441320375670700224350ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file impulse_response.cc /// /// \brief get the impulse response of a b-spline prefilter /// /// filter a unit pulse with a b-spline prefilter of a given degree /// and display the central section of the result /// /// compile with: /// g++ -std=c++11 -o impulse_response -O3 -pthread -DUSE_VC=1 impulse_response.cc -lVc /// /// to get the central section with values beyond +/- 0.0042 of a degree 5 b-spline: /// /// impulse_response 5 .0042 /// /// producing this output: /// /// long double ir_5[] = { /// -0.0084918610197410 , /// +0.0197222540252632 , /// -0.0458040841925519 , /// +0.1063780046433000 , /// -0.2470419274022756 , /// +0.5733258709616592 , /// -1.3217294729875093 , /// +2.8421709220216247 , /// -1.3217294729875098 , /// +0.5733258709616593 , /// -0.2470419274022757 , /// +0.1063780046433000 , /// -0.0458040841925519 , /// +0.0197222540252632 , /// -0.0084918610197410 , /// } ; /// /// which, when used as a convolution kernel, will have the same effect on a signal /// as applying the recursive filter itself, but with lessened precision due to windowing. /// /// note how three different ways of getting the result are given, the variants /// using lower-level access to the filter are commented out. #include #include #include #include int main ( int argc , char * argv[] ) { if ( argc < 3 ) { std::cerr << "please pass spline degree and cutoff on the command line" << std::endl ; exit ( -1 ) ; } int degree = std::atoi ( argv[1] ) ; long double cutoff = std::atof ( argv[2] ) ; std::cout << "calculating impulse resonse with spline degree " << degree << " and cutoff " << cutoff << std::endl ; assert ( degree >= 0 && degree < 25 ) ; int npoles = degree / 2 ; const long double * poles = vspline_constants::precomputed_poles [ degree ] ; // using the highest-level access to prefiltering, we code: vspline::bspline < long double , 1 > bsp ( 1001 , degree , vspline::MIRROR ) ; auto v1 = bsp.core ; v1 [ 500 ] = 1.0 ; bsp.prefilter() ; // using slightly lower-level access to the prefiltering code, we could achieve // the same result with: // // typedef vigra::MultiArray < 1 , long double > array_t ; // vigra::TinyVector < vspline::bc_code , 1 > bcv ; // bcv[0] = vspline::MIRROR ; // array_t v1 ( 1001 ) ; // v1[500] = 1.0 ; // vspline::solve < array_t , array_t , long double > // ( v1 , v1 , bcv , degree , 0.000000000001 ) ; // and, going yet one level lower, this code also produces the same result: // vigra::MultiArray < 1 , long double > v1 ( 1001 ) ; // v1[500] = 1.0 ; // typedef decltype ( v1.begin() ) iter_type ; // vspline::filter < iter_type , iter_type , long double > // f ( v1.size() , // vspline::overall_gain ( npoles , poles ) , // vspline::MIRROR , // npoles , // poles , // 0.000000000001 ) ; // f.solve ( v1.begin() ) ; std::cout << "long double ir_" << degree << "[] = {" << std::endl ; std::cout << std::fixed << std::showpos << std::showpoint << std::setprecision(std::numeric_limits::max_digits10) ; for ( int k = 0 ; k < 1001 ; k++ ) { if ( std::abs ( v1[k] ) > cutoff ) { std::cout << v1[k] << "L ," << std::endl ; } } std::cout << "} ;" << std::endl ; } kfj-vspline-6e66cf7a7926/example/mandelbrot.cc000066400000000000000000000162601320375670700211730ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file mandelbrot.cc /// /// \brief calculate an image of a section of the mandelbrot set /// /// to demonstrate that vspline's transform routines don't have to use /// b-splines at all, here's a simple example creating a functor /// to perform the iteration leading to the mandelbrot set, together /// with a vspline::domain adapting the coordinates and another /// functor to do the 'colorization'. The three functors are chained /// and fed to vspline::transform() to yield the image. /// /// compile with: /// clang++ -std=c++11 -march=native -o mandelbrot -O3 -pthread -DUSE_VC mandelbrot.cc -lvigraimpex -lVc /// /// invoke like /// /// mandelbrot -2 -1 1 1 /// /// ( -2 , -1 ) being lower left and ( 1 , 1 ) upper right /// /// the result will be written to 'mandelbrot.tif' #include #include #include #include #include // we want a colour image typedef vigra::RGBValue pixel_type; // coordinate_type has a 2D coordinate typedef vigra::TinyVector < double , 2 > coordinate_type ; typedef typename vspline::vector_traits < coordinate_type > :: type crd_v ; const int VSZ = vspline::vector_traits < coordinate_type > :: size ; typedef typename vspline::vector_traits < int , VSZ > :: type int_v ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; struct mandelbrot_functor : public vspline::unary_functor < coordinate_type , int > { const int max_iterations = 255 ; const double threshold = 1000.0 ; // the single-value evaluation recognizably uses the well-known // iteration formula void eval ( const coordinate_type & c , int & m ) const { std::complex < double > cc ( c[0] , c[1] ) ; std::complex < double > z ( 0.0 , 0.0 ) ; for ( m = 0 ; m < max_iterations ; m++ ) { z = z * z + cc ; if ( std::abs ( z ) > threshold ) break ; } } #ifdef USE_VC // the vector code is abit more involved, since the vectorized type // for a std::complex is a vigra::TinyVector of two SIMD types, // and we implement the complex maths 'manually': void eval ( const crd_v & c , int_v & m ) const { // state of the iteration crd_v z { 0.0f , 0.0f } ; // iteration count m = 0.0f ; for ( int i = 0 ; i < max_iterations ; i++ ) { // z = z * z ; using complex maths auto rr = z[0] * z[0] ; auto ii = z[1] * z[1] ; auto ri = z[0] * z[1] ; z[0] = rr - ii ; z[1] = 2.0f * ri ; // create a mask for those values which haven't exceeded the // theshold yet auto mm = ( rr + ii < threshold * threshold ) ; // if the mask is empty, all values have exceeded the threshold // and we end the iteration if ( none_of ( mm ) ) break ; // now we add 'c', the coordinate z += c ; // and increase the iteration count for all values which haven't // exceeded the threshold m ( mm ) += 1 ; } } #endif } ; struct colorize : public vspline::unary_functor < int , pixel_type , VSZ > { // to 'colorize' we produce black-and-white from the incoming // value's LSB template < class IN , class OUT > void eval ( const IN & c , OUT & result ) const { result = 255 * ( c & 1 ) ; } ; } ; int main ( int argc , char * argv[] ) { // get the extent of the section to show if ( argc < 5 ) { std::cerr << "please pass x0, y0, x1 and y1 on the command line" << std::endl ; exit ( -1 ) ; } double x0 = atof ( argv[1] ) ; double y0 = atof ( argv[2] ) ; double x1 = atof ( argv[3] ) ; double y1 = atof ( argv[4] ) ; // this is where the result should go: target_type target ( vigra::Shape2 ( 1920 , 1080 ) ) ; // the domain maps the image coordinates to the coordinates of the // section we want to display. The mapped coordinates, now in the range // of ((x0,y0), (x1,y1)), are fed to the functor calculating the result // of the iteration, and it's results are fed to a 'colorize' object // which translates the iteration depth values to a pixel value. auto f = vspline::domain < coordinate_type , VSZ > ( coordinate_type ( 0 , 0 ) , coordinate_type ( 1920 , 1080 ) , coordinate_type ( x0 , y0 ) , coordinate_type ( x1 , y1 ) ) + mandelbrot_functor() + colorize() ; // the combined functor is passed to transform(), which uses it for // every coordinate pair in 'target' and deposits the result at the // corresponding location. vspline::transform ( f , target ) ; // store the result with vigra impex vigra::ImageExportInfo imageInfo ( "mandelbrot.tif" ); vigra::exportImage ( target , imageInfo .setPixelType("UINT8") .setCompression("100") .setForcedRangeMapping ( 0 , 255 , 0 , 255 ) ) ; } kfj-vspline-6e66cf7a7926/example/n_shift.cc000066400000000000000000000226001320375670700204710ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file n_shift.cc /// /// \brief fidelity test /// /// This is a test to see how much a signal degrades when it is submitted /// to a sequence of operations: /// /// - create a b-spline over the signal /// - evaluate the spline at unit-spaced locations with an arbitrary offset /// - yielding a shifted signal, for which the process is repeated /// /// Finally, a last shift is performed which samples the penultimate version /// of the signal at points coinciding with coordinates 0...N-1 of the /// original signal. This last iteration should ideally recreate the /// original sequence. /// /// The test is done with a periodic signal to avoid margin effects. /// The inital sequence is created by evaluating a periodic high-degree /// b-spline of half the size at steps n/2. This way we start out /// with a (reasonably) band-limited signal, and a signal which can be /// approximated well with a b-spline. Optionally, the supersampling /// factor can be passed on the command line to experiment with different /// values than the default of 2. /// /// compile with: clang++ -pthread -O3 -std=c++11 n_shift.cc -o n_shift #include #include #include #include int main ( int argc , char * argv[] ) { if ( argc < 3 ) { std::cerr << "pass the spline's degree, the number of iterations" << std::endl << "and optionally the supersampling factor" << std::endl ; exit ( -1 ) ; } int degree = std::atoi ( argv[1] ) ; assert ( degree >= 0 && degree < 25 ) ; int iterations = 1 + std::atoi ( argv[2] ) ; const int sz = 1024 ; double widen = 2.0 ; if ( argc > 3 ) widen = atof ( argv[3] ) ; assert ( widen >= 1.0 ) ; int wsz = sz * widen ; vigra::MultiArray < 1 , double > original ( wsz ) ; vigra::MultiArray < 1 , double > target ( wsz ) ; // we start out by filling the first bit of 'original' with random data // between -1 and 1 std::random_device rd ; std::mt19937 gen ( rd() ) ; gen.seed ( 1 ) ; // level playing field std::uniform_real_distribution<> dis ( -1 , 1 ) ; for ( int x = 0 ; x < sz ; x++ ) original [ x ] = dis ( gen ) ; // create the bspline object to produce the data we'll work with vspline::bspline < double , // spline's data type 1 > // one dimension bsp ( sz , // sz values 20 , // high degree for smoothness vspline::PERIODIC , // periodic boundary conditions vspline::BRACED , // implicit scheme, bracing coeffs -1 , // horizon (unused w. BRACED) 0.0 ) ; // no tolerance vigra::MultiArrayView < 1 , double > initial ( vigra::Shape1(sz) , original.data() ) ; // pull in the data while prefiltering bsp.prefilter ( initial ) ; // create an evaluator to obtain interpolated values typedef vspline::evaluator < double , double > ev_type ; ev_type ev ( bsp ) ; // from the bspline object we just made auto fev = vspline::callable ( ev ) ; // for convenience // now we evaluate at 1/widen unit steps, into target for ( int x = 0 ; x < wsz ; x++ ) target [ x ] = fev ( double ( x ) / double ( widen ) ) ; // we take this signal as our original. Since this is a sampling // of a periodic signal (the spline in bsp) representing a full // period, we assume that a b-spline over this signal will, within // the spline's capacity, approximate the signal in 'original'. // what we want to see is how sampling at offsetted positions // and recreating a spline over the offsetted signal will degrade // the signal with different-degree b-splines and different numbers // of iterations. original = target ; // now we set up the working spline vspline::bspline < double , // spline's data type 1 > // one dimension bspw ( wsz , // wsz values degree , // degree as per command line vspline::PERIODIC , // periodic boundary conditions vspline::BRACED , // implicit scheme, bracing coeffs -1 , // horizon (unused w. BRACED) 0.0 ) ; // no tolerance // we pull in the working data we've just generated bspw.prefilter ( original ) ; // and set up the evaluator for the test ev_type evw ( bspw ) ; // we want to map the incoming coordinates into the defined range. // Since we're using a periodic spline, the range is fom 1...N, // rather than 1...N-1 for non-periodic splines auto gate = vspline::periodic ( 0.0 , double(wsz) ) ; // now we do a bit of functional programming. // create a callable (so we have operator()) of the // chained gate and evaluator: auto periodic_ev = vspline::callable ( gate + evw ) ; // we cumulate the offsets so we can 'undo' the cumulated offset // in the last iteration long double cumulated_offset = 0.0 ; for ( int n = 0 ; n < iterations ; n++ ) { using namespace vigra::multi_math ; using namespace vigra::acc; // use a random, largish offset (+/- 1000). any offset // will do, since we have a periodic gate, mapping the // coordinates for evaluation into the spline's range double offset = 1000.0 * dis ( gen ) ; // with the last iteration, we shift back to the original // 0-based locations. This last shift should recreate the // original signal as best as a spline of this degree can // do after so many iterations. if ( n == iterations - 1 ) offset = - cumulated_offset ; cumulated_offset += offset ; std::cout << "iteration " << n << " offset " << offset << " cumulated offset " << cumulated_offset << std::endl ; // we evaluate the spline at unit-stepped offsetted locations, // so, 0 + offset , 1 + offset ... // in the last iteration, this should ideally reproduce the original // signal. for ( int x = 0 ; x < wsz ; x++ ) { auto arg = x + offset ; target [ x ] = periodic_ev ( arg ) ; } // now we create a new spline over target, reusing bspw // note how this merely changes the coefficients of the spline, // the container for the coefficients is reused, and therefore // the evaluator (evw) will look at the new set of coefficients. // So we don't need to create a new evaluator. bspw.prefilter ( target ) ; // to convince ourselves that we really are working on a different // sampling of the signal signal - and to see how close we get to the // original signal after n iterations, when we use an offset to get // the sampling locations back to 0, 1, ... vigra::MultiArray < 1 , double > error_array ( vigra::multi_math::squaredNorm ( target - original ) ) ; AccumulatorChain < double , Select < Mean, Maximum > > ac ; extractFeatures ( error_array.begin() , error_array.end() , ac ) ; std::cout << "signal difference Mean: " << sqrt(get(ac)) << std::endl; std::cout << "signal difference Maximum: " << sqrt(get(ac)) << std::endl; } } kfj-vspline-6e66cf7a7926/example/quickstart.cc000066400000000000000000000123221320375670700212310ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file quickstart.cc /// /// \brief sample code from the documentation /// /// just the bits of code given in the 'Quickstart' section of the documentation overview /// /// compile: clang++ -std=c++11 -o quickstart -pthread quickstart.cc #include #include using namespace std ; using namespace vigra ; using namespace vspline ; int main ( int argc , char * argv[] ) { // given a vigra::MultiArray of data (initialization omitted) vigra::MultiArray < 2 , float > a ( 10 , 20 ) ; // let's initialize the whole array with 42 a = 42 ; typedef vspline::bspline < float , 2 > spline_type ; // fix the type of the spline spline_type bspl ( a.shape() ) ; // create bspline object 'bspl' suitable for your data bspl.core = a ; // copy the source data into the bspline object's 'core' area bspl.prefilter() ; // run prefilter() to convert original data to b-spline coefficients // for a 2D spline, we want 2D coordinates typedef vigra::TinyVector < float ,2 > coordinate_type ; // get the appropriate evaluator type typedef vspline::evaluator < coordinate_type , float > eval_type ; // create the evaluator eval_type ev ( bspl ) ; // create variables for input and output: float x = 3 , y = 4 ; coordinate_type coordinate ( x , y ) ; float result ; // use the evaluator to produce the result ev.eval ( coordinate , result ) ; // evaluate at (x,y) auto f = vspline::callable ( ev ) ; // wrap the evaluator in a vspline::callable float r = f ( coordinate ) ; // use the callable assert ( r == result ) ; // create a 1D array containing (2D) coordinates into 'a' vigra::MultiArray < 1 , coordinate_type > coordinate_array ( 3 ) ; // we initialize the coordinate array by hand... coordinate_array[0] = coordinate_array[1] = coordinate_array[2] = coordinate ; // create an array to accomodate the result of the remap operation vigra::MultiArray < 1 , float > target_array ( 3 ) ; // perform the remap vspline::remap ( a , coordinate_array , target_array ) ; auto ic = coordinate_array.begin() ; for ( auto k : target_array ) assert ( k == f ( *(ic++) ) ) ; // instead of the remap, we can use transform, passing the evaluator for // the b-spline over 'a' instead of 'a' itself. the result is the same. vspline::transform ( ev , coordinate_array , target_array ) ; // create a 2D array for the index-based transform operation vigra::MultiArray < 2 , float > target_array_2d ( 3 , 4 ) ; // use transform to evaluate the spline for the coordinates of // all values in this array vspline::transform ( ev , target_array_2d ) ; for ( int x = 0 ; x < 3 ; x ++ ) { for ( y = 0 ; y < 4 ; y++ ) { coordinate_type c { float(x) , float(y) } ; assert ( target_array_2d [ c ] == f ( c ) ) ; } } vigra::MultiArray < 2 , float > b ( 10 , 20 ) ; vspline::transform ( ev , b ) ; auto ia = a.begin() ; for ( auto r : b ) assert ( vigra::closeAtTolerance ( *(ia++) , r , .00001 ) ) ; } kfj-vspline-6e66cf7a7926/example/restore_test.cc000066400000000000000000000533671320375670700215770ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// restore_test - create b-splines from random data and restore the original /// data from the spline. This has grown to function as a unit test /// instantiating all sorts of splines and evaluators and using the evaluators /// with the functions in transform.h. /// /// Ideally, this test should result in restored data which are identical to /// the initial random data, but several factors come into play: /// /// the precision of the coefficient calculation is parametrizable in vspline, /// via the 'tolerance' parameter, or, in some places, the 'horizon' parameter. /// /// the data type of the spline is important - single precision data are unfit /// to produce coefficients capable of reproducing the data very precisely /// /// the extrapolation method may be EXPLICIT or BRACED (in this test), and /// since the EXPLICIT extrapolation uses clever guesswork for the initial /// causal and anticausal coefficients at the margin of it's extrapolated /// data array, it usually comes out the winner precisionwise. But using /// EXPLICIT prefilter strategy takes a long time for higher-D splines with /// high precision, so I usually comment it out (look for 'EXPLICIT') /// /// the spline degree is important. High degrees need wide horizons, but wide /// horizons also need many calculation steps, which may introduce errors. /// /// The dimension of the spline is also important. Since higher-dimension /// splines need more calculations for prefiltering and evaluation, results are /// less precise. This test program tets up to 4D. /// /// With dimensions > 2 running this program in full takes a long time, since /// it tries to be comprehensive to catch any corner cases which might have /// escaped scrutiny. #include #include #include #include #include #include // for this test, we may define these two symbols, resulting in extra // tests on the mappers used. // #define ASSERT_IN_BOUNDS // #define ASSERT_CONSISTENT #include bool verbose = false ; // 'condense' aggregate types (TinyVectors etc.) into a single value template < typename T > double condense ( const T & t , std::true_type ) { return std::abs ( t ) ; } template < typename T > double condense ( const T & t , std::false_type ) { return sqrt ( sum ( t * t ) ) / t.size() ; } template < typename T > double condense ( const std::complex & t , std::false_type ) { return std::abs ( t ) ; } template < class T > using is_singular = typename std::conditional < std::is_fundamental < T > :: value , std::true_type , std::false_type > :: type ; template < typename T > double condense ( const T & t ) { return condense ( t , is_singular() ) ; } // compare two arrays and calculate the mean and maximum difference template < int dim , typename T > double check_diff ( vigra::MultiArrayView < dim , T > & reference , vigra::MultiArrayView < dim , T > & candidate ) { using namespace vigra::multi_math ; using namespace vigra::acc; assert ( reference.shape() == candidate.shape() ) ; vigra::MultiArray < 1 , double > error_array ( vigra::Shape1(reference.size() ) ) ; for ( int i = 0 ; i < reference.size() ; i++ ) { auto help = candidate[i] - reference[i] ; // std::cerr << reference[i] << " <> " << candidate[i] // << " CFD " << help << std::endl ; error_array [ i ] = condense ( help ) ; } AccumulatorChain < double , Select < Mean, Maximum > > ac ; extractFeatures ( error_array.begin() , error_array.end() , ac ) ; double mean = get(ac) ; double max = get(ac) ; if ( verbose ) { std::cout << "delta Mean: " << mean << std::endl; std::cout << "delta Maximum: " << max << std::endl; } return max ; } // template < int dim , class array_type , int spline_degree > // struct get_vigra_coeffs // { // void operator() ( array_type & target , const array_type & orig ) // { } ; // } ; // // template < class array_type , int spline_degree > // struct get_vigra_coeffs<2,array_type,spline_degree> // { // void operator() // ( array_type & target , const array_type & orig ) // { // vigra::SplineImageView // view ( orig ) ; // vigra::MultiArrayView < 2 , typename array_type::value_type > // v ( view.image() ) ; // target = v ; // } // } ; /// do a restore test. This test fills the array that's /// passed in with small random data, constructs a b-spline with the requested /// parameters over the data, then calls vspline::restore(), which evaluates the /// spline at discrete locations, either using transform() (for 1D data) or /// grid_eval(), which is more efficient for dimensions > 1. /// While this test fails to address several aspects (derivatives, behaviour /// at locations which aren't discrete), it does make sure that prefiltering /// has produced a correct result and reconstruction succeeds: If the spline /// coefficients were wrong, reconstruction would fail just as it would if /// the coefficients were right and the evaluation code was wrong. That both /// should be wrong and accidentally produce a correct result is highly unlikely. /// /// This routine has grown to be more of a unit test for all of vspline, /// additional vspline functions are executed and the results are inspected. /// This way we can assure that the transform-type routines are usable with /// all supported data types and vectorized and unvectorized results are /// consistent. template < int dim , typename T > double restore_test ( vigra::MultiArrayView < dim , T > & arr , vspline::bc_code bc , int spline_degree , vspline::prefilter_strategy pfs ) { if ( verbose ) std::cout << "**************************************" << std::endl << "testing type " << typeid(T()).name() << std::endl << " prefilter strategy " << ( pfs == vspline::BRACED ? "BRACED" : "EXPLICIT" ) << std::endl ; typedef vigra::MultiArray < dim , T > array_type ; typedef vspline::bspline < T , dim > spline_type ; vigra::TinyVector < vspline::bc_code , dim > bcv { bc } ; // // note that the 'acceptable error' may be exceeded if the arithmetic type // // used for calculations isn't sufficient. When using double precision maths, // // specifying an acceptable error of .000000001 will limit the error to the // // specification, but with single precision, especially with higher degree // // splines, the maximum error will often be higher. // // double acceptable_error = .000000000000000000001 ; spline_type bsp ( arr.shape() , spline_degree , bcv , pfs ) ; // -1 , -1.0 , 0.0 , headroom ) ; if ( verbose ) std::cout << "created b-spline:" << std::endl << bsp << std::endl ; std::random_device rd ; std::mt19937 gen ( rd() ) ; // gen.seed ( 765 ) ; // level playing field std::uniform_real_distribution<> dis ( -1.0 , 1.0 ) ; for ( auto & e : arr ) e = dis ( gen ) ; array_type reference = arr ; bsp.prefilter ( arr ) ; vspline::restore < dim , T > ( bsp , arr ) ; double emax = check_diff < dim , T > ( reference , arr ) ; if ( verbose ) { std::cout << "after restoration of original data:" << std::endl ; // print a summary, can use '| grep CF' on cout std::cout << typeid(T()).name() << " CF " << vspline::pfs_name [ pfs ][0] << " D " << dim << " " << arr.shape() << " BC " << vspline::bc_name[bc] << " DG " << spline_degree << " TOL " << bsp.tolerance << " EMAX " << emax << std::endl ; } // test the factory functions make_evaluator and make_safe_evaluator auto raw_ev = vspline::make_evaluator < spline_type , float > ( bsp ) ; typedef typename decltype ( raw_ev ) :: in_type fc_type ; auto rraw = raw_ev ( fc_type ( .123f ) ) ; fc_type cs ( .123f ) ; auto rs = raw_ev ( cs ) ; raw_ev.eval ( cs , rs ) ; // try evaluating the spline at it's lower and upper limit cs = fc_type ( vspline::unwrap ( bsp.lower_limit() ) ) ; raw_ev.eval ( cs , rs ) ; if ( verbose ) std::cout << cs << " -> " << rs << std::endl ; cs = fc_type ( vspline::unwrap ( bsp.upper_limit() ) ) ; raw_ev.eval ( cs , rs ) ; if ( verbose ) std::cout << cs << " -> " << rs << std::endl ; // this code would only compile for vectorizable data, // so the tests with long double's would not compile. // a specific _vsize will only be accepted for vectorizable // data types. // auto raw_ev_4 = vspline::make_evaluator // < spline_type , float , 4 > ( bsp ) ; // // auto rraw4 = raw_ev_4 ( fc_type ( .123f ) ) ; #ifdef USE_VC { // if the spline's data type is not vectorizable, // in_v will be the same as in_type, so this operation // wiil not be vectorized. typedef typename decltype ( raw_ev ) :: in_v fc_v ; fc_v cv ( .123f ) ; auto rv = raw_ev ( cv ) ; raw_ev.eval ( cv , rv ) ; // typedef typename decltype ( raw_ev_4 ) :: in_v fc_v4 ; // fc_v4 c4 ( .123f ) ; // auto r4 = raw_ev_4 ( c4 ) ; } #endif // additionally, we perform a test with a 'safe evaluator' and random // coordinates. For a change we use double precision coordinates auto _ev = vspline::make_safe_evaluator < spline_type , double > ( bsp ) ; enum { vsize = decltype ( _ev ) :: vsize } ; typedef typename decltype ( _ev ) :: in_type coordinate_type ; typedef typename decltype ( _ev ) :: in_ele_type rc_type ; typedef typename decltype ( _ev ) :: out_ele_type ele_type ; // throw a domain in to test that as well: auto dom = vspline::domain < coordinate_type , spline_type , vsize > ( bsp , coordinate_type(-.3377) , coordinate_type(3.11) ) ; auto ev = vspline::grok ( dom + _ev ) ; coordinate_type c ; vigra::MultiArray < 1 , coordinate_type > ca ( vigra::Shape1 ( 10003 ) ) ; vigra::MultiArray < 1 , T > ra ( vigra::Shape1 ( 10003 ) ) ; vigra::MultiArray < 1 , T > ra2 ( vigra::Shape1 ( 10003 ) ) ; auto pc = (rc_type*) &c ; // make sure we can evaluate at the lower and upper limit int k = 0 ; { for ( int e = 0 ; e < dim ; e++ ) pc[e] = 0.0 ; ra[k] = ev ( c ) ; ca[k] = c ; } k = 1 ; { for ( int e = 0 ; e < dim ; e++ ) pc[e] = 1.0 ; ra[k] = ev ( c ) ; ca[k] = c ; } k = 2 ; { for ( int e = 0 ; e < dim ; e++ ) pc[e] = -2.0 ; ra[k] = ev ( c ) ; ca[k] = c ; } k = 3 ; { for ( int e = 2.0 ; e < dim ; e++ ) pc[e] = 1.0 ; ra[k] = ev ( c ) ; ca[k] = c ; } // the remaining coordinates are picked randomly std::uniform_real_distribution<> dis2 ( -2.371 , 2.1113 ) ; for ( k = 4 ; k < 10003 ; k++ ) { for ( int e = 0 ; e < dim ; e++ ) pc[e] = dis2 ( gen ) ; ra[k] = ev ( c ) ; ca[k] = c ; } // run an index-based transform. With a domain in action, this // does *not* recreate the original data vspline::transform ( ev , arr ) ; // run an array-based transform. result should be identical // to single-value-eval above, within arithmetic precision limits // given the specific optimization level. // we can produce different views on the data to make sure feeding // the coordinates still works correctly: vigra::MultiArrayView < 2 , coordinate_type > ca2d ( vigra::Shape2 ( 99 , 97 ) , ca.data() ) ; vigra::MultiArrayView < 2 , T > ra2d ( vigra::Shape2 ( 99 , 97 ) , ra2.data() ) ; vigra::MultiArrayView < 2 , coordinate_type > ca2ds ( vigra::Shape2 ( 49 , 49 ) , vigra::Shape2 ( 2 , 200 ) , ca.data() ) ; vigra::MultiArrayView < 2 , T > ra2ds ( vigra::Shape2 ( 49 , 49 ) , vigra::Shape2 ( 2 , 200 ) , ra2.data() ) ; vigra::MultiArrayView < 3 , coordinate_type > ca3d ( vigra::Shape3 ( 20 , 21 , 19 ) , ca.data() ) ; vigra::MultiArrayView < 3 , T > ra3d ( vigra::Shape3 ( 20 , 21 , 19 ) , ra2.data() ) ; vspline::transform ( ev , ca , ra2 ) ; vspline::transform ( ev , ca2d , ra2d ) ; vspline::transform ( ev , ca2ds , ra2ds ) ; vspline::transform ( ev , ca3d , ra3d ) ; // vectorized and unvectorized operation may produce slightly // different results in optimized code. // usually assert ( ra == ra2 ) holds, but we don't consider // it an error if it doesn't and allow for a small difference auto dsv = check_diff < 1 , T > ( ra , ra2 ) ; auto tolerance = 10 * std::numeric_limits::epsilon() ; if ( dsv > tolerance ) std::cout << vspline::bc_name[bc] << " - max difference single/vectorized eval " << dsv << std::endl ; if ( dsv > .0001 ) { std::cout << vspline::bc_name[bc] << " - excessive difference single/vectorized eval " << dsv << std::endl ; for ( int k = 0 ; k < 10003 ; k++ ) { if ( ra[k] != ra2[k] ) { std::cout << "excessive at k = " << k << ": " << ca[k] << " -> " << ra[k] << ", " << ra2[k] << std::endl ; } } } // to test broadcasting, we create a broadcasting evaluator // from ev auto brd_ev = vspline::broadcast < decltype(ev) , vsize > ( ev ) ; vspline::transform ( brd_ev , ca , ra2 ) ; // with the broadcasted evaluator, this should hold: assert ( ra == ra2 ) ; return emax ; // crossreferencing with vigra is problematic, for several reasons: // - per default, vigra's SplineImageView will assume mirror BCs. vigra calls // them reflect, but it's mirroring on the bounds. If other BCs should be // tested, one would have to go one level deeper and directly use vigra's // filtering routines. Then, periodic BCs could be compared as well. // - vigra's SplineImageViews only work on 2D data // - vigra uses the spline's degree as a template argument and calls it ORDER // - the acceptable error can't be parametrized, it's fixed at 0.00001 // - small arrays with high degrees need a wide horizon, but vigra's reflect BC // does not do multiple reflections, so the coffs come out wrong. // with a widened eps parameter and sufficiently large 2D arrays, the coeffs // came out nearly the same, but I omit the test for the reasons above. // if ( dim == 2 && bc == vspline::MIRROR ) // { // std::cout << "coefficients crossreference with vigra:" << std::endl ; // array_type varr ; // get_vigra_coeffs() ( varr , reference ) ; // double emax = check_diff < dim , T > ( bsp.core , varr ) ; // } } using namespace vspline ; template < class arr_t > double view_test ( arr_t & arr ) { double emax = 0.0 ; enum { dimension = arr_t::actual_dimension } ; typedef typename arr_t::value_type value_type ; vspline::bc_code bc_seq[] { PERIODIC , MIRROR , REFLECT , NATURAL } ; vspline::prefilter_strategy pfs_seq [] { BRACED } ; // , EXPLICIT } ; // TODO: for degree-1 splines, I sometimes get different results // for unvectorized and vectorized operation. why? for ( int spline_degree = 0 ; spline_degree < 8 ; spline_degree++ ) { for ( auto bc : bc_seq ) { for ( auto pfs : pfs_seq ) { auto e = restore_test < dimension , value_type > ( arr , bc , spline_degree , pfs ) ; if ( e > emax ) emax = e ; } } } return emax ; } int d0[] { 2 , 3 , 5 , 8 , 13 , 16 , 21 , 34 , 55 } ; int d1[] { 2 , 3 , 5 , 8 , 13 , 16 , 21 , 34 } ; int d2[] { 2 , 3 , 5 , 8 , 13 , 21 } ; int d3[] { 2 , 3 , 5 , 8 , 13 } ; int* dext[] { d0 , d1 , d2 , d3 } ; int dsz[] { sizeof(d0) / sizeof(int) , sizeof(d1) / sizeof(int) , sizeof(d2) / sizeof(int) , sizeof(d3) / sizeof(int) } ; template < int dim , typename T > struct test { typedef vigra::TinyVector < int , dim > shape_type ; double emax = 0.0 ; double operator() () { shape_type dshape ; for ( int d = 0 ; d < dim ; d++ ) dshape[d] = dsz[d] ; vigra::MultiCoordinateIterator i ( dshape ) , end = i.getEndIterator() ; while ( i != end ) { shape_type shape ; for ( int d = 0 ; d < dim ; d++ ) shape[d] = * ( dext[d] + (*i)[d] ) ; vigra::MultiArray < dim , T > _arr ( 2 * shape + 1 ) ; auto stride = _arr.stride() * 2 ; vigra::MultiArrayView < dim , T > arr ( shape , stride , _arr.data() + long ( sum ( stride ) ) ) ; auto e = view_test ( arr ) ; if ( e > emax ) emax = e ; ++i ; // make sure that we have only written back to 'arr', leaving // _arr untouched for ( auto & e : arr ) e = T(0.0) ; for ( auto e : _arr ) assert ( e == T(0.0) ) ; } return emax ; } } ; template < typename T > struct test < 0 , T > { double operator() () { return 0.0 ; } ; } ; template < int dim , typename tuple_type , int ntypes = std::tuple_size::value > struct multitest { void operator() () { typedef typename std::tuple_element::type T ; auto e = test < dim , T >() () ; std::cout << "test for type " << typeid(T()).name() << ": max error = " << e << std::endl ; multitest < dim , tuple_type , ntypes - 1 >() () ; } } ; template < int dim , typename tuple_type > struct multitest < dim , tuple_type , 0 > { void operator() () { } } ; int main ( int argc , char * argv[] ) { std::cout << std::fixed << std::showpos << std::showpoint << std::setprecision(18); std::cerr << std::fixed << std::showpos << std::showpoint << std::setprecision(18); int test_dim = 2 ; if ( argc > 1 ) test_dim = std::atoi ( argv[1] ) ; if ( test_dim > 4 ) test_dim = 4 ; std::cout << "testing with " << test_dim << " dimensions" << std::endl ; // std::cout << "dvtype " // << vspline::dvtype < float , std::true_type > :: size // << " " // << vspline::dvtype < float , std::false_type > :: size // << std::endl ; // // std::cout << "vtraits " // << vspline::vector_traits < float > :: isv :: value // << " " // << vspline::vector_traits < float > :: size // << std::endl ; typedef std::tuple < vigra::TinyVector < double , 1 > , vigra::TinyVector < float , 1 > , vigra::TinyVector < double , 2 > , vigra::TinyVector < float , 2 > , vigra::TinyVector < long double , 3 > , vigra::TinyVector < double , 3 > , vigra::TinyVector < float , 3 > , vigra::RGBValue , vigra::RGBValue , vigra::RGBValue , std::complex < long double > , std::complex < double > , std::complex < float > , long double , double , float > tuple_type ; switch ( test_dim ) { case 1: multitest < 1 , tuple_type >() () ; break ; case 2: multitest < 2 , tuple_type >() () ; break ; // case 3: // multitest < 3 , tuple_type >() () ; // break ; // case 4: // multitest < 4 , tuple_type >() () ; // break ; default: break ; } std::cout << "terminating" << std::endl ; } kfj-vspline-6e66cf7a7926/example/roundtrip.cc000066400000000000000000000432571320375670700211000ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file roundtrip.cc /// /// \brief benchmarking and testing code for various vspline capabilities /// /// load an image, create a b-spline from it, and restore the original data, /// both by normal evaluation and by convolution with the reconstruction kernel. /// all of this is done several times each with different boundary conditions, /// spline degrees and in float and double arithmetic, the processing times /// and differences between input and restored signal are printed to cout. /// /// obviously, this is not a useful program, it's to make sure the engine functions as /// intended and all combinations of float and double as values and coordinates compile /// and function as intended, also giving an impression of the speed of processing. /// On my system I can see here that vectorization with double values doesn't perform /// better than the unvectorized code. /// /// compile: /// clang++ -std=c++11 -march=native -o roundtrip -O3 -pthread -DUSE_VC roundtrip.cc -lvigraimpex -lVc /// /// invoke: roundtrip /// /// there is no image output. #include #include #include #include #include #include #define PRINT_ELAPSED #ifdef PRINT_ELAPSED #include #include #endif using namespace std ; using namespace vigra ; /// check for differences between two arrays template < class view_type > long double check_diff ( const view_type& a , const view_type& b ) { using namespace vigra::multi_math ; using namespace vigra::acc; typedef typename view_type::value_type value_type ; typedef typename vigra::ExpandElementResult < value_type > :: type real_type ; typedef MultiArray error_array ; error_array ea ( vigra::multi_math::squaredNorm ( b - a ) ) ; AccumulatorChain > ac ; extractFeatures(ea.begin(), ea.end(), ac); std::cout << "warped image diff Mean: " << sqrt(get(ac)) << std::endl; long double max_error = sqrt(get(ac)) ; std::cout << "warped image diff Maximum: " << max_error << std::endl ; return max_error ; } template < class view_type , typename real_type , typename rc_type , int specialize > long double run_test ( view_type & data , vspline::bc_code bc , int DEGREE , int TIMES = 32 ) { typedef typename view_type::value_type pixel_type ; typedef typename view_type::difference_type Shape; typedef MultiArray < 2 , pixel_type > array_type ; typedef int int_type ; long double max_error = 0.0L ; long double error ; #ifdef USE_VC // we use simdized types with as many elements as vector_traits // considers appropriate for a given real_type, which is the elementary // type of the (pixel) data we process: const int vsize = vspline::vector_traits < real_type > :: size ; // for vectorized coordinates, we use simdized coordinates with as many // elements as the simdized values hold: typedef typename vspline::vector_traits < rc_type , vsize > :: type rc_v ; #else const int vsize = 1 ; #endif TinyVector < vspline::bc_code , 2 > bcv ( bc ) ; int Nx = data.width() ; int Ny = data.height() ; // cout << "Nx: " << Nx << " Ny: " << Ny << endl ; vspline::bspline < pixel_type , 2 > bsp ( data.shape() , DEGREE , bcv ) ; // , vspline::EXPLICIT ) ; bsp.core = data ; // cout << "created bspline object:" << endl << bsp << endl ; // first test: time prefilter #ifdef PRINT_ELAPSED std::chrono::system_clock::time_point start = std::chrono::system_clock::now(); std::chrono::system_clock::time_point end ; #endif for ( int times = 0 ; times < TIMES ; times++ ) bsp.prefilter() ; #ifdef PRINT_ELAPSED end = std::chrono::system_clock::now(); cout << "avg " << TIMES << " x prefilter:........................ " << std::chrono::duration_cast(end - start).count() / float(TIMES) << " ms" << endl ; #endif // to time and test 1D operation, we pretend the data are 1D, // prefilter and restore them. start = std::chrono::system_clock::now(); // cast data to 1D array vigra::MultiArrayView < 1 , pixel_type > fake_1d_array ( vigra::Shape1 ( prod ( data.shape() ) ) , data.data() ) ; vigra::TinyVector < vspline::bc_code , 1 > bcv1 ( bcv[0] ) ; vspline::bspline < pixel_type , 1 > bsp1 ( fake_1d_array.shape() , DEGREE , bcv1 ) ; bsp1.core = fake_1d_array ; for ( int times = 0 ; times < TIMES ; times++ ) bsp1.prefilter() ; #ifdef PRINT_ELAPSED end = std::chrono::system_clock::now(); cout << "avg " << TIMES << " x prefilter as fake 1D array:........... " << std::chrono::duration_cast(end - start).count() / float(TIMES) << " ms" << endl ; #endif // use fresh data, data above are useless after TIMES times filtering bsp1.core = fake_1d_array ; bsp1.prefilter() ; start = std::chrono::system_clock::now(); vigra::MultiArray < 1 , pixel_type > fake_1d_target ( vigra::Shape1 ( prod ( data.shape() ) ) ) ; vspline::restore < 1 , pixel_type , rc_type > ( bsp1 , fake_1d_target ) ; for ( int times = 1 ; times < TIMES ; times++ ) vspline::restore < 1 , pixel_type , rc_type > ( bsp1 , fake_1d_target ) ; #ifdef PRINT_ELAPSED end = std::chrono::system_clock::now(); cout << "avg " << TIMES << " x restore original data from 1D:........ " << std::chrono::duration_cast(end - start).count() / float(TIMES) << " ms" << endl ; #endif cout << "difference original data/restored data:" << endl ; error = check_diff ( fake_1d_array , fake_1d_target ) ; if ( error > max_error ) max_error = error ; // that's all of the 1D testing we do, back to the 2D data. // use fresh data, data above are useless after TIMES times filtering bsp.core = data ; bsp.prefilter() ; // get a view to the core coefficients (those which aren't part of the brace) view_type cfview = bsp.core ; // set the coordinate type typedef vigra::TinyVector < rc_type , 2 > coordinate_type ; // set the evaluator type typedef vspline::evaluator eval_type ; // create the evaluator for the b-spline, using plain evaluation (no derivatives) eval_type ev ( bsp ) ; // spline // type for coordinate array typedef vigra::MultiArray<2, coordinate_type> coordinate_array ; int Tx = Nx ; int Ty = Ny ; // now we create a warp array of coordinates at which the spline will be evaluated. // Also create a target array to contain the result. coordinate_array fwarp ( Shape ( Tx , Ty ) ) ; array_type _target ( Shape(Tx,Ty) ) ; view_type target ( _target ) ; rc_type dfx = 0.0 , dfy = 0.0 ; // currently evaluating right at knot point locations for ( int times = 0 ; times < 1 ; times++ ) { for ( int y = 0 ; y < Ty ; y++ ) { rc_type fy = (rc_type)(y) + dfy ; for ( int x = 0 ; x < Tx ; x++ ) { rc_type fx = (rc_type)(x) + dfx ; // store the coordinate to fwarp[x,y] fwarp [ Shape ( x , y ) ] = coordinate_type ( fx , fy ) ; } } } // second test. perform a transform using fwarp as warp array. Since fwarp contains // the discrete coordinates to the knot points, converted to float, the result // should be the same as the input within the given precision #ifdef PRINT_ELAPSED start = std::chrono::system_clock::now(); #endif for ( int times = 0 ; times < TIMES ; times++ ) vspline::transform ( ev , fwarp , target ) ; #ifdef PRINT_ELAPSED end = std::chrono::system_clock::now(); cout << "avg " << TIMES << " x transform with ready-made bspline:.... " << std::chrono::duration_cast(end - start).count() / float(TIMES) << " ms" << endl ; #endif error = check_diff ( target , data ) ; if ( error > max_error ) max_error = error ; // third test: do the same with the 'classic' remap routine which internally creates // a b-spline #ifdef PRINT_ELAPSED start = std::chrono::system_clock::now(); #endif for ( int times = 0 ; times < TIMES ; times++ ) vspline::remap ( data , fwarp , target , bcv , DEGREE ) ; #ifdef PRINT_ELAPSED end = std::chrono::system_clock::now(); cout << "avg " << TIMES << " x classic remap:........................ " << std::chrono::duration_cast(end - start).count() / float(TIMES) << " ms" << endl ; #endif error = check_diff ( target , data ) ; if ( error > max_error ) max_error = error ; // fourth test: perform an transform() directly using the b-spline evaluator // as the transform's functor. This is, yet again, the same, because // it evaluates at all discrete positions, but now without the warp array: // the index-based transform feeds the evaluator with the discrete coordinates. #ifdef PRINT_ELAPSED start = std::chrono::system_clock::now(); #endif for ( int times = 0 ; times < TIMES ; times++ ) vspline::transform ( ev , target ) ; #ifdef PRINT_ELAPSED end = std::chrono::system_clock::now(); cout << "avg " << TIMES << " x index-based transform................. " << std::chrono::duration_cast(end - start).count() / float(TIMES) << " ms" << endl ; #endif cout << "difference original data/restored data:" << endl ; error = check_diff ( target , data ) ; if ( error > max_error ) max_error = error ; // fifth test: use 'restore' which internally delegates to grid_eval. This is // usually slightly faster than the previous way to restore the original data, // but otherwise makes no difference. #ifdef PRINT_ELAPSED start = std::chrono::system_clock::now(); #endif for ( int times = 0 ; times < TIMES ; times++ ) vspline::restore < 2 , pixel_type , rc_type > ( bsp , target ) ; #ifdef PRINT_ELAPSED end = std::chrono::system_clock::now(); cout << "avg " << TIMES << " x restore original data: .......... " << std::chrono::duration_cast(end - start).count() / float(TIMES) << " ms" << endl ; #endif cout << "difference original data/restored data:" << endl ; error = check_diff ( data , target ) ; if ( error > max_error ) max_error = error ; cout << endl ; return max_error ; } template < class real_type , class rc_type > long double process_image ( char * name ) { long double max_error = 0.0L ; long double error ; cout << fixed << showpoint << setprecision(16) ; // the import and info-displaying code is taken from vigra: vigra::ImageImportInfo imageInfo(name); // print some information std::cout << "Image information:\n"; std::cout << " file format: " << imageInfo.getFileType() << std::endl; std::cout << " width: " << imageInfo.width() << std::endl; std::cout << " height: " << imageInfo.height() << std::endl; std::cout << " pixel type: " << imageInfo.getPixelType() << std::endl; std::cout << " color image: "; if (imageInfo.isColor()) std::cout << "yes ("; else std::cout << "no ("; std::cout << "number of channels: " << imageInfo.numBands() << ")\n"; typedef vigra::RGBValue pixel_type; typedef vigra::MultiArray<2, pixel_type> array_type ; typedef vigra::MultiArrayView<2, pixel_type> view_type ; // to test that strided data are processed correctly, we load the image // to an inner subarray of containArray // array_type containArray(imageInfo.shape()+vigra::Shape2(3,5)); // view_type imageArray = containArray.subarray(vigra::Shape2(1,2),vigra::Shape2(-2,-3)) ; // alternatively, just use the same for both array_type containArray ( imageInfo.shape() ); view_type imageArray ( containArray ) ; vigra::importImage(imageInfo, imageArray); // test these bc codes: vspline::bc_code bcs[] = { vspline::MIRROR , vspline::REFLECT , vspline::NATURAL , vspline::PERIODIC } ; for ( int b = 0 ; b < 4 ; b++ ) { vspline::bc_code bc = bcs[b] ; for ( int spline_degree = 0 ; spline_degree < 8 ; spline_degree++ ) { #ifdef USE_VC cout << "testing bc code " << vspline::bc_name[bc] << " spline degree " << spline_degree << " using Vc" << endl ; #else cout << "testing bc code " << vspline::bc_name[bc] << " spline degree " << spline_degree << endl ; #endif if ( spline_degree == 0 ) { std::cout << "using specialized evaluator" << std::endl ; error = run_test < view_type , real_type , rc_type , 0 > ( imageArray , bc , spline_degree ) ; if ( error > max_error ) max_error = error ; std::cout << "using unspecialized evaluator" << std::endl ; error = run_test < view_type , real_type , rc_type , -1 > ( imageArray , bc , spline_degree ) ; if ( error > max_error ) max_error = error ; } else if ( spline_degree == 1 ) { std::cout << "using specialized evaluator" << std::endl ; error = run_test < view_type , real_type , rc_type , 1 > ( imageArray , bc , spline_degree ) ; if ( error > max_error ) max_error = error ; std::cout << "using unspecialized evaluator" << std::endl ; error = run_test < view_type , real_type , rc_type , -1 > ( imageArray , bc , spline_degree ) ; if ( error > max_error ) max_error = error ; } else { error = run_test < view_type , real_type , rc_type , -1 > ( imageArray , bc , spline_degree ) ; if ( error > max_error ) max_error = error ; } } } return max_error ; } int main ( int argc , char * argv[] ) { long double max_error = 0.0L ; long double error ; cout << "testing float data, float coordinates" << endl ; error = process_image ( argv[1] ) ; if ( error > max_error ) max_error = error ; cout << "max error of float/float test: " << error << std::endl << std::endl ; cout << endl << "testing double data, double coordinates" << endl ; error = process_image ( argv[1] ) ; if ( error > max_error ) max_error = error ; cout << "max error of double/double test: " << error << std::endl << std::endl ; cout << endl << "testing long double data, float coordinates" << endl ; error = process_image ( argv[1] ) ; if ( error > max_error ) max_error = error ; cout << "max error of ldouble/float test: " << error << std::endl << std::endl ; cout << endl << "testing long double data, double coordinates" << endl ; error = process_image ( argv[1] ) ; if ( error > max_error ) max_error = error ; cout << "max error of ldouble/double test: " << error << std::endl << std::endl ; cout << "testing float data, double coordinates" << endl ; error = process_image ( argv[1] ) ; if ( error > max_error ) max_error = error ; cout << "max error of float/double test: " << error << std::endl << std::endl ; cout << endl << "testing double data, float coordinates" << endl ; error = process_image ( argv[1] ) ; if ( error > max_error ) max_error = error ; cout << "max error of double/float test: " << error << std::endl << std::endl ; cout << "reached end. max error of all tests: " << max_error << std::endl ; } kfj-vspline-6e66cf7a7926/example/slice.cc000066400000000000000000000121601320375670700201360ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file slice.cc /// /// \brief create 2D image data from a 3D spline /// /// build a 3D volume from samples of the RGB colour space /// build a spline over it and extract a 2D slice, using vspline::transform() /// /// compile with: /// clang++ -std=c++11 -march=native -o slice -O3 -pthread -DUSE_VC=1 slice.cc -lvigraimpex -lVc /// g++ also works. #include #include #include #include int main ( int argc , char * argv[] ) { // pixel_type is the result type, an RGB float pixel typedef vigra::TinyVector < float , 3 > pixel_type ; // voxel_type is the source data type - the same as pixel_type typedef vigra::TinyVector < float , 3 > voxel_type ; // coordinate_3d has a 3D coordinate typedef vigra::TinyVector < float , 3 > coordinate_3d ; // warp_type is a 2D array of coordinates typedef vigra::MultiArray < 2 , coordinate_3d > warp_type ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; // we want a b-spline with natural boundary conditions vigra::TinyVector < vspline::bc_code , 3 > bcv ( vspline::NATURAL ) ; // create quintic 3D b-spline object containing voxels vspline::bspline < voxel_type , 3 > space ( vigra::Shape3 ( 10 , 10 , 10 ) , 5 , bcv ) ; // fill the b-spline's core with a three-way gradient for ( int z = 0 ; z < 10 ; z++ ) { for ( int y = 0 ; y < 10 ; y++ ) { for ( int x = 0 ; x < 10 ; x++ ) { voxel_type & c ( space.core [ vigra::Shape3 ( x , y , z ) ] ) ; c[0] = 25.5 * x ; c[1] = 25.5 * y ; c[2] = 25.5 * z ; } } } // prefilter the b-spline space.prefilter() ; // get an evaluator for the b-spline typedef vspline::evaluator < coordinate_3d , voxel_type > ev_type ; ev_type ev ( space ) ; // now make a warp array with 1920X1080 3D coordinates warp_type warp ( vigra::Shape2 ( 1920 , 1080 ) ) ; // we want the coordinates to follow this scheme: // warp(x,y) = (x,1-x,y) // scaled appropriately for ( int y = 0 ; y < 1080 ; y++ ) { for ( int x = 0 ; x < 1920 ; x++ ) { coordinate_3d & c ( warp [ vigra::Shape2 ( x , y ) ] ) ; c[0] = float ( x ) / 192.0 ; c[1] = 10.0 - c[0] ; c[2] = float ( y ) / 108.0 ; } } // this is where the result should go: target_type target ( vigra::Shape2 ( 1920 , 1080 ) ) ; // now we perform the transform, yielding the result vspline::transform ( ev , warp , target ) ; // store the result with vigra impex vigra::ImageExportInfo imageInfo ( "slice.tif" ); vigra::exportImage ( target , imageInfo .setPixelType("UINT8") .setCompression("100") .setForcedRangeMapping ( 0 , 255 , 0 , 255 ) ) ; std::cout << "result was written to slice.tif" << std::endl ; exit ( 0 ) ; } kfj-vspline-6e66cf7a7926/example/slice2.cc000066400000000000000000000145671320375670700202350ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file slice2.cc /// /// \brief create 2D image data from a 3D spline /// /// build a 3D volume from samples of the RGB colour space /// build a spline over it and extract a 2D slice, using vspline::transform() /// /// while the result is just about the same as the one we get from slice.cc, /// here our volume contains double-precision voxels. Since the evaluator over /// the b-spline representing the volume of voxels produces double precision /// voxels as output (same as the source data type), but our target takes /// float pixels, we have to wrap the evaluator in an outer class which handles /// the conversion from double voxels to float pixels. This is done with a /// class derived from vspline::unary_functor. /// /// compile with: /// clang++ -std=c++11 -march=native -o slice2 -O3 -pthread -DUSE_VC=1 slice2.cc -lvigraimpex -lVc /// g++ also works. #include #include #include #include // pixel_type is the result type, an RGB float pixel typedef vigra::TinyVector < float , 3 > pixel_type ; // voxel_type is the source data type typedef vigra::TinyVector < double , 3 > voxel_type ; // coordinate_type has a 3D coordinate typedef vigra::TinyVector < float , 3 > coordinate_type ; // warp_type is a 2D array of coordinates typedef vigra::MultiArray < 2 , coordinate_type > warp_type ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; // b-spline evaluator producing double precision voxels typedef vspline::evaluator < coordinate_type , voxel_type > ev_type ; // typedef vspline::callable < _ev_type > ev_type ; // to move from the b-splines result type (double precision voxel) to the // target type (float pixel) we wrap the b-spline evaluator with a class // derived from vspline::unary_functor: struct downcast_type : public vspline::unary_functor < voxel_type , pixel_type > { // here we define the method template doing the type conversion template < class IN , class OUT > void eval ( const IN & c , OUT & result ) const { result = c ; // downcasting is simple: just assign } ; } ; int main ( int argc , char * argv[] ) { // we want a b-spline with natural boundary conditions vigra::TinyVector < vspline::bc_code , 3 > bcv ( vspline::NATURAL ) ; // create quintic 3D b-spline object containing voxels vspline::bspline < voxel_type , 3 > space ( vigra::Shape3 ( 10 , 10 , 10 ) , 5 , bcv ) ; // fill the b-spline's core with a three-way gradient for ( int z = 0 ; z < 10 ; z++ ) { for ( int y = 0 ; y < 10 ; y++ ) { for ( int x = 0 ; x < 10 ; x++ ) { voxel_type & c ( space.core [ vigra::Shape3 ( x , y , z ) ] ) ; c[0] = 25.5 * x ; c[1] = 25.5 * y ; c[2] = 25.5 * z ; } } } // prefilter the b-spline space.prefilter() ; // get an evaluator for the b-spline ev_type ev ( space ) ; // wrap it in a downcast_type downcast_type dc ; // put the two functors in sequence. // might also write this as // auto combined = ev + dc ; auto combined = vspline::chain ( ev , dc ) ; // now make a warp array with 1920X1080 3D coordinates warp_type warp ( vigra::Shape2 ( 1920 , 1080 ) ) ; // we want the coordinates to follow this scheme: // warp(x,y) = (x,1-x,y) // scaled appropriately for ( int y = 0 ; y < 1080 ; y++ ) { for ( int x = 0 ; x < 1920 ; x++ ) { coordinate_type & c ( warp [ vigra::Shape2 ( x , y ) ] ) ; c[0] = float ( x ) / 192.0 ; c[1] = 10.0 - c[0] ; c[2] = float ( y ) / 108.0 ; } } // this is where the result should go: target_type target ( vigra::Shape2 ( 1920 , 1080 ) ) ; // now we perform the transform, yielding the result vspline::transform ( combined , warp , target ) ; // store the result with vigra impex vigra::ImageExportInfo imageInfo ( "slice.tif" ); vigra::exportImage ( target , imageInfo .setPixelType("UINT8") .setCompression("100") .setForcedRangeMapping ( 0 , 255 , 0 , 255 ) ) ; std::cout << "result was written to slice.tif" << std::endl ; exit ( 0 ) ; } kfj-vspline-6e66cf7a7926/example/slice3.cc000066400000000000000000000136771320375670700202370ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file slice3.cc /// /// \brief create 2D image data from a 3D spline /// /// build a 3D volume from samples of the RGB colour space /// build a spline over it and extract a 2D slice /// /// Here we use a quick shot without elaborate, possibly vectorized, wrapper /// classes. Oftentimes all it takes is a single run of an interpolation with /// as little programming effort as possible, never mind the performance. /// Again we use a b-spline with double-precision voxels as value_type, /// but instead of using vspline::transform, which requires a suitable functor /// yielding pixel_type, we simply use the evaluator directly and implicitly /// cast the result to pixel_type. /// /// compile with: /// clang++ -std=c++11 -o slice3 -O3 -pthread slice3.cc -lvigraimpex /// g++ also works. #include #include #include #include // pixel_type is the result type, here we use a vigra::RGBValue for a change typedef vigra::RGBValue < unsigned char , 0 , 1 , 2 > pixel_type ; // voxel_type is the source data type typedef vigra::TinyVector < double , 3 > voxel_type ; // coordinate_type has a 3D coordinate typedef vigra::TinyVector < float , 3 > coordinate_type ; // warp_type is a 2D array of coordinates typedef vigra::MultiArray < 2 , coordinate_type > warp_type ; // target_type is a 2D array of pixels typedef vigra::MultiArray < 2 , pixel_type > target_type ; // b-spline evaluator producing double precision voxels typedef vspline::evaluator < coordinate_type , voxel_type > ev_type ; int main ( int argc , char * argv[] ) { // we want a b-spline with natural boundary conditions vigra::TinyVector < vspline::bc_code , 3 > bcv ( vspline::NATURAL ) ; // create quintic 3D b-spline object containing voxels vspline::bspline < voxel_type , 3 > space ( vigra::Shape3 ( 10 , 10 , 10 ) , 5 , bcv ) ; // fill the b-spline's core with a three-way gradient for ( int z = 0 ; z < 10 ; z++ ) { for ( int y = 0 ; y < 10 ; y++ ) { for ( int x = 0 ; x < 10 ; x++ ) { voxel_type & c ( space.core [ vigra::Shape3 ( x , y , z ) ] ) ; c[0] = 25.5 * x ; c[1] = 25.5 * y ; c[2] = 25.5 * z ; } } } // prefilter the b-spline space.prefilter() ; // get a callable evaluator for the b-spline ev_type _ev ( space ) ; auto ev = vspline::callable ( _ev ) ; // now make a warp array with 1920X1080 3D coordinates warp_type warp ( vigra::Shape2 ( 1920 , 1080 ) ) ; // we want the coordinates to follow this scheme: // warp(x,y) = (x,1-x,y) // scaled appropriately for ( int y = 0 ; y < 1080 ; y++ ) { for ( int x = 0 ; x < 1920 ; x++ ) { coordinate_type & c ( warp [ vigra::Shape2 ( x , y ) ] ) ; c[0] = float ( x ) / 192.0 ; c[1] = 10.0 - c[0] ; c[2] = float ( y ) / 108.0 ; } } // this is where the result should go: target_type target ( vigra::Shape2 ( 1920 , 1080 ) ) ; // we are sure warp and target have the same shape. We use iterators // over warp and target and perform the transform 'manually' by iterating over // warp and target synchronously: auto coordinate_iter = warp.begin() ; for ( auto & trg : target ) { // here we implicitly cast the result of the evaluator down to pixel_type: trg = ev ( *coordinate_iter ) ; ++coordinate_iter ; } // store the result with vigra impex vigra::ImageExportInfo imageInfo ( "slice.tif" ); vigra::exportImage ( target , imageInfo .setPixelType("UINT8") .setCompression("100") .setForcedRangeMapping ( 0 , 255 , 0 , 255 ) ) ; std::cout << "result was written to slice.tif" << std::endl ; exit ( 0 ) ; } kfj-vspline-6e66cf7a7926/example/splinus.cc000066400000000000000000000162701320375670700205420ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file splinus.cc /// /// \brief compare a periodic b-spline with a sine /// /// This is a simple example using a periodic b-spline /// over just two values: 1 and -1. This spline is used to approximate /// a sine function. You pass the spline's desired degree on the command /// line. Next you enter a number (interpreted as degrees) and the program /// will output the sine and the 'splinus' of the given angle. /// As you can see when playing with higher degrees, the higher the spline's /// degree, the closer the match with the sine. So apart from serving as a /// very simple demonstration of using a 1D periodic b-spline, it teaches us /// that a periodic b-spline can approximate a sine. /// To show off, we use long double as the spline's data type. /// This program also shows a bit of functional programming magic, putting /// together the 'splinus' functor from several of vspline's functional /// bulding blocks. /// /// compile with: clang++ -pthread -O3 -std=c++11 splinus.cc -o splinus #include #include int main ( int argc , char * argv[] ) { assert ( argc > 1 ) ; int degree = std::atoi ( argv[1] ) ; assert ( degree >= 0 && degree < 25 ) ; // create the bspline object typedef vspline::bspline < long double , 1 > spline_type ; spline_type bsp ( 2 , // two values degree , // degree as per command line vspline::PERIODIC , // periodic boundary conditions vspline::BRACED , // implicit scheme, bracing coeffs -1 , // horizon (unused w. BRACED) 0.0 ) ; // no tolerance // the bspline object's 'core' is a MultiArrayView to the knot point // data, which we set one by one for this simple example: bsp.core[0] = 1.0L ; bsp.core[1] = -1.0L ; // now we prefilter the data bsp.prefilter() ; // we build 'splinus' as a functional construct. Inside the brace, // we 'chain' several vspline::unary_functors: // - a 'domain' which scales and shifts input to the spline's range // we use the same values as for the gate: the gate has mapped // the input to [ 90 , 450 ], and this is now translated to // spline coordinates [ 0 , 2 ] by the domain. // - a periodic gate, mapping out-of-range values into the range. // we want a range from 0.0 to 2.0 here. // - a b-spline evaluator calculating our result value // from the result of chaining the two previous operations. // so the evaluator receives values in the range of [ 0 , 2 ]. // this 'chain' is wrapped into a vspline::callable object so that // we can conveniently access it by just calling it. // auto splinus // = vspline::grok // ( vspline::domain ( bsp , 90.0L , 450.0L ) // + vspline::periodic ( 0.0L , 2.0L ) // + vspline::evaluator < long double , long double > ( bsp ) ) ; // auto splinus // = vspline::grok // ( vspline::domain ( bsp , 90.0L , 450.0L ) // + vspline::make_safe_evaluator < spline_type , long double > ( bsp ) ) ; auto splinus = vspline::grok ( vspline::domain ( bsp , 90.0L , 450.0L ) + vspline::make_safe_evaluator < spline_type , long double > ( bsp ) ) ; // alternatively we can use this construct. This will work just about // the same, but has a potential flaw: If arithmetic imprecision should // land the output of the periodic gate just slightly below 90.0, the // domain may produce a value just below 0.0, resulting in an out-of-bounds // access. So the construct above is preferable. // Just to demonstrate that vspline::grok aso produces an object that // can be used with function call syntax, we use vspline::grok here instead // of vspline::callable. auto splinus2 = vspline::grok ( vspline::periodic ( 90.0L , 450.0L ) + vspline::domain ( bsp , 90.0L , 450.0L ) + vspline::evaluator < long double , long double > ( bsp ) ) ; // now we let the user enter arbitrary angles, calculate the sine // and the 'splinus' and print the result and difference: while ( true ) { std::cout << " > " ; long double x ; std::cin >> x ; // get an angle long double xs = x * M_PI / 180.0L ; // note: sin() uses radians // finally we can produce both results. Note how we can use periodic_ev, // the combination of gate and evaluator, like an ordinary function. std::cout << "sin(" << x << ") = " << sin ( xs ) << std::endl << "splinus(" << x << ") = " << splinus ( x ) << std::endl << "splinus2(" << x << ") = " << splinus2 ( x ) << std::endl << "difference sin/splinus: " << sin ( xs ) - splinus ( x ) << std::endl << "difference sin/splinus2: " << sin ( xs ) - splinus2 ( x ) << std::endl << "difference splinus/splinus2: " << splinus2 ( x ) - splinus ( x ) << std::endl ; } } kfj-vspline-6e66cf7a7926/example/use_map.cc000066400000000000000000000113071320375670700204720ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file use_map.cc \brief test program for code in map.h The program creates one gate type each of the types provided in map.h over the interval [0,1]. Then it queries a value and uses the gates on this value in turn, printing out the result. */ #include #include #include #include // Little tester program using map.h // TODO: expand this to a unit test const int VSIZE = vspline::vector_traits < double > :: rsize ; template < class gate_type > void test ( gate_type gx , double x , const char * mode ) { std::cout << mode << std::endl ; auto tester = vspline::mapper < typename gate_type::in_type > ( gx ) ; typedef double crd_type ; const crd_type crd { x } ; crd_type res ; tester.eval ( crd , res ) ; std::cout << "single-value operation:" << std::endl ; std::cout << crd << " -> " << res << std::endl ; if ( VSIZE > 1 ) { typedef vspline::vector_traits < crd_type , VSIZE > :: ele_v crd_v ; crd_v inv ( crd ) ; crd_v resv ; tester.eval ( inv , resv ) ; std::cout << "vectorized operation:" << std::endl << inv << " -> " << resv << std::endl ; } } int main ( int argc , char * argv[] ) { double x ; std::cout << std::fixed << std::showpos << std::showpoint << std::setprecision(5); while ( true ) { std::cout << "enter coordinate to map to [ 0.0 : 1.0 ]" << std::endl ; std::cin >> x ; try { test ( vspline::reject ( 0.0 , 1.0 ) , x , "REJECT:" ) ; } catch ( vspline::out_of_bounds ) { std::cout << "exception out_of_bounds" << std::endl ; } test ( vspline::clamp ( 0.0 , 1.0 , 0.0 , 1.0 ) , x , "CLAMP:" ) ; test ( vspline::mirror ( 0.0 , 1.0 ) , x , "MIRROR:" ) ; test ( vspline::periodic ( 0.0 , 1.0 ) , x , "PERIODIC:" ) ; } } kfj-vspline-6e66cf7a7926/example/verify.cc000066400000000000000000000162001320375670700203420ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file verify.cc \brief verify bspline interpolation against polynomial A b-spline is a piecewise polynomial function. Therefore, it should model a polynomial of the same degree precisely. This program tests this assumption. While the test should hold throughout, we have to limit it to 'reasonable' degrees, because we have to create the spline over a range sufficiently large to make margin errors disappear, so even if we want to, say, only look at the spline's values between 0 and 1, we have to have a few ten or even 100 values to the left and right of this interval, because the polynomial does not exhibit convenient features like periodicity or mirroring on the bounds. But since a polynomial, outside [-1,1], grows with the power of it's degree, the values around the test interval become very large very quickly with high degrees. We can't reasonable expect to calculate a meaningful spline over such data. The test shows how the measured fidelity degrades with higher degrees due to this effect. still , with 'reasonable' degrees, we see that the spline fits the signal very well indeed, demonstrating that the spline can faithfully represent a polynomial of equal degree. */ #include #include #include #include #include #include #include template < class dtype > struct random_polynomial { int degree ; std::vector < dtype > coefficient ; // set up a polynomial with random coefficients in [0,1] random_polynomial ( int _degree ) : degree ( _degree ) , coefficient ( _degree + 1 ) { std::random_device rd ; std::mt19937 gen ( rd() ) ; std::uniform_real_distribution<> dis ( -1.0 , 1.0 ) ; for ( auto & e : coefficient ) e = dis ( gen ) ; } // evaluate the polynomial at x dtype operator() ( dtype x ) { dtype power = 1 ; dtype result = 0 ; for ( auto e : coefficient ) { result += e * power ; power *= x ; } return result ; } } ; template < class dtype > void polynominal_test ( int degree , const char * type_name ) { // this is the function we want to model: random_polynomial < long double > rp ( degree ) ; // we evaluate this function in the range [-200,200[ // note that for high degrees, the signal will grow very // large outside [-1,1], 'spoiling' the test vigra::MultiArray < 1 , dtype > data ( vigra::Shape1 ( 400 ) ) ; for ( int i = 0 ; i < 400 ; i++ ) { dtype x = ( i - 200 ) ; data[i] = rp ( x ) ; } // we create a b-spline over these data vspline::bspline < dtype , 1 > bspl ( 400 , degree , vspline::NATURAL , vspline::BRACED , -1 , 0.0 ) ; bspl.prefilter ( data ) ; auto _ev = vspline::make_evaluator < decltype(bspl), dtype > ( bspl ) ; // for convenience of notation auto ev = vspline::callable ( _ev ) ; // to test the spline against the polynomial, we generate random // arguments in [-10,10] std::random_device rd ; std::mt19937 gen ( rd() ) ; std::uniform_real_distribution dis ( -2.0 , 2.0 ) ; long double signal = 0 ; long double spline = 0 ; long double noise = 0 ; long double error ; long double max_error = 0 ; // now we evaluate the spline and the polynomial at equal arguments // and do the statistics for ( int i = 0 ; i < 10000 ; i++ ) { long double x = dis ( gen ) ; long double p = rp ( dtype ( x ) ) ; // note how we have to translate x to spline coordinates here long double s = ev ( dtype ( x + 200 ) ) ; error = std::fabs ( p - s ) ; if ( error > max_error ) max_error = error ; signal += std::fabs ( p ) ; spline += std::fabs ( s ) ; noise += error ; } // note: with optimized code, this does not work: if ( std::isnan ( noise ) || std::isnan ( noise ) ) { std::cout << type_name << " aborted due to numeric overflow" << std::endl ; return ; } long double mean_error = noise / 10000.0L ; // finally we echo the results of the test std::cout << type_name << " Mean error: " << mean_error << " Maximum error: " << max_error << " SNR " << int ( 20 * std::log10 ( signal / noise ) ) << "dB" << std::endl ; } int main ( int argc , char * argv[] ) { std::cout << std::fixed << std::showpos << std::showpoint << std::setprecision(18) ; for ( int degree = 1 ; degree < 15 ; degree++ ) { std::cout << "testing spline against polynomial, degree " << degree << std::endl ; polynominal_test < float > ( degree , "using float........" ) ; polynominal_test < double > ( degree , "using double......." ) ; polynominal_test < long double > ( degree , "using long double.." ) ; } } kfj-vspline-6e66cf7a7926/filter.h000066400000000000000000002320231320375670700165350ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file filter.h \brief generic implementation of an n-pole forward-backward IIR filter for nD arrays This code was initially part of vspline's prefilter.h, but I factored it out and disentangled it from the remainder of the code, since it's more general and not specific to B-splines. The code in this file provides efficient filtering of nD arrays with an n-pole forward-backward recursive filter, accepting a variety of boundary conditions and optionally using multithreading and/or vectorization to speed things up. The data have to be presented as vigra MultiArrayViews of elementary floating point types or their 'aggregates' (TinyVectors, pixels, etc.), the code is dimension-agnostic but templated to the array types used, so the dimensionality is not a run time parameter. Note the code organization is bottom-up, so the highest-level code comes last. Most code using filter.h will only call the final routine, filter_nd - and user code, working with vspline::bspline, will not directly call code in this file at all. While the initial purpose for the code in this file was, of course, b-spline prefiltering, the generalized version I present here can be used for arbitrary filters. There is probably one other filter which is most useful in the context of vspline: passing a single positive pole in the range of ] 0 , 1 [ smoothes the signal very efficiently. */ // include common.h for the border condition codes #include #include "common.h" #ifndef VSPLINE_FILTER_H #define VSPLINE_FILTER_H namespace vspline { /// overall_gain is a helper routine: /// Simply executing the filtering code by itself will attenuate the signal. Here /// we calculate the gain which, pre-applied to the signal, will cancel this effect. /// While this code was initially part of the filter's constructor, I took it out /// to gain some flexibility by passing in the gain as a parameter. static long double overall_gain ( const int & nbpoles , const long double * const pole ) { long double lambda = 1.0L ; for ( int k = 0 ; k < nbpoles ; k++ ) lambda = lambda * ( 1.0L - pole[k] ) * ( 1.0L - 1.0L / pole[k] ) ; return lambda ; } /// for each pole passed in, this filter will perform a forward-backward /// first order IIR filter, initially on the data passed in via in_iter, subsequently /// on the result of the application of the previous pole, using these recursions: /// /// forward filter: /// /// x[n]' = x[n] + p * x[n-1] /// /// backward filter: /// /// x[n]'' = p * ( x[n+1]' - x[n]' ) /// /// the result will be deposited via out_iter, which may be an iterator over /// the same data in_iter iterates over, in which case operation is in-place. /// in_iter can be a const iterator, it's never used for writing data. /// /// class filter needs three template arguments, one for the type of iterator over the /// incoming data, one for the type of iterator to the resultant coefficients, and one /// for the real type used in arithmetic operations. The iterators' types will usually /// be the same, but formulating the code with two separate types makes it more /// versatile. The third template argument will usually be the elementary /// type of the iterator's value_type. When the value_types are vigra aggregates /// (TinyVectors etc.) vigra's ExpandElementResult mechanism will provide, but at times /// we may wish to be explicit here, e.g. when iterating over simdized types. template < typename in_iter , // iterator over the knot point values typename out_iter , // iterator over the coefficient array typename real_type > // type for single real value for calculations class filter { // both iterators must define value_type and have the same value_type typedef typename in_iter::value_type value_type ; static_assert ( std::is_same < typename out_iter::value_type , value_type > :: value , "prefilter input and output iterator must have the same value_type" ) ; // // both iterators should be random access iterators. // // currently not enforced // typedef typename std::iterator_traits < in_iter > :: iterator_category in_cat ; // static_assert ( std::is_same < in_cat , std::random_access_iterator_tag > :: value , // "prefilter input iterator must be random access iterator" ) ; // // typedef typename std::iterator_traits < out_iter > :: iterator_category out_cat ; // static_assert ( std::is_same < out_cat , std::random_access_iterator_tag > :: value , // "prefilter output iterator must be random access iterator" ) ; /// typedef the fully qualified type for brevity, to make the typedefs below /// a bit more legible typedef filter < in_iter , out_iter , real_type > filter_type ; const long double* pole ; ///< poles of the IIR filter std::vector horizon ; ///< corresponding horizon values const real_type lambda ; ///< (potentiated) overall gain. const int npoles ; ///< Number of filter poles const int M ; ///< length of the data /// the solving routine and initial coefficient finding routines are called via method pointers. /// these pointers are typedefed for better legibility: typedef void ( filter_type::*p_solve ) ( in_iter input , out_iter output ) ; typedef value_type ( filter_type::*p_icc1 ) ( in_iter input , int k ) ; typedef value_type ( filter_type::*p_icc2 ) ( out_iter input , int k ) ; typedef value_type ( filter_type::*p_iacc ) ( out_iter input , int k ) ; // these are the method pointers used: p_solve _p_solve ; ///< pointer to the solve method p_icc1 _p_icc1 ; ///< pointer to calculation of initial causal coefficient with different p_icc2 _p_icc2 ; ///< and equal data types of input and output p_iacc _p_iacc ; ///< pointer to calculation of initial anticausal coefficient public: /// solve() takes two iterators, one to the input data and one to the output space. /// The containers must have the same size. It's safe to use solve() in-place. void solve ( in_iter input , out_iter output ) { (this->*_p_solve) ( input , output ) ; } /// for in-place operation we use the same filter routine. /// I checked: a handcoded in-place routine using only a single /// iterator is not noticeably faster than using one with two separate iterators. void solve ( out_iter data ) { (this->*_p_solve) ( data , data ) ; } // I use adapted versions of P. Thevenaz' code to calculate the initial causal and // anticausal coefficients for the filter. The code is changed just a little to work // with an iterator instead of a C vector. private: /// The code for mirrored BCs is adapted from P. Thevenaz' code, the other routines are my /// own doing, with aid from a digest of spline formulae I received from P. Thevenaz and which /// were helpful to verify the code against a trusted source. /// /// note how, in the routines to find the initial causal coefficient, there are two different /// cases: first the 'accelerated loop', which is used when the theoretically infinite sum of /// terms has reached sufficient precision, and the 'full loop', which implements the mathematically /// precise representation of the limes of the infinite sum towards an infinite number of terms, /// which happens to be calculable due to the fact that the absolute value of all poles is < 1 and /// /// lim n a /// sum a * q ^ k = --- /// n->inf k=0 1-q /// /// first are mirror BCs. This is mirroring 'on bounds', /// f(-x) == f(x) and f(n-1 - x) == f(n-1 + x) /// /// note how mirror BCs are equivalent to requiring the first derivative to be zero in the /// linear algebra approach. Obviously with mirrored data this has to be the case; the location /// where mirroring occurs is always an extremum. So this case covers 'FLAT' BCs as well /// /// the initial causal coefficient routines are templated by iterator type, because depending /// on the circumstances, they may be used either on the input or the output iterator. template < class IT > value_type icc_mirror ( IT c , int k ) { value_type z = value_type ( pole[k] ) ; value_type zn, z2n, iz; value_type Sum ; int n ; if (horizon[k] < M) { /* accelerated loop */ zn = z; Sum = c[0]; for (n = 1; n < horizon[k]; n++) { Sum += zn * c[n]; zn *= z; } } else { /* full loop */ zn = z; iz = value_type(1.0) / z; z2n = value_type ( pow(double(pole[k]), double(M - 1)) ); Sum = c[0] + z2n * c[M - 1]; z2n *= z2n * iz; for (n = 1; n <= M - 2; n++) { Sum += (zn + z2n) * c[n]; zn *= z; z2n *= iz; } Sum /= (value_type(1.0) - zn * zn); } // cout << "icc_mirror: " << Sum << endl ; return(Sum); } /// the initial anticausal coefficient routines are always called with the output iterator, /// so they needn't be templated like the icc routines. /// /// I still haven't understood the 'magic' which allows to calculate the initial anticausal /// coefficient from just two results of the causal filter, but I assume it's some exploitation /// of the symmetry of the data. This code is adapted from P. Thevenaz'. value_type iacc_mirror ( out_iter c , int k ) { value_type z = value_type ( pole[k] ) ; return( value_type( z / ( z * z - value_type(1.0) ) ) * ( c [ M - 1 ] + z * c [ M - 2 ] ) ); } /// next are 'antimirrored' BCs. This is the same as 'natural' BCs: the signal is /// extrapolated via point mirroring at the ends, resulting in point-symmetry at the ends, /// which is equivalent to the second derivative being zero, the constraint used in /// the linear algebra approach to calculate 'natural' BCs: /// /// f(x) - f(0) == f(0) - f(-x); f(x+n-1) - f(n-1) == f(n-1) - f (n-1-x) template < class IT > value_type icc_natural ( IT c , int k ) { value_type z = value_type ( pole[k] ) ; value_type zn, z2n, iz; value_type Sum , c02 ; int n ; // f(x) - f(0) == f(0) - f(-x) // f(-x) == 2 * f(0) - f(x) if (horizon[k] < M) { c02 = c[0] + c[0] ; zn = z; Sum = c[0]; for (n = 1; n < horizon[k]; n++) { Sum += zn * ( c02 - c[n] ) ; zn *= z; } return(Sum); } else { zn = z; iz = value_type(1.0) / z; z2n = value_type ( pow(double(pole[k]), double(M - 1)) ); Sum = value_type( ( value_type(1.0) + z ) / ( value_type(1.0) - z ) ) * ( c[0] - z2n * c[M - 1] ); z2n *= z2n * iz; // z2n == z^2M-3 for (n = 1; n <= M - 2; n++) { Sum -= (zn - z2n) * c[n]; zn *= z; z2n *= iz; } return(Sum / (value_type(1.0) - zn * zn)); } } /// I still haven't understood the 'magic' which allows to calculate the initial anticausal /// coefficient from just two results of the causal filter, but I assume it's some exploitation /// of the symmetry of the data. This code is adapted from P. Thevenaz' formula. value_type iacc_natural ( out_iter c , int k ) { value_type z = value_type ( pole[k] ) ; return - value_type( z / ( ( value_type(1.0) - z ) * ( value_type(1.0) - z ) ) ) * ( c [ M - 1 ] - z * c [ M - 2 ] ) ; } /// next are reflective BCs. This is mirroring 'between bounds': /// /// f ( -1 - x ) == f ( x ) and f ( n + x ) == f ( n-1 - x ) /// /// I took Thevenaz' routine for mirrored data as a template and adapted it. /// 'reflective' BCs have some nice properties which make them more suited than mirror BCs in /// some situations: /// - the artificial discontinuity is 'pushed out' half a unit spacing /// - the extrapolated data are just as long as the source data /// - they play well with even splines template < class IT > value_type icc_reflect ( IT c , int k ) { value_type z = value_type ( pole[k] ) ; value_type zn, z2n, iz; value_type Sum ; int n ; if (horizon[k] < M) { zn = z; Sum = c[0]; for (n = 0; n < horizon[k]; n++) { Sum += zn * c[n]; zn *= z; } return(Sum); } else { zn = z; iz = value_type(1.0) / z; z2n = value_type ( pow(double(pole[k]), double(2 * M)) ); Sum = 0 ; for (n = 0; n < M - 1 ; n++) { Sum += (zn + z2n) * c[n]; zn *= z; z2n *= iz; } Sum += (zn + z2n) * c[n]; return c[0] + Sum / (value_type(1.0) - zn * zn) ; } } /// I still haven't understood the 'magic' which allows to calculate the initial anticausal /// coefficient from just one result of the causal filter, but I assume it's some exploitation /// of the symmetry of the data. I have to thank P. Thevenaz for his formula which let me code: value_type iacc_reflect ( out_iter c , int k ) { value_type z = value_type ( pole[k] ) ; return c[M - 1] / ( value_type(1.0) - value_type(1.0) / z ) ; } /// next is periodic BCs. so, f(x) = f(x+N) /// /// Implementing this is more straightforward than implementing the various mirrored types. /// The mirrored types are, in fact, also periodic, but with a period twice as large, since they /// repeat only after the first reflection. So especially the code for the full loop is more complex /// for mirrored types. The down side here is the lack of symmetry to exploit, which made me code /// a loop for the initial anticausal coefficient as well. template < class IT > value_type icc_periodic ( IT c , int k ) { value_type z = value_type ( pole[k] ) ; value_type zn ; value_type Sum ; int n ; if (horizon[k] < M) { zn = z ; Sum = c[0] ; for ( n = M - 1 ; n > ( M - horizon[k] ) ; n-- ) { Sum += zn * c[n]; zn *= z; } } else { zn = z; Sum = c[0]; for ( n = M - 1 ; n > 0 ; n-- ) { Sum += zn * c[n]; zn *= z; } Sum /= ( value_type(1.0) - zn ) ; } return Sum ; } // TODO doublecheck this routine! value_type iacc_periodic ( out_iter c , int k ) { value_type z = value_type ( pole[k] ) ; value_type zn ; value_type Sum ; if (horizon[k] < M) { zn = z ; Sum = c[M-1] * z ; for ( int n = 0 ; n < horizon[k] ; n++ ) { zn *= z; Sum += zn * c[n]; } Sum = -Sum ; } else { zn = z; Sum = c[M-1]; for ( int n = 0 ; n < M - 1 ; n++ ) { Sum += zn * c[n]; zn *= z; } Sum = z * Sum / ( zn - value_type(1.0) ); } return Sum ; } /// guess the initial coefficient. This tries to minimize the effect /// of starting out with a hard discontinuity as it occurs with zero-padding, /// while at the same time requiring little arithmetic effort /// /// for the forward filter, we guess an extrapolation of the signal to the left /// repeating c[0] indefinitely, which is cheap to compute: template < class IT > value_type icc_guess ( IT c , int k ) { return c[0] * value_type ( 1.0 / ( 1.0 - pole[k] ) ) ; } // for the backward filter, we assume mirror BC, which is also cheap to compute: value_type iacc_guess ( out_iter c , int k ) { return iacc_mirror ( c , k ) ; } template < class IT > value_type icc_identity ( IT c , int k ) { return c[0] ; } value_type iacc_identity ( out_iter c , int k ) { return c[M-1] ; } /// now we come to the solving, or prefiltering code itself. /// there are some variants - a bit of code bloat due to the explicit handling of a few /// distinct cases; since this is core code I have opted to suffer some code duplication /// in exchange for maximum efficiency. /// The code itself is adapted from P. Thevenaz' code. /// /// This variant uses a 'carry' element, 'X', to carry the result of the recursion /// from one iteration to the next instead of using the direct implementation of the /// recursion formula, which would read the previous value of the recursion from memory /// by accessing x[n-1], or, x[n+1], respectively. void solve_gain_inlined ( in_iter c , out_iter x ) { assert ( M > 1 ) ; // use a buffer of one value_type for the recursion (see below) value_type X ; real_type p = real_type ( pole[0] ) ; // process first pole, applying overall gain in the process // of consuming the input. This gain may be a power of the 'orthodox' // lambda from Thevenaz' code. This is done when the input is multidimensional, // in which case it's wasteful to apply lambda in each dimension. In this situation // it makes more sense to apply pow(lambda,dimensions) when solving along the // first axis and apply no gain when solving along the other axes. // Also note that the application of the gain is performed during the processing // of the first (maybe the only) pole of the filter, instead of running a separate // loop over the input to apply it before processing starts. // note how the gain is applied to the initial causal coefficient. This is // equivalent to first applying the gain to the input and then calculating // the initial causal coefficient from the amplified input. // note the seemingly strange = X clause in the asignment. By performing this // assignment, we buffer the result of the current filter step to be used in the // next iteration instead of fetching it again from memory. In my trials, this // performed better, especially on SIMD data. x[0] = X = value_type ( lambda ) * (this->*_p_icc1) (c, 0); /* causal recursion */ // the gain is applied to each input value as it is consumed for (int n = 1; n < M; n++) { x[n] = X = value_type ( lambda ) * c[n] + value_type ( p ) * X ; } // now the input is used up and won't be looked at any more; all subsequent // processing operates on the output. /* anticausal initialization */ x[M - 1] = X = (this->*_p_iacc)(x, 0); /* anticausal recursion */ for (int n = M - 2; 0 <= n; n--) { x[n] = X = value_type ( p ) * ( X - x[n]); } // for the remaining poles, if any, don't apply the gain // and process the result from applying the first pole for (int k = 1; k < npoles; k++) { p = pole[k] ; /* causal initialization */ x[0] = X = (this->*_p_icc2)(x, k); /* causal recursion */ for (int n = 1; n < M; n++) { x[n] = X = x[n] + value_type ( p ) * X ; } /* anticausal initialization */ x[M - 1] = X = (this->*_p_iacc)(x, k); /* anticausal recursion */ for (int n = M - 2; 0 <= n; n--) { x[n] = X = value_type ( p ) * ( X - x[n] ); } } } /// solve routine without application of any gain, it is assumed that this has been /// done already during an initial run with the routine above, or in some other way. void solve_no_gain ( in_iter c , out_iter x ) { assert ( M > 1 ) ; value_type X ; real_type p = real_type ( pole[0] ) ; // process first pole, consuming the input /* causal initialization */ x[0] = X = (this->*_p_icc1)(c, 0); /* causal recursion */ for ( int n = 1; n < M; n++) { x[n] = X = c[n] + value_type ( p ) * X ; } /* anticausal initialization */ x[M - 1] = X = (this->*_p_iacc)(x, 0); /* anticausal recursion */ for ( int n = M - 2; 0 <= n; n--) { x[n] = X = value_type ( p ) * ( X - x[n]); } // for the remaining poles, if any, work on the result // of processing the first pole for ( int k = 1 ; k < npoles; k++) { p = pole[k] ; /* causal initialization */ x[0] = X = (this->*_p_icc2)(x, k); /* causal recursion */ for (int n = 1; n < M; n++) { x[n] = X = x[n] + value_type ( p ) * X ; } /* anticausal initialization */ x[M - 1] = X = (this->*_p_iacc)(x, k); /* anticausal recursion */ for (int n = M - 2; 0 <= n; n--) { x[n] = X = value_type ( p ) * ( X - x[n] ); } } } /// shortcircuit routine, copies input to output /// /// this routine can also be used for splines of degree 0 and 1, for simplicity's sake void solve_identity ( in_iter c , out_iter x ) { if ( &(*x) == &(*c) ) // if operation is in-place we needn't do anything return ; for ( int n = 0 ; n < M ; n++ ) // otherwise, copy input to output x[n] = c[n] ; } /// The last bit of work left in class filter is the constructor. /// The number of input/output values is passed into the constructur, limiting the /// filter to operate on data precisely of this length. apply_gain isn't immediately /// obvious: it's not a mere flag, but contains the exponent which should be applied /// to the gain. If, for example, a 2D spline is built, one might pass in 2 here for /// the first dimension, and 0 for the second. This way, one set of multiplications is /// saved, at the cost of slightly reduced accuracy for large spline degrees. For high /// spline degrees and higher dimensions, it's advisable to not use this mechanism and /// pass in apply_gain = 1 for all dimensions; the calling code in filter.h decides this /// with a heuristic. /// The number of poles and a pointer to the poles themselves are passed in with the /// parameters _nbpoles and _pole, respectively. /// Finally, the last parameter, tolerance, gives a measure of the acceptable error. public: filter ( int _M , ///< number of input/output elements (DataLength) double gain , ///< gain to apply to the signal to cancel attenuation bc_code bc , ///< boundary conditions for this filter int _npoles , ///< number of poles const long double * _pole , ///< pointer to _npoles doubles holding the filter poles double tolerance ) ///< acceptable loss of precision, absolute value : M ( _M ) , npoles ( _npoles ) , pole ( _pole ) , lambda ( gain ) { if ( npoles < 1 ) { // zero poles means there's nothing to do but possibly // copying the input to the output, which solve_identity // will do if the operation isn't in-place _p_solve = & filter_type::solve_identity ; return ; } // calculate the horizon for each pole, this is the number of iterations // the filter must perform on a unit impulse (TODO doublecheck) for it to // decay below 'tolerance' for ( int i = 0 ; i < npoles ; i++ ) { if ( tolerance ) horizon.push_back ( ceil ( log ( tolerance ) / log ( fabs ( pole[i] ) ) ) ) ; else horizon.push_back ( M ) ; } if ( gain == 1.0 ) { // gain == 1.0 has no effect, we can use this solve variant, applying no gain: _p_solve = & filter_type::solve_no_gain ; } else { // if gain isn't 1.0, we use the solve variant which applies it // to the signal as it goes along. _p_solve = & filter_type::solve_gain_inlined ; } // while the forward/backward IIR filter in the solve_... routines is the same for all // boundary conditions, the calculation of the initial causal and anticausal coefficients // depends on the boundary conditions and is handled by a call through a method pointer // in the solve_... routines. Here we fix these method pointers: if ( bc == MIRROR ) { _p_icc1 = & filter_type::icc_mirror ; _p_icc2 = & filter_type::icc_mirror ; _p_iacc = & filter_type::iacc_mirror ; } else if ( bc == NATURAL ) { _p_icc1 = & filter_type::icc_natural ; _p_icc2 = & filter_type::icc_natural ; _p_iacc = & filter_type::iacc_natural ; } else if ( bc == PERIODIC ) { _p_icc1 = & filter_type::icc_periodic ; _p_icc2 = & filter_type::icc_periodic ; _p_iacc = & filter_type::iacc_periodic ; } else if ( bc == REFLECT ) { _p_icc1 = & filter_type::icc_reflect ; _p_icc2 = & filter_type::icc_reflect ; _p_iacc = & filter_type::iacc_reflect ; } else if ( bc == ZEROPAD ) { _p_icc1 = & filter_type::icc_identity ; _p_icc2 = & filter_type::icc_identity ; _p_iacc = & filter_type::iacc_identity ; } else if ( bc == IDENTITY ) { _p_solve = & filter_type::solve_identity ; } else if ( bc == GUESS ) { _p_icc1 = & filter_type::icc_guess ; _p_icc2 = & filter_type::icc_guess ; _p_iacc = & filter_type::iacc_guess ; } else { std::cout << "boundary condition " << bc << " not supported by vspline::filter" << std::endl ; throw not_supported ( "boundary condition not supported by vspline::filter" ) ; } } } ; // end of class filter // Now that we have generic code for 1D filtering, we want to apply this code to // n-dimensional arrays. We use the following strategy: // - perform the prefiltering collinear to each axis separately // - when processing a specific axis, split the array(s) into chunks and use one job per chunk // - perform a traversal on each chunk, copying out subsets collinear to the processing axis // to a buffer // - perform the filter on the buffer // - copy the filtered data to the target // The code is organized bottom-up, with the highest-level routines furthest down, saving // on forward declarations. The section of code immediately following doesn't use vectorization, // the vector code follows. /// 'monadic' gather and scatter. gather picks up count source_type which are stride apart, /// starting at source and depositing compactly at target. scatter performs the reverse /// operation. source_type and target_type can be different; on assignment source_type is /// simply cast to target_type. /// /// index_type is passed in as a template argument, allowing for wider types than int, /// so these routines can also operate on very large areas of memory. template < typename source_type , typename target_type = source_type , typename index_type = int > void gather ( const source_type* source , target_type* target , const index_type & stride , index_type count ) { while ( count-- ) { *target = target_type ( *source ) ; source += stride ; ++target ; } } template < typename source_type , typename target_type = source_type , typename index_type = int > void scatter ( const source_type* source , target_type* target , const index_type & stride , index_type count ) { while ( count-- ) { *target = target_type ( *source ) ; ++source ; target += stride ; } } /// nonaggregating_filter subsequently copies all 1D subarrays of source collinear to axis /// into a 1D buffer, performs the filter 'solver' on the buffer, then writes the filtered /// data to the corresponding 1D subarray of target (which may be the same as source). /// While the buffering consumes some time, it saves time on the actual filter calculation, /// especially with higher-order filters. On my system, I found I broke even even with only /// one pole, so there is no special treatment here for low-order filtering (TODO confirm) /// note the use of range_type, which is from multithread.h /// we derive the index type for the call to the monadic gather/scatter routines /// automatically, so here it comes out as vigra's difference_type_1 template < class source_view_type , class target_view_type , class math_type > void nonaggregating_filter ( vspline::range_type < typename source_view_type::difference_type > range , source_view_type * p_original_source , target_view_type * p_original_target , int axis , double gain , bc_code bc , int nbpoles , const long double * pole , double tolerance ) { typedef typename source_view_type::value_type source_type ; typedef typename target_view_type::value_type target_type ; // we're in the single-threaded code now. multithread() has simply forwarded // the source and target MultiArrayViews and a range, here we use the range // to pick out the subarrays of original_source and original_target which we // are meant to process in this thread: const auto source = p_original_source->subarray ( range[0] , range[1] ) ; auto target = p_original_target->subarray ( range[0] , range[1] ) ; auto count = source.shape ( axis ) ; /// we use a buffer of count value_types vigra::MultiArray < 1 , math_type > buffer ( count ) ; // avoiding being specific about the iterator's type allows us to slot in // any old iterator we can get by calling begin() on buffer typedef decltype ( buffer.begin() ) iter_type ; typedef filter < iter_type , iter_type , math_type > filter_type ; filter_type solver ( count , gain , bc , nbpoles , pole , tolerance ) ; // next slice is this far away: auto source_stride = source.stride ( axis ) ; auto source_base_adress = source.data() ; auto buffer_base_adress = buffer.data() ; auto target_base_adress = target.data() ; if ( source.stride() == target.stride() ) { // we already know that both arrays have the same shape. If the strides are also the same, // both arrays have the same structure in memory. // If both arrays have the same structure, we can save ourselves the index calculations // for the second array, since the indices would come out the same. target_base_adress // may be the same as source_base_adress, in which case the operation is in-place, but // we can't derive any performance benefit from the fact. // TODO: doublecheck if there really is a performance benefit from the 'shared' // indexes. using the else case below for all situations would simplify the code. // pick the first slice of source along the processing axis auto source_slice = source.bindAt ( axis , 0 ) ; // we permute the slice's strides to ascending order to make the memory access // as efficient as possible. auto permuted_slice = source_slice.permuteStridesAscending() ; // we iterate over the elements in this slice - not to access them, but to // calculate their offset from the first one. This may not be the most efficient // way but it's simple and foolproof and will only be needed once per count values. auto source_sliter = permuted_slice.begin() ; auto source_sliter_end = permuted_slice.end() ; while ( source_sliter < source_sliter_end ) { // copy from the array to the buffer with a monadic gather, casting to // math_type in the process auto source_index = &(*source_sliter) - source_base_adress ; gather < source_type , math_type > ( source_base_adress + source_index , buffer_base_adress , source_stride , count ) ; // finally (puh): apply the prefilter, using the solver in-place, iterating over // the vectors in buffer with maximum efficiency. solver.solve ( buffer.begin() ) ; // and perform a monadic scatter to write the filtered data to the destination, // casting to target_type in the process scatter< math_type , target_type > ( buffer_base_adress , target_base_adress + source_index , source_stride , count ) ; ++source_sliter ; } } else { // pretty much the same as the previouse operation, with the distinction that // copying the filtered data from the buffer to the target now needs it's own // index etc., since all these may be different. // TODO we might permute source_slice's strides to ascending and apply the same // permutation to target_slice. auto source_slice = source.bindAt ( axis , 0 ) ; auto source_sliter = source_slice.begin() ; auto source_sliter_end = source_slice.end() ; auto target_slice = target.bindAt ( axis , 0 ) ; auto target_stride = target.stride ( axis ) ; auto target_sliter = target_slice.begin() ; while ( source_sliter < source_sliter_end ) { auto source_index = &(*source_sliter) - source_base_adress ; auto target_index = &(*target_sliter) - target_base_adress ; gather < source_type , math_type > ( source_base_adress + source_index , buffer_base_adress , source_stride , count ) ; solver.solve ( buffer.begin() ) ; scatter< math_type , target_type > ( buffer_base_adress , target_base_adress + target_index , target_stride , count ) ; ++source_sliter ; ++target_sliter ; } } } // the use of Vc has to be switched on with the flag USE_VC. // before we can code the vectorized analogon of nonaggregating_filter, we need // some more infrastructure code: #ifdef USE_VC /// extended gather and scatter routines taking 'extrusion parameters' /// which handle how many times and with which stride the gather/scatter /// operation is repeated. With these routines, strided memory can be /// copied to a compact chunk of properly aligned memory and back. /// The gather routine gathers from source, which points to strided memory, /// and deposits in target, which is compact. /// The scatter routine scatters from source, which points to compact memory, /// and deposits in target, which points to strided memory. /// Initially I coded using load/store operations to access the 'non-compact' /// memory as well, if the indexes were contiguous, but surprisingly, this was /// slower. I like the concise expression with this code - instead of having /// variants for load/store vs. gather/scatter and masked/unmasked operation, /// the modus operandi is determined by the indices and mask passed, which is /// relatively cheap as it occurs only once, while the inner loop can just /// rip away. /// per default, the type used for gather/scatter indices (gs_indexes_type) /// will be what Vc deems appropriate. This comes out as an SIMD type composed /// of int, and ought to result in the fastest code on the machine level. /// But since the only *requirement* on gather/scatter indices is that they /// offer a subscript operator (and hold enough indices), other types can be /// used as gs_indexes_type as well. Below I make the disticzion and pass in /// a TinyVector of ptrdiff_t if int isn't sufficiently large to hold the /// intended indices. On my system, this is actually faster. template < typename source_type , // (singular) source type typename target_type , // (simdized) target type typename index_type , // (singular) index type for stride, count typename gs_indexes_type > // type for gather/scatter indices void gather ( const source_type * source , target_type * target , const gs_indexes_type & indexes , const typename target_type::Mask & mask , const index_type & stride , index_type count ) { // fix the type into which to gather source data enum { vsize = target_type::Size } ; typedef typename Vc::SimdArray < source_type , vsize > simdized_source_type ; // if the mask is all-true, load the data with an unmasked gather operation if ( mask.isFull() ) { while ( count-- ) { // while Vc hadn't yet implemented gathering using intrinsics (for AVX2) // I played with using tem directly to see if I could get better performance. // So far it looks like as if the prefiltering code doesn't benefit. // __m256i ix = _mm256_loadu_si256 ( (const __m256i *)&(indexes) ) ; // __m256 fv = _mm256_i32gather_ps (source, ix, 4) ; simdized_source_type x ( source , indexes ) ; * target = target_type ( x ) ; source += stride ; ++ target ; } } else { // if there is a partially filled mask, perform a masked gather operation while ( count-- ) { simdized_source_type x ( source , indexes , mask ) ; * target = target_type ( x ) ; source += stride ; ++ target ; } } } template < typename source_type , // (simdized) source type typename target_type , // (singular) target type typename index_type , // (singular) index type for stride, count typename gs_indexes_type > // type for gather/scatter indices void scatter ( const source_type * source , target_type * target , const gs_indexes_type & indexes , const typename source_type::Mask & mask , const index_type & stride , index_type count ) { // fix the type from which to scatter target data enum { vsize = source_type::Size } ; typedef typename Vc::SimdArray < target_type , vsize > simdized_target_type ; // if the mask is full, deposit with an unmasked scatter if ( mask.isFull() ) { while ( count-- ) { simdized_target_type x ( *source ) ; x.scatter ( target , indexes ) ; ++ source ; target += stride ; } } else { // if there is a partially filled mask, perform a masked scatter operation while ( count-- ) { simdized_target_type x ( *source ) ; x.scatter ( target , indexes , mask ) ; ++ source ; target += stride ; } } } /// aggregating_filter keeps a buffer of vector-aligned memory, which it fills from /// vsize 1D subarrays of the source array which are collinear to the processing axis. /// Note that the vectorization, or aggregation axis is *orthogonal* to the processing /// axis, since the adjacency of neighbours along the processing axis needs to be /// preserved for filtering. /// The buffer is then submitted to vectorized forward-backward recursive filtering /// and finally stored back to the corresponding memory area in target, which may /// be the same as source, in which case the operation is seemingly performed /// in-place (while in fact the buffer is still used). Buffering takes the bulk /// of the processing time (on my system), the vectorized maths are fast by /// comparison. Depending on data type, array size and spline degree, sometimes the /// nonvectorized code is faster. But as both grow, bufering soon comes out on top. /// ele_aggregating_filter is a subroutine processing arrays of elementary value_type. /// It's used by aggregating_filter, after element-expanding the array(s). /// With this vectorized routine and the size of gather/scatter indices used by Vc /// numeric overflow could occur: the index type is only int, while it's assigned a /// ptrdiff_t, which it may not be able to represent. The overflow can happen when /// a gather/scatter spans a too-large memory area. The gather/scatter indices will /// be set up so that the first index is always 0 (by using the adress of the first /// storee, not the array base adress), but even though this makes it less likely for /// the overflow to occur, it still can happen. In this case the code falls back /// to using a vigra::TinyVector < ptrdiff_t > as gather/scatter index type, which /// may cause Vc to use less performant code for the gather/scatter operations but /// is safe. // TODO: using different vsize for different axes might be faster. template < typename source_view_type , typename target_view_type , typename math_type > void ele_aggregating_filter ( source_view_type &source , target_view_type &target , int axis , double gain , bc_code bc , int nbpoles , const long double * pole , double tolerance ) { // for prefiltering, using Vc::Vectors seems faster than using SimdArrays of twice the size, // which are used as simdized type in evaluation const int vsize = vspline::vector_traits < math_type > :: rsize ; typedef typename vspline::vector_traits < math_type , vsize > :: type simdized_math_type ; typedef typename source_view_type::value_type source_type ; typedef typename vspline::vector_traits < source_type > :: type simdized_source_type ; typedef typename target_view_type::value_type target_type ; typedef typename vspline::vector_traits < target_type > :: type simdized_target_type ; // indexes for gather/scatter. first the 'optimal' type, which Vc produces as // the IndexType for simdized_math_type. Next a wider type composed of std::ptrdiff_t, // to be used initially when calculating the indices, and optionally later for the // actual gather/scatter operations if gs_indexes_type isn't wide enough. typedef typename simdized_math_type::IndexType gs_indexes_type ; typedef vigra::TinyVector < std::ptrdiff_t , vsize > comb_type ; // mask type for masked operation typedef typename simdized_math_type::MaskType mask_type ; auto count = source.shape ( axis ) ; // number of vectors we'll process // I initially tried to use Vc::Memory, but the semantics of the iterator obtained // by buffer.begin() didn't work for me. // anyway, a MultiArray with the proper allocator works just fine, and the dereferencing // of the iterator needed in the solver works without further ado. vigra::MultiArray < 1 , simdized_math_type , Vc::Allocator > buffer ( count ) ; // avoiding being specific about the iterator's type allows us to slot in // any old iterator we can get by calling begin() on buffer typedef decltype ( buffer.begin() ) viter_type ; // set of offsets into the source slice which will be used for gather/scatter comb_type source_indexes ; // while we don't hit the last odd few 1D subarrays the mask is all-true mask_type mask ( true ) ; // next slice is this far away: auto source_stride = source.stride ( axis ) ; // we want to use the extended gather/scatter (with 'extrusion'), so we need the // source and target pointers. Casting buffer's data pointer to math_type is safe, // Since the simdized_type objects stored there are merely raw math_type data // in disguise. auto source_base_adress = source.data() ; auto buffer_base_adress = buffer.data() ; auto target_base_adress = target.data() ; gs_indexes_type source_gs_indexes ; gs_indexes_type target_gs_indexes ; // we create a solver object capable of handling the iterator producing the successive // simdized_types from the buffer. While the unvectorized code can omit passing the third // template argument (the elementary type used inside the solver) we pass it here, as we // don't define an element-expansion via vigra::ExpandElementResult for simdized_type. typedef filter < viter_type , viter_type , math_type > filter_type ; filter_type solver ( count , gain , bc , nbpoles , pole , tolerance ) ; if ( source.stride() == target.stride() ) { // we already know that both arrays have the same shape. If the strides are also the same, // both arrays have the same structure in memory. // If both arrays have the same structure, we can save ourselves the index calculations // for the second array, since the indexes would come out the same. target_base_adress // may be the same as source_base_adress, in which case the operation is in-place, but // we can't derive any performance benefit from the fact. // pick the first slice of source along the processing axis auto source_slice = source.bindAt ( axis , 0 ) ; // we permute the slice's strides to ascending order to make the memory access // as efficient as possible. auto permuted_slice = source_slice.permuteStridesAscending() ; // we iterate over the elements in this slice - not to access them, but to // calculate their offset from the first one. This may not be the most efficient // way but it's simple and foolproof and will only be needed once per count values. auto source_sliter = permuted_slice.begin() ; auto source_sliter_end = permuted_slice.end() ; while ( source_sliter < source_sliter_end ) { // try loading vsize successive offsets into an comb_type int e ; // we base the operation so that the first entry in source_indexes // will come out 0. auto first_source_adress = &(*source_sliter) ; auto offset = first_source_adress - source_base_adress ; auto first_target_adress = target_base_adress + offset ; for ( e = 0 ; e < vsize && source_sliter < source_sliter_end ; ++e , ++source_sliter ) source_indexes[e] = &(*source_sliter) - first_source_adress ; if ( e < vsize ) // have got less than vsize? must be the last few items. // mask was all-true before, so now we limit it to the first e fields: mask = ( simdized_math_type::IndexesFromZero() < e ) ; // next we assign the indices (which are ptrdiff_t) to the intended type // for gather/scatter indices - which is what Vc deems appropriate. This should // be the optimal choice in terms of performance. Yet we can't be certain that // the ptrdiff_t values actually fit into this type, which is usually composed of // int only. So we test if the assigned value compares equal to the assignee. // If the test fails for any of the indices, we switch to code using a // vigra::TinyVector < ptrdiff_t > for the indices, which is permissible, since // TinyVector offers operator[], but may be less efficient. // Note: Vc hasn't implemented the gather with intrinsics for AVX2, that's why // using gs_indexes_type can't yet have a speedup effect. // Note: since the gathers are often from widely spaced locations, there is // not too much benefit to be expected. bool fits = true ; for ( e = 0 ; fits && ( e < vsize ) ; e++ ) { source_gs_indexes[e] = source_indexes[e] ; if ( source_gs_indexes[e] != source_indexes[e] ) fits = false ; } if ( fits ) { // perform extended gather with extrusion parameters to transport the unfiltered data // to the buffer, passing in source_gs_indexes for best performance. gather ( first_source_adress , buffer_base_adress , source_gs_indexes , mask , source_stride , count ) ; // finally (puh): apply the prefilter, using the solver in-place, iterating over // the vectors in buffer with maximum efficiency. solver.solve ( buffer.begin() ) ; // and perform extended scatter with extrusion parameters to write the filtered data // to the destination scatter ( buffer_base_adress , first_target_adress , source_gs_indexes , mask , source_stride , count ) ; } else { // Since the indices did not fit into the optimal type for gather/scatter // indices, we pass in a wider type, which may reduce performance, but is // necessary under the circumstances. But this should rarely happen: // it would mean a gather/scatter spanning several GB. gather ( first_source_adress , buffer_base_adress , source_indexes , mask , source_stride , count ) ; solver.solve ( buffer.begin() ) ; scatter ( buffer_base_adress , first_target_adress , source_indexes , mask , source_stride , count ) ; } } } else { // pretty much the same as the if(...) case, with the distinction that copying // the filtered data from the buffer to the target now needs it's own set of // indexes etc., since all these may be different. // TODO we might permute source_slice's strides to ascending and apply the same // permutation to target_slice. auto source_slice = source.bindAt ( axis , 0 ) ; auto source_sliter = source_slice.begin() ; auto source_sliter_end = source_slice.end() ; auto target_slice = target.bindAt ( axis , 0 ) ; auto target_stride = target.stride ( axis ) ; auto target_sliter = target_slice.begin() ; comb_type target_indexes ; while ( source_sliter < source_sliter_end ) { int e ; auto first_source_adress = &(*source_sliter) ; auto first_target_adress = &(*target_sliter) ; for ( e = 0 ; e < vsize && source_sliter < source_sliter_end ; ++e , ++source_sliter , ++target_sliter ) { source_indexes[e] = &(*source_sliter) - first_source_adress ; target_indexes[e] = &(*target_sliter) - first_target_adress ; } if ( e < vsize ) mask = ( simdized_math_type::IndexesFromZero() < e ) ; // similar code here for the idexes, see notes above. bool fits = true ; for ( e = 0 ; fits && ( e < vsize ) ; e++ ) { source_gs_indexes[e] = source_indexes[e] ; target_gs_indexes[e] = target_indexes[e] ; if ( source_gs_indexes[e] != source_indexes[e] || target_gs_indexes[e] != target_indexes[e] ) fits = false ; } if ( fits ) { gather ( first_source_adress , buffer_base_adress , source_gs_indexes , mask , source_stride , count ) ; solver.solve ( buffer.begin() ) ; scatter ( buffer_base_adress , first_target_adress , target_gs_indexes , mask , target_stride , count ) ; } else { gather ( first_source_adress , buffer_base_adress , source_indexes , mask , source_stride , count ) ; solver.solve ( buffer.begin() ) ; scatter ( buffer_base_adress , first_target_adress , target_indexes , mask , target_stride , count ) ; } } } } /// here we provide a common routine 'aggregating_filter', which works for elementary /// value_types and also for aggregate value_types. Processing is different for these /// two cases, because the vector code can only process elementary types, and if /// value_type isn't elementary, we need to element-expand the source and target /// arrays. Since this routine is the functor passed to multithread() and therefore /// receives a range parameter to pick out a subset of the data to process in the /// single thread, we also take the opportunity here to pick out the subarrays /// for further processing. template < class source_type , class target_type , typename math_type > void aggregating_filter ( range_type < typename source_type::difference_type > range , source_type * p_original_source , target_type * p_original_target , int axis , double gain , bc_code bc , int nbpoles , const long double * pole , double tolerance ) { const int dim = source_type::actual_dimension ; typedef typename source_type::value_type value_type ; static_assert ( std::is_same < value_type , typename target_type::value_type > :: value , "aggregating_filter: both arrays must have the same value_type" ) ; typedef typename vigra::ExpandElementResult < value_type > :: type ele_type ; // continue processing on the subarrays of source and target specified by 'range': auto source = p_original_source->subarray ( range[0] , range[1] ) ; auto target = p_original_target->subarray ( range[0] , range[1] ) ; // value_type may be an aggregate type, but we want to operate on elementary types // so we element-expand the array and call ele_aggregating_filter, which works on // arrays with elementary types. If value_type is elementary already, the call to // expandElements inserts a singleton dimension, but this has next to no performance // impact, so contrary to my initial implementation I don't handle the 1-channel // case separately any more. auto expanded_source = source.expandElements ( 0 ) ; auto expanded_target = target.expandElements ( 0 ) ; // with the element-expanded arrays at hand, we can now delegate to ele_aggregating_filter: ele_aggregating_filter < decltype ( expanded_source ) , decltype ( expanded_target ) , math_type > ( expanded_source , expanded_target , axis + 1 , gain , bc , nbpoles , pole , tolerance ) ; } #else // just need the 'hull' for the compiler, will never be called template < class source_type , class target_type , typename math_type > void aggregating_filter ( ) { assert ( false ) ; } ; #endif /// Now we have the routines which perform the buffering and filtering for a chunk of data, /// We add code for multithreading. This is done by using utility code from multithread.h. /// /// Note the template parameter 'is_1d': we use specialized code for 1D arrays, see below. /// Since there *is* a specialization for is_1d == true_type, this variant will only be /// called if is_1d == false_type template < int dim , typename is_1d , typename input_array_type , ///< type of array with knot point data typename output_array_type , ///< type of array for coefficients (may be the same) typename math_type , ///< real data type used for calculations inside the filter int rsize = vspline::vector_traits < math_type > :: rsize > class filter_1d { public: void operator() ( input_array_type &input , ///< source data. can also operate in-place, output_array_type &output , ///< where input == output. int axis , double gain , bc_code bc , ///< boundary treatment for this solver int nbpoles , const long double * pole , double tolerance , int njobs = default_njobs ) ///< number of jobs to use when multithreading { typedef typename input_array_type::value_type value_type ; // depending on whether Vc is used or not, we choose the appropriate (single-threaded) // filtering routine, which is to be passed to multitheread() typedef typename vigra::ExpandElementResult < value_type > :: type ele_type ; auto pf = & aggregating_filter < input_array_type , output_array_type , ele_type > ; // obtain a partitioning of the data array into subranges. We do this 'manually' here // because we must instruct shape_splitter not to chop up the current processing axis // (by passing axis as the 'forbid' parameter) auto partitioning = shape_splitter::part ( input.shape() , njobs , axis ) ; // now use multithread() to distribute ranges of data to individual jobs which are // executed by the it's thread pool. multithread ( pf , partitioning , &input , &output , axis , gain , bc , nbpoles , pole , tolerance ) ; } } ; /// specialization for rsize == 1, no vectorization is used even if Vc is available. /// we need to specify is_1d == false_type - the specialization below for is_1d == true_type, /// is meant to catch all 1D cases. template < int dim , typename input_array_type , ///< type of array with knot point data typename output_array_type , ///< type of array for coefficients (may be the same) typename math_type ///< real data type used for calculations inside the filter > class filter_1d < dim , std::false_type , input_array_type , output_array_type , math_type , 1 > { public: void operator() ( input_array_type &input , ///< source data. can also operate in-place, output_array_type &output , ///< where input == output. int axis , double gain , bc_code bc , ///< boundary treatment for this solver int nbpoles , const long double * pole , double tolerance , int njobs = default_njobs ) ///< number of jobs to use when multithreading { typedef typename input_array_type::value_type value_type ; // depending on whether Vc is used or not, we choose the appropriate (single-threaded) // filtering routine, which is to be passed to multitheread() auto pf = & nonaggregating_filter < input_array_type , output_array_type , value_type > ; // obtain a partitioning of the data array into subranges. We do this 'manually' here // because we must instruct shape_splitter not to chop up the current processing axis // (by passing axis as the 'forbid' parameter) auto partitioning = shape_splitter::part ( input.shape() , njobs , axis ) ; // now use multithread() to distribute ranges of data to individual jobs which are // executed by the thread pool. multithread ( pf , partitioning , &input , &output , axis , gain , bc , nbpoles , pole , tolerance ) ; } } ; /// now here's the specialization for *1D arrays*. It may come as a surprise that it looks /// nothing like the nD routine. This is due to the fact that we follow a specific strategy: /// We 'fold up' the 1D array into a 'fake 2D' array, process this 2D array with the nD code /// which is very efficient, and 'mend' the stripes along the margins of the fake 2D array /// which contain wrong results due to the fact that some boundary condition appropriate /// for the 2D case was applied. /// With this 'cheat' we can handle 1D arrays with full multithreading and vectorization, /// while the 'orthodox' approach would have to process the data in linear order with /// a single thread. Cleaning up the 'dirty' margins is cheap for large arrays. /// The code is making guesses as to whether it's worth while to follow this strategy; /// the array has to be 'quite large' before 'fake 2D processing' is actually applied. template < typename input_array_type , ///< type of array with knot point data typename output_array_type , ///< type of array for coefficients (may be the same) typename math_type , ///< type for calculations inside filter int rsize > class filter_1d < 1 , // specialize for 1D std::true_type , // specialize for is_1d == true_type input_array_type , output_array_type , math_type , rsize > { public: void operator() ( input_array_type &input , ///< source data. can operate in-place output_array_type &output , ///< where input == output. int axis , double gain , bc_code bc , ///< boundary treatment for this solver int nbpoles , const long double * pole , double tolerance , int njobs = default_njobs ) ///< number of jobs to use { typedef typename input_array_type::value_type value_type ; typedef decltype ( input.begin() ) input_iter_type ; typedef decltype ( output.begin() ) output_iter_type ; typedef vspline::filter < input_iter_type , output_iter_type , double > filter_type ; typedef typename vigra::ExpandElementResult < value_type > :: type ele_type ; if ( nbpoles <= 0 ) { // nbpoles == 0 means we're prefiltering for a degree 0 or 1 spline. // so we don't need to filter anything, but if we're not operating // in-place, we want the input copied to the output. // we use the simple single-threaded implementation here for now. // this should be memory-bound, so multithreading it might be futile // TODO: might use multithreaded code to copy input to otput, test auto it1 = input.begin() ; auto it2 = output.begin() ; void * pi = &(*(it1)) ; void * po = &(*(it2)) ; // if operation isn't in-place if ( pi != po ) { // copy input to output auto ie = input.end() ; while ( it1 != ie ) { *it2 = *it1 ; ++it1 ; ++it2 ; } } return ; // return prematurely, saving us an else clause } const int bands = vigra::ExpandElementResult < value_type > :: size ; int runup ; // if we can multithread, start out with as many lanes as the desired number of threads int lanes = njobs ; #ifdef USE_VC const int vsize = vector_traits < ele_type > :: size ; // const int vsize = Vc::Vector < ele_type > :: Size ; // if we can use vector code, the number of lanes is multiplied by the // number of elements a simdized type inside the vector code can handle lanes *= vsize ; #endif // we give the filter some space to run up to precision if ( tolerance <= 0.0 ) { // we can't use the fake_2d method if the tolerance is 0.0 lanes = 1 ; } else { // there are minimum requirements for using the fake 2D filter. First find // the horizon at the given tolerance int horizon = ceil ( log ( tolerance ) / log ( fabs ( pole[0] ) ) ) ; // this is just as much as we want for the filter to run up to precision // starting with BC code 'ZEROPAD' at the margins runup = horizon ; // the absolute minimum to successfully run the fake 2D filter is this: // TODO we might rise the threshold, min_length, here int min_length = 4 * runup * lanes + 2 * runup ; // input is too short to bother with fake 2D, just single-lane it if ( input.shape(0) < min_length ) { lanes = 1 ; } else { // input is larger than the absolute minimum, maybe we can even increase // the number of lanes some more? we'd like to do this if the input is // very large, since we use buffering and don't want the buffers to become // overly large. But the smaller the run along the split x axis, the more // incorrect margin values we have to mend, so we need a compromise. // assume a 'good' length for input: some length where further splitting // would not be wanted anymore. TODO: do some testing, find a good value int good_length = 64 * runup * lanes + 2 * runup ; int split = 1 ; // suppose we split input.shape(0) in ( 2 * split ) parts, is it still larger // than this 'good' length? If not, leave split factor as it is. while ( input.shape(0) / ( 2 * split ) >= good_length ) { // if yes, double split factor, try again split *= 2 ; } lanes *= split ; // increase number of lanes by additional split } } // if there's only one lane we just use this simple code: if ( lanes == 1 ) { // this is a simple single-threaded implementation filter_type solver ( input.shape(0) , gain , bc , nbpoles , pole , 0.0 ) ; solver.solve ( input.begin() , output.begin() ) ; return ; // return prematurely, saving us an else clause } // the input qualifies for fake 2D processing. // std::cout << "fake 2D processing with " << lanes << " lanes" << std::endl ; // we want as many chunks as we have lanes. There may be some data left // beyond the chunks (tail_size of value_type) int core_size = input.shape(0) ; int chunk_size = core_size / lanes ; core_size = lanes * chunk_size ; int tail_size = input.shape(0) - core_size ; // just doublecheck assert ( core_size + tail_size == input.shape(0) ) ; // now here's the strategy: we treat the data as if they were 2D. This will // introduce errors along the 'vertical' margins, since there the 2D treatment // will start with some boundary condition along the x axis instead of looking // at the neighbouring line where the actual continuation is. // first we deal with the very beginning and end of the signal. This requires // special treatment, because here we want the boundary conditions to take // effect. So we copy the beginning and end of the signal to a buffer, being // generous with how many data we pick. The resulting buffer will have an // unusable part in the middle, where tail follows head, but since we've made // sure that this location is surrounded by enough 'runup' data, the effect // will only be detectable at +/- runup from the point where tail follows head. // The beginning of head and the end of tail are at the beginning and end // of the buffer, though, so that applying the boundary condition will // have the desired effect. What we'll actually use of the buffer is not // the central bit with the effects of the clash of head and tail, but // only the bits at the ends which aren't affected because they are far enough // away. // note how this code fixes a bug in my initial implementation, which produced // erroneous results with periodic splines, because the boundary condition // was not properly honoured. // calculate the sizes of the parts of the signal we'll put into the buffer int front = 2 * runup ; int back = tail_size + 2 * runup ; int total = front + back ; // create the buffer and copy the beginning and end of the signal into it vigra::MultiArray < 1 , value_type > head_and_tail ( total ) ; auto target_it = head_and_tail.begin() ; auto source_it = input.begin() ; for ( int i = 0 ; i < front ; i++ ) { *target_it = *source_it ; ++target_it ; ++source_it ; } source_it = input.end() - back ; for ( int i = 0 ; i < back ; i++ ) { *target_it = *source_it ; ++target_it ; ++source_it ; } // set up the filter for this buffer and apply it filter_type head_and_tail_solver ( head_and_tail.size() , gain , bc , nbpoles , pole , 0.0 ) ; head_and_tail_solver.solve ( head_and_tail.begin() ) ; // set up two MultiArrayViews corresponding to the portions of the data // we copied into the buffer. The first bit of 'head' and the last bit // of 'tail' hold valid data and will be used further down. vigra::MultiArrayView < 1 , value_type > head ( vigra::Shape1 ( front ) , head_and_tail.data() ) ; vigra::MultiArrayView < 1 , value_type > tail ( vigra::Shape1 ( back ) , head_and_tail.data() + front ) ; // end of bug fix for periodic splines // head now has runup correct values at the beginning, succeeded by runup invalid // values, and tail has tail_size + runup correct values at the end, preceded by // runup values which aren't usable. // now we create a fake 2D view to the margin of the data. Note how we let the // view begin 2 * runup before the end of the first line, capturing the 'wraparound' // right in the middle of the view typedef vigra::MultiArrayView < 2 , value_type > fake_2d_type ; fake_2d_type fake_2d_margin ( vigra::Shape2 ( 4 * runup , lanes - 1 ) , vigra::Shape2 ( input.stride(0) , input.stride(0) * chunk_size ) , input.data() + chunk_size - 2 * runup ) ; // again we create a buffer and filter into the buffer vigra::MultiArray < 2 , value_type > margin_buffer ( fake_2d_margin.shape() ) ; filter_1d < 2 , std::false_type , fake_2d_type , fake_2d_type , math_type > () ( fake_2d_margin , margin_buffer , 0 , gain , GUESS , nbpoles , pole , tolerance , 1 ) ; // now we have filtered data for the margins in margin_buffer, of which the central half // is usable, the remainder being runup data which we'll ignore. Here's a view to the // central half: vigra::MultiArrayView < 2 , value_type > margin = margin_buffer.subarray ( vigra::Shape2 ( runup , 0 ) , vigra::Shape2 ( 3 * runup , lanes - 1 ) ) ; // we already create a view to the target array's margin which we intend to overwrite, // but the data will only be copied in from margin after the treatment of the core. vigra::MultiArrayView < 2 , value_type > margin_target ( vigra::Shape2 ( 2 * runup , lanes - 1 ) , vigra::Shape2 ( output.stride(0) , output.stride(0) * chunk_size ) , output.data() + chunk_size - runup ) ; // next we fake a 2D array from input and filter it to output, this may be an // in-place operation, since we've extracted all margin information earlier and // deposited what we need in buffers fake_2d_type fake_2d_source ( vigra::Shape2 ( chunk_size , lanes ) , vigra::Shape2 ( input.stride(0) , input.stride(0) * chunk_size ) , input.data() ) ; fake_2d_type fake_2d_target ( vigra::Shape2 ( chunk_size , lanes ) , vigra::Shape2 ( output.stride(0) , output.stride(0) * chunk_size ) , output.data() ) ; // now we filter the fake 2D source to the fake 2D target filter_1d < 2 , std::false_type , fake_2d_type , fake_2d_type , math_type > () ( fake_2d_source , fake_2d_target , 0 , gain , GUESS , nbpoles , pole , tolerance , njobs ) ; // we now have filtered data in target, but the stripes along the magin // in x-direction (1 runup wide) are wrong, because we applied GUESS BC. // this is why we have the data in 'margin', and we now copy them to the // relevant section of 'target' margin_target = margin ; // finally we have to fix the first and last few values, which weren't touched // by the margin operation (due to margin's offset and length) typedef vigra::Shape1 dt ; output.subarray ( dt(0) , dt(runup) ) = head.subarray ( dt(0) , dt(runup) ) ; output.subarray ( dt(output.size() - tail_size - runup ) , dt(output.size()) ) = tail.subarray ( dt(tail.size() - tail_size - runup ) , dt(tail.size()) ) ; } } ; /// This routine calls the 1D filtering routine for all axes in turn. This is the /// highest-level routine in filter.h, and the only routine used by other code in /// vspline. It has no code specific to b-splines, any set of poles will be processed. /// To use this routine for b-splines, the correct poles have to be passed in, which /// is done in prefilter.h, where the code for prefiltering the knot point data /// calls filter_nd with the poles needed for a b-spline. /// /// This routine takes the following parameters: /// /// - input, output: MultiArrayViews of the source and target array /// - bc: TinyVector of boundary condition codes, allowing separate values for each axis /// - nbpoles: number of filter poles /// - pole: pointer to nbpoles long doubles containing the filter poles /// - tolerance: acceptable error /// - njobs: number of jobs to use when multithreading // TODO look into treatment of singleton dimensions template < typename input_array_type , // type of array with knot point data typename output_array_type , // type of array for coefficients (may be the same) typename math_type > // type used for arithmetic operations in filter void filter_nd ( input_array_type & input , output_array_type & output , vigra::TinyVector bc , int nbpoles , const long double * pole , double tolerance , int njobs = default_njobs ) { // check if operation is in-place. I assume that the test performed here // is sufficient to determine if the operation is in-place. bool in_place = false ; if ( (void*)(input.data()) == (void*)(output.data()) ) in_place = true ; // if input == output, with degree <= 1 we needn't do anything at all. if ( in_place && nbpoles < 1 ) return ; // do a bit of compatibility checking const int dim = input_array_type::actual_dimension ; if ( output_array_type::actual_dimension != dim ) { throw dimension_mismatch ( "input and output array must have the same dimension" ) ; } typedef typename input_array_type::difference_type diff_t ; diff_t shape = input.shape() ; if ( output.shape() != shape ) { throw shape_mismatch ( "input and output array must have the same shape" ) ; } // normally the gain is the same for all dimensions. double gain_d0 = overall_gain ( nbpoles , pole ) ; double gain_dn = gain_d0 ; // deactivating the code below may produce slightly more precise results // This bit of code results in applictation of the cumulated gain for all dimensions // while processing axis 0, and no gain application for subsequent axes. // heuristic. for high degrees, below optimization reduces precision too much // TODO: the effect of this optimization seems negligible. if ( dim > 1 && pow ( nbpoles , dim ) < 32 ) { gain_d0 = pow ( gain_d0 , dim ) ; gain_dn = 1.0 ; } // even if degree <= 1, we'll only arrive here if input != output. // So we still have to copy the input data to the output (solve_identity) typedef std::integral_constant < bool , dim == 1 > is_1d ; filter_1d < dim , is_1d , input_array_type , output_array_type , math_type > () ( input , output , 0 , gain_d0 , bc[0] , nbpoles , pole , tolerance , njobs ) ; // but if degree <= 1 we're done already, since copying the data again // in dimensions 1... is futile if ( nbpoles > 0 ) { // so for the remaining dimensions we also call the filter. for ( int d = 1 ; d < dim ; d++ ) filter_1d < dim , is_1d , output_array_type , output_array_type , math_type , vspline::vector_traits < math_type > :: rsize > () ( output , output , d , gain_dn , bc[d] , nbpoles , pole , tolerance , njobs ) ; } } } ; // namespace vspline #endif // VSPLINE_FILTER_H kfj-vspline-6e66cf7a7926/map.h000066400000000000000000000503721320375670700160320ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file map.h \brief code to handle out-of-bounds coordinates. Incoming coordinates may not be inside the range which can be evaluated by a functor. There is no one correct way of dealing with out-of-bounds coordinates, so we provide a few common ways of doing it. If the 'standard' gate types don't suffice, the classes provided here can serve as templates. The basic type handling the operation is a 'gate type', which 'treats' a single value or single simdized value. For nD coordinates, we use a set of these gate_type objects, one for each component; each one may be of a distinct type specific to the axis the component belongs to. Application of the gates is via a 'mapper' object, which contains the gate_types and applies them to the components in turn. The final mapper object is a functor which converts an arbitrary incoming coordinate into a 'treated' coordinate (or, for REJECT mode, may throw an out_of_bounds exception). mapper objects are derived from vspline::unary_functor, so they fit in well with other code in vspline and can easily be combined with other unary_functor objects, or used stand-alone. They are used inside vspline to implement the factory function vspline::make_safe_evaluator, which chains a suitable mapper and an evaluator to create an object allowing safe evaluation of a b-spline with arbitrary coordinates where out-of-range coordinates are mapped to the defined range in a way fitting the b-spline's boundary conditions. */ #ifndef VSPLINE_MAP_H #define VSPLINE_MAP_H #include #include // for production code, the two following #defines should be // commented out. // define to check that results are in expected bounds // #define ASSERT_IN_BOUNDS // define to check that vectorized and unvectorized code // produce identical results // #define ASSERT_CONSISTENT namespace vspline { /// class pass_gate passes it's input to it's output unmodified. template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > struct pass_gate : public vspline::unary_functor < rc_type , rc_type , _vsize > { template < class T > void eval ( const T & c , T & result ) const { result = c ; } } ; /// factory function to create a pass_gate type functor template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > vspline::pass_gate < rc_type , _vsize > pass() { return vspline::pass_gate < rc_type , _vsize >() ; } /// reject_gate throws vspline::out_of_bounds for invalid coordinates template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > struct reject_gate : public vspline::unary_functor < rc_type , rc_type , _vsize > { const rc_type lower ; const rc_type upper ; reject_gate ( rc_type _lower , rc_type _upper ) : lower ( _lower ) , upper ( _upper ) { } ; void eval ( const rc_type & c , rc_type & result ) const { if ( c < lower || c > upper ) throw vspline::out_of_bounds() ; result = c ; } #ifdef USE_VC // the vectorized eval() is coded as a template. This way it is // a worse match than the single-value eval when eval is called // with single values. All other possible calls pass vectorized // data and will match this template template < class rc_v > void eval ( const rc_v & c , rc_v & result ) const { if ( any_of ( ( c < lower ) | ( c > upper ) ) ) throw vspline::out_of_bounds() ; result = c ; } #endif } ; /// factory function to create a reject_gate type functor given /// a lower and upper limit for the allowed range. template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > vspline::reject_gate < rc_type , _vsize > reject ( rc_type lower , rc_type upper ) { return vspline::reject_gate < rc_type , _vsize > ( lower , upper ) ; } /// clamp gate clamps out-of-bounds values. clamp_gate takes /// four arguments: the lower and upper limit of the gate, and /// the values which are returned if the input is outside the /// range: 'lfix' if it is below 'lower' and 'ufix' if it is /// above 'upper' template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > struct clamp_gate : public vspline::unary_functor < rc_type , rc_type , _vsize > { const rc_type lower ; const rc_type upper ; const rc_type lfix ; const rc_type ufix ; clamp_gate ( rc_type _lower , rc_type _upper , rc_type _lfix , rc_type _ufix ) : lower ( _lower <= _upper ? _lower : _upper ) , upper ( _upper >= _lower ? _upper : _lower ) , lfix ( _lower <= _upper ? _lfix : _ufix ) , ufix ( _upper >= _lower ? _ufix : _lfix ) { assert ( lower < upper ) ; } ; void eval ( const rc_type & c , rc_type & result ) const { if ( c < lower ) result = lfix ; else if ( c > upper ) result = ufix ; else result = c ; } #ifdef USE_VC template < class rc_v > void eval ( const rc_v & c , rc_v & result ) const { result = c ; result ( c < lower ) = lfix ; result ( c > upper ) = ufix ; } #endif } ; /// factory function to create a clamp_gate type functor given /// a lower and upper limit for the allowed range, and, optionally, /// the values to use if incoming coordinates are out-of-range template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > vspline::clamp_gate < rc_type , _vsize > clamp ( rc_type lower , rc_type upper , rc_type lfix , rc_type rfix ) { return vspline::clamp_gate < rc_type , _vsize > ( lower , upper , lfix , rfix ) ; } /// for the vectorized versions of mirror_gate and periodic_gate, /// we use a vectorized fmod function. v_fmod_broadcast is used /// to verify it's operation. #ifdef USE_VC template rc_v v_fmod_broadcast ( const rc_v & lhs , const typename rc_v::EntryType & rhs ) { rc_v result ; // using std::fmod, the additional test above is not necessary, // but the broadcasted operation takes longer. for ( int e = 0 ; e < rc_v::size() ; e++ ) result[e] = std::fmod ( lhs[e] , rhs ) ; #ifdef ASSERT_IN_BOUNDS assert ( all_of ( std::abs ( result ) < std::abs ( rhs ) ) ) ; assert ( all_of ( std::abs ( result ) >= 0 ) ) ; #endif return result ; } /// vectorized fmod function using std::trunc, which is fast, but /// checking the result to make sure it's always <= rhs. template rc_v v_fmod ( const rc_v & lhs , const typename rc_v::EntryType & rhs ) { auto result = lhs - rhs * std::trunc ( lhs / rhs ) ; // due to arithmetic imprecision, result may come out >= rhs // so we doublecheck and set result to 0 when this occurs result ( std::abs(result) >= std::abs(rhs) ) = 0 ; #ifdef ASSERT_IN_BOUNDS assert ( all_of ( std::abs ( result ) < std::abs ( rhs ) ) ) ; assert ( all_of ( std::abs ( result ) >= 0 ) ) ; #endif #ifdef ASSERT_CONSISTENT auto reference = v_fmod_broadcast ( lhs , rhs ) ; assert ( all_of ( result == reference ) ) ; #endif return result ; } #endif /// mirror gate 'folds' coordinates into the range. From the infinite /// number of mirror images resulting from mirroring the input on the /// bounds, the only one inside the range is picked as the result. /// When using this gate type with splines with MIRROR boundary conditions, /// if the shape of the core for the axis in question is M, _lower would be /// passed 0 and _upper M-1. /// For splines with REFLECT boundary conditions, we'd pass -0.5 to /// _lower and M-0.5 to upper, since here we mirror 'between bounds' /// and the defined range is wider. /// /// Note how this mode of 'mirroring' allows use of arbitrary coordinates, /// rather than limiting the range of acceptable input to the first reflection, /// as some implementations do. template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > struct mirror_gate : public vspline::unary_functor < rc_type , rc_type , _vsize > { const rc_type lower ; const rc_type upper ; mirror_gate ( rc_type _lower , rc_type _upper ) : lower ( _lower <= _upper ? _lower : _upper ) , upper ( _upper >= _lower ? _upper : _lower ) { assert ( lower < upper ) ; } ; void eval ( const rc_type & c , rc_type & result ) const { rc_type cc ( c - lower ) ; auto w = upper - lower ; cc = std::abs ( cc ) ; // left mirror, v is now >= 0 if ( cc >= w ) { cc = fmod ( cc , 2 * w ) ; // map to one full period cc -= w ; // center cc = std::abs ( cc ) ; // map to half period cc = w - cc ; // flip } result = cc + lower ; #ifdef ASSERT_IN_BOUNDS assert ( result >= lower ) ; assert ( result <= upper ) ; #endif } #ifdef USE_VC template < class rc_v > void eval ( const rc_v & c , rc_v & result ) const { rc_v cc ( c - lower ) ; auto w = upper - lower ; cc = std::abs ( cc ) ; // left mirror, v is now >= 0 auto mask = ( cc >= w ) ; if ( any_of ( mask ) ) { auto cm = v_fmod ( cc , 2 * w ) ; // map to one full period cm -= w ; // center cm = std::abs ( cm ) ; // map to half period cm = w - cm ; // flip cc ( mask ) = cm ; } result = cc + lower ; #ifdef ASSERT_IN_BOUNDS assert ( all_of ( result >= lower ) ) ; assert ( all_of ( result <= upper ) ) ; #endif #ifdef ASSERT_CONSISTENT cc = result ; for ( int e = 0 ; e < rc_v::size() ; e++ ) { rc_type x = cc[e] ; eval ( x , x ) ; cc[e] = x ; } assert ( all_of ( cc == result ) ) ; #endif } #endif } ; /// factory function to create a mirror_gate type functor given /// a lower and upper limit for the allowed range. template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > vspline::mirror_gate < rc_type , _vsize > mirror ( rc_type lower , rc_type upper ) { return vspline::mirror_gate < rc_type , _vsize > ( lower , upper ) ; } /// the periodic mapping also folds the incoming value into the allowed range. /// The resulting value will be ( N * period ) from the input value and inside /// the range, period being upper - lower. /// For splines done with PERIODIC boundary conditions, if the shape of /// the core for this axis is M, we'd pass 0 to _lower and M to _upper. template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > struct periodic_gate : public vspline::unary_functor < rc_type , rc_type , _vsize > { const rc_type lower ; const rc_type upper ; periodic_gate ( rc_type _lower , rc_type _upper ) : lower ( _lower <= _upper ? _lower : _upper ) , upper ( _upper >= _lower ? _upper : _lower ) { assert ( lower < upper ) ; } ; void eval ( const rc_type & c , rc_type & result ) const { rc_type cc = c - lower ; auto w = upper - lower ; if ( ( cc < 0 ) | ( cc >= w ) ) { cc = fmod ( cc , w ) ; if ( cc < 0 ) cc += w ; // due to arithmetic imprecision, even though cc < 0 // cc+w may come out == w, so we need to test again: if ( cc >= w ) cc = 0 ; } result = cc + lower ; #ifdef ASSERT_IN_BOUNDS assert ( result >= lower ) ; assert ( result < upper ) ; #endif } #ifdef USE_VC template < class rc_v > void eval ( const rc_v & c , rc_v & result ) const { rc_v cc ; cc = c - lower ; auto w = upper - lower ; auto mask_below = ( cc < 0 ) ; auto mask_above = ( cc >= w ) ; auto mask_any = mask_above | mask_below ; if ( any_of ( mask_any ) ) { auto cm = v_fmod ( cc , w ) ; cm ( mask_below ) += w ; // due to arithmetic imprecision, even though cc < 0 // cc+w may come out == w, so we need to test again: cm ( cm >= w ) = 0 ; cc ( mask_any ) = cm ; } result = cc + lower ; #ifdef ASSERT_IN_BOUNDS assert ( all_of ( result >= lower ) ) ; assert ( all_of ( result < upper ) ) ; #endif #ifdef ASSERT_CONSISTENT cc = result ; for ( int e = 0 ; e < rc_v::size() ; e++ ) { rc_type x = cc[e] ; eval ( x , x ) ; cc[e] = x ; } assert ( all_of ( cc == result ) ) ; #endif } #endif } ; /// factory function to create a periodic_gate type functor given /// a lower and upper limit for the allowed range. template < typename rc_type , int _vsize = vspline::vector_traits < rc_type > :: size > vspline::periodic_gate < rc_type , _vsize > periodic ( rc_type lower , rc_type upper ) { return vspline::periodic_gate < rc_type , _vsize > ( lower , upper ) ; } /// finally we define class mapper which is initialized with a set of /// gate objects (of arbitrary type) which are applied to each component /// of an incoming nD coordinate in turn. /// The trickery with the variadic template argument list is necessary, /// because we want to be able to combine arbitrary gate types (which /// have distinct types) to make the mapper as efficient as possible. /// the only requirement for a gate type is that it has to provide the /// necessary eval() functions. template < typename nd_rc_type , int _vsize , class ... gate_types > struct map_functor : public vspline::unary_functor < nd_rc_type , nd_rc_type , _vsize > { typedef typename vspline::unary_functor < nd_rc_type , nd_rc_type , _vsize > base_type ; typedef typename base_type::in_type in_type ; typedef typename base_type::out_type out_type ; enum { vsize = _vsize } ; // typedef typename vspline::vector_traits // < in_type > :: ele_type in_ele_type ; // // typedef typename vspline::vector_traits // < out_type > :: ele_type out_ele_type ; // // typedef typename vspline::vector_traits // < in_type , vsize > :: ele_v in_ele_v ; // // typedef typename vspline::vector_traits // < out_type , vsize > :: ele_v out_ele_v ; // // typedef typename vspline::vector_traits // < in_type , vsize > :: type in_v ; // // typedef typename vspline::vector_traits // < out_type , vsize > :: type out_v ; enum { dimension = vigra::ExpandElementResult < nd_rc_type > :: size } ; // we hold the 1D mappers in a tuple typedef std::tuple < gate_types... > mvec_type ; // mvec holds the 1D gate objects passed to the constructor const mvec_type mvec ; // the constructor receives gate objects map_functor ( gate_types ... args ) : mvec ( args... ) { } ; // constructor variant taking a tuple of gates map_functor ( const mvec_type & _mvec ) : mvec ( _mvec ) { } ; // to handle the application of the 1D gates, we use a recursive // helper type which applies the 1D gate for a specific axis and // then recurses to the next axis until axis 0 is reached. // We also pass 'dimension' as template argument, so we can specialize // for 1D operation (see below) template < int level , int dimension , typename nd_coordinate_type > struct _map { void operator() ( const mvec_type & mvec , const nd_coordinate_type & in , nd_coordinate_type & out ) const { std::get(mvec).eval ( in[level] , out[level] ) ; _map < level - 1 , dimension , nd_coordinate_type >() ( mvec , in , out ) ; } } ; // at level 0 the recursion ends template < int dimension , typename nd_coordinate_type > struct _map < 0 , dimension , nd_coordinate_type > { void operator() ( const mvec_type & mvec , const nd_coordinate_type & in , nd_coordinate_type & out ) const { std::get<0>(mvec).eval ( in[0] , out[0] ) ; } } ; // here's the specialization for 1D operation template < typename coordinate_type > struct _map < 0 , 1 , coordinate_type > { void operator() ( const mvec_type & mvec , const coordinate_type & in , coordinate_type & out ) const { std::get<0>(mvec).eval ( in , out ) ; } } ; // now we define eval for unvectorized and vectorized operation // by simply delegating to struct _map at the top level. template < class in_type , class out_type > void eval ( const in_type & in , out_type & out ) const { _map < dimension - 1 , dimension , in_type >() ( mvec , in , out ) ; } } ; /// factory function to create a mapper type functor given /// a set of gate_type objects. Please see vspline::make_safe_evaluator /// for code to automatically create a mapper object suitable for a /// specific vspline::bspline. template < typename nd_rc_type , int _vsize = vspline::vector_traits < nd_rc_type > :: size , class ... gate_types > vspline::map_functor < nd_rc_type , _vsize , gate_types... > mapper ( gate_types ... args ) { return vspline::map_functor < nd_rc_type , _vsize , gate_types... > ( args... ) ; } } ; // namespace vspline #endif // #ifndef VSPLINE_MAP_H kfj-vspline-6e66cf7a7926/multithread.h000066400000000000000000000614661320375670700176050ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file multithread.h /// /// \brief code to distribute the processing of bulk data to several threads /// /// The code in this header provides a resonably general method to perform /// processing of manifolds of data with several threads in parallel. In vspline, /// there are several areas where potentially large numbers of individual values /// have to be processed independently of each other or in a dependence which /// can be preserved in partitioning. To process such 'bulk' data effectively, /// vspline employs two strategies: multithreading and vectorization. /// This file handles the multithreading. /// /// To produce generic code for the purpose, we first introduce a model of what /// we intend to do. This model looks at the data as occupying a 'range' having /// a defined starting point and end point. We keep with the convention of defining /// ranges so that the start point is inside and the end point outside the data /// set described by the range, just like iterators obtained by begin() and end(). /// This range is made explicit, even if it is implicit in the data which we want to /// submit to multithreading, and there is a type for the purpose: struct range_type. /// range_type merely captures the concept of a range, taking 'limit_type' as it's /// template parameter, so that any type of range can be accomodated. A range is /// defined by it's lower and upper limit. /// /// Next we define an object holding a set of ranges, modeling a partitioning of /// an original/whole range into subranges, which, within the context of this code, /// are disparate and in sequence. This object is modeled as struct partition_type, /// taking a range_type as it's template argument. /// /// With these types, we model concrete ranges and partitionings. The most important /// one is dealing with multidimensional shapes, where a range extends from a 'lower' /// coordinate to just below a 'higer' coordinate. These two coordinates can be /// used directly to call vigra's 'subarray' function. /// /// Next we provide code to partition ranges into sets of subranges. /// /// Finally we can express a generalized multithreading routine. This routine takes /// a functor capable of processing a range specification and a parameter pack of /// arbitrary further parameters, of which some will usually be refering to manifolds /// of data for which the given range makes sense. We call this routine with a /// partitioning of the original range and the same parameter pack that is to be passed /// to the functor. The multithreading routine proceeds to set up 'tasks' as needed, /// providing each with the functor as it's functional, a subrange from /// the partitioning, and the parameter pack as arguments. The routine to be used /// to partition the 'whole' range is passed in. /// /// The tasks, once prepared, are handed over to a 'joint_task' object which handles /// the interaction with the thread pool (in thread_pool.h). While my initial code /// used one thread per task, this turned out inefficient, because it was not granular /// enough: the slowest thread became the limiting factor. Now the job at hand is split /// into more individual tasks (something like 8 times the number of cores), resulting /// in a fair compromise concerning granularity. multithread() waits for all tasks to /// terminate and returns when it's certain that the job is complete. /// /// With this method, we assure that on return of multithread() we can safely access /// whatever results we anticipate. While it might be useful to launch the tasks and /// return to continue the main thread, picking up the result later when it becomes /// ready, I chose to suspend the calling thread until the result arrives. This makes /// the logic simpler, and should be what most use cases need: there is often little else /// to do but to wait for the result anyway. If asynchronous operation is needed, a thread /// can be launched to initiate and collect from the multithreading. It's safe to have /// several threads using this multithreading code, since each task is linked to a /// 'coordinator', see struct joint_task below. #ifndef VSPLINE_MULTITHREAD_H #define VSPLINE_MULTITHREAD_H #include #include #include #include #include #include #include #include #include namespace vspline { /// number of CPU cores in the system. const int ncores = std::thread::hardware_concurrency() ; /// when multithreading, use this number of jobs per default. This is /// an attempt at a compromise: too many jobs will produce too much overhead, /// too few will not distribute the load well and make the system vulnerable /// to 'straggling' threads const int default_njobs = 8 * ncores ; /// given limit_type, we define range_type as a TinyVector of two limit_types, /// the first denoting the beginning of the range and the second it's end, with /// end being outside of the range. template < class limit_type > using range_type = vigra::TinyVector < limit_type , 2 > ; /// given range_type, we define partition_type as a std::vector of range_type. /// This data type is used to hold the partitioning of a range into subranges. template < class range_type > using partition_type = std::vector < range_type > ; /// given a dimension, we define a shape_type as a TinyVector of /// vigra::MultiArrayIndex of this dimension. /// This is equivalent to vigra's shape type. // TODO: might instead define as: vigra::MultiArrayShape template < int dimension > using shape_type = vigra::TinyVector < vigra::MultiArrayIndex , dimension > ; /// given a dimension, we define shape_range_type as a range defined by /// two shapes of the given dimension. This definition allows us to directly /// pass the two shapes as arguments to a call of subarray() on a MultiArrayView /// of the given dimension. Note the subarray semantics: if the range is /// [2,2] to [4,4], it refers to elements [2,2], [3,2], [2,3], [3,3]. template < int dimension > using shape_range_type = range_type < shape_type < dimension > > ; template < int dimension > using shape_partition_type = partition_type < shape_range_type < dimension > > ; // currently unused // // iterator_splitter will try to set up n ranges from a range. the partial // // ranges are stored in a std::vector. The split may succeed producing n // // or less ranges, and if iter_range can't be split at all, a single range // // encompassing the whole of iter_range will be returned in the result vector. // // template < class _iterator_type > // struct iterator_splitter // { // typedef _iterator_type iterator_type ; // typedef vigra::TinyVector < iterator_type , 2 > range_type ; // typedef std::vector < range_type > partition_type ; // // static partition_type part ( const range_type & iter_range , // int n ) // { // std::vector < range_type > res ; // assert ( n > 0 ) ; // // iterator_type start = iter_range [ 0 ] ; // iterator_type end = iter_range [ 1 ] ; // int size = end - start ; // if ( n > size ) // n = size ; // // int chunk_size = size / n ; // will be at least 1 // // for ( int i = 0 ; i < n - 1 ; i++ ) // { // res.push_back ( range_type ( start , start + chunk_size ) ) ; // start += chunk_size ; // } // res.push_back ( range_type ( start , end ) ) ; // return res ; // } // } ; /// shape_splitter will try to split a shape into n ranges by 'chopping' it /// along the outermost axis that can be split n-ways. The additional parameter /// 'forbid' prevents a particular axis from being split. The split may succeed /// producing n or less ranges, and if 'shape' can't be split at all, a single range /// encompassing the whole of 'shape' will be returned in the result vector. This /// object is used for partitioning when one axis has to be preserved intact, like /// for b-spline prefiltering, but it's not used per default for all shape splitting, /// since the resulting partitioning performs not so well in certain situations /// (see the partitioning into tiles below for a better general-purpose splitter) // TODO: with some shapes, splitting will result in subranges which aren't optimal // for b-spline prefiltering (these are fastest with extents which are a multiple of // the simdized data type), so we might add code to preferably use cut locations // coinciding with those extents. And with small extents being split, the result // becomes very inefficient for filtering. template < int dim > struct shape_splitter { typedef shape_type < dim > shape_t ; typedef range_type < shape_t > range_t ; typedef partition_type < range_t > partition_t ; static partition_t part ( const shape_t & shape , ///< shape to be split n-ways int n = default_njobs , ///< intended number of chunks int forbid = -1 ) ///< axis which shouldn't be split { partition_t res ; // find the outermost dimension that can be split n ways, and it's extent int split_dim = -1 ; int max_extent = -1 ; for ( int md = dim - 1 ; md >= 0 ; md-- ) { if ( md != forbid && shape[md] > max_extent && shape[md] >= n ) { max_extent = shape[md] ; split_dim = md ; break ; } } // if the search did not yet succeed: if ( max_extent == -1 ) { // repeat process with relaxed conditions: now the search will also succeed // if there is an axis which can be split less than n ways for ( int md = dim - 1 ; md >= 0 ; md-- ) { if ( md != forbid && shape[md] > 1 ) { max_extent = shape[md] ; split_dim = md ; break ; } } } if ( split_dim == -1 ) { // we have not found a dimension for splitting. We pass back res with // a range over the whole initial shape as it's sole member res.push_back ( range_t ( shape_t() , shape ) ) ; } else { // we can split the shape along split_dim int w = shape [ split_dim ] ; // extent of the dimension we can split n = std::min ( n , w ) ; // just in case, if that is smaller than n int * cut = new int [ n ] ; // where to chop up this dimension for ( int i = 0 ; i < n ; i++ ) cut[i] = ( (i+1) * w ) / n ; // roughly equal chunks, but certainly last cut == a.end() shape_t start , end = shape ; for ( int i = 0 ; i < n ; i++ ) { end [ split_dim ] = cut [ i ]; // apply the cut locations res.push_back ( range_t ( start , end ) ) ; start [ split_dim ] = end [ split_dim ] ; } delete[] cut ; // clean up } return res ; } } ; /// partition a shape range into 'stripes'. This uses shape_splitter with /// 'forbid' left at the default of -1, resulting in a split along the /// outermost dimension that can be split n ways or the next best thing /// shape_splitter can come up with. If the intended split is merely to /// distribute the work load without locality considerations, this should /// be the split to use. When locality is an issue, consider the next variant. template < int d > partition_type < shape_range_type > partition_to_stripes ( shape_range_type range , int nparts ) { if ( range[0].any() ) { // the lower limit of the range is not at the origin, so get the shape // of the region between range[0] and range[1], call shape_splitter with // this shape, and add the offset to the lower limit of the original range // to the partial ranges in the result auto shape = range[1] - range[0] ; auto res = shape_splitter < d > :: part ( shape , nparts ) ; for ( auto & r : res ) { r[0] += range[0] ; r[1] += range[0] ; } return res ; } // if range[0] is at the origin, we don't have to use an offset return shape_splitter < d > :: part ( range[1] , nparts ) ; } /// alternative partitioning into tiles. For the optimal situation, where /// the view isn't rotated or pitched much, the partitioning into bunches /// of lines (above) seems to perform slightly better, but with more difficult /// transformations (like 90 degree rotation), performance suffers (like, -20%), /// whereas with this tiled partitioning it is roughly the same, supposedly due /// to identical locality in both cases. So currently I am using this partitioning. /// note that the current implementation ignores the argument 'nparts' and /// produces tiles 160X160. // TODO code is a bit clumsy... // TODO it may be a good idea to have smaller portions towards the end // of the partitioning, since they will be processed last, and if the // last few single-threaded operations are short, they may result in less // situations where a long single-threaded operation has just started when // all other tasks are already done, causing the system to idle on the other // cores. or at least the problem would not persist for so long. template < int d > partition_type < shape_range_type > partition_to_tiles ( shape_range_type range , int nparts = default_njobs ) { // To help with the dilemma that this function is really quite specific // for images, for the time being I delegate to return partition_to_stripes() // for dimensions != 2 if ( d != 2 ) return partition_to_stripes ( range , nparts ) ; auto shape = range[1] - range[0] ; // currently disregarding incoming nparts parameter: // int nelements = prod ( shape ) ; // int ntile = nelements / nparts ; // int nedge = pow ( ntile , ( 1.0 / d ) ) ; // TODO fixing this size is system-specific! int nedge = 160 ; // heuristic, fixed size tiles auto tiled_shape = shape / nedge ; typedef std::vector < int > stopv ; stopv stops [ d ] ; for ( int a = 0 ; a < d ; a++ ) { stops[a].push_back ( 0 ) ; for ( int k = 1 ; k < tiled_shape[a] ; k++ ) stops[a].push_back ( k * nedge ) ; stops[a].push_back ( shape[a] ) ; } for ( int a = 0 ; a < d ; a++ ) tiled_shape[a] = stops[a].size() - 1 ; int k = prod ( tiled_shape ) ; // If this partitioning scheme fails to produce a partitioning with // at least nparts components, fall back to using partition_to_stripes() if ( k < nparts ) return partition_to_stripes ( range , nparts ) ; nparts = k ; partition_type < shape_range_type > res ( nparts ) ; for ( int a = 0 ; a < d ; a++ ) { int j0 = 1 ; for ( int h = 0 ; h < a ; h++ ) j0 *= tiled_shape[h] ; int i = 0 ; int j = 0 ; for ( int k = 0 ; k < nparts ; k++ ) { res[k][0][a] = stops[a][i] ; res[k][1][a] = stops[a][i+1] ; ++j ; if ( j == j0 ) { j = 0 ; ++i ; if ( i >= tiled_shape[a] ) i = 0 ; } } } for ( auto & e : res ) { e[0] += range[0] ; e[1] += range[0] ; // std::cout << "tile: " << e[0] << e[1] << std::endl ; } return res ; } // /// specialization for 1D shape range. Obviously we can't make tiles // /// from 1D data... // // template<> // partition_type < shape_range_type<1> > // partition_to_tiles ( shape_range_type<1> range , // int nparts ) // { // auto size = range[1][0] - range[0][0] ; // auto part_size = size / nparts ; // if ( part_size < 1 ) // part_size = size ; // // nparts = int ( size / part_size ) ; // if ( nparts * part_size < size ) // nparts++ ; // // partition_type < shape_range_type<1> > res ( nparts ) ; // // auto start = range[0] ; // auto stop = start + part_size ; // for ( auto & e : res ) // { // e[0] = start ; // e[1] = stop ; // start = stop ; // stop = start + part_size ; // } // res[nparts-1][1] = size ; // return res ; // } /// action_wrapper wraps a functional into an outer function which /// first calls the functional and then checks if this was the last /// of a bunch of actions to complete, by incrementing the counter /// p_done points to and comparing the result to 'nparts'. If the /// test succeeds, the caller is notified via the condition variable /// p_pool_cv points to, under the mutex p_pool_mutex points to. static void action_wrapper ( std::function < void() > payload , int nparts , std::mutex * p_pool_mutex , std::condition_variable * p_pool_cv , int * p_done ) { // execute the 'payload' payload() ; // under the coordinator's pool mutex, increase the caller's // 'done' counter and test if it's now equal to 'nparts', the total // number of actions in this bunch // TODO initially I had the notify_all call after closing the scope of // the lock guard, but I had random crashes. Changing the code to call // notify_all with the lock guard still in effect seemed to remove the // problem, but made me unsure of my logic. // 2017-06-23 after removing a misplaced semicolon after the conditional // below I recoded to perform the notification after closing the lock_guard's // scope, and now there doesn't seem to be any problem any more. I leave // these comments in for reference in case things go wrong // TODO remove this and previous comment if all is well // 2017-10-12 when stress-testing with restore_test, I had random crashes // and failure to join again, so I've taken the notify call into the lock // guard's scope again to see if that fixes it which seems to be the case. { std::lock_guard lk ( * p_pool_mutex ) ; if ( ++ ( * p_done ) == nparts ) { // this was the last action originating from the coordinator // notify the coordinator that the joint task is now complete p_pool_cv->notify_one() ; } } } // with this collateral code at hand, we can now implement multithread(). /// multithread uses a thread pool of worker threads to perform /// a multithreaded operation. It receives a functor (a single-threaded /// function used for all individual tasks), a partitioning, which contains /// information on which part of the data each task should work, and /// a set of additional parameters to pass on to the functor. /// The individual 'payload' tasks are created by binding the functor with /// /// - a range from the partitioning, describing it's share of the data /// /// - the remaining parameters /// /// These tasks are bound to a wrapper routine which takes care of /// signalling when the last task has completed. static thread_pool common_thread_pool ; // keep a thread pool only for multithread() template < class range_type , class ...Types > int multithread ( void (*pfunc) ( range_type , Types... ) , partition_type < range_type > partitioning , Types ...args ) { // get the number of ranges in the partitioning int nparts = partitioning.size() ; // guard against empty or wrong partitioning if ( nparts <= 0 ) { return 0 ; } if ( nparts == 1 ) { // if only one part is in the partitioning, we take a shortcut // and execute the function right here: (*pfunc) ( partitioning[0] , args... ) ; return 1 ; } // alternatively, 'done' can be coded as std::atomic. I tried // but couldn't detect any performance benefit, even though allegedly // atomics are faster than using mutexes... so I'm leaving the code // as it was, using an int and a mutex. int done = 0 ; // number of completed tasks std::mutex pool_mutex ; // mutex to guard access to done and pool_cv std::condition_variable pool_cv ; // for signalling completion { // under the thread pool's task_mutex, fill tasks into task queue std::lock_guard lk ( common_thread_pool.task_mutex ) ; for ( int i = 0 ; i < nparts ; i++ ) { // first create the 'payload' function std::function < void() > payload = std::bind ( pfunc , partitioning[i] , args... ) ; // now bind it to the action wrapper and enqueue it std::function < void() > action = std::bind ( action_wrapper , payload , nparts , &pool_mutex , &pool_cv , &done ) ; common_thread_pool.task_queue.push ( action ) ; } } // alert all worker threads common_thread_pool.task_cv.notify_all() ; { // now wait for the last task to complete. This is signalled by // action_wrapper by notfying on pool_cv and doublechecked // by testing for done == nparts std::unique_lock lk ( pool_mutex ) ; // the predicate done == nparts rejects spurious wakes pool_cv.wait ( lk , [&] { return done == nparts ; } ) ; } // all jobs are done return nparts ; } /// This variant of multithread() takes a pointer to a function performing /// the partitioning of the incoming range. The partitioning function is /// invoked on the incoming range (provided nparts is greater than 1) and /// the resulting partitioning is used as an argument to the first variant /// of multithread(). // TODO It might be better to code this using std::function objects. // TODO may use move semantics for forwarding instead of relying on the // optimizer to figure this out template < class range_type , class ...Types > int multithread ( void (*pfunc) ( range_type , Types... ) , partition_type < range_type > (*partition) ( range_type , int ) , int nparts , range_type range , Types ...args ) { if ( nparts <= 1 ) { // if only one part is requested, we take a shortcut and execute // the function right here: (*pfunc) ( range , args... ) ; return 1 ; } // partition the range using the function pointed to by 'partition' auto partitioning = (*partition) ( range , nparts ) ; // then pass pfunc, the partitioning and the remaining arguments // to the variant of multithread() accepting a partitioning return multithread ( pfunc , partitioning , args... ) ; } } ; // end if namespace vspline #endif // #ifndef VSPLINE_MULTITHREAD_H kfj-vspline-6e66cf7a7926/poles.h000066400000000000000000000607451320375670700164040ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file poles.h \brief precalculated prefilter poles and basis function values The contents of this file below the comments can be generated using prefilter_poles.cc both the precalculated basis function values and the prefilter poles can be generated in long double precision, so the constants below are given as long doubles - concrete splines will downcast them to whatever precision they use for prefiltering (usually the same type as the data type's elementary type). The values defined here are used in several places in vspline. They are precomputed because calculating them when needed can be (potentially very) expensive, and providing them by definitions evaluated at compile time slows compilation. Great care is taken to provide very exact values. The code in prefilter_poles.cc goes beyond simply calculating polynomial roots with gsl/blas, adding a polishing stage done in long double arithmetic. The set of values provided here is sufficient to calculate the b-spline basis function for all spline degrees for arbitrary arguments (including the spline's derivatives) - see basis.h. The poles are needed for prefiltering. */ #ifndef VSPLINE_POLES_H namespace vspline_constants { const long double K0[] = { 1L , // basis(0) } ; const long double K1[] = { 1L , // basis(0) 0.5L , // basis(0.5) } ; const long double K2[] = { 0.75L , // basis(0) 0.5L , // basis(0.5) 0.125L , // basis(1) } ; const long double Poles_2[] = { -0.171572875253809916743L , } ; const long double K3[] = { 0.666666666666666666685L , // basis(0) 0.479166666666666666658L , // basis(0.5) 0.166666666666666666671L , // basis(1) 0.0208333333333333333339L , // basis(1.5) } ; const long double Poles_3[] = { -0.267949192431122719925L , } ; const long double K4[] = { 0.598958333333333333315L , // basis(0) 0.458333333333333333342L , // basis(0.5) 0.197916666666666666671L , // basis(1) 0.0416666666666666666678L , // basis(1.5) 0.00260416666666666666674L , // basis(2) } ; const long double Poles_4[] = { -0.3613412259002201794L , -0.0137254292973391213608L , } ; const long double K5[] = { 0.550000000000000000011L , // basis(0) 0.438020833333333333299L , // basis(0.5) 0.216666666666666666668L , // basis(1) 0.0617187500000000000007L , // basis(1.5) 0.00833333333333333333373L , // basis(2) 0.000260416666666666666679L , // basis(2.5) } ; const long double Poles_5[] = { -0.430575347099973781842L , -0.0430962882032646538245L , } ; const long double K6[] = { 0.511024305555555555529L , // basis(0) 0.419444444444444444487L , // basis(0.5) 0.228797743055555555543L , // basis(1) 0.0791666666666666666658L , // basis(1.5) 0.0156684027777777777777L , // basis(2) 0.00138888888888888888895L , // basis(2.5) 2.17013888888888888899e-05L , // basis(3) } ; const long double Poles_6[] = { -0.488294589303044755213L , -0.0816792710762375126011L , -0.00141415180832581775119L , } ; const long double K7[] = { 0.479365079365079365124L , // basis(0) 0.402596416170634920605L , // basis(0.5) 0.236309523809523809522L , // basis(1) 0.0940243675595238095248L , // basis(1.5) 0.0238095238095238095252L , // basis(2) 0.00337766617063492063474L , // basis(2.5) 0.000198412698412698412712L , // basis(3) 1.55009920634920634931e-06L , // basis(3.5) } ; const long double Poles_7[] = { -0.535280430796438168935L , -0.122554615192326690523L , -0.00914869480960827692899L , } ; const long double K8[] = { 0.45292096819196428568L , // basis(0) 0.387375992063492063515L , // basis(0.5) 0.24077768477182539682L , // basis(1) 0.10647321428571428571L , // basis(1.5) 0.0321269686259920634935L , // basis(2) 0.00612599206349206349226L , // basis(2.5) 0.000634765624999999999979L , // basis(3) 2.4801587301587301589e-05L , // basis(3.5) 9.68812003968253968318e-08L , // basis(4) } ; const long double Poles_8[] = { -0.574686909248765430565L , -0.163035269297280935283L , -0.0236322946948448500203L , -0.00015382131064169091176L , } ; const long double K9[] = { 0.430417768959435626143L , // basis(0) 0.373602402567653218697L , // basis(0.5) 0.243149250440917107593L , // basis(1) 0.1168385769744819224L , // basis(1.5) 0.0402557319223985890627L , // basis(2) 0.0094531293058311287482L , // basis(2.5) 0.00138337742504409171077L , // basis(3) 0.000105885769744819223983L , // basis(3.5) 2.75573192239858906539e-06L , // basis(4) 5.38228891093474426835e-09L , // basis(4.5) } ; const long double Poles_9[] = { -0.607997389168625778982L , -0.201750520193153238738L , -0.0432226085404817521351L , -0.00212130690318081842041L , } ; const long double K10[] = { 0.410962642824418540556L , // basis(0) 0.361098434744268077622L , // basis(0.5) 0.244066156188857197966L , // basis(1) 0.12543871252204585538L , // basis(1.5) 0.0479833489204420194007L , // basis(2) 0.013183421516754850087L , // basis(2.5) 0.00245328523074087852745L , // basis(3) 0.000279155643738977072321L , // basis(3.5) 1.5887978636188271604e-05L , // basis(4) 2.75573192239858906539e-07L , // basis(4.5) 2.69114445546737213417e-10L , // basis(5) } ; const long double Poles_10[] = { -0.636550663969423858484L , -0.238182798377573284936L , -0.0657270332283085515493L , -0.00752819467554869064324L , -1.69827628232746642329e-05L , } ; const long double K11[] = { 0.393925565175565175585L , // basis(0) 0.349702231887443069057L , // basis(0.5) 0.243960287397787397795L , // basis(1) 0.132561165432106594207L , // basis(1.5) 0.0552020202020202020231L , // basis(2) 0.017163149607531321399L , // basis(2.5) 0.00382387866762866762846L , // basis(3) 0.000571286261263271354464L , // basis(3.5) 5.10060926727593394244e-05L , // basis(4) 2.16679942326914983149e-06L , // basis(4.5) 2.50521083854417187763e-08L , // basis(5) 1.22324747975789642462e-11L , // basis(5.5) } ; const long double Poles_11[] = { -0.661266068900734706973L , -0.272180349294785885644L , -0.0897595997937133099328L , -0.0166696273662346560986L , -0.000510557534446502057208L , } ; const long double K12[] = { 0.378844084544729991474L , // basis(0) 0.339272950236491903185L , // basis(0.5) 0.243130918010144694708L , // basis(1) 0.138451466550424883762L , // basis(1.5) 0.0618676680090413254889L , // basis(2) 0.0212685824013949013948L , // basis(2.5) 0.0054581869256967252307L , // basis(3) 0.000998474744134466356579L , // basis(3.5) 0.000120913920591875371615L , // basis(4) 8.52397987814654481317e-06L , // basis(4.5) 2.70861650696991408761e-07L , // basis(5) 2.08767569878680989809e-09L , // basis(5.5) 5.09686449899123510276e-13L , // basis(6) } ; const long double Poles_12[] = { -0.682864884197723324803L , -0.30378079328825415993L , -0.11435052002713588501L , -0.0288361901986638037087L , -0.00251616621726133559223L , -1.88330564506390264383e-06L , } ; const long double K13[] = { 0.365370869485452818838L , // basis(0) 0.329689879585910011891L , // basis(0.5) 0.241788417986334653017L , // basis(1) 0.143315017471742083661L , // basis(1.5) 0.0679749672588214254906L , // basis(2) 0.0254044062949849888001L , // basis(2.5) 0.00731223669591725147264L , // basis(3) 0.00156717310816563305462L , // basis(3.5) 0.000237629847004847004809L , // basis(4) 2.3492285420207986942e-05L , // basis(4.5) 1.31330860497527164195e-06L , // basis(5) 3.12537574712392963204e-08L , // basis(5.5) 1.60590438368216146005e-10L , // basis(6) 1.96033249961201350104e-14L , // basis(6.5) } ; const long double Poles_13[] = { -0.701894251816807861958L , -0.333107232930623592306L , -0.138901113194319430138L , -0.0432138667403636696352L , -0.0067380314152449140003L , -0.000125100113214418715975L , } ; const long double K14[] = { 0.353239156699189298447L , // basis(0) 0.320850245020631925431L , // basis(0.5) 0.240082990415587342031L , // basis(1) 0.147321800946242910537L , // basis(1.5) 0.0735410325640670609783L , // basis(2) 0.0294998002323771172986L , // basis(2.5) 0.00934108185451225690503L , // basis(3) 0.002275919650051594496L , // basis(3.5) 0.000411090511493721967216L , // basis(4) 5.20463745910174481555e-05L , // basis(4.5) 4.2229561084936041826e-06L , // basis(5) 1.87764634689237863838e-07L , // basis(5.5) 3.34863577512474229312e-09L , // basis(6) 1.14707455977297247148e-11L , // basis(6.5) 7.00118749861433393235e-16L , // basis(7) } ; const long double Poles_14[] = { -0.718783787239944450614L , -0.360319071916961058049L , -0.163033514799298691019L , -0.0590894821948310188073L , -0.0132467567348479146521L , -0.000864024040953337935742L , -2.09130967752753285063e-07L , } ; const long double K15[] = { 0.342240261355340720458L , // basis(0) 0.312666606251760809706L , // basis(0.5) 0.238123194910707311519L , // basis(1) 0.150611949803996986836L , // basis(1.5) 0.0785952538667485757425L , // basis(2) 0.0335038025716498355252L , // basis(2.5) 0.0115022744874968750639L , // basis(3) 0.00311749394849886391303L , // basis(3.5) 0.000648549006353239157451L , // basis(4) 9.94402494389464625065e-05L , // basis(4.5) 1.05720042682674957797e-05L , // basis(5) 7.06839790279879631815e-07L , // basis(5.5) 2.50459906544562629208e-08L , // basis(6) 3.34864254293932428696e-10L , // basis(6.5) 7.64716373181981647651e-13L , // basis(7) 2.33372916620477797745e-17L , // basis(7.5) } ; const long double Poles_15[] = { -0.733872571684837351016L , -0.385585734278435207065L , -0.186520108450964346307L , -0.0759075920476681992858L , -0.0217520657965404714602L , -0.00280115148207645487303L , -3.09356804514744232418e-05L , } ; const long double K16[] = { 0.33220826914249586032L , // basis(0) 0.305064427814943222976L , // basis(0.5) 0.235988316876636090489L , // basis(1) 0.153300931440152308626L , // basis(1.5) 0.0831729750455189804676L , // basis(2) 0.0373810339101848175097L , // basis(2.5) 0.0137576309094881893988L , // basis(3) 0.00408087253210770282541L , // basis(3.5) 0.000954482867889482399375L , // basis(4) 0.000170727005056277129691L , // basis(4.5) 2.23489506378181871124e-05L , // basis(5) 2.00416604212280468886e-06L , // basis(5.5) 1.10747187961685068735e-07L , // basis(6) 3.13146575340689097298e-09L , // basis(6.5) 3.13935464480574627987e-11L , // basis(7) 4.77947733238738529782e-14L , // basis(7.5) 7.29290364438993117953e-19L , // basis(8) } ; const long double Poles_16[] = { -0.747432387766468507421L , -0.409073604757250910887L , -0.209228719339539694189L , -0.0932547189802406259143L , -0.0318677061204539004258L , -0.00625840678512598491248L , -0.000301565363306959580638L , -2.32324863642123170546e-08L , } ; const long double K17[] = { 0.323009394156998706679L , // basis(0) 0.297979958708191627778L , // basis(0.5) 0.233736749230651110002L , // basis(1) 0.155484036150159998435L , // basis(1.5) 0.0873116407701823031193L , // basis(2) 0.0411080643091169147709L , // basis(2.5) 0.0160739219909647846453L , // basis(3) 0.00515282387357668068747L , // basis(3.5) 0.00133081257213353628318L , // basis(4) 0.000270404925830189195488L , // basis(4.5) 4.18215496949898696688e-05L , // basis(5) 4.69571537987106402301e-06L , // basis(5.5) 3.56439418392324554771e-07L , // basis(6) 1.63149746984798566173e-08L , // basis(6.5) 3.68452719010997878094e-10L , // basis(7) 2.77001951208101220232e-12L , // basis(7.5) 2.81145725434552076336e-15L , // basis(8) 2.14497166011468564099e-20L , // basis(8.5) } ; const long double Poles_17[] = { -0.759683224071932781984L , -0.430939653180396579961L , -0.231089843599271833896L , -0.110828993316247253478L , -0.0432139114566841576597L , -0.0112581836894716022219L , -0.00118593312515217673908L , -7.68756258125468272936e-06L , } ; const long double K18[] = { 0.314534400858646718218L , // basis(0) 0.291358446651083303363L , // basis(0.5) 0.231411779366461601095L , // basis(1) 0.157240113462067456344L , // basis(1.5) 0.0910485005933913615569L , // basis(2) 0.0446704749601585298685L , // basis(2.5) 0.0184229286904982474763L , // basis(3) 0.00631911641019581553075L , // basis(3.5) 0.00177727765574329432895L , // basis(4) 0.000402198030910974421746L , // basis(4.5) 7.13838910691101004477e-05L , // basis(5) 9.59071055865801927793e-06L , // basis(5.5) 9.27104774298620103304e-07L , // basis(6) 5.97340832600638683586e-08L , // basis(6.5) 2.26850789267494323566e-09L , // basis(7) 4.0941846266406646114e-11L , // basis(7.5) 2.30834980193975490196e-13L , // basis(8) 1.56192069685862264628e-16L , // basis(8.5) 5.95825461142968233597e-22L , // basis(9) } ; const long double Poles_18[] = { -0.770805051407208484079L , -0.451328733377817488268L , -0.252074574682638764562L , -0.128412836796714573015L , -0.0554629671385220109096L , -0.0176623776847852202014L , -0.00301193072899483019657L , -0.000106337355887136664016L , -2.5812403962571557538e-09L , } ; const long double K19[] = { 0.306693101737982424604L , // basis(0) 0.285152657447631086029L , // basis(0.5) 0.229045645681183776321L , // basis(1) 0.158634625338890750896L , // basis(1.5) 0.0944192951167601057425L , // basis(2) 0.0480605454253507002694L , // basis(2.5) 0.0207811493712450163664L , // basis(3) 0.00756538341267226747638L , // basis(3.5) 0.00229186688915413342573L , // basis(4) 0.000568952290899485013994L , // basis(4.5) 0.000113413200680775915678L , // basis(5) 1.76630333585591612417e-05L , // basis(5.5) 2.06939934562068942136e-06L , // basis(6) 1.72752478435489838162e-07L , // basis(6.5) 9.46832953509055358168e-09L , // basis(7) 2.98700491781092245297e-10L , // basis(7.5) 4.30981599947724409224e-12L , // basis(8) 1.8223814805986014024e-14L , // basis(8.5) 8.22063524662432971747e-18L , // basis(9) 1.56796173984991640424e-23L , // basis(9.5) } ; const long double Poles_19[] = { -0.780946444851732297302L , -0.470372819467643025522L , -0.272180376283034490098L , -0.145850893757564512081L , -0.068345906124880463918L , -0.0252650733448555957229L , -0.00593665959108296996236L , -0.000508410194680816563286L , -1.91547865621224798646e-06L , } ; const long double K20[] = { 0.299410290320012640323L , // basis(0) 0.279321655993642289261L , // basis(0.5) 0.226662421857486947625L , // basis(1) 0.159722117626588762783L , // basis(1.5) 0.0974575566598727567998L , // basis(2) 0.0512754651380133029385L , // basis(2.5) 0.0231293383380602931465L , // basis(3) 0.00887770910234364912608L , // basis(3.5) 0.00287124002002061356517L , // basis(4) 0.000772619967256821964494L , // basis(4.5) 0.000170150730850241728814L , // basis(5) 3.00088196466905304557e-05L , // basis(5.5) 4.11670330038509039567e-06L , // basis(6) 4.2192794922896485481e-07L , // basis(6.5) 3.04930466565191773934e-08L , // basis(7) 1.42412826466311255693e-09L , // basis(7.5) 3.73544185013320677245e-11L , // basis(8) 4.30989409551208702348e-13L , // basis(8.5) 1.36678612573657801533e-15L , // basis(9) 4.11031762331216485864e-19L , // basis(9.5) 3.91990434962479101051e-25L , // basis(10) } ; const long double Poles_20[] = { -0.790231117480724852336L , -0.488191260639127939892L , -0.291421601474782553581L , -0.163033534850263082434L , -0.0816481156195889353096L , -0.0338494795539119584577L , -0.0099730290200587274223L , -0.00146832175710434043464L , -3.77465731975190254876e-05L , -2.86799448817251264668e-10L , } ; const long double K21[] = { 0.292622687231434779222L , // basis(0) 0.2738298047486301248L , // basis(0.5) 0.224280093878832764107L , // basis(1) 0.160548212661644545846L , // basis(1.5) 0.10019429073492722872L , // basis(2) 0.0543159666272729709679L , // basis(2.5) 0.0254519832636627386328L , // basis(3) 0.0102430008488452902521L , // basis(3.5) 0.00351110777263132730256L , // basis(4) 0.00101430459325298737959L , // basis(4.5) 0.000243612424661332394002L , // basis(5) 4.77978392444135000019e-05L , // basis(5.5) 7.4865177795402407054e-06L , // basis(6) 9.07561579439142494542e-07L , // basis(6.5) 8.15879097942759735833e-08L , // basis(7) 5.11508190667103638761e-09L , // basis(7.5) 2.03836837750990982686e-10L , // basis(8) 4.44822374203723964682e-12L , // basis(8.5) 4.10470018922697156696e-14L , // basis(9) 9.76275807924129018926e-17L , // basis(9.5) 1.95729410633912612316e-20L , // basis(10) 9.33310559434474050122e-27L , // basis(10.5) } ; const long double Poles_21[] = { -0.798762885665731613507L , -0.504891537448370302096L , -0.309823196415790324431L , -0.179884666798535838096L , -0.0952008124607312005328L , -0.0432139184407157353102L , -0.0150454999872945547364L , -0.00317200396388417259226L , -0.000219902957631632330337L , -4.77976468942536316244e-07L , } ; const long double K22[] = { 0.286276614055386039554L , // basis(0) 0.268645940276898897319L , // basis(0.5) 0.221912073096871506056L , // basis(1) 0.161151214470108255204L , // basis(1.5) 0.102657889534264013345L , // basis(2) 0.0571852901048010636068L , // basis(2.5) 0.0277367831200034982673L , // basis(3) 0.0116492037590350826642L , // basis(3.5) 0.00420655579826186278524L , // basis(4) 0.00129434332740911861009L , // basis(4.5) 0.000335529281985329123513L , // basis(5) 7.2224788646371748003e-05L , // basis(5.5) 1.26713837947481474385e-05L , // basis(6) 1.76823505790900777494e-06L , // basis(6.5) 1.89938914670434336315e-07L , // basis(7) 1.50102063224714874089e-08L , // basis(7.5) 8.1770577437810697864e-10L , // basis(8) 2.78332478768553791978e-11L , // basis(8.5) 5.05570941840879253739e-13L , // basis(9) 3.73156430983189829773e-15L , // basis(9.5) 6.6564259722400510509e-18L , // basis(10) 8.89679139245057328719e-22L , // basis(10.5) 2.12116036235107738666e-28L , // basis(11) } ; const long double Poles_22[] = { -0.806629499469070434376L , -0.520570235969917462454L , -0.327416474110574466536L , -0.196352826127515360616L , -0.108872451984077597159L , -0.0531816045858892408468L , -0.0210356609306781013348L , -0.00570661364597512596405L , -0.00072254796507425291654L , -1.34581549833048808237e-05L , -3.18664326043226950698e-11L , } ; const long double K23[] = { 0.280326198549807545016L , // basis(0) 0.26374269458034057742L , // basis(0.5) 0.219568310050317182092L , // basis(1) 0.161563403314335434516L , // basis(1.5) 0.104874182876882497498L , // basis(2) 0.0598884046006764718112L , // basis(2.5) 0.0299741594490754701042L , // basis(3) 0.0130854031040473308021L , // basis(3.5) 0.00495230970916637213348L , // basis(4) 0.00161240876694443049683L , // basis(4.5) 0.000447314117341397825535L , // basis(5) 0.000104464763013597038396L , // basis(5.5) 2.0225085344373592521e-05L , // basis(6) 3.18289046923990635356e-06L , // basis(6.5) 3.96798661290086831989e-07L , // basis(7) 3.78552338529272869352e-08L , // basis(7.5) 2.63467348901839379208e-09L , // basis(8) 1.24884104983961410861e-10L , // basis(8.5) 3.63383071656837423748e-12L , // basis(9) 5.49595855548087519785e-14L , // basis(9.5) 3.24484704026298260285e-16L , // basis(10) 4.34114737527508147485e-19L , // basis(10.5) 3.86817017063068403804e-23L , // basis(11) 4.61121817902408127552e-30L , // basis(11.5) } ; const long double Poles_23[] = { -0.813905623561800423832L , -0.535314083668485659803L , -0.344236276919378086036L , -0.212404660540193033032L , -0.122561160991938728421L , -0.0636024801541136769679L , -0.0278116620379407188025L , -0.0090795953352951493023L , -0.0017112714467817036613L , -9.57339435007398175279e-05L , -1.19369188160664947563e-07L , } ; const long double K24[] = { 0.274731973521188101471L , // basis(0) 0.259095933885492246126L , // basis(0.5) 0.217256122184060208605L , // basis(1) 0.161812082117910165327L , // basis(1.5) 0.106866566729597120988L , // basis(2) 0.0624314258543732094415L , // basis(2.5) 0.0321568163257983379026L , // basis(3) 0.014541849599514216045L , // basis(3.5) 0.00574294462662439229233L , // basis(4) 0.00196761740283894750425L , // basis(4.5) 0.000580049962700882370756L , // basis(5) 0.000145635431566187893506L , // basis(5.5) 3.07460180528882923814e-05L , // basis(6) 5.37040360961471687212e-06L , // basis(6.5) 7.60169776706315293316e-07L , // basis(7) 8.48619490096167514872e-08L , // basis(7.5) 7.20452818709766667257e-09L , // basis(8) 4.42291850046729626149e-10L , // basis(8.5) 1.82614999388872219242e-11L , // basis(9) 4.54526283883070886422e-13L , // basis(9.5) 5.72536381119234370332e-15L , // basis(10) 2.70404290721556569e-17L , // basis(10.5) 2.71321710999844103522e-20L , // basis(11) 1.61173757109611834921e-24L , // basis(11.5) 9.6067045396335026575e-32L , // basis(12) } ; const long double Poles_24[] = { -0.820655181072953240136L , -0.549200963957329733618L , -0.360319071376742828281L , -0.228020147227906865145L , -0.136188500068012711185L , -0.074351497277369400168L , -0.0352441266658319660804L , -0.0132463753253008104888L , -0.00329768262348645466398L , -0.000358071544108679951611L , -4.8126755633104826723e-06L , -3.54070880733606722549e-12L , } ; const long double* const precomputed_poles[] = { 0, 0, Poles_2, Poles_3, Poles_4, Poles_5, Poles_6, Poles_7, Poles_8, Poles_9, Poles_10, Poles_11, Poles_12, Poles_13, Poles_14, Poles_15, Poles_16, Poles_17, Poles_18, Poles_19, Poles_20, Poles_21, Poles_22, Poles_23, Poles_24, } ; const long double* const precomputed_basis_function_values[] = { K0, K1, K2, K3, K4, K5, K6, K7, K8, K9, K10, K11, K12, K13, K14, K15, K16, K17, K18, K19, K20, K21, K22, K23, K24, } ; } ; // end of namespace vspline_constants #define VSPLINE_POLES_H #endif kfj-vspline-6e66cf7a7926/prefilter.h000066400000000000000000000326301320375670700172460ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file prefilter.h \brief Code to create the coefficient array for a b-spline. Note: the bulk of the code was factored out to filter.h, while this text still outlines the complete filtering process. B-spline coefficients can be generated in two ways (that I know of): the first is by solving a set of equations which encode the constraints of the spline. A good example of how this is done can be found in libeinspline. I term it the 'linear algebra approach'. In this implementation, I have chosen what I call the 'DSP approach'. In a nutshell, the DSP approach looks at the b-spline's reconstruction as a convolution of the coefficients with a specific kernel. This kernel acts as a low-pass filter. To counteract the effect of this filter and obtain the input signal from the convolution of the coefficients, a high-pass filter with the inverse transfer function to the low-pass is used. This high-pass has infinite support, but can still be calculated precisely within the bounds of the arithmetic precision the CPU offers, due to the properties it has. I recommend [CIT2000] for a formal explanation. At the core of my prefiltering routines there is code from Philippe Thevenaz' accompanying code to this paper, with slight modifications translating it to C++ and making it generic. The greater part of this file deals with 'generifying' the process and to employing multithreading and the CPU's vector units to gain speed. This code makes heavy use of vigra, which provides handling of multidimensional arrays and efficient handling of aggregate types - to only mention two of it's many qualities. The vectorization is done with Vc, which allowed me to code the horizontal vectorization I use in a generic fashion. In another version of this code I used vigra's BSplineBase class to obtain prefilter poles. This required passing the spline degree/order as a template parameter. Doing it like this allows to make the Poles static members of the solver, but at the cost of type proliferation. Here I chose not to follow this path and pass the spline order as a parameter to the spline's constructor, thus reducing the number of solver specializations and allowing automated testing with loops over the degree. This variant may be slightly slower. The prefilter poles I use are precalculated externally with gsl/blas and polished in long double precision to provide the most precise data possible. this avoids using vigra's polynomial root code which failed for high degrees when I used it. In addition to the code following the 'implicit scheme' proposed by Thevenaz, I provide code to use an 'explicit scheme' to obtain the b-spline coefficients. The implicit scheme makes assumptions about the continuation of the signal outside of the window of data which is acceessible: that the data continue mirrored, reflected, etc. - and it proceeds to capture these assumptions in formulae deriving suitable initial causal/anticausal coefficients from them. Usually this is done with a certain 'horizon' which takes into account the limited arithmetic precision of the calculations and abbreviates the initial coefficient calculation to a certain chosen degree of precision. The same effect can be achieved by simply embedding the knot point data into a frame containing extrapolated knot point data. If the frame is chosen so wide that margin effects don't 'disturb' the core data, we end up with an equally (im)precise result with an explicit scheme. The width of the frame now takes the roll of the horizon used in the implicit scheme and has the same effect. While the explicit scheme needs more memory, it has several advantages: - there is no need to code specific routines for initial coefficient generation - nor any need to explicitly run such code - the iteration over the input becomes more straightforward - arbitrary unconventional extrapolation schemes can be used easily A disadvantage, apart from the higher memory consumption, is that one cannot give a 'precise' solution, which the implicit scheme can do for the cases it can handle. But what is 'precise'? Certainly there is no precision beyond the arithmetic precision offered by the underlying system. So if the horizon is chosen wide enough, the resulting coefficients become 'just about' the same with all schemes. They are interchangeable. In an image-processing context, the extra memory needed would typically be a small single-digit percentage - not really a bother. In my trials, I found the runtime differences between the two approaches negligible and the simplification of the code so attractive that I was tempted to choose the explicit scheme over the implicit. Yet since the code for the implicit scheme is there already and some of it is even used in the explicit scheme I keep both methods in the code base for now. Note that using the explicit scheme also makes it possible to, if necessary, widen the shape of the complete coefficient array (including the 'frame') so that it becomes vector-friendly. Currently, this is not done. [CIT2000] Interpolation Revisited by Philippe Thévenaz, Member,IEEE, Thierry Blu, Member, IEEE, and Michael Unser, Fellow, IEEE in IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 7, JULY 2000, */ // TODO instead of erecting a horizon-wide frame around the core coefficients for the explicit // extrapolation, one might only widen the buffer and extrapolate inside the buffer, writing back // to a smaller array of core coefficients, optionally with a brace. The only drawback is in // handling extrapolation schemes which pick values for extrapolation which aren't collinear // to the buffered data, like SPHERICAL BCs, which is currently the only one exhibiting such // behaviour. One option would be to abolish SPHERICAL BCs and force users to use MANUAL // prefiltering strategy for spherical data, which would, in a way, be 'purer' anyway, since // SPHERICAL BCs are not really what you'd expect in a general-purpose b-spline library, as // they are quite specific to panoramic image processing. #ifndef VSPLINE_PREFILTER_H #define VSPLINE_PREFILTER_H #include "common.h" #include "filter.h" #include "basis.h" namespace vspline { using namespace std ; using namespace vigra ; /// With large data sets, and with higher dimensionality, processing separately along each /// axis consumes a lot of memory bandwidth. There are ways out of this dilemma by interleaving /// the code. Disregarding the calculation of initial causal and anticausal coefficients, the code /// to do this would perform the forward filtering step for all axes at the same time and then, later, /// the backward filtering step for all axes at the same time. This is possible, since the order /// of the filter steps is irrelevant, and the traversal of the data can be arranged so that /// values needed for context of the filter are always present (the filters are recursive and only /// 'look' one way). I have investigated these variants, but especially the need to calculate /// initial causal/anticausal coefficients, and the additional complications arising from /// vectorization, have kept me from choosing this path for the current body of code. With the /// inclusion of the explicit scheme for prefiltering, dimension-interleaved prefiltering becomes /// more feasible, and I anticipate revisiting it. /// /// Here I am using a scheme where I make access to 1D subsets of the data very efficient /// (by buffering lines/stripes of data) and rely on the fact that such simple, fast access plays /// well with the compiler's optimizer and pipelining in the CPU. From the trials on my own system /// I conclude that this approach does not perform significantly worse than interleaving schemes /// and is much easier to formulate and understand. And with fast access to 1D subsets, higher order /// splines become less of an issue; the extra arithemtic to prefilter for, say, quintic splines is /// done very quickly, since no additional memory access is needed beyond a buffer's worth of data /// already present in core memory. /// /// 'solve' is just a thin wrapper around filter_nd in filter.h, injecting the actual number of poles /// and the poles themselves. /// /// Note how smoothing comes into play here: it's done simply by /// prepending an additional pole to the filter cascade, taking a positive value between /// 0 (no smoothing) and 1 (total blur) if 'smoothing' is not 0.0. While I'm not sure about /// the precise mathematics (yet) this does what is intended very efficiently. Why smoothe? /// If the signal is scaled down when remapping, we'd have aliasing of higher frequencies /// into the output, producing artifacts. Pre-smoothing with an adequate factor removes the /// higher frequencies (more or less), avoiding the problem. /// /// Using this simple method, pre-smoothing is computationally cheap, but the method used /// here isn't (?) equivalent to convolving with a gaussian, though the effect is quite similar. /// I think the method is called exponential smoothing. // TODO: establish the proper maths for this smoothing method template < typename input_array_type , ///< type of array with knot point data typename output_array_type , ///< type of array for coefficients (may be the same) typename math_type > ///< type for arithmetic operations in filter void solve ( input_array_type & input , output_array_type & output , TinyVector bcv , int degree , double tolerance , double smoothing = 0.0 , int njobs = default_njobs ) { if ( smoothing != 0.0 ) { assert ( smoothing > 0.0 && smoothing < 1.0 ) ; int npoles = degree / 2 + 1 ; long double *pole = new long double [ npoles ] ; pole[0] = smoothing ; for ( int i = 1 ; i < npoles ; i++ ) pole[i] = vspline_constants::precomputed_poles [ degree ] [ i - 1 ] ; filter_nd < input_array_type , output_array_type , math_type > ( input , output , bcv , npoles , pole , tolerance , njobs ) ; delete[] pole ; } else filter_nd < input_array_type , output_array_type , math_type > ( input , output , bcv , degree / 2 , vspline_constants::precomputed_poles [ degree ] , tolerance , njobs ) ; } } ; // namespace vspline #endif // VSPLINE_PREFILTER_H kfj-vspline-6e66cf7a7926/prefilter_poles.cc000066400000000000000000000215351320375670700206100ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform rational b-splines */ /* */ /* Copyright 2015, 2016 by Kay F. Jahnke */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file prefilter_poles.cc \brief calculates the poles of the b-spline prefilter using gsl and BLAS this doesn't have to be done for installing vspline if poles.cc is already present. Providing degrees up to 24 is just about what gsl can handle, with such high degrees the evaluation becomes quite imprecise as well, especialy for floats. compile with: clang++ -O3 -std=c++11 prefilter_poles.cc -oprefilter_poles -lgsl -lblas run ./prefilter_poles > poles.cc In this latest version, the double precision roots calculated by gsl/blas are now polished as long doubles to the best precision we can provide. TODO: could do with some TLC... */ #include #include #include #include #include #include using namespace std ; using namespace vigra ; // newton() is an implementation of the newton method to approach a zero // of a polynomial from a starting point nearby. We use this routine to // polish the zeros we get from gsl/blas, since these values are only // calculated in double precision and we want to take the precision as // high as we possibly can. // newton() calculates the augmented value from the current value, it's // run in a loop until some termination criterion is satisfied. template < typename dtype > dtype newton ( dtype p , int ncoeff , dtype * coeff ) { dtype * pk = coeff + ncoeff - 1 ; // point to last coefficient dtype pot = 1 ; // powers of p int ex = 0 ; // exponent yielding the power int xd = 1 ; // ex derivative, d(x^n) = n(x^(n-1)) dtype fx = 0 ; // f(p) dtype dfx = 0 ; // f'(p) // we work our way back to front, starting with p^0 and raising // the power with each iteration fx += pot * *pk ; for ( ; ; ) { --pk ; if ( pk == coeff ) break ; dfx += pot * xd * *pk ; xd++ ; ex++ ; pot = pow ( p , ex ) ; fx += pot * *pk ; } pot *= p ; fx += pot * *pk ; // std::cout << "fx: " << fx << " dfx: " << dfx << std::endl ; dtype next = p - fx / dfx ; // std::cout << "p: " << p << " p': " << next << std::endl ; return next ; } template < class T > ArrayVector calculatePrefilterCoefficients(int DEGREE) { ArrayVector res; const int r = DEGREE / 2; double a[2*r+1] ; long double la[2*r+1] ; double z[4*r+2] ; cout << "const long double K" << DEGREE << "[] = {" << endl ; // we calculate the basis function values at 0.5 intervals int imax = 2 * r ; if ( DEGREE & 1 ) imax++ ; for(int i = 0; i <= imax ; ++i) { long double v = vspline::cdb_bspline_basis_2 ( i , DEGREE , 0 ) ; cout << " " << v << "L , // basis(" << i * .5 << ")" << endl ; if ( ! ( i & 1 ) ) { // for even i, we put the value in a[] as well - only even i // correspond to the value of the basis function at integral values // which we need for the poles int ih = i / 2 ; a [ r - ih ] = a [ r + ih ] = v ; la [ r - ih ] = la [ r + ih ] = v ; } } // alternatively, gen_bspline_basis can be used to the same effect, // the result is identical. // // for(int i = 0; i <= imax ; ++i) // { // long double half_i = i / (long double) 2.0 ; // long double v // = vspline::gen_bspline_basis // ( half_i , DEGREE , 0 ) ; // // cout << " " << v << "L , // basis(" << half_i << ")" << endl ; // if ( ! ( i & 1 ) ) // { // // for even i, we put the value in a[] as well - only even i // // correspond to the value of the basis function at integral values // // which we need for the poles // int ih = i / 2 ; // a [ r - ih ] = a [ r + ih ] = v ; // la [ r - ih ] = la [ r + ih ] = v ; // } // } cout << " } ; " << endl ; if(DEGREE > 1) { ArrayVector roots; // we set up the environment gsl needs to find the roots gsl_poly_complex_workspace * w = gsl_poly_complex_workspace_alloc (2*r+1); // now we call gsl's root finder gsl_poly_complex_solve (a, 2*r+1, w, z); // and release it's workspace gsl_poly_complex_workspace_free (w); // we only look at the real parts of the values, which are stored // interleaved real/imag. And we take them back to front, even though // it doesn't matter which end we start with - but conventionally // Pole[0] is the root with the largest absolute, so I stick with that. for(int i = 2 * r - 2 ; i >= 0; i-=2) if(VIGRA_CSTD::fabs(z[i]) < 1.0) res.push_back(z[i]); } // // can use eigen alternatively: // { // using namespace Eigen ; // // Eigen::PolynomialSolver solver; // Eigen::Matrix coeff(2*r+1); // // long double * pa = a ; // for ( int i = 0 ; i < 2*r+1 ; i++ ) // { // coeff[i] = *pa++ ; // } // // solver.compute(coeff); // // const Eigen::PolynomialSolver::RootsType & r // = solver.roots(); // // for ( int i = r.rows() - 1 ; i >= 0 ; i-- ) // { // if ( std::fabs ( r[i].real() ) < 1 ) // res.push_back(r[i].real() ); // } // } // we polish the roots with the newton method for ( auto & p : res ) { long double pa = p , pb = 0 ; while ( ( pa - pb ) * ( pa - pb ) >= LDBL_EPSILON ) { pb = pa ; pa = newton ( pa , 2*r+1 , la ) ; } // std::cout << "....... settled on " << pa << std::endl ; p = pa ; } return res; } // TODO ugly mishmash of prints and calculations... void print_poles ( int degree ) { ArrayVector res = calculatePrefilterCoefficients ( degree ) ; if ( degree > 1 ) { cout << "const long double Poles_" << degree << "[] = {" << endl ; for ( auto r : res ) cout << r << "L ," << endl ; cout << "} ;" << endl ; } } int main ( int argc , char * argv[] ) { cout << setprecision(std::numeric_limits::max_digits10) ; for ( int degree = 0 ; degree < 25 ; degree++ ) print_poles(degree) ; cout << noshowpos ; cout << "const long double* const precomputed_poles[] = {" << endl ; cout << " 0, " << endl ; cout << " 0, " << endl ; for ( int i = 2 ; i < 25 ; i++ ) cout << " Poles_" << i << ", " << endl ; cout << "} ;" << endl ; cout << "const long double* const precomputed_basis_function_values[] = {" << endl ; for ( int i = 0 ; i < 25 ; i++ ) cout << " K" << i << ", " << endl ; cout << "} ;" << endl ; } kfj-vspline-6e66cf7a7926/thread_pool.h000066400000000000000000000145101320375670700175470ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file thread_pool.h /// /// \brief provides a thread pool for vspline's multithread() routine /// /// class thread_pool aims to provide a simple and straightforward implementation /// of a thread pool for multithread() in multithread.h, but the class might find /// use elsewhere. The operation is simple, I think of it as 'piranha mode' ;) /// /// a set of worker threads is launched which wait for 'tasks', which come in the shape /// of std::function, from a queue. When woken, a worker thread tries to obtain /// a task. If it succeeds, the task is executed, and the worker thread tries to get /// another task. If none is to be had, it goes to sleep, waiting to be woken once /// there are new tasks. #include #include #include #include #include namespace vspline { class thread_pool { // used to switch off the worker threads at program termination. // access under task_mutex. bool stay_alive = true ; // the thread pool itself is held in this variable. The pool // does not change after construction std::vector < std::thread * > pool ; public: // mutex and condition variable for interaction with the task queue // and stay_alive std::mutex task_mutex ; std::condition_variable task_cv ; // queue to hold tasks. access under task_mutex std::queue < std::function < void() > > task_queue ; private: /// code to run a worker thread /// We use a thread pool of worker threads. These threads have a very /// simple cycle: They try and obtain a task (std::function). /// If there is one to be had, it is invoked, otherwise they wait on /// task_cv. When woken up, the flag stay_alive is checked, and if it /// is found to be false, the worker thread ends. void worker_thread() { while ( true ) { // under task_mutex, check stay_alive and try to obtain a task std::unique_lock task_lock ( task_mutex ) ; if ( ! stay_alive ) { task_lock.unlock() ; break ; // die } if ( task_queue.size() ) { // there are tasks in the queue, take one auto task = task_queue.front() ; task_queue.pop() ; task_lock.unlock() ; // got a task, perform it, then try for another one task() ; } else { // no luck. wait. task_cv.wait ( task_lock ) ; // simply wait, spurious alert is okay } // start next cycle, either after having completed a job // or after having been woken by an alert } } public: thread_pool ( int nthreads = 4 * std::thread::hardware_concurrency() ) { // to launch a thread with a method, we need to bind it to the object: std::function < void() > wf = std::bind ( &thread_pool::worker_thread , this ) ; // now we can fill the pool with worker threads for ( int t = 0 ; t < nthreads ; t++ ) pool.push_back ( new std::thread ( wf ) ) ; } int get_nthreads() const { return pool.size() ; } ~thread_pool() { { // under task_mutex, set stay_alive to false std::lock_guard task_lock ( task_mutex ) ; stay_alive = false ; } // wake all inactive worker threads, // join all worker threads once they are finished task_cv.notify_all() ; for ( auto threadp : pool ) { threadp->join() ; } // once all are joined, delete their std::thread object for ( auto threadp : pool ) { delete threadp ; } } } ; } ; // end of namespace vspline kfj-vspline-6e66cf7a7926/transform.h000066400000000000000000001763601320375670700172760ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /// \file transform.h /// /// \brief set of generic remap, transform and apply functions /// /// My foremost reason to have efficient B-spline processing is the formulation of /// generic remap-like functions. remap() is a function which takes an array of real-valued /// nD coordinates and an interpolator over a source array. Now each of the real-valued /// coordinates is fed into the interpolator in turn, yielding a value, which is placed /// in the output array at the same place the coordinate occupies in the coordinate /// array. To put it concisely, if we have /// /// - c, the coordinate array (or 'warp' array) /// - a, the source array /// - i, the interpolator over a /// - j, a coordinate into c and t /// - t, the target array /// /// remap defines the operation /// /// t[j] = i(c[j]) for all j /// /// Now we widen the concept of remapping to a 'transform' /// function. Instead of limiting the process to the use of an 'interpolator', we use /// an arbitrary unary functor transforming incoming values to outgoing values, where /// the type of the incoming and outgoing values is determined by the functor. If the /// functor actually is an interpolator, we have a 'true' remap transforming coordinates /// into values, but this is merely a special case. So here we have: /// /// - c, an array containing input values /// - f, a unary functor converting input to output values /// - j, a coordinate into c and t /// - t, the target array /// /// transform performs the operation /// /// t[j] = f(c[j]) for all j /// /// remaps/transforms to other-dimensional objects are supported. This makes it possible to, /// for example, remap from a volume to a 2D image, using a 2D coordinate array containing /// 3D coordinates. /// /// There is also a variant of this transform function in this file, which doesn't take an /// input array. Instead, for every target location, the location's discrete coordinates /// are passed to the unary_functor type object. This way, transformation-based remaps /// can be implemented easily: the user code just has to provide a suitable functor /// to yield values for coordinates. This functor will internally take the discrete /// incoming coordinates (into the target array) and take it from there, eventually /// producing values of the target array's value_type. /// Here we have: /// /// - f, a unary functor converting discrete coordinates to output values /// - j, a discrete coordinate into t /// - t, the target array /// /// 'index-based' transform performs the operation /// /// t[j] = f(j) for all j /// /// This file also has code to evaluate a b-spline at positions in a mesh grid, which can /// be used for scaling, and for separable geometric transformations. /// /// Finally there is a function to restore the original data from a b-spline to the /// precision possible with the given data type and degree of the spline. This is done /// with a call to transform for 1D splines, and a grid_eval for higher dimensions. /// /// The current implementation of the remap functionality uses a straightforward mode of /// operation, which factors out the various needed tasks into separate bits of code. The /// result data are acquired by 'pulling' them into the target array by repeatedly calling /// a functor yielding the results. This functor is a closure containing all logic needed /// to produce the result values in scan order of the target array. While remap, transform /// and grid_eval should cover most use cases, it's quite possible to use the routine fill() /// itself, passing in a suitable functor - but note it's in namespace vspline::detail. /// /// While the code presented here is quite involved and there are several types and routines /// the use(fulness) of which isn't immediately apparent, most use cases will be able to get /// by using only remap() or transform(). Calling these functions is simplified by /// the fact that their template arguments match function parameters. Hence remap and /// transform can be called without specifying the template arguments. /// /// Note: Currently, the calls to multithread() are hardwired to use partition_to_tiles() /// as their partitioner. partition_to_tiles() falls back to partition_to_stripes() if /// it's 'own' partitioning scheme fails to produce the desired number of parts or if /// the data are not 2D. This way, most use cases should receive adequate treatment. /// /// Coding remap functions for vspline is an interesting problem, because of vspline's /// scope. We want a solution which is dimension-agnostic, can handle all of vspline's /// potential value types, multithreads, and vectorizes transparently for such types /// which can be used with hardware vectorization, automatically falling back to /// unvectorized code if the value_type in question can't be vectorized. On top of /// that it should scale well and hide all this complexity in the implementation, /// providing only a clean, simple interface without the scary detail. /// /// It turns out that all these demands can be taken into account at the same time. /// The current solution is reasonably complex, but 'does the trick'. #ifndef VSPLINE_TRANSFORM_H #define VSPLINE_TRANSFORM_H #include "multithread.h" #include "eval.h" namespace vspline { using namespace std ; using namespace vigra ; template < int dimension > using bcv_type = vigra::TinyVector < bc_code , dimension > ; // we start out with the workhorse code. // The implementation of remap(), transform() etc. is after namespace detail. namespace detail { /// struct _fill contains the implementation of the 'engine' used for transform-like /// functions. The design logic is this: a transform will ultimately produce an array /// of results. This array is filled in standard scan order sequence by repeated /// calls to a functor containing all the logic to produce values in the required /// order. The functor is like a closure, being set up initially with all parameters /// needed for the task at hand (like with a warp array, a transformation, a genuine /// generator function etc.). Since the functor controls the path of the calculation /// from whatever starting point to the production of the final result, there are no /// intermediate containers for intermediate results. Since the remap process is /// mainly memory-bound, this strategy helps keeping memory use low. The data can /// be produced one by one, but the code has vectorized operation as well, which /// brings noticeable performance gain. With vectorized operation, instead of producing /// single values, the engine produces vectors of values. This operation is transparent /// to the caller, since the data are picked up and deposited in normal interleaved /// fashion. The extra effort for vectorized operation is in the implementation of the /// generator functor and reasonably straightforward. If only the standard remap /// functions are used, the user can remain ignorant of the vectorization. /// /// struct _fill's operator() takes an object of class generator_type. This object /// has to satisfy a few requirements: /// /// - it has to have an overloaded operator() accepting two signatures: one taking /// a pointer to vsize value_type, one taking a reference to a single value_type. /// these arguments specify where to deposit the generator's output. /// /// - it has to offer a bindOuter routine producing a subdimensional generator /// to supply values for a slice of output /// /// - it has to offer a subrange routine, limiting output to a subarray /// of the 'whole' output // TODO might write an abstract base class specifying the interface /// In the current implementation, the hierarchical descent to subdimensional slices /// is always taken to the lowest level, leaving the actual calls of the functor to /// occur there. While the hierarchical access may consume some processing time, mainly /// to establish the bounds for the 1D operation - but possibly optimized away, /// the operation on 1D data can use optimizations which gain more than is needed /// for the hierarchical descent. Especially when vectorized code is used, operation /// on 1D data is very efficient, since the data can be accessed using load/store /// or gather/scatter operations, even when the arrays involved are strided. /// Taking the hierarchical descent down to level 0 is encoded in fill() and it's /// workhorse code, the generator objects implemented here depend on the descent /// going all the way down to 1D. /// /// Note: support of hardware-assisted gather/scatter operations is reasonably new /// in Vc. On my (AVX2) system, using hardware gather/scatter increases performance /// for my typical applications in the order of magnitude of 10%, so if your code /// is time-critical, make sure your Vc is up-to-date - best build from source. // TODO while the current implementation has to issue load/store operations without // passing Vc::Aligned, since there is no guarantee that individual lines of data // are aligned, using special MultiArrays where the underlying memory's shape along // dimension 0 is extended to coincide with a vector boundary would allow using // aligned operation. But this would require further specialization, and/or looking // at the array's strides. Alternatively, a method Vc uses (hinted at further down) // processes single data until it hits an aligned memory location, from where on it // processes vectorized with aligned operations. I followed this lead but could not // produce a performance gain. // Note the third template argument, _vsize. This only comes into play at level 0 // (see the specializations below) and is needed to differentiate between cases // where the operation can be vectorized (because the underlying functor can do it) // and situations where it can't be (like, when using functors with only single-value // eval), in which case _vsize comes in as 1 and the code automatically falls back // to unvectorized mode. template < typename generator_type , // functor object yielding values int dim_out , // number of dimensions of output array int _vsize = 0 > struct _fill { void operator() ( generator_type & gen , MultiArrayView < dim_out , typename generator_type::value_type > & output ) { // we're not yet at the intended lowest level of recursion, // so we slice output and generator and feed the slices to the // next lower recursion level for ( int c = 0 ; c < output.shape ( dim_out - 1 ) ; c++ ) { // recursively call _fill for each slice along the highest axis auto sub_output = output.bindOuter ( c ) ; auto sub_gen = gen.bindOuter ( c ) ; _fill < decltype ( sub_gen ) , dim_out - 1 , generator_type::vsize >() ( sub_gen , sub_output ) ; } } } ; // browsing Vc's code base, I noticed the undocumented functions // simd_for_each and simd_for_each_n, which do simple iterations over // contiguous single-channel memory - the code is in 'algorithms.h'. // what's interesting there is that the code iterates with scalar // values until it has reached an aligned address. then it continues // by passing vectors to the unary functor as long as full vectors // can be found, and finally the remaining values are also passed as // scalars. The effect is that the central loop which is processing // vectors will certainly load from an aligned adress, and hence the // load operation can be issued with Vc::Aligned set true. // #defining RUNUP_TO_ALIGNED implements this behaviour, but in my // tests, the resulting code was slower. I suspect this is due to // the runup code making it harder for the optimizer. Another factor // is my specific vector unit (AVX2) - AFAIK AVX2 handles unaligned // access efficiently, while older vector units may perform badly // with unaligned access. for these vector units, using RUNUP_TO_ALIGNED // might produce performance gain TODO try // #define RUNUP_TO_ALIGNED /// specialization of _fill for level 0 ends the recursive descent. /// Here, with template argument _vsize unfixed, we have the vector code, /// below is a specialization for _vsize == 1 which is unvectorized. template < typename generator_type , int _vsize > struct _fill < generator_type , 1 , _vsize > { typedef typename generator_type::value_type value_type ; // get the functor's type and use it to fix a few types needed for // vectorized operation typedef typename generator_type::functor_type functor_type ; enum { dimension = functor_type::dim_out } ; enum { vsize = functor_type::vsize } ; enum { advance = dimension * vsize } ; typedef typename functor_type::out_v out_v ; typedef typename functor_type::out_ele_type ele_type ; typedef typename functor_type::out_ele_v ele_v ; typedef typename vspline::vector_traits < ele_type , vsize > :: index_type index_type ; inline void store ( const ele_v & src , ele_type * dp ) { #ifdef RUNUP_TO_ALIGNED src.store ( dp , Vc::Aligned ) ; #else src.store ( dp ) ; #endif } // compiler needs this overload, but it is never called // TODO: avoid it altogether inline void store ( const TinyVector < ele_v , dimension > & src , ele_type * dp ) { assert ( dimension == 1 ) ; src[0].store ( dp ) ; } inline void scatter ( const ele_v & src , ele_type * dp , const index_type & indexes ) { src.scatter ( dp , indexes ) ; } inline void scatter ( const TinyVector < ele_v , dimension > & src , ele_type * dp , const index_type & indexes ) { for ( int e = 0 ; e < dimension ; e++ ) src[e].scatter ( dp + e , indexes ) ; } void operator() ( generator_type & gen , MultiArrayView < 1 , typename generator_type::value_type > & output ) { auto target_it = output.begin() ; int leftover = output.elementCount() ; ele_type * dp = (ele_type*) ( output.data() ) ; #ifdef RUNUP_TO_ALIGNED while ( leftover && ( ! vspline::is_aligned ( dp ) ) ) { gen ( *target_it ) ; ++target_it ; --leftover ; dp = (ele_type*) &(*target_it) ; } #endif int aggregates = leftover / vsize ; // number of full vectors leftover -= aggregates * vsize ; // remaining leftover single values out_v target_buffer ; if ( output.isUnstrided() ) { if ( dimension == 1 ) { // best case: unstrided operation on 1D data, we can use // efficient SIMD store operation for ( int a = 0 ; a < aggregates ; a++ , dp += advance ) { gen ( target_buffer ) ; // and store it to destination with a SIMD store. store ( target_buffer , dp ) ; } } else { // second best: unstrided operation on nD data for ( int a = 0 ; a < aggregates ; a++ , dp += advance ) { gen ( target_buffer ) ; // and store it to destination with a scatter operation. scatter ( target_buffer , dp , index_type::IndexesFromZero() * dimension ) ; } } } else { // worst case: strided operation. here, instead of using 'advance' // directly (which is compile-time constant and therefore potentially // very good for the optimizer) we have to use a run-time value for // advancing dp. auto strided_advance = advance * output.stride(0) ; for ( int a = 0 ; a < aggregates ; a++ , dp += strided_advance ) { // here we generate to a simdized target type gen ( target_buffer ) ; // and store it to destination using a scatter operation. scatter ( target_buffer , dp , index_type::IndexesFromZero() * dimension * output.stride(0) ) ; } } // if there aren't any leftovers, we can return straight away. if ( ! leftover ) return ; // otherwise, advance target_it to remaining single values target_it += aggregates * vsize ; // process leftovers. If vc isn't used, this loop does all the processing while ( leftover-- ) { // process leftovers with single-value evaluation gen ( *target_it ) ; ++target_it ; } } } ; /// unvectorized variant of 1D _fill object. This is very straightforward. template < typename generator_type > struct _fill < generator_type , 1 , 1 > { typedef typename generator_type::value_type value_type ; void operator() ( generator_type & gen , MultiArrayView < 1 , typename generator_type::value_type > & output ) { auto target_it = output.begin() ; auto target_end = output.end() ; // process leftovers. If vc isn't used, this loop does all the processing while ( target_it != target_end ) { // process leftovers with single-value evaluation gen ( *target_it ) ; ++target_it ; } } } ; /// single-threaded fill. This routine receives the range to process and the generator /// object capable of providing result values. The generator object is set up to provide /// values for the desired subrange and then passed to _fill, which handles the calls to /// the generator object and the depositing of the result values into the target array. template < typename generator_type , // functor object yielding values int dim_out > // number of dimensions of output array void st_fill ( shape_range_type < dim_out > range , generator_type * const p_gen , MultiArrayView < dim_out , typename generator_type::value_type > * p_output ) { // pick out output's subarray specified by 'range' auto output = p_output->subarray ( range[0] , range[1] ) ; // get a new generator to cover the same range. we need an instance here! // the generator carries state, we're in the single thread, processing one // chunk out of the partitioning, so the generator we have here won't be // used by other threads (which would be wrong, since it carries state). // but it may be subdivided into yet more generators if fill decides to slice // it and process slices. auto gen = p_gen->subrange ( range ) ; // have the results computed and put into the target _fill < generator_type , dim_out , generator_type::vsize >() ( gen , output ) ; } /// multithreaded fill. This is the top-level fill routine. It takes a functor capable /// of delivering successive result values (in the target array's scan order), and calls /// this functor repeatedly until 'output' is full. /// this task is distributed to several worker threads by means of 'multithread', which in /// turn uses st_fill, the single-threaded fill routine. template < typename generator_type , // functor object yielding values int dim_target > // number of dimensions of output array void fill ( generator_type & gen , MultiArrayView < dim_target , typename generator_type::value_type > & output ) { // set up 'range' to cover a complete array of output's size shape_range_type < dim_target > range ( shape_type < dim_target > () , output.shape() ) ; // heuristic. minumum desired number of partitions; partition_to_tiles // only uses this value when it delegates to partition_to_stripes. int njobs = vspline::common_thread_pool.get_nthreads() ; // call multithread(), specifying the single-threaded fill routine as the // functor to invoke the threads with, and the partitioner to use on 'range'. // next come desired number of partitions and the original, 'whole' range, // followed by the other parameters the single-threaded fill needs, which is // pretty much the set of parameters we've received here, with the difference // that we don't pass anything on by reference and use pointers instead. multithread ( & detail::st_fill < generator_type , dim_target > , vspline::partition_to_tiles < dim_target > , njobs , // desired number of partitions range , // 'full' range which is to be partitioned &gen , // generator_type object &output ) ; // target array } ; /// Next we code 'generators' for use with fill(). These objects can yield values /// to the fill routine, each in it's specific way. The first type we define is /// warp_generator. This generator yields data from an array, which, in the context /// of a remap-like function, will provide the coordinates to feed to the interpolator. /// Seen from the generalized context, it provides arguments to the functor to use /// to produce result values, and might more aptly be called something like 'picker', /// since it picks successive batches of input values from the input array. /// /// First is warp_generator for dimensions > 1. Here we provide 'subrange' and /// 'bindOuter' to be used for the hierarchical descent in _fill. The current /// implementation relies of the hierarchical descent going all the way to 1D, /// and does not implement operator() until the 1D specialization. /// /// note the flag strided_warp. If the warp array is strided in dimension 0, /// this flag has to be set true. template < int dimension , typename unary_functor_type , bool strided_warp > struct warp_generator { typedef unary_functor_type functor_type ; typedef typename unary_functor_type::out_type value_type ; typedef typename unary_functor_type::in_type nd_rc_type ; enum { vsize = unary_functor_type::vsize } ; typedef MultiArrayView < dimension , nd_rc_type > warp_array_type ; const warp_array_type warp ; // must not use reference here! const unary_functor_type & itp ; const unary_functor_type & get_functor() { return itp ; } warp_generator ( const warp_array_type & _warp , const unary_functor_type & _itp ) : warp ( _warp ) , itp ( _itp ) { } ; warp_generator < dimension , unary_functor_type , strided_warp > subrange ( const shape_range_type < dimension > & range ) const { return warp_generator < dimension , unary_functor_type , strided_warp > ( warp.subarray ( range[0] , range[1] ) , itp ) ; } warp_generator < dimension - 1 , unary_functor_type , strided_warp > bindOuter ( const int & c ) const { return warp_generator < dimension - 1 , unary_functor_type , strided_warp > ( warp.bindOuter ( c ) , itp ) ; } } ; /// here we have the 1D specialization of warp_generator, where the actual /// processing takes place. template < typename unary_functor_type , bool strided_warp > struct warp_generator < 1 , unary_functor_type , strided_warp > { typedef unary_functor_type functor_type ; typedef typename unary_functor_type::in_type nd_rc_type ; enum { dimension = unary_functor_type::dim_in } ; typedef typename unary_functor_type::out_type value_type ; enum { vsize = unary_functor_type::vsize } ; typedef MultiArrayView < 1 , nd_rc_type > warp_array_type ; const warp_array_type warp ; // must not use reference here! typedef typename unary_functor_type::in_ele_type ele_type ; const ele_type * dp ; typename warp_array_type::const_iterator witer ; const unary_functor_type & itp ; const unary_functor_type & get_functor() { return itp ; } warp_generator ( const warp_array_type & _warp , const unary_functor_type & _itp ) : warp ( _warp ) , itp ( _itp ) , witer ( _warp.begin() ) , dp ( (ele_type*) ( _warp.data() ) ) { } ; /// If vectorization isn't used, this routine does all the work. /// This is the overload taking a straight value_type & as it's /// argument. Below is code for vectorized operation. /// We dispatch on strided_warp: void operator() ( value_type & target ) { operator() ( target , std::integral_constant < bool , strided_warp > () ) ; } /// unvectorized operator() for strided warp arrays void operator() ( value_type & target , std::true_type ) { itp.eval ( *((nd_rc_type*)dp) , target ) ; dp += dimension * warp.stride(0) ; } /// unvectorized operator() for unstrided warp arrays void operator() ( value_type & target , std::false_type ) { itp.eval ( *((nd_rc_type*)dp) , target ) ; dp += dimension ; } #ifdef USE_VC enum { advance = dimension * vsize } ; typedef typename vector_traits < ele_type , vsize > :: ele_v ele_v ; typedef typename vspline::vector_traits < ele_type , vsize > :: index_type index_type ; const index_type indexes = vspline::vector_traits < ele_type , vsize > :: IndexesFromZero() * dimension ; typedef typename unary_functor_type::in_ele_v source_ele_type ; typedef vigra::TinyVector < source_ele_type , dimension > source_type ; // initially I implemented a single operator() with conditionals on // strided_warp and dimension, expecting that the compiler would // pick out the right code without performance impact, but this turned // out wrong. so now I'm using a dispatch mechanism which picks the // appropriate code, effectively forcing the compiler to do the right // thing. TODO: this teaches me a lesson. I think I have relied on // dead code elimination in several places, so I may have to go through // the inner loops looking for similar situations. The performance // difference was not large but consistently measurable. /// dispatch to the operator() variant for strided or unstrided warp. /// while the code for both variants is very similar, the differentiation /// is important, because the unstrided case can use advance (which is /// a compile-time constant) directly, while the second case has to /// multiply with the stride, which is a run-time value. /// we write this as a member function template, making it a worse match /// for operator() ( value_type & ) and so assuring that this overload /// will only match if T is *not* a straight value_type, in which case /// we can be assured that we're running vector code. template < class T > inline void operator() ( T & target ) { static_assert ( vsize > 1 , "this code must not be called for vsize == 1" ) ; operator() ( target , std::integral_constant < bool , strided_warp > () ) ; } /// vectorized variant of operator() for strided warp arrays /// here we don't need to dispatch further, since the stride forces /// us to use gather operations even for 1D data. template < class T > inline void operator() ( T & target , std::true_type ) // strided warp array { source_type buffer ; for ( int e = 0 ; e < dimension ; e++ ) buffer[e].gather ( dp + e , indexes * warp.stride(0) ) ; itp.eval ( unwrap(buffer) , target ) ; dp += advance * warp.stride(0) ; } /// vectorized variant of operator() for unstrided warp arrays /// this variant of operator() further dispatches on 1D/nD data, which /// would be futile for strided data (which have to use gather anyway) /// but, with unstrided data, if the data are 1D, can result in a (fast) /// SIMD load operation. Otherwise it's gathers. template < class T > inline void operator() ( T & target , std::false_type ) // unstrided warp array { source_type buffer ; load ( buffer , std::integral_constant < bool , dimension == 1 > () ) ; itp.eval ( unwrap(buffer) , target ) ; dp += advance ; } /// loading 1D data from unstrided memory can use SIMD load instruction: inline void load ( source_type & buffer , std::true_type // data are 1D, use SIMD load ) { buffer[0].load ( dp ) ; } /// nD data have to be gathered instead, and buffer is indexable inline void load ( source_type & buffer , std::false_type // not 1D, use gather ) { for ( int e = 0 ; e < dimension ; e++ ) buffer[e].gather ( dp + e , indexes ) ; } #endif /// subrange is used to create a warp_generator from part of the data /// while we are at the lowest level here, we still need the subrange routine /// for cases where the data are 1D in the first place: in this situation, /// we need to be able to split up the range as well. warp_generator < 1 , unary_functor_type , strided_warp > subrange ( const shape_range_type < 1 > & range ) const { return warp_generator < 1 , unary_functor_type , strided_warp > ( warp.subarray ( range[0] , range[1] ) , itp ) ; } } ; /// for transform() from indexes we need a different generator object: here we don't /// pick input values at successive locations from an array, but instead pass the nD /// indices which correspond to these locations - and are the same at which /// output will be stored, as well. In fact it is feasible to implement /// warp_generator using index_generator, by simply picking data from the input array /// at the indexes index_generator produces. I tried that, but due to the index maths /// needed, it came out slower than the implementation I give here. /// /// class index_generator provides nD indices as input to it's functor which coincide /// with the location in the target array for which the functor is called. The data type /// of these indices is derived from the functor's input type. Again we presume that /// fill() will recurse to level 0, so index_generator's operator() will only be called /// at the lowest level of recursion, and we needn't even define it for higher levels. template < typename unary_functor_type , int level > struct index_generator { typedef unary_functor_type functor_type ; typedef typename unary_functor_type::out_type value_type ; enum { dimension = unary_functor_type::dim_in } ; enum { vsize = unary_functor_type :: vsize } ; const unary_functor_type & itp ; const shape_range_type < dimension > range ; const unary_functor_type & get_functor() { return itp ; } index_generator ( const unary_functor_type & _itp , const shape_range_type < dimension > _range ) : itp ( _itp ) , range ( _range ) { } ; index_generator < unary_functor_type , level > subrange ( const shape_range_type < dimension > range ) const { return index_generator < unary_functor_type , level > ( itp , range ) ; } index_generator < unary_functor_type , level - 1 > bindOuter ( const int & c ) const { auto slice_start = range[0] , slice_end = range[1] ; slice_start [ level ] += c ; slice_end [ level ] = slice_start [ level ] + 1 ; return index_generator < unary_functor_type , level - 1 > ( itp , shape_range_type < dimension > ( slice_start , slice_end ) ) ; } } ; /// specialization of index_generator for level 0. Here, the indices for all higher /// dimensions have been fixed by the hierarchical descent, and we only need to concern /// ourselves with the index(es) for dimension 0, and supply the operator() implementations. /// Note how we derive the concrete type of index from the functor. This way, whatever /// the functor takes is provided with no need of type conversion, which would be necessary /// if we'd only produce integral indices here. template < typename unary_functor_type > struct index_generator < unary_functor_type , 0 > { typedef unary_functor_type functor_type ; typedef typename unary_functor_type::in_ele_type index_ele_type ; typedef typename unary_functor_type::out_type value_type ; enum { dimension = unary_functor_type::dim_in } ; typedef vigra::TinyVector < index_ele_type , dimension > index_type ; enum { vsize = unary_functor_type::vsize } ; #ifdef USE_VC typedef typename unary_functor_type::out_v out_v ; typedef typename unary_functor_type::in_ele_v index_ele_v ; typedef vigra::TinyVector < index_ele_v , dimension > index_v ; index_v current_v ; // current vectorized index to feed to functor #endif index_type current ; // singular index const unary_functor_type & itp ; const shape_range_type < dimension > range ; const unary_functor_type & get_functor() { return itp ; } index_generator ( const unary_functor_type & _itp , const shape_range_type < dimension > _range ) : itp ( _itp ) , range ( _range ) { // initially, set the singular index to the beginning of the range current = index_type ( range[0] ) ; #ifdef USE_VC // vectorized processing will be done only if vsize > 1. // vectorized processing will process the bulk of the data, leaving // only a few 'stragglers' to mop up afterwards. But if vsize == 1, // we're using the unvectorized code as fallback, in which case // all values are treated as stragglers ;) if ( vsize > 1 ) { // initialize current_v to hold the first simdized index for ( int d = 0 ; d < dimension ; d++ ) current_v[d] = index_ele_v ( range[0][d] ) ; current_v[0] += vspline::vector_traits < index_ele_type , vsize > :: IndexesFromZero() ; // if vc is used, the singular index will only be used for mop-up action // after all aggregates have been processed. int size = range[1][0] - range[0][0] ; int aggregates = size / vsize ; } #endif } ; /// single-value evaluation. This will be used for all values if vc isn't used, /// or only for mop-up action after all full vectors are processed. If operator() /// is called for straight value_type, this is the best matching overload. void operator() ( value_type & target ) { itp.eval ( unwrap ( current ) , target ) ; current[0] += index_ele_type ( 1 ) ; } #ifdef USE_VC /// vectorized evaluation. Hierarchical decent has left us with only the /// level0 coordinate to increase, making this code very efficient. /// Here we have T as a template argument. This version will only match /// if T is not a straight value_type, because if it were, the first operator() /// variant would be preferred. template < class T > void operator() ( T & target ) { static_assert ( vsize > 1 , "this code must not be called for vsize == 1" ) ; current_v[0] = index_ele_v::IndexesFromZero() + index_ele_v ( current[0] ) ; itp.eval ( current_v , target ) ; current[0] += vsize ; } #endif /// while we are at the lowest level here, we still need the subrange routine /// for cases where the data are 1D in the first place: in this situation, /// we need to be able to split up the range as well. index_generator < unary_functor_type , 0 > subrange ( const shape_range_type < dimension > range ) const { return index_generator < unary_functor_type , 0 > ( itp , range ) ; } } ; } ; // namespace detail /// implementation of transform() by delegation to the more general fill() routine, /// passing in the input array and the interpolator via a generator object. /// This is a generalization of a remap routine: the remap concept looks at the incoming /// data as coordinates, at the functor as an interpolator yielding values for coordinates, /// and at the output as an array of thusly generated values. /// Here, incoming and outgoing data aren't necessarily coordinates or the result of /// an interpolation, they can be any pair of types which the functor can handle. /// /// transform takes two template arguments: /// /// - 'unary_functor_type', which is a class satisfying the interface laid down in /// unary_functor.h. Typically, this would be a type inheriting from /// vspline::unary_functor, but any type will do as long as it provides the required /// typedefs and an the relevant eval() routines. /// /// - the type of the output array /// /// this overload of transform takes three parameters: /// /// - a reference to a const unary_functor_type object providing the functionality needed /// to generate values from coordinates. /// /// - a reference to a const MultiArrayView holding values to feed to the unary functor /// object. It has to have the same shape as the target array and contain data of /// the unary_functor's in_type. /// /// - a reference to a MultiArrayView to use as a target. This is where the resulting /// data are put, so it has to contain data of unary_functor's out_type. It has to have /// the same shape as the input array. template < typename unary_functor_type , // functor yielding values for coordinates typename output_type > // type of output array void transform ( const unary_functor_type & ev , const MultiArrayView < output_type::actual_dimension , typename unary_functor_type::in_type > & input , output_type & output ) { // make sure the functor's output type matches the otput array's value_type static_assert ( std::is_same < typename unary_functor_type::out_type , typename output_type::value_type > :: value , "functor's output type and output's value_type must match" ) ; // check shape compatibility if ( output.shape() != input.shape() ) { throw shape_mismatch ( "transform: the shapes of the input array and the output array do not match" ) ; } enum { dim_target = output_type::actual_dimension } ; // we test if the input array is unstrided in dimension 0. If that is so, even // if it is strided in higher dimensions, via the hierarchical descent we will // eventually arrive in dimension 0 and iterate over an unstrided array. // This only matters if Vc is used, because if the input array is unstrided, // the coordinates can be loaded more effectively. Note that this method // requires that the hierarchical access goes down all the way to 1D. // this test determines the type of input generator we need. With this type // fixed, we proceed to set up the appropriate generator object and pass // it to fill, together with the output array to receive the results. if ( input.isUnstrided ( 0 ) ) { typedef detail::warp_generator < dim_target , unary_functor_type , false // unstrided > gen_t ; gen_t gen ( input , ev ) ; detail::fill < gen_t , dim_target > ( gen , output ) ; } else { // input array is strided even in dimension 0 typedef detail::warp_generator < dim_target , unary_functor_type , true > gen_t ; // strided gen_t gen ( input , ev ) ; detail::fill < gen_t , dim_target > ( gen , output ) ; } } /// for backward compatibility, deprecated. /// up to vspline 0.2.1, the function above was also named 'remap', but I decided to /// rename it to 'transform', which names it more aptly. template < typename unary_functor_type , // functor yielding values for coordinates typename output_type > // type of output array void remap ( const unary_functor_type & ev , const MultiArrayView < output_type::actual_dimension , typename unary_functor_type::in_type > & input , output_type & output ) { vspline::transform ( ev , input , output ) ; } /// we code 'apply' as a special variant of transform where the output /// is also used as input, so the effect is to feed the unary functor /// each 'output' value in turn, let it process it and store the result /// back to the same location. template < typename unary_functor_type , // functor yielding values for coordinates typename output_type > // type of output array void apply ( const unary_functor_type & ev , output_type & output ) { // make sure the functor's output type matches the otput array's value_type static_assert ( std::is_same < typename unary_functor_type::out_type , typename output_type::value_type > :: value , "functor's value_type and array's value_type must match" ) ; // make sure the functor's input and output type are the same static_assert ( std::is_same < typename unary_functor_type::in_type , typename unary_functor_type::out_type > :: value , "functor's input and output type must match" ) ; transform ( ev , output , output ) ; } /// Implementation of 'classic' remap, which directly takes an array of values and remaps /// it, internally creating a b-spline of given order just for the purpose. This is used for /// one-shot remaps where the spline isn't reused, and specific to b-splines, since /// the functor used is a b-spline evaluator. The spline defaults to a cubic b-spline /// with mirroring on the bounds. /// /// So here we have the 'classic' remap, where the input array holds coordinates and /// the functor used is actually an interpolator. Since this is merely a special case /// of using transform(), we delegate to transform(). template < typename input_type , typename warp_type , typename output_type > void remap ( const input_type & input , const warp_type & warp , output_type & output , bcv_type < input_type::actual_dimension > bcv = bcv_type < input_type::actual_dimension > ( MIRROR ) , int degree = 3 ) { // fix the type for coordinates typedef typename warp_type::value_type coordinate_type ; // fix the type for values/coefficients typedef typename input_type::value_type value_type ; static_assert ( std::is_same < typename input_type::value_type , typename output_type::value_type > :: value , "input and output array's value_type must match" ) ; static_assert ( warp_type::actual_dimension == output_type::actual_dimension , "warp aray's and output array's dimension must match" ) ; enum { dim_in = input_type::actual_dimension } ; static_assert ( dim_in == coordinate_type::static_size , "warp array must contain values with same dimension as input array" ) ; // check shape compatibility if ( output.shape() != warp.shape() ) { throw shape_mismatch ( "the shapes of the warp array and the output array must match" ) ; } // create the bspline object // TODO may want to specify tolerance here instead of using default bspline < value_type , dim_in > bsp ( input.shape() , degree , bcv ) ; // prefilter, taking data in 'input' as knot point data bsp.prefilter ( input ) ; // create an evaluator over the bspline typedef evaluator < coordinate_type , value_type > evaluator_type ; evaluator_type ev ( bsp ) ; // and call transform(), passing in the evaluator, // the coordinate array and the target array transform ( ev , warp , output ) ; } /// this overload of transform() is very similar to the previous one, but instead of /// picking input from an array, it feeds the discrete coordinates to the successive /// places data should be rendered to to the unary_functor_type object. /// /// this transform overload takes one template argument: /// /// - 'unary_functor_type', which is a class satisfying the interface laid down in /// unary_functor.h. This is an object which can provide values given *discrete* /// coordinates, like class evaluator, but generalized to allow for arbitrary ways /// of achieving it's goal. The unary functor's in_type determines the number of /// dimensions of the indices - since they are indices into the target array, the /// functor's input type has to have the same number of dimensions as the target. /// /// it takes two parameters: /// /// - a reference to a const unary_functor_type object providing the functionality needed /// to generate values from discrete coordinates /// /// - a reference to a MultiArrayView to use as a target. This is where the resulting /// data are put. template < class unary_functor_type > void transform ( const unary_functor_type & ev , MultiArrayView < unary_functor_type::dim_in , typename unary_functor_type::out_type > & output ) { enum { dim_target = unary_functor_type::dim_in } ; typedef typename unary_functor_type::out_type value_type ; typedef TinyVector < int , dim_target > nd_ic_type ; typedef detail::index_generator < unary_functor_type , dim_target - 1 > gen_t ; shape_range_type < dim_target > range ( nd_ic_type() , output.shape() ) ; gen_t gen ( ev , range ) ; detail::fill < gen_t , dim_target > ( gen , output ) ; } /// for backward compatibility, deprecated /// up to vspline 0.2.1, the function above was named 'index_remap', but I decided to /// rename it 'transform', which is more apt. template < class unary_functor_type > void index_remap( const unary_functor_type & ev , MultiArrayView < unary_functor_type::dim_in , typename unary_functor_type::out_type > & output ) { transform ( ev , output ) ; } namespace detail // workhorse code for grid_eval { // in grid_weight, for every dimension we have a set of ORDER weights // for every position in this dimension. in grid_ofs, we have the // partial offset for this dimension for every position. these partial // offsets are the product of the index for this dimension at the position // and the stride for this dimension, so that the sum of the partial // offsets for all dimensions yields the offset into the coefficient array // to the window of coefficients where the weights are to be applied. template < typename evaluator_type , int level , int _vsize = 0 > struct _grid_eval { typedef typename evaluator_type::ele_type weight_type ; typedef MultiArrayView < level + 1 , typename evaluator_type::value_type > target_type ; void operator() ( int initial_ofs , MultiArrayView < 2 , weight_type > & weight , weight_type** const & grid_weight , const int & ORDER , int ** const & grid_ofs , const evaluator_type & itp , target_type & result ) { for ( int ofs = 0 ; ofs < result.shape ( level ) ; ofs++ ) { for ( int e = 0 ; e < ORDER ; e++ ) weight [ vigra::Shape2 ( e , level ) ] = grid_weight [ level ] [ ORDER * ofs + e ] ; int cum_ofs = initial_ofs + grid_ofs [ level ] [ ofs ] ; auto region = result.bindAt ( level , ofs ) ; _grid_eval < evaluator_type , level - 1 , evaluator_type::vsize >() ( cum_ofs , weight , grid_weight , ORDER , grid_ofs , itp , region ) ; } } } ; /// Here, with template argument _vsize unfixed, we have the vector code, /// below is a specialization for _vsize == 1 which is unvectorized. template < typename evaluator_type , int _vsize > struct _grid_eval < evaluator_type , 0 , _vsize > { typedef typename evaluator_type::ele_type weight_type ; typedef MultiArrayView < 1 , typename evaluator_type::value_type > target_type ; // on my system, using clang++, the vectorized code is slightly slower // than the unvectorized code. With g++, the vectorized code is faster // than either clang version, but the unvectorized code is much slower. enum { vsize = evaluator_type::vsize } ; enum { channels = evaluator_type::channels } ; typedef typename evaluator_type::value_type value_type ; typedef typename evaluator_type::ele_type ele_type ; typedef typename evaluator_type::ic_v ic_v ; typedef typename evaluator_type::ele_v ele_v ; typedef typename evaluator_type::out_v mc_ele_v ; typedef typename evaluator_type::out_v out_v ; typedef typename ele_v::IndexType index_type ; inline void _scatter ( const out_v & src , ele_type * dp , index_type indexes , std::true_type ) { src.scatter ( dp , indexes ) ; } inline void _scatter ( const out_v & src , ele_type * dp , index_type indexes , std::false_type ) { for ( int e = 0 ; e < channels ; e++ ) src[e].scatter ( dp + e , indexes ) ; } inline void scatter ( const out_v & src , ele_type * dp , index_type indexes ) { _scatter ( src , dp , indexes , typename std::is_same < ele_v , out_v > :: type () ) ; } void operator() ( int initial_ofs , MultiArrayView < 2 , weight_type > & weight , weight_type** const & grid_weight , const int & ORDER , int ** const & grid_ofs , const evaluator_type & itp , target_type & region ) { auto iter = region.begin() ; int ofs_start = 0 ; // number of vectorized results int aggregates = region.size() / vsize ; // vectorized weights MultiArray < 2 , ele_v > vweight ( weight.shape() ) ; // vectorized offset ic_v select ; // buffer for target data mc_ele_v vtarget ; // initialize the vectorized weights for dimensions > 0 for ( int d = 1 ; d < weight.shape(1) ; d++ ) { for ( int o = 0 ; o < ORDER ; o++ ) vweight [ vigra::Shape2 ( o , d ) ] = weight [ vigra::Shape2 ( o , d ) ] ; } // get a pointer to the target array's data (seen as elementary type) ele_type * p_target = (ele_type*) ( region.data() ) ; // and the stride, if any, also in terms of the elementary type, from // one cluster of target data to the next int stride = vsize * channels * region.stride(0) ; for ( int a = 0 ; a < aggregates ; a++ ) { // gather the individual weights into the vectorized form for ( int o = 0 ; o < ORDER ; o++ ) { vweight[ vigra::Shape2 ( o , 0 ) ].gather ( grid_weight [ 0 ] + ORDER * a * vsize , ORDER * ic_v::IndexesFromZero() + o ) ; } select.load ( grid_ofs [ 0 ] + a * vsize ) ; // get the offsets from grid_ofs select += initial_ofs ; // add cumulated offsets from higher dimensions // now we can call the vectorized eval routine itp.eval ( select , vweight , vtarget ) ; // finally we scatter the vectorized result to target memory scatter ( vtarget , p_target , ic_v::IndexesFromZero() * channels * region.stride(0) ) ; // and set p_target to the next cluster of target values p_target += stride ; } // adapt the iterator into target array iter += aggregates * vsize ; // and the initial offset ofs_start += aggregates * vsize ; // now we finish off the stragglers: for ( int ofs = ofs_start ; ofs < region.shape ( 0 ) ; ofs++ ) { for ( int e = 0 ; e < ORDER ; e++ ) weight [ vigra::Shape2 ( e , 0 ) ] = grid_weight [ 0 ] [ ORDER * ofs + e ] ; int cum_ofs = initial_ofs + grid_ofs [ 0 ] [ ofs ] ; itp.eval ( cum_ofs , weight , *iter ) ; ++iter ; } } } ; template < typename evaluator_type > struct _grid_eval < evaluator_type , 0 , 1 > { typedef typename evaluator_type::ele_type weight_type ; typedef MultiArrayView < 1 , typename evaluator_type::value_type > target_type ; void operator() ( int initial_ofs , MultiArrayView < 2 , weight_type > & weight , weight_type** const & grid_weight , const int & ORDER , int ** const & grid_ofs , const evaluator_type & itp , target_type & region ) { auto iter = region.begin() ; int ofs_start = 0 ; // if Vc wasn't used, we start with ofs = 0 and this loop // does all the processing: for ( int ofs = ofs_start ; ofs < region.shape ( 0 ) ; ofs++ ) { for ( int e = 0 ; e < ORDER ; e++ ) weight [ vigra::Shape2 ( e , 0 ) ] = grid_weight [ 0 ] [ ORDER * ofs + e ] ; int cum_ofs = initial_ofs + grid_ofs [ 0 ] [ ofs ] ; itp.eval ( cum_ofs , weight , *iter ) ; ++iter ; } } } ; /// Here is the single-threaded code for the grid_eval function. /// The first argument is a shape range, defining the subsets of data /// to process in a single thread. the remainder are forwards of the /// arguments to grid_eval, partly as pointers. The call is affected /// via 'multithread()' which sets up the partitioning and distribution /// to threads from a thread pool. template < typename evaluator_type , // b-spline evaluator type int dim_out > // dimension of target void st_grid_eval ( shape_range_type < dim_out > range , typename evaluator_type::rc_type ** const _grid_coordinate , const evaluator_type * itp , MultiArrayView < dim_out , typename evaluator_type::value_type > * p_result ) { typedef typename evaluator_type::ele_type weight_type ; typedef typename evaluator_type::rc_type rc_type ; typedef MultiArrayView < dim_out , typename evaluator_type::value_type > target_type ; const int ORDER = itp->get_order() ; // pick the subarray of the 'whole' target array pertaining to this thread's range auto result = p_result->subarray ( range[0] , range[1] ) ; // pick the subset of coordinates pertaining to this thread's range const rc_type * grid_coordinate [ dim_out ] ; for ( int d = 0 ; d < dim_out ; d++ ) grid_coordinate[d] = _grid_coordinate[d] + range[0][d] ; // set up storage for precalculated weights and offsets weight_type * grid_weight [ dim_out ] ; int * grid_ofs [ dim_out ] ; // get some metrics TinyVector < int , dim_out > shape ( result.shape() ) ; TinyVector < int , dim_out > estride ( itp->get_estride() ) ; // allocate space for the per-axis weights and offsets for ( int d = 0 ; d < dim_out ; d++ ) { grid_weight[d] = new weight_type [ ORDER * shape [ d ] ] ; grid_ofs[d] = new int [ shape [ d ] ] ; } int select ; rc_type tune ; // fill in the weights and offsets, using the interpolator's split() to split // the coordinates received in grid_coordinate, the interpolator's obtain_weights // method to produce the weight components, and the strides of the coefficient // array to convert the integral parts of the coordinates into offsets. for ( int d = 0 ; d < dim_out ; d++ ) { for ( int c = 0 ; c < shape [ d ] ; c++ ) { itp->split ( grid_coordinate [ d ] [ c ] , select , tune ) ; itp->obtain_weights ( grid_weight [ d ] + ORDER * c , d , tune ) ; grid_ofs [ d ] [ c ] = select * estride [ d ] ; } } // allocate storage for a set of singular weights MultiArray < 2 , weight_type > weight ( vigra::Shape2 ( ORDER , dim_out ) ) ; // now call the recursive workhorse routine detail::_grid_eval < evaluator_type , dim_out - 1 , evaluator_type::vsize >() ( 0 , weight , grid_weight , ORDER , grid_ofs , *itp , result ) ; // clean up for ( int d = 0 ; d < dim_out ; d++ ) { delete[] grid_weight[d] ; delete[] grid_ofs[d] ; } } } ; // end of namespace detail /// this is the multithreaded version of grid_eval, which sets up the /// full range over 'result' and calls 'multithread' to do the rest /// /// grid_eval evaluates a b-spline object /// at points whose coordinates are distributed in a grid, so that for /// every axis there is a set of as many coordinates as this axis is long, /// which will be used in the grid as the coordinate for this axis at the /// corresponding position. The resulting coordinate matrix (which remains /// implicit) is like a mesh grid made from the per-axis coordinates. /// /// If we have two dimensions and x coordinates x0, x1 and x2, and y /// coordinates y0 and y1, the resulting implicit coordinate matrix is /// /// (x0,y0) (x1,y0) (x2,y0) /// /// (x0,y1) (x1,y1) (x2,y1) /// /// since the offsets and weights needed to perform an interpolation /// only depend on the coordinates, this highly redundant coordinate array /// can be processed more efficiently by precalculating the offset component /// and weight component for all axes and then simply permutating them to /// obtain the result. Especially for higher-degree and higher-dimensional /// splines this saves quite some time, since the generation of weights /// is computationally expensive. /// /// grid_eval is useful for generating a scaled representation of the original /// data, but when scaling down, aliasing will occur and the data should be /// low-pass-filtered adequately before processing. Let me hint here that /// low-pass filtering can be achieved by using b-spline reconstruction on /// raw data (a 'smoothing spline') - or by prefiltering with exponential /// smoothing, which can be activated by passing the 'smoothing' parameter /// to the prefiltering routine. Of course any other way of smoothing can /// be used just the same, like a Burt filter or Gaussian smoothing. /// /// Note that this code is specific to b-spline evaluators and relies /// on evaluator_type offering several b-spline specific methods which /// are not present in other interpolators, like split() and /// obtain_weights(). Since the weight generation for b-splines can /// be done separately for each axis and is a computationally intensive /// task, precalculating these per-axis weights makes sense. Coding for /// the general case (other interpolators), the only achievement would be /// the permutation of the partial coordinates, so little would be gained, /// and instead a transform where the indices are used to pick up /// the coordinates can be written easily: have a unary_functor taking /// discrete coordinates, 'loaded' with the per-axis coordinates, and an /// eval routine yielding the picked coordinates. template < typename evaluator_type , // b-spline evaluator int dim_out > // dimension of target void grid_eval ( typename evaluator_type::rc_type ** const grid_coordinate , const evaluator_type & itp , MultiArrayView < dim_out , typename evaluator_type::value_type > & result ) { shape_range_type < dim_out > range ( shape_type < dim_out > () , result.shape() ) ; multithread ( detail::st_grid_eval < evaluator_type , dim_out > , vspline::partition_to_tiles < dim_out > , ncores * 8 , range , grid_coordinate , &itp , &result ) ; } /// grid_eval allows us to code a function to restore the original knot point /// date from a bspline. We simply fill in the discrete coordinates into the /// grid coordinate vectors and call grid_eval with them. /// note that this routine can't operate in-place, so you can't overwrite /// a bspline object's core with the restored knot point data, you have to /// provide a separate target array. /// This routine is potentially faster than running an transform with /// the same target, due to the precalculated weight components. For 1D data, /// a transform is used, because here we'd just precalculate a weight for /// each individual value, which would actually be slower. template < int dimension , typename value_type , typename rc_type = float > void restore ( const vspline::bspline < value_type , dimension > & bspl , vigra::MultiArrayView < dimension , value_type > & target ) { if ( target.shape() != bspl.core.shape() ) throw shape_mismatch ( "restore: spline's core shape and target array shape must match" ) ; typedef vigra::TinyVector < rc_type , dimension > coordinate_type ; typedef vigra::MultiArrayView < dimension , value_type > target_type ; typedef typename vigra::ExpandElementResult < value_type > :: type weight_type ; typedef vspline::evaluator < coordinate_type , value_type > ev_type ; ev_type ev ( bspl ) ; // TODO: might catch cases with spline degree < 2 where data can be // simply copied - or not even that, if source == target // for now we unconditionally give the caller a 'proper' restore. if ( dimension == 1 ) { // for 1D splines, it's futile to do a grid_eval vspline::transform ( ev , target ) ; } else { // set up the coordinate component vectors rc_type * p_ruler [ dimension ] ; for ( int d = 0 ; d < dimension ; d++ ) { p_ruler[d] = new rc_type [ target.shape ( d ) ] ; for ( int i = 0 ; i < target.shape ( d ) ; i++ ) p_ruler[d][i] = rc_type(i) ; } vspline::grid_eval < ev_type , dimension > // target_type , weight_type , rc_type > ( p_ruler , ev , target ) ; for ( int d = 0 ; d < dimension ; d++ ) delete[] p_ruler[d] ; } } } ; // end of namespace vspline #endif // VSPLINE_TRANSFORM_H kfj-vspline-6e66cf7a7926/unary_functor.h000066400000000000000000001006661320375670700201550ustar00rootroot00000000000000/************************************************************************/ /* */ /* vspline - a set of generic tools for creation and evaluation */ /* of uniform b-splines */ /* */ /* Copyright 2015 - 2017 by Kay F. Jahnke */ /* */ /* The git repository for this software is at */ /* */ /* https://bitbucket.org/kfj/vspline */ /* */ /* Please direct questions, bug reports, and contributions to */ /* */ /* kfjahnke+vspline@gmail.com */ /* */ /* Permission is hereby granted, free of charge, to any person */ /* obtaining a copy of this software and associated documentation */ /* files (the "Software"), to deal in the Software without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of the Software, and to permit persons to whom the */ /* Software is furnished to do so, subject to the following */ /* conditions: */ /* */ /* The above copyright notice and this permission notice shall be */ /* included in all copies or substantial portions of the */ /* Software. */ /* */ /* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND */ /* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES */ /* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND */ /* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT */ /* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, */ /* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING */ /* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR */ /* OTHER DEALINGS IN THE SOFTWARE. */ /* */ /************************************************************************/ /*! \file unary_functor.h \brief interface definition for unary functors vspline's evaluation and remapping code relies on a unary functor template which is used as the base for vspline::evaluator and also constitutes the type of object accepted by most of the functions in transform.h. This template produces functors which are meant to yield a single output for a single input, where both the input and output types may be single types or vigra::TinyVectors, and their elementary types may be vectorized. The functors are expected to provide methods named eval() which are capable of performing the required functionality. These eval routines take both their input and output by reference - the input is taken by const &, and the output as plain &. The result type of the eval routines is void. While such unary functors can be hand-coded, the class template 'unary_functor' provides services to create such functors in a uniform way, with a specifc system of associated types and some convenience code. Using unary_functor is meant to facilitate the creation of the unary functors used in vspline. Using unary_functor generates objects which can be easily combined into more complex unary functors, a typical use would be to 'chain' two unary_functors, see class template 'chain_type' below, which also provides an example for the use of unary_functor. class unary_functor takes three template arguments: - the argument type, IN - the result type, OUT - the number of fundamentals in a vector, _vsize When using Vc, the vectorized argument and result type are deduced from IN, OUT and _vsize by querying vspline::vector_traits. So where is eval() or operator()? Not in class unary_functor. The actual functionality is provided by the derived class. There is deliberately no code concerning evaluation in class unary_functor. My initial implementation had pure virtual functions to define the interface for evaluation, but this required explicitly providing the overloads in the derived class. Simply omitting any reference to evaluation allows the derived class to accomplish evaluation with a template if the code is syntactically the same for vectorized and unvectorized operation. To users of concrete functors inheriting from unary_functor this makes no difference. The only drawback is that it's not possible to perform evaluation via a base class pointer or reference. But this is best avoided anyway because it degrades performance. If the need arises to have several unary_functors with the same template signature share a common type, there's a mechanism to make the internals opaque by 'grokking'. grokking provides a wrapper around a unary_functor which hides it's type, vspline::grok_type directly inherits from unary_functor and the only template arguments are IN, OUT and _vsize. This hurts performance a little - just as calling via a base class pointer/reference would, but the code is outside class unary_functor and therefore only activated when needed. Class vspline::evaluator is itself coded as a vspline::unary_functor and can serve as another example for the use of the code in this file. */ #ifndef VSPLINE_UNARY_FUNCTOR_H #define VSPLINE_UNARY_FUNCTOR_H #include #include namespace vspline { /// we derive all vspline::unary_functors from this empty class, to have /// a common base type for all of them. This enables us to easily check if /// a type is a vspline::unary_functor without having to wrangle with /// unary_functor's template arguments. template < int _vsize > struct unary_functor_tag { } ; /// class unary_functor provides a functor object which offers a system /// of types for concrete unary functors derived from it. If Vc isn't used, /// this is trivial, but with Vc in use, we get vectorized types derived from /// plain IN and OUT via query of vspline::vector_traits. /// /// class unary_functor does not provide anything callable, this is left to /// the concrete functors inheriting from unary_functor. It is expected, though, /// that the derived classes provide evaluation capability, either as a /// method template or as (overloaded) method(s) 'eval'. eval is to be coded /// as taking it's argument as a const&, and writing it's result to it's /// second argument, to which it receives a reference. eval's return type /// is void. /// /// Why not lay down an interface with a pure virtual function eval() /// which derived classes would need to override? Suppose you had, in /// unary_functor, /// /// virtual void eval ( const in_type & in , out_type & out ) = 0 ; /// /// Then, in a derived class, you'd have to provide an override with this /// signature. Initially, this seems reasonable enough, but if you want to /// implement eval() as a member function template in the derived class, you /// still would have to provide the override (calling an instatiated version /// of your template), because your template won't be recognized as a viable /// way to override the pure virtual base class member function. Since /// providing eval as a template is common (oftentimes vectorized and /// unvectorized code are the same) I've decided against having virtual eval /// routines, to avoid the need for explicitly overriding them in derived /// classes which provide eval() as a template. /// /// How about providing operator() in unary_functor? operator() would need to /// delegate to eval(), but eval is not known in unary_functor; only classes /// derived from unary_functor provide eval(). We could only call the derived /// class' eval if it were virtual and declared in unary_functor itself. Since /// this has been ruled out, we can't have operator() either. Initially this /// looks like an annoying ommision: we have a functor and it doesn't provide /// operator()? But it turns out that /// /// - most of the time we don't use objects derived from unary_functor directly /// /// - and if we do and need function call syntax, we can use vspline::callable /// or vspline::grok which produce objects with operator(). /// /// finally, if you define your own functors derived from unary_functor, nothing /// is stopping you from supplying operator() for your derived classes, if you /// don't want wrap them in callable types vspline provides. /// /// With no virtual member functions, class unary_functor becomes very simple, /// which is desirable from a design standpoint, and also makes unary_functors /// smaller, avoiding the creation of the virtual function table. template < typename IN , // argument or input type typename OUT , // result type int _vsize = vspline::vector_traits < IN > :: size > struct unary_functor : public unary_functor_tag < _vsize > { // number of elements in simdized data. For code not using Vc, and also for // cases where vectorization isn't possible, vsize will be 1 and the vectorized // types will 'collapse' to the unvectorized types. enum { vsize = _vsize } ; // number of dimensions. This may well be different for IN and OUT. // note how we rely on vigra's ExpandElementResult mechanism to inspect IN and OUT. enum { dim_in = vigra::ExpandElementResult < IN > :: size } ; enum { dim_out = vigra::ExpandElementResult < OUT > :: size } ; // typedefs for incoming (argument) and outgoing (result) type. These two types // are non-vectorized types, like vigra::TinyVector < float , 2 >. Since such types // consist of elements of the same type, the corresponding vectorized type can be // easily automatically determined. typedef IN in_type ; typedef OUT out_type ; // elementary types of same. we rely on vigra's ExpandElementResult mechanism // to provide these types. typedef typename vigra::ExpandElementResult < IN > :: type in_ele_type ; typedef typename vigra::ExpandElementResult < OUT > :: type out_ele_type ; // for vectorized operation, we need a few extra typedefs. I use a _v // suffix instead of the _type suffix above to indicate vectorized types. // If Vc is not in use, the _v types simply collapse to the unvectorized // types, having them does no harm, but it's not safe to assume that, // for example, in_v and in_type are in fact different types. /// a simdized type of the elementary type of result_type, /// which is used for coefficients and results. this is fixed via /// the traits class vector_traits (in common.h). Note how we derive /// this type using vsize from the template argument, not what /// vspline::vector_traits deems appropriate for ele_type - though /// both numbers will be the same in most cases. typedef typename vector_traits < IN , vsize > :: ele_v in_ele_v ; typedef typename vector_traits < OUT , vsize > :: ele_v out_ele_v ; /// vectorized in_type and out_type. vspline::vector_traits supplies these /// types so that multidimensional/multichannel data come as vigra::TinyVectors, /// while 'singular' data won't be made into TinyVectors of one element. typedef typename vector_traits < IN , vsize > :: type in_v ; typedef typename vector_traits < OUT , vsize > :: type out_v ; /// vsize wide vector of ints, used for gather/scatter indexes typedef typename vector_traits < int , vsize > :: ele_v ic_v ; // out_type_of checks if it's template argument is the same type as // in_type. If this is so, the corresponding result type is out_type. // If not, the input must be vectorized and the corresponding vectorized // result type is queried from vspline::vector_traits template < class argument_type > using out_type_of = typename std::conditional < std::is_same < argument_type , in_type > :: value , out_type , typename vspline::vector_traits < out_type , vsize > :: type > :: type ; } ; /// class callable_type provides a wrapper for vspline::unary_functors /// which provides operator(). class unary_functor, as it is currently /// implemented, cannot provide operator(), since the eval routines /// are only defined in the concrete functors which inherit from it. /// To be able to provide operator() in class unary_functor, I tried /// an implementation which pulled in the eval functionality as a /// policy. This worked, but the code was verbose and convoluted, /// so I abandoned it. This is my new way of providing a callable. /// If the callable is needed, it provides, and the code for /// class unary_functor is much simpler. /// to obtain the return type O for a given argument type I, /// we use class unary_functor's out_typ_of mechanism. /// This class is typical for a class derived from unary_functor: /// It inherits from unary_functor. It defines it's in_type, out_type, /// vsize and base_type, then proceeds to define it's evaluation /// code and other capabilities it may provide. template < typename inner_type > struct callable_type : public vspline::unary_functor < typename inner_type::in_type , typename inner_type::out_type , inner_type::vsize > { // definition of in_type, out_type, vsize and base_type typedef typename inner_type::in_type in_type ; typedef typename inner_type::out_type out_type ; enum { vsize = inner_type::vsize } ; typedef vspline::unary_functor < in_type , out_type , vsize > base_type ; // the wrapped, 'inner' functor const inner_type inner ; // the constructor initializes the inner functor callable_type ( const inner_type _inner ) : inner ( _inner ) { } ; // we set up a member function template 'eval' which delegates to // the inner type's eval. template < class I , class O > void eval ( const I & i , O & o ) const { inner.eval ( i , o ) ; } // here comes the implementation of operator(). We provide a // member function template to handle all parameter signatures // uniformly. Note how the return type is derived from the // argument type by using unary_functor's out_type_of mechanism. template < class I , class O = typename base_type::template out_type_of > O operator() ( const I & in ) const { O result ; inner.eval ( in , result ) ; return result ; } } ; /// callable() is a factory function wrapping a vspline::unary_functor /// in a vspline::callable_type which provides operator() overloads /// processing vectorized and unvectorized arguments. template < class inner_type > vspline::callable_type < inner_type > callable ( const inner_type & inner ) { return vspline::callable_type < inner_type > ( inner ) ; } /// class chain_type is a helper class to pass one unary functor's result /// as argument to another one. We rely on T1 and T2 to provide a few of the /// standard types used in unary functors. Typically, T1 and T2 will both be /// vspline::unary_functors, but the type requirements could also be fulfilled /// manually. template < typename T1 , typename T2 > struct chain_type : public vspline::unary_functor < typename T1::in_type , typename T2::out_type , T1::vsize > { // definition of in_type, out_type, vsize and base_type typedef typename T1::in_type in_type ; typedef typename T2::out_type out_type ; enum { vsize = T1::vsize } ; typedef vspline::unary_functor < in_type , out_type , vsize > base_type ; // we require both functors to share the same vectorization width static_assert ( T1::vsize == T2::vsize , "can only chain unary_functors with the same vector width" ) ; // hold the two functors by value const T1 t1 ; const T2 t2 ; // the constructor initializes them chain_type ( const T1 & _t1 , const T2 & _t2 ) : t1 ( _t1 ) , t2 ( _t2 ) { } ; // the actual eval needs a bit of trickery to determine the type of // the intermediate type from the type of the first argument. template < typename argument_type , typename result_type > void eval ( const argument_type & argument , result_type & result ) const { typedef typename T1::template out_type_of < argument_type > intermediate_type ; intermediate_type intermediate ; t1.eval ( argument , intermediate ) ; // evaluate first functor into it t2.eval ( intermediate , result ) ; // feed it as input to second functor } } ; /// chain is a factory function yielding the result of chaining /// two unary_functors. template < class T1 , class T2 > vspline::chain_type < T1 , T2 > chain ( const T1 & t1 , const T2 & t2 ) { return vspline::chain_type < T1 , T2 > ( t1 , t2 ) ; } /// using operator overloading, we can exploit operator+'s semantics /// to chain several unary functors. We need to specifically enable /// this for types derived from unary_functor_tag to avoid a catch-all /// situation. template < typename T1 , typename T2 , typename std::enable_if < std::is_base_of < vspline::unary_functor_tag < T2::vsize > , T1 > :: value , int > :: type = 0 , typename std::enable_if < std::is_base_of < vspline::unary_functor_tag < T1::vsize > , T2 > :: value , int > :: type = 0 > vspline::chain_type < T1 , T2 > operator+ ( const T1 & t1 , const T2 & t2 ) { return vspline::chain ( t1 , t2 ) ; } /// sometimes, vectorized code for a vspline::unary_functor is not at hand /// for some specific evaluation. class broadcast_type can broadcast unvectorized /// evaluation code, so that vectorized data can be procesed with this code, /// albeit less efficiently. template < class inner_type , int _vsize > struct broadcast_type : public vspline::unary_functor < typename inner_type::in_type , typename inner_type::out_type , _vsize > { // definition of in_type, out_type, vsize and base_type typedef typename inner_type::in_type in_type ; typedef typename inner_type::out_type out_type ; enum { vsize = _vsize } ; typedef vspline::unary_functor < in_type , out_type , vsize > base_type ; const inner_type inner ; broadcast_type ( const inner_type & _inner ) : inner ( _inner ) { } ; /// single-value evaluation simply delegates to inner void eval ( const in_type & in , out_type & out ) const { inner.eval ( in , out ) ; } // vector_traits for in_type and out_type typedef typename vspline::vector_traits < in_type , _vsize > iv_traits ; typedef typename vspline::vector_traits < out_type , _vsize > ov_traits ; // now we implement the actual broadcast /// vectorized evaluation will match this template: template < class in_v , class out_v> void eval ( const in_v & in , out_v & out ) const { // we want TinyVectors even if there is only one channel. // this way we can iterate over the channels. typedef typename iv_traits::nd_ele_v iv_type ; typedef typename ov_traits::nd_ele_v ov_type ; // we instantiate an iv_type by 'wrapping' the incoming in_v, which // produces a TinyVector with one element for types which aren't // already TinyVectors. iv_type iv ( wrap ( in ) ) ; // we have an ov_type handy for the output ov_type ov ; // now the broadcast: // we want a definite TinyVector for unvectorized input and output types // so that we can iterate over them, even if they have only one element typename iv_traits::nd_ele_type i ; typename ov_traits::nd_ele_type o ; // at the same time, we want to be able to feed these data to eval, // which does *not* like TinyVectors with one element. in_type & iref ( reinterpret_cast < in_type& > ( i ) ) ; out_type & oref ( reinterpret_cast < out_type& > ( o ) ) ; for ( int e = 0 ; e < _vsize ; e++ ) { // extract the eth input value from the simdized input for ( int d = 0 ; d < iv_traits::dimension ; d++ ) i[d] = iv[d][e] ; // process it with eval, passing the eval-compatible references inner.eval ( iref , oref ) ; // now distribute eval's result to the SIMD output for ( int d = 0 ; d < ov_traits::dimension ; d++ ) ov[d][e] = o[d] ; } // finally, assign ov to out. We use unwrap() here, in case ov is a // TinyVector with one element. out = unwrap ( ov ) ; } } ; /// type of a std::function for unvectorized evaluation: template < class IN , class OUT > using eval_type = std::function < void ( const IN & , OUT & ) > ; /// helper class hold_type holds a single-element evaluation function template < class IN , class OUT > struct hold_type : public vspline::unary_functor < IN , OUT , 1 > { const eval_type < IN , OUT > eval ; hold_type ( eval_type < IN , OUT > _eval ) : eval ( _eval ) { } ; } ; /// factory function to create a broadcast_type from another vspline::unary_functor /// This will pick the other functor's unvectorized eval routine and broadcast it, /// The vectorized eval routine of the other functor (if present) is ignored. template < class inner_type , int _vsize > broadcast_type < inner_type , _vsize > broadcast ( const inner_type & inner ) { return broadcast_type < inner_type , _vsize > ( inner ) ; } /// factory function to create a broadcast_type from a std::function /// implementing the unvectorized evaluation. /// to broadcast a single-value evaluation function, we package it /// in a hold_type, which broadcast can handle. template < class IN , class OUT , int _vsize > broadcast_type < hold_type < IN , OUT > , _vsize > broadcast ( eval_type < IN , OUT > _eval ) { return broadcast_type < hold_type < IN , OUT > , _vsize > ( hold_type < IN , OUT > ( _eval ) ) ; } /// eval_wrap is a helper function template, wrapping an 'ordinary' /// function which returns some value given some input in a void function /// taking input as const reference and writing output to a reference, /// which is the signature used for evaluation in vspline::unary_functors. template < class IN , class OUT > std::function < void ( const IN& , OUT& ) > eval_wrap ( std::function < OUT ( const IN& ) > f ) { return [f] ( const IN& in , OUT& out ) { out = f ( in ) ; } ; } /// class grok_type is a helper class wrapping a vspline::unary_functor /// so that it's type becomes opaque - a technique called 'type erasure', /// here applied to vspline::unary_functors with their specific /// capability of providing both vectorized and unvectorized operation /// in one common object. /// /// While 'grokking' a unary_functor may degrade performance slightly, /// the resulting type is less complex, and when working on complex /// constructs involving several unary_functors, it can be helpful to /// wrap the whole bunch into a grok_type for some time to make compiler /// messages more palatable. I even suspect that the resulting functor, /// which simply delegates to two std::functions, may optimize better at /// times than a more complex functor in the grokkee. /// /// Performance aside, 'grokking' a vspline::unary_functor produces a /// simple, consistent type that can hold *any* unary_functor with the /// given input and output type(s), so it allows to hold and use a /// variety of (intrinsically differently typed) functors at runtime /// via a common handle which is a vspline::unary_functor itself and /// can be passed to the transform-type routines. With unary_functors /// being first-class, copyable objects, this also makes it possible /// to pass around unary_functors between different TUs where user /// code can provide new functors at will which can simply be used /// without having to recompile to make their type known, at the cost /// of a call through a std::function. /// /// grok_type also provides a convenient way to introduce functors into /// vspline. Since the functionality is implemented with std::functions, /// we allow direct initialization of these std::functions on top of /// 'grokking' the capabilities of another unary_functor via lambda /// expressions. 'Ordinary' functions can also be grokked. /// /// For grok_type objects where _vsize is greater 1, there are /// constructor overloads taking only a single function. These /// constructors broadcast the unvectorized function to process /// vectorized data, providing a quick way to produce code which /// runs with vector data, albeit less efficiently than true vector /// code. /// /// finally, for convenience, grok_type also provides operator(), just /// like vspline::callable, to use the grok_type object with function /// call syntax, and it also provides the common 'eval' routine(s), just /// like any other unary_functor. template < typename IN , // argument or input type typename OUT , // result type int _vsize = vspline::vector_traits < IN > :: size > struct grok_type : public vspline::unary_functor < IN , OUT , _vsize > { // if Vc is in use, we provide code for this template class. // otherwise it's just empty - only the specialization for // _vsize == 1 will be used, see below #ifdef USE_VC typedef vspline::unary_functor < IN , OUT , _vsize > base_type ; using base_type::vsize ; using typename base_type::in_type ; using typename base_type::out_type ; typedef std::function < void ( const in_type & , out_type & ) > eval_type ; typedef std::function < out_type ( const in_type & ) > call_type ; eval_type _ev ; // here, we are certain to have vsize > 1: we have a specialization // for class grok_type with vsize == 1 below. We derive in_v, and out_v, // the data types for vetorized evaluation. typedef typename vspline::vector_traits < in_type , vsize > :: type in_v ; typedef typename vspline::vector_traits < out_type , vsize > :: type out_v ; // given these types, we can define the types for the std::function // we will use to wrap the grokkee's evaluation code in. typedef std::function < void ( const in_v & , out_v & ) > v_eval_type ; // this is the class member holding the std::function: v_eval_type _v_ev ; // we also define a std::function type using 'normal' call/return syntax typedef std::function < out_v ( const in_v & ) > v_call_type ; /// we provide a default constructor so we can create an empty /// grok_type and assign to it later. Calling the empty grok_type's /// eval will result in an exception. grok_type() { } ; /// direct initialization of the internal evaluation functions /// this overload, with two arguments, specifies the unvectorized /// and the vectorized evaluation function explicitly. grok_type ( const eval_type & fev , const v_eval_type & vfev ) : _ev ( fev ) , _v_ev ( vfev ) { } ; /// constructor taking a call_type and a v_call_type, /// initializing the two std::functions _ev and _v_ev /// with wrappers around these arguments which provide /// the 'standard' vspline evaluation functor signature grok_type ( call_type f , v_call_type vf ) : _ev ( eval_wrap ( f ) ) , _v_ev ( eval_wrap ( vf ) ) { } ; /// constructor from 'grokkee' using lambda expressions /// to initialize the std::functions _ev and _v_ev. /// we enable this if grokkee_type is a vspline::unary_functor template < class grokkee_type , typename std::enable_if < std::is_base_of < vspline::unary_functor_tag < vsize > , grokkee_type > :: value , int > :: type = 0 > grok_type ( grokkee_type grokkee ) : _ev ( [ grokkee ] ( const IN & in , OUT & out ) { grokkee.eval ( in , out ) ; } ) , _v_ev ( [ grokkee ] ( const in_v & in , out_v & out ) { grokkee.eval ( in , out ) ; } ) { } ; /// constructor taking only an unvectorized evaluation function. /// this function is broadcast, providing evaluation of SIMD types /// with non-vector code, which is less efficient. grok_type ( const eval_type & fev ) : _ev ( fev ) , _v_ev ( [ fev ] ( const in_v & in , out_v & out ) { vspline::broadcast < IN , OUT , vsize > (fev) .eval ( in , out ) ; } ) { } ; /// constructor taking only one call_type, which is also broadcast, /// since the call_type std::function is wrapped to provide a /// std::function with vspline's standard evaluation functor signature /// and the result is fed to the single-argument functor above. grok_type ( const call_type & f ) : grok_type ( eval_wrap ( f ) ) { } ; /// unvectorized evaluation. This is delegated to _ev. void eval ( const IN & i , OUT & o ) const { _ev ( i , o ) ; } /// vectorized evaluation function template /// the eval overload above will catch calls with (in_type, out_type) /// while this overload template will catch vectorized evaluations. template < class in_v , class out_v > void eval ( const in_v & i , out_v & o ) const { _v_ev ( unwrap ( i ) , o ) ; } /// since grok_type is meant as a hold-all, type-erased shell for /// unary_functors, we also equip it with operator(), to save users /// from having to wrap it in a vspline::callable if they want to /// use function-call syntax with it. Since here the call is done /// via the std::functions, performance may be worse than what a /// 'straight' vspline::callable delivers. template < class I , class O = typename base_type::template out_type_of > O operator() ( const I & in ) const { O result ; eval ( in , result ) ; return result ; } #endif } ; /// specialization of grok_type for _vsize == 1 /// this is the only possible specialization if Vc is not in use. /// here we don't use _v_ev but only the unvectorized evaluation. template < typename IN , // argument or input type typename OUT // result type > struct grok_type < IN , OUT , 1 > : public vspline::unary_functor < IN , OUT , 1 > { typedef vspline::unary_functor < IN , OUT , 1 > base_type ; using base_type::vsize ; using typename base_type::in_type ; using typename base_type::out_type ; typedef std::function < void ( const in_type & , out_type & ) > eval_type ; typedef std::function < out_type ( const in_type & ) > call_type ; eval_type _ev ; grok_type() { } ; template < class grokkee_type , typename std::enable_if < std::is_base_of < vspline::unary_functor_tag < 1 > , grokkee_type > :: value , int > :: type = 0 > grok_type ( grokkee_type grokkee ) : _ev ( [ grokkee ] ( const IN & in , OUT & out ) { grokkee.eval ( in , out ) ; } ) { } ; grok_type ( const eval_type & fev ) : _ev ( fev ) { } ; grok_type ( call_type f ) : _ev ( eval_wrap ( f ) ) { } ; void eval ( const IN & i , OUT & o ) const { _ev ( i , o ) ; } template < class I , class O = typename base_type::template out_type_of > O operator() ( const I & in ) const { O result ; eval ( in , result ) ; return result ; } } ; /// grok() is the corresponding factory function, wrapping grokkee /// in a vspline::grok_type. template < class grokkee_type > vspline::grok_type < typename grokkee_type::in_type , typename grokkee_type::out_type , grokkee_type::vsize > grok ( const grokkee_type & grokkee ) { return vspline::grok_type < typename grokkee_type::in_type , typename grokkee_type::out_type , grokkee_type::vsize > ( grokkee ) ; } } ; // end of namespace vspline #endif // VSPLINE_UNARY_FUNCTOR_H kfj-vspline-6e66cf7a7926/vspline.doxy000066400000000000000000003037211320375670700174700ustar00rootroot00000000000000# Doxyfile 1.8.6 # This file describes the settings to be used by the documentation system # doxygen (www.doxygen.org) for a project. # # All text after a double hash (##) is considered a comment and is placed in # front of the TAG it is preceding. # # All text after a single hash (#) is considered a comment and will be ignored. # The format is: # TAG = value [value, ...] # For lists, items can also be appended using: # TAG += value [value, ...] # Values that contain spaces should be placed between quotes (\" \"). #--------------------------------------------------------------------------- # Project related configuration options #--------------------------------------------------------------------------- # This tag specifies the encoding used for all characters in the config file # that follow. The default is UTF-8 which is also the encoding used for all text # before the first occurrence of this tag. Doxygen uses libiconv (or the iconv # built into libc) for the transcoding. See http://www.gnu.org/software/libiconv # for the list of possible encodings. # The default value is: UTF-8. DOXYFILE_ENCODING = UTF-8 # The PROJECT_NAME tag is a single word (or a sequence of words surrounded by # double-quotes, unless you are using Doxywizard) that should identify the # project for which the documentation is generated. This name is used in the # title of most generated pages and in a few other places. # The default value is: My Project. PROJECT_NAME = "vspline" # The PROJECT_NUMBER tag can be used to enter a project or revision number. This # could be handy for archiving the generated documentation or if some version # control system is used. PROJECT_NUMBER = 0.3.1 # Using the PROJECT_BRIEF tag one can provide an optional one line description # for a project that appears at the top of each page and should give viewer a # quick idea about the purpose of the project. Keep the description short. PROJECT_BRIEF = "Generic C++ Code for Uniform B-Splines" # With the PROJECT_LOGO tag one can specify an logo or icon that is included in # the documentation. The maximum height of the logo should not exceed 55 pixels # and the maximum width should not exceed 200 pixels. Doxygen will copy the logo # to the output directory. PROJECT_LOGO = # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) path # into which the generated documentation will be written. If a relative path is # entered, it will be relative to the location where doxygen was started. If # left blank the current directory will be used. OUTPUT_DIRECTORY = ../kfj.bitbucket.org # If the CREATE_SUBDIRS tag is set to YES, then doxygen will create 4096 sub- # directories (in 2 levels) under the output directory of each output format and # will distribute the generated files over these directories. Enabling this # option can be useful when feeding doxygen a huge amount of source files, where # putting all generated files in the same directory would otherwise causes # performance problems for the file system. # The default value is: NO. CREATE_SUBDIRS = NO # The OUTPUT_LANGUAGE tag is used to specify the language in which all # documentation generated by doxygen is written. Doxygen will use this # information to generate all constant output in the proper language. # Possible values are: Afrikaans, Arabic, Armenian, Brazilian, Catalan, Chinese, # Chinese-Traditional, Croatian, Czech, Danish, Dutch, English (United States), # Esperanto, Farsi (Persian), Finnish, French, German, Greek, Hungarian, # Indonesian, Italian, Japanese, Japanese-en (Japanese with English messages), # Korean, Korean-en (Korean with English messages), Latvian, Lithuanian, # Macedonian, Norwegian, Persian (Farsi), Polish, Portuguese, Romanian, Russian, # Serbian, Serbian-Cyrillic, Slovak, Slovene, Spanish, Swedish, Turkish, # Ukrainian and Vietnamese. # The default value is: English. OUTPUT_LANGUAGE = English # If the BRIEF_MEMBER_DESC tag is set to YES doxygen will include brief member # descriptions after the members that are listed in the file and class # documentation (similar to Javadoc). Set to NO to disable this. # The default value is: YES. BRIEF_MEMBER_DESC = YES # If the REPEAT_BRIEF tag is set to YES doxygen will prepend the brief # description of a member or function before the detailed description # # Note: If both HIDE_UNDOC_MEMBERS and BRIEF_MEMBER_DESC are set to NO, the # brief descriptions will be completely suppressed. # The default value is: YES. REPEAT_BRIEF = YES # This tag implements a quasi-intelligent brief description abbreviator that is # used to form the text in various listings. Each string in this list, if found # as the leading text of the brief description, will be stripped from the text # and the result, after processing the whole list, is used as the annotated # text. Otherwise, the brief description is used as-is. If left blank, the # following values are used ($name is automatically replaced with the name of # the entity):The $name class, The $name widget, The $name file, is, provides, # specifies, contains, represents, a, an and the. ABBREVIATE_BRIEF = # If the ALWAYS_DETAILED_SEC and REPEAT_BRIEF tags are both set to YES then # doxygen will generate a detailed section even if there is only a brief # description. # The default value is: NO. ALWAYS_DETAILED_SEC = NO # If the INLINE_INHERITED_MEMB tag is set to YES, doxygen will show all # inherited members of a class in the documentation of that class as if those # members were ordinary class members. Constructors, destructors and assignment # operators of the base classes will not be shown. # The default value is: NO. INLINE_INHERITED_MEMB = NO # If the FULL_PATH_NAMES tag is set to YES doxygen will prepend the full path # before files name in the file list and in the header files. If set to NO the # shortest path that makes the file name unique will be used # The default value is: YES. FULL_PATH_NAMES = YES # The STRIP_FROM_PATH tag can be used to strip a user-defined part of the path. # Stripping is only done if one of the specified strings matches the left-hand # part of the path. The tag can be used to show relative paths in the file list. # If left blank the directory from which doxygen is run is used as the path to # strip. # # Note that you can specify absolute paths here, but also relative paths, which # will be relative from the directory where doxygen is started. # This tag requires that the tag FULL_PATH_NAMES is set to YES. STRIP_FROM_PATH = # The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of the # path mentioned in the documentation of a class, which tells the reader which # header file to include in order to use a class. If left blank only the name of # the header file containing the class definition is used. Otherwise one should # specify the list of include paths that are normally passed to the compiler # using the -I flag. STRIP_FROM_INC_PATH = # If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter (but # less readable) file names. This can be useful is your file systems doesn't # support long names like on DOS, Mac, or CD-ROM. # The default value is: NO. SHORT_NAMES = NO # If the JAVADOC_AUTOBRIEF tag is set to YES then doxygen will interpret the # first line (until the first dot) of a Javadoc-style comment as the brief # description. If set to NO, the Javadoc-style will behave just like regular Qt- # style comments (thus requiring an explicit @brief command for a brief # description.) # The default value is: NO. JAVADOC_AUTOBRIEF = NO # If the QT_AUTOBRIEF tag is set to YES then doxygen will interpret the first # line (until the first dot) of a Qt-style comment as the brief description. If # set to NO, the Qt-style will behave just like regular Qt-style comments (thus # requiring an explicit \brief command for a brief description.) # The default value is: NO. QT_AUTOBRIEF = NO # The MULTILINE_CPP_IS_BRIEF tag can be set to YES to make doxygen treat a # multi-line C++ special comment block (i.e. a block of //! or /// comments) as # a brief description. This used to be the default behavior. The new default is # to treat a multi-line C++ comment block as a detailed description. Set this # tag to YES if you prefer the old behavior instead. # # Note that setting this tag to YES also means that rational rose comments are # not recognized any more. # The default value is: NO. MULTILINE_CPP_IS_BRIEF = YES # If the INHERIT_DOCS tag is set to YES then an undocumented member inherits the # documentation from any documented member that it re-implements. # The default value is: YES. INHERIT_DOCS = YES # If the SEPARATE_MEMBER_PAGES tag is set to YES, then doxygen will produce a # new page for each member. If set to NO, the documentation of a member will be # part of the file/class/namespace that contains it. # The default value is: NO. SEPARATE_MEMBER_PAGES = NO # The TAB_SIZE tag can be used to set the number of spaces in a tab. Doxygen # uses this value to replace tabs by spaces in code fragments. # Minimum value: 1, maximum value: 16, default value: 4. TAB_SIZE = 4 # This tag can be used to specify a number of aliases that act as commands in # the documentation. An alias has the form: # name=value # For example adding # "sideeffect=@par Side Effects:\n" # will allow you to put the command \sideeffect (or @sideeffect) in the # documentation, which will result in a user-defined paragraph with heading # "Side Effects:". You can put \n's in the value part of an alias to insert # newlines. ALIASES = # This tag can be used to specify a number of word-keyword mappings (TCL only). # A mapping has the form "name=value". For example adding "class=itcl::class" # will allow you to use the command class in the itcl::class meaning. TCL_SUBST = # Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C sources # only. Doxygen will then generate output that is more tailored for C. For # instance, some of the names that are used will be different. The list of all # members will be omitted, etc. # The default value is: NO. OPTIMIZE_OUTPUT_FOR_C = NO # Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java or # Python sources only. Doxygen will then generate output that is more tailored # for that language. For instance, namespaces will be presented as packages, # qualified scopes will look different, etc. # The default value is: NO. OPTIMIZE_OUTPUT_JAVA = NO # Set the OPTIMIZE_FOR_FORTRAN tag to YES if your project consists of Fortran # sources. Doxygen will then generate output that is tailored for Fortran. # The default value is: NO. OPTIMIZE_FOR_FORTRAN = NO # Set the OPTIMIZE_OUTPUT_VHDL tag to YES if your project consists of VHDL # sources. Doxygen will then generate output that is tailored for VHDL. # The default value is: NO. OPTIMIZE_OUTPUT_VHDL = NO # Doxygen selects the parser to use depending on the extension of the files it # parses. With this tag you can assign which parser to use for a given # extension. Doxygen has a built-in mapping, but you can override or extend it # using this tag. The format is ext=language, where ext is a file extension, and # language is one of the parsers supported by doxygen: IDL, Java, Javascript, # C#, C, C++, D, PHP, Objective-C, Python, Fortran, VHDL. For instance to make # doxygen treat .inc files as Fortran files (default is PHP), and .f files as C # (default is Fortran), use: inc=Fortran f=C. # # Note For files without extension you can use no_extension as a placeholder. # # Note that for custom extensions you also need to set FILE_PATTERNS otherwise # the files are not read by doxygen. EXTENSION_MAPPING = # If the MARKDOWN_SUPPORT tag is enabled then doxygen pre-processes all comments # according to the Markdown format, which allows for more readable # documentation. See http://daringfireball.net/projects/markdown/ for details. # The output of markdown processing is further processed by doxygen, so you can # mix doxygen, HTML, and XML commands with Markdown formatting. Disable only in # case of backward compatibilities issues. # The default value is: YES. MARKDOWN_SUPPORT = YES # When enabled doxygen tries to link words that correspond to documented # classes, or namespaces to their corresponding documentation. Such a link can # be prevented in individual cases by by putting a % sign in front of the word # or globally by setting AUTOLINK_SUPPORT to NO. # The default value is: YES. AUTOLINK_SUPPORT = YES # If you use STL classes (i.e. std::string, std::vector, etc.) but do not want # to include (a tag file for) the STL sources as input, then you should set this # tag to YES in order to let doxygen match functions declarations and # definitions whose arguments contain STL classes (e.g. func(std::string); # versus func(std::string) {}). This also make the inheritance and collaboration # diagrams that involve STL classes more complete and accurate. # The default value is: NO. BUILTIN_STL_SUPPORT = NO # If you use Microsoft's C++/CLI language, you should set this option to YES to # enable parsing support. # The default value is: NO. CPP_CLI_SUPPORT = NO # Set the SIP_SUPPORT tag to YES if your project consists of sip (see: # http://www.riverbankcomputing.co.uk/software/sip/intro) sources only. Doxygen # will parse them like normal C++ but will assume all classes use public instead # of private inheritance when no explicit protection keyword is present. # The default value is: NO. SIP_SUPPORT = NO # For Microsoft's IDL there are propget and propput attributes to indicate # getter and setter methods for a property. Setting this option to YES will make # doxygen to replace the get and set methods by a property in the documentation. # This will only work if the methods are indeed getting or setting a simple # type. If this is not the case, or you want to show the methods anyway, you # should set this option to NO. # The default value is: YES. IDL_PROPERTY_SUPPORT = YES # If member grouping is used in the documentation and the DISTRIBUTE_GROUP_DOC # tag is set to YES, then doxygen will reuse the documentation of the first # member in the group (if any) for the other members of the group. By default # all members of a group must be documented explicitly. # The default value is: NO. DISTRIBUTE_GROUP_DOC = NO # Set the SUBGROUPING tag to YES to allow class member groups of the same type # (for instance a group of public functions) to be put as a subgroup of that # type (e.g. under the Public Functions section). Set it to NO to prevent # subgrouping. Alternatively, this can be done per class using the # \nosubgrouping command. # The default value is: YES. SUBGROUPING = YES # When the INLINE_GROUPED_CLASSES tag is set to YES, classes, structs and unions # are shown inside the group in which they are included (e.g. using \ingroup) # instead of on a separate page (for HTML and Man pages) or section (for LaTeX # and RTF). # # Note that this feature does not work in combination with # SEPARATE_MEMBER_PAGES. # The default value is: NO. INLINE_GROUPED_CLASSES = NO # When the INLINE_SIMPLE_STRUCTS tag is set to YES, structs, classes, and unions # with only public data fields or simple typedef fields will be shown inline in # the documentation of the scope in which they are defined (i.e. file, # namespace, or group documentation), provided this scope is documented. If set # to NO, structs, classes, and unions are shown on a separate page (for HTML and # Man pages) or section (for LaTeX and RTF). # The default value is: NO. INLINE_SIMPLE_STRUCTS = NO # When TYPEDEF_HIDES_STRUCT tag is enabled, a typedef of a struct, union, or # enum is documented as struct, union, or enum with the name of the typedef. So # typedef struct TypeS {} TypeT, will appear in the documentation as a struct # with name TypeT. When disabled the typedef will appear as a member of a file, # namespace, or class. And the struct will be named TypeS. This can typically be # useful for C code in case the coding convention dictates that all compound # types are typedef'ed and only the typedef is referenced, never the tag name. # The default value is: NO. TYPEDEF_HIDES_STRUCT = NO # The size of the symbol lookup cache can be set using LOOKUP_CACHE_SIZE. This # cache is used to resolve symbols given their name and scope. Since this can be # an expensive process and often the same symbol appears multiple times in the # code, doxygen keeps a cache of pre-resolved symbols. If the cache is too small # doxygen will become slower. If the cache is too large, memory is wasted. The # cache size is given by this formula: 2^(16+LOOKUP_CACHE_SIZE). The valid range # is 0..9, the default is 0, corresponding to a cache size of 2^16=65536 # symbols. At the end of a run doxygen will report the cache usage and suggest # the optimal cache size from a speed point of view. # Minimum value: 0, maximum value: 9, default value: 0. LOOKUP_CACHE_SIZE = 0 #--------------------------------------------------------------------------- # Build related configuration options #--------------------------------------------------------------------------- # If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in # documentation are documented, even if no documentation was available. Private # class members and static file members will be hidden unless the # EXTRACT_PRIVATE respectively EXTRACT_STATIC tags are set to YES. # Note: This will also disable the warnings about undocumented members that are # normally produced when WARNINGS is set to YES. # The default value is: NO. EXTRACT_ALL = NO # If the EXTRACT_PRIVATE tag is set to YES all private members of a class will # be included in the documentation. # The default value is: NO. EXTRACT_PRIVATE = NO # If the EXTRACT_PACKAGE tag is set to YES all members with package or internal # scope will be included in the documentation. # The default value is: NO. EXTRACT_PACKAGE = NO # If the EXTRACT_STATIC tag is set to YES all static members of a file will be # included in the documentation. # The default value is: NO. EXTRACT_STATIC = NO # If the EXTRACT_LOCAL_CLASSES tag is set to YES classes (and structs) defined # locally in source files will be included in the documentation. If set to NO # only classes defined in header files are included. Does not have any effect # for Java sources. # The default value is: YES. EXTRACT_LOCAL_CLASSES = YES # This flag is only useful for Objective-C code. When set to YES local methods, # which are defined in the implementation section but not in the interface are # included in the documentation. If set to NO only methods in the interface are # included. # The default value is: NO. EXTRACT_LOCAL_METHODS = NO # If this flag is set to YES, the members of anonymous namespaces will be # extracted and appear in the documentation as a namespace called # 'anonymous_namespace{file}', where file will be replaced with the base name of # the file that contains the anonymous namespace. By default anonymous namespace # are hidden. # The default value is: NO. EXTRACT_ANON_NSPACES = NO # If the HIDE_UNDOC_MEMBERS tag is set to YES, doxygen will hide all # undocumented members inside documented classes or files. If set to NO these # members will be included in the various overviews, but no documentation # section is generated. This option has no effect if EXTRACT_ALL is enabled. # The default value is: NO. HIDE_UNDOC_MEMBERS = NO # If the HIDE_UNDOC_CLASSES tag is set to YES, doxygen will hide all # undocumented classes that are normally visible in the class hierarchy. If set # to NO these classes will be included in the various overviews. This option has # no effect if EXTRACT_ALL is enabled. # The default value is: NO. HIDE_UNDOC_CLASSES = NO # If the HIDE_FRIEND_COMPOUNDS tag is set to YES, doxygen will hide all friend # (class|struct|union) declarations. If set to NO these declarations will be # included in the documentation. # The default value is: NO. HIDE_FRIEND_COMPOUNDS = NO # If the HIDE_IN_BODY_DOCS tag is set to YES, doxygen will hide any # documentation blocks found inside the body of a function. If set to NO these # blocks will be appended to the function's detailed documentation block. # The default value is: NO. HIDE_IN_BODY_DOCS = NO # The INTERNAL_DOCS tag determines if documentation that is typed after a # \internal command is included. If the tag is set to NO then the documentation # will be excluded. Set it to YES to include the internal documentation. # The default value is: NO. INTERNAL_DOCS = NO # If the CASE_SENSE_NAMES tag is set to NO then doxygen will only generate file # names in lower-case letters. If set to YES upper-case letters are also # allowed. This is useful if you have classes or files whose names only differ # in case and if your file system supports case sensitive file names. Windows # and Mac users are advised to set this option to NO. # The default value is: system dependent. CASE_SENSE_NAMES = YES # If the HIDE_SCOPE_NAMES tag is set to NO then doxygen will show members with # their full class and namespace scopes in the documentation. If set to YES the # scope will be hidden. # The default value is: NO. HIDE_SCOPE_NAMES = NO # If the SHOW_INCLUDE_FILES tag is set to YES then doxygen will put a list of # the files that are included by a file in the documentation of that file. # The default value is: YES. SHOW_INCLUDE_FILES = YES # If the SHOW_GROUPED_MEMB_INC tag is set to YES then Doxygen will add for each # grouped member an include statement to the documentation, telling the reader # which file to include in order to use the member. # The default value is: NO. SHOW_GROUPED_MEMB_INC = NO # If the FORCE_LOCAL_INCLUDES tag is set to YES then doxygen will list include # files with double quotes in the documentation rather than with sharp brackets. # The default value is: NO. FORCE_LOCAL_INCLUDES = NO # If the INLINE_INFO tag is set to YES then a tag [inline] is inserted in the # documentation for inline members. # The default value is: YES. INLINE_INFO = YES # If the SORT_MEMBER_DOCS tag is set to YES then doxygen will sort the # (detailed) documentation of file and class members alphabetically by member # name. If set to NO the members will appear in declaration order. # The default value is: YES. SORT_MEMBER_DOCS = YES # If the SORT_BRIEF_DOCS tag is set to YES then doxygen will sort the brief # descriptions of file, namespace and class members alphabetically by member # name. If set to NO the members will appear in declaration order. Note that # this will also influence the order of the classes in the class list. # The default value is: NO. SORT_BRIEF_DOCS = NO # If the SORT_MEMBERS_CTORS_1ST tag is set to YES then doxygen will sort the # (brief and detailed) documentation of class members so that constructors and # destructors are listed first. If set to NO the constructors will appear in the # respective orders defined by SORT_BRIEF_DOCS and SORT_MEMBER_DOCS. # Note: If SORT_BRIEF_DOCS is set to NO this option is ignored for sorting brief # member documentation. # Note: If SORT_MEMBER_DOCS is set to NO this option is ignored for sorting # detailed member documentation. # The default value is: NO. SORT_MEMBERS_CTORS_1ST = NO # If the SORT_GROUP_NAMES tag is set to YES then doxygen will sort the hierarchy # of group names into alphabetical order. If set to NO the group names will # appear in their defined order. # The default value is: NO. SORT_GROUP_NAMES = NO # If the SORT_BY_SCOPE_NAME tag is set to YES, the class list will be sorted by # fully-qualified names, including namespaces. If set to NO, the class list will # be sorted only by class name, not including the namespace part. # Note: This option is not very useful if HIDE_SCOPE_NAMES is set to YES. # Note: This option applies only to the class list, not to the alphabetical # list. # The default value is: NO. SORT_BY_SCOPE_NAME = NO # If the STRICT_PROTO_MATCHING option is enabled and doxygen fails to do proper # type resolution of all parameters of a function it will reject a match between # the prototype and the implementation of a member function even if there is # only one candidate or it is obvious which candidate to choose by doing a # simple string match. By disabling STRICT_PROTO_MATCHING doxygen will still # accept a match between prototype and implementation in such cases. # The default value is: NO. STRICT_PROTO_MATCHING = NO # The GENERATE_TODOLIST tag can be used to enable ( YES) or disable ( NO) the # todo list. This list is created by putting \todo commands in the # documentation. # The default value is: YES. GENERATE_TODOLIST = YES # The GENERATE_TESTLIST tag can be used to enable ( YES) or disable ( NO) the # test list. This list is created by putting \test commands in the # documentation. # The default value is: YES. GENERATE_TESTLIST = YES # The GENERATE_BUGLIST tag can be used to enable ( YES) or disable ( NO) the bug # list. This list is created by putting \bug commands in the documentation. # The default value is: YES. GENERATE_BUGLIST = YES # The GENERATE_DEPRECATEDLIST tag can be used to enable ( YES) or disable ( NO) # the deprecated list. This list is created by putting \deprecated commands in # the documentation. # The default value is: YES. GENERATE_DEPRECATEDLIST= YES # The ENABLED_SECTIONS tag can be used to enable conditional documentation # sections, marked by \if ... \endif and \cond # ... \endcond blocks. ENABLED_SECTIONS = # The MAX_INITIALIZER_LINES tag determines the maximum number of lines that the # initial value of a variable or macro / define can have for it to appear in the # documentation. If the initializer consists of more lines than specified here # it will be hidden. Use a value of 0 to hide initializers completely. The # appearance of the value of individual variables and macros / defines can be # controlled using \showinitializer or \hideinitializer command in the # documentation regardless of this setting. # Minimum value: 0, maximum value: 10000, default value: 30. MAX_INITIALIZER_LINES = 30 # Set the SHOW_USED_FILES tag to NO to disable the list of files generated at # the bottom of the documentation of classes and structs. If set to YES the list # will mention the files that were used to generate the documentation. # The default value is: YES. SHOW_USED_FILES = YES # Set the SHOW_FILES tag to NO to disable the generation of the Files page. This # will remove the Files entry from the Quick Index and from the Folder Tree View # (if specified). # The default value is: YES. SHOW_FILES = YES # Set the SHOW_NAMESPACES tag to NO to disable the generation of the Namespaces # page. This will remove the Namespaces entry from the Quick Index and from the # Folder Tree View (if specified). # The default value is: YES. SHOW_NAMESPACES = YES # The FILE_VERSION_FILTER tag can be used to specify a program or script that # doxygen should invoke to get the current version for each file (typically from # the version control system). Doxygen will invoke the program by executing (via # popen()) the command command input-file, where command is the value of the # FILE_VERSION_FILTER tag, and input-file is the name of an input file provided # by doxygen. Whatever the program writes to standard output is used as the file # version. For an example see the documentation. FILE_VERSION_FILTER = # The LAYOUT_FILE tag can be used to specify a layout file which will be parsed # by doxygen. The layout file controls the global structure of the generated # output files in an output format independent way. To create the layout file # that represents doxygen's defaults, run doxygen with the -l option. You can # optionally specify a file name after the option, if omitted DoxygenLayout.xml # will be used as the name of the layout file. # # Note that if you run doxygen from a directory containing a file called # DoxygenLayout.xml, doxygen will parse it automatically even if the LAYOUT_FILE # tag is left empty. LAYOUT_FILE = # The CITE_BIB_FILES tag can be used to specify one or more bib files containing # the reference definitions. This must be a list of .bib files. The .bib # extension is automatically appended if omitted. This requires the bibtex tool # to be installed. See also http://en.wikipedia.org/wiki/BibTeX for more info. # For LaTeX the style of the bibliography can be controlled using # LATEX_BIB_STYLE. To use this feature you need bibtex and perl available in the # search path. Do not use file names with spaces, bibtex cannot handle them. See # also \cite for info how to create references. CITE_BIB_FILES = #--------------------------------------------------------------------------- # Configuration options related to warning and progress messages #--------------------------------------------------------------------------- # The QUIET tag can be used to turn on/off the messages that are generated to # standard output by doxygen. If QUIET is set to YES this implies that the # messages are off. # The default value is: NO. QUIET = NO # The WARNINGS tag can be used to turn on/off the warning messages that are # generated to standard error ( stderr) by doxygen. If WARNINGS is set to YES # this implies that the warnings are on. # # Tip: Turn warnings on while writing the documentation. # The default value is: YES. WARNINGS = YES # If the WARN_IF_UNDOCUMENTED tag is set to YES, then doxygen will generate # warnings for undocumented members. If EXTRACT_ALL is set to YES then this flag # will automatically be disabled. # The default value is: YES. WARN_IF_UNDOCUMENTED = YES # If the WARN_IF_DOC_ERROR tag is set to YES, doxygen will generate warnings for # potential errors in the documentation, such as not documenting some parameters # in a documented function, or documenting parameters that don't exist or using # markup commands wrongly. # The default value is: YES. WARN_IF_DOC_ERROR = YES # This WARN_NO_PARAMDOC option can be enabled to get warnings for functions that # are documented, but have no documentation for their parameters or return # value. If set to NO doxygen will only warn about wrong or incomplete parameter # documentation, but not about the absence of documentation. # The default value is: NO. WARN_NO_PARAMDOC = NO # The WARN_FORMAT tag determines the format of the warning messages that doxygen # can produce. The string should contain the $file, $line, and $text tags, which # will be replaced by the file and line number from which the warning originated # and the warning text. Optionally the format may contain $version, which will # be replaced by the version of the file (if it could be obtained via # FILE_VERSION_FILTER) # The default value is: $file:$line: $text. WARN_FORMAT = "$file:$line: $text" # The WARN_LOGFILE tag can be used to specify a file to which warning and error # messages should be written. If left blank the output is written to standard # error (stderr). WARN_LOGFILE = #--------------------------------------------------------------------------- # Configuration options related to the input files #--------------------------------------------------------------------------- # The INPUT tag is used to specify the files and/or directories that contain # documented source files. You may enter file names like myfile.cpp or # directories like /usr/src/myproject. Separate the files or directories with # spaces. # Note: If this tag is empty the current directory is searched. INPUT = . example # This tag can be used to specify the character encoding of the source files # that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses # libiconv (or the iconv built into libc) for the transcoding. See the libiconv # documentation (see: http://www.gnu.org/software/libiconv) for the list of # possible encodings. # The default value is: UTF-8. INPUT_ENCODING = UTF-8 # If the value of the INPUT tag contains directories, you can use the # FILE_PATTERNS tag to specify one or more wildcard patterns (like *.cpp and # *.h) to filter out the source-files in the directories. If left blank the # following patterns are tested:*.c, *.cc, *.cxx, *.cpp, *.c++, *.java, *.ii, # *.ixx, *.ipp, *.i++, *.inl, *.idl, *.ddl, *.odl, *.h, *.hh, *.hxx, *.hpp, # *.h++, *.cs, *.d, *.php, *.php4, *.php5, *.phtml, *.inc, *.m, *.markdown, # *.md, *.mm, *.dox, *.py, *.f90, *.f, *.for, *.tcl, *.vhd, *.vhdl, *.ucf, # *.qsf, *.as and *.js. FILE_PATTERNS = # The RECURSIVE tag can be used to specify whether or not subdirectories should # be searched for input files as well. # The default value is: NO. RECURSIVE = YES # The EXCLUDE tag can be used to specify files and/or directories that should be # excluded from the INPUT source files. This way you can easily exclude a # subdirectory from a directory tree whose root is specified with the INPUT tag. # # Note that relative paths are relative to the directory from which doxygen is # run. EXCLUDE = # The EXCLUDE_SYMLINKS tag can be used to select whether or not files or # directories that are symbolic links (a Unix file system feature) are excluded # from the input. # The default value is: NO. EXCLUDE_SYMLINKS = NO # If the value of the INPUT tag contains directories, you can use the # EXCLUDE_PATTERNS tag to specify one or more wildcard patterns to exclude # certain files from those directories. # # Note that the wildcards are matched against the file with absolute path, so to # exclude all test directories for example use the pattern */test/* EXCLUDE_PATTERNS = # The EXCLUDE_SYMBOLS tag can be used to specify one or more symbol names # (namespaces, classes, functions, etc.) that should be excluded from the # output. The symbol name can be a fully qualified name, a word, or if the # wildcard * is used, a substring. Examples: ANamespace, AClass, # AClass::ANamespace, ANamespace::*Test # # Note that the wildcards are matched against the file with absolute path, so to # exclude all test directories use the pattern */test/* EXCLUDE_SYMBOLS = # The EXAMPLE_PATH tag can be used to specify one or more files or directories # that contain example code fragments that are included (see the \include # command). EXAMPLE_PATH = # If the value of the EXAMPLE_PATH tag contains directories, you can use the # EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp and # *.h) to filter out the source-files in the directories. If left blank all # files are included. EXAMPLE_PATTERNS = # If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be # searched for input files to be used with the \include or \dontinclude commands # irrespective of the value of the RECURSIVE tag. # The default value is: NO. EXAMPLE_RECURSIVE = NO # The IMAGE_PATH tag can be used to specify one or more files or directories # that contain images that are to be included in the documentation (see the # \image command). IMAGE_PATH = # The INPUT_FILTER tag can be used to specify a program that doxygen should # invoke to filter for each input file. Doxygen will invoke the filter program # by executing (via popen()) the command: # # # # where is the value of the INPUT_FILTER tag, and is the # name of an input file. Doxygen will then use the output that the filter # program writes to standard output. If FILTER_PATTERNS is specified, this tag # will be ignored. # # Note that the filter must not add or remove lines; it is applied before the # code is scanned, but not when the output code is generated. If lines are added # or removed, the anchors will not be placed correctly. INPUT_FILTER = # The FILTER_PATTERNS tag can be used to specify filters on a per file pattern # basis. Doxygen will compare the file name with each pattern and apply the # filter if there is a match. The filters are a list of the form: pattern=filter # (like *.cpp=my_cpp_filter). See INPUT_FILTER for further information on how # filters are used. If the FILTER_PATTERNS tag is empty or if none of the # patterns match the file name, INPUT_FILTER is applied. FILTER_PATTERNS = # If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using # INPUT_FILTER ) will also be used to filter the input files that are used for # producing the source files to browse (i.e. when SOURCE_BROWSER is set to YES). # The default value is: NO. FILTER_SOURCE_FILES = NO # The FILTER_SOURCE_PATTERNS tag can be used to specify source filters per file # pattern. A pattern will override the setting for FILTER_PATTERN (if any) and # it is also possible to disable source filtering for a specific pattern using # *.ext= (so without naming a filter). # This tag requires that the tag FILTER_SOURCE_FILES is set to YES. FILTER_SOURCE_PATTERNS = # If the USE_MDFILE_AS_MAINPAGE tag refers to the name of a markdown file that # is part of the input, its contents will be placed on the main page # (index.html). This can be useful if you have a project on for instance GitHub # and want to reuse the introduction page also for the doxygen output. USE_MDFILE_AS_MAINPAGE = #--------------------------------------------------------------------------- # Configuration options related to source browsing #--------------------------------------------------------------------------- # If the SOURCE_BROWSER tag is set to YES then a list of source files will be # generated. Documented entities will be cross-referenced with these sources. # # Note: To get rid of all source code in the generated output, make sure that # also VERBATIM_HEADERS is set to NO. # The default value is: NO. SOURCE_BROWSER = NO # Setting the INLINE_SOURCES tag to YES will include the body of functions, # classes and enums directly into the documentation. # The default value is: NO. INLINE_SOURCES = NO # Setting the STRIP_CODE_COMMENTS tag to YES will instruct doxygen to hide any # special comment blocks from generated source code fragments. Normal C, C++ and # Fortran comments will always remain visible. # The default value is: YES. STRIP_CODE_COMMENTS = YES # If the REFERENCED_BY_RELATION tag is set to YES then for each documented # function all documented functions referencing it will be listed. # The default value is: NO. REFERENCED_BY_RELATION = NO # If the REFERENCES_RELATION tag is set to YES then for each documented function # all documented entities called/used by that function will be listed. # The default value is: NO. REFERENCES_RELATION = NO # If the REFERENCES_LINK_SOURCE tag is set to YES and SOURCE_BROWSER tag is set # to YES, then the hyperlinks from functions in REFERENCES_RELATION and # REFERENCED_BY_RELATION lists will link to the source code. Otherwise they will # link to the documentation. # The default value is: YES. REFERENCES_LINK_SOURCE = YES # If SOURCE_TOOLTIPS is enabled (the default) then hovering a hyperlink in the # source code will show a tooltip with additional information such as prototype, # brief description and links to the definition and documentation. Since this # will make the HTML file larger and loading of large files a bit slower, you # can opt to disable this feature. # The default value is: YES. # This tag requires that the tag SOURCE_BROWSER is set to YES. SOURCE_TOOLTIPS = YES # If the USE_HTAGS tag is set to YES then the references to source code will # point to the HTML generated by the htags(1) tool instead of doxygen built-in # source browser. The htags tool is part of GNU's global source tagging system # (see http://www.gnu.org/software/global/global.html). You will need version # 4.8.6 or higher. # # To use it do the following: # - Install the latest version of global # - Enable SOURCE_BROWSER and USE_HTAGS in the config file # - Make sure the INPUT points to the root of the source tree # - Run doxygen as normal # # Doxygen will invoke htags (and that will in turn invoke gtags), so these # tools must be available from the command line (i.e. in the search path). # # The result: instead of the source browser generated by doxygen, the links to # source code will now point to the output of htags. # The default value is: NO. # This tag requires that the tag SOURCE_BROWSER is set to YES. USE_HTAGS = NO # If the VERBATIM_HEADERS tag is set the YES then doxygen will generate a # verbatim copy of the header file for each class for which an include is # specified. Set to NO to disable this. # See also: Section \class. # The default value is: YES. VERBATIM_HEADERS = YES #--------------------------------------------------------------------------- # Configuration options related to the alphabetical class index #--------------------------------------------------------------------------- # If the ALPHABETICAL_INDEX tag is set to YES, an alphabetical index of all # compounds will be generated. Enable this if the project contains a lot of # classes, structs, unions or interfaces. # The default value is: YES. ALPHABETICAL_INDEX = YES # The COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns in # which the alphabetical index list will be split. # Minimum value: 1, maximum value: 20, default value: 5. # This tag requires that the tag ALPHABETICAL_INDEX is set to YES. COLS_IN_ALPHA_INDEX = 5 # In case all classes in a project start with a common prefix, all classes will # be put under the same header in the alphabetical index. The IGNORE_PREFIX tag # can be used to specify a prefix (or a list of prefixes) that should be ignored # while generating the index headers. # This tag requires that the tag ALPHABETICAL_INDEX is set to YES. IGNORE_PREFIX = #--------------------------------------------------------------------------- # Configuration options related to the HTML output #--------------------------------------------------------------------------- # If the GENERATE_HTML tag is set to YES doxygen will generate HTML output # The default value is: YES. GENERATE_HTML = YES # The HTML_OUTPUT tag is used to specify where the HTML docs will be put. If a # relative path is entered the value of OUTPUT_DIRECTORY will be put in front of # it. # The default directory is: html. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_OUTPUT = . # The HTML_FILE_EXTENSION tag can be used to specify the file extension for each # generated HTML page (for example: .htm, .php, .asp). # The default value is: .html. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_FILE_EXTENSION = .html # The HTML_HEADER tag can be used to specify a user-defined HTML header file for # each generated HTML page. If the tag is left blank doxygen will generate a # standard header. # # To get valid HTML the header file that includes any scripts and style sheets # that doxygen needs, which is dependent on the configuration options used (e.g. # the setting GENERATE_TREEVIEW). It is highly recommended to start with a # default header using # doxygen -w html new_header.html new_footer.html new_stylesheet.css # YourConfigFile # and then modify the file new_header.html. See also section "Doxygen usage" # for information on how to generate the default header that doxygen normally # uses. # Note: The header is subject to change so you typically have to regenerate the # default header when upgrading to a newer version of doxygen. For a description # of the possible markers and block names see the documentation. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_HEADER = # The HTML_FOOTER tag can be used to specify a user-defined HTML footer for each # generated HTML page. If the tag is left blank doxygen will generate a standard # footer. See HTML_HEADER for more information on how to generate a default # footer and what special commands can be used inside the footer. See also # section "Doxygen usage" for information on how to generate the default footer # that doxygen normally uses. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_FOOTER = # The HTML_STYLESHEET tag can be used to specify a user-defined cascading style # sheet that is used by each HTML page. It can be used to fine-tune the look of # the HTML output. If left blank doxygen will generate a default style sheet. # See also section "Doxygen usage" for information on how to generate the style # sheet that doxygen normally uses. # Note: It is recommended to use HTML_EXTRA_STYLESHEET instead of this tag, as # it is more robust and this tag (HTML_STYLESHEET) will in the future become # obsolete. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_STYLESHEET = # The HTML_EXTRA_STYLESHEET tag can be used to specify an additional user- # defined cascading style sheet that is included after the standard style sheets # created by doxygen. Using this option one can overrule certain style aspects. # This is preferred over using HTML_STYLESHEET since it does not replace the # standard style sheet and is therefor more robust against future updates. # Doxygen will copy the style sheet file to the output directory. For an example # see the documentation. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_EXTRA_STYLESHEET = # The HTML_EXTRA_FILES tag can be used to specify one or more extra images or # other source files which should be copied to the HTML output directory. Note # that these files will be copied to the base HTML output directory. Use the # $relpath^ marker in the HTML_HEADER and/or HTML_FOOTER files to load these # files. In the HTML_STYLESHEET file, use the file name only. Also note that the # files will be copied as-is; there are no commands or markers available. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_EXTRA_FILES = # The HTML_COLORSTYLE_HUE tag controls the color of the HTML output. Doxygen # will adjust the colors in the stylesheet and background images according to # this color. Hue is specified as an angle on a colorwheel, see # http://en.wikipedia.org/wiki/Hue for more information. For instance the value # 0 represents red, 60 is yellow, 120 is green, 180 is cyan, 240 is blue, 300 # purple, and 360 is red again. # Minimum value: 0, maximum value: 359, default value: 220. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_COLORSTYLE_HUE = 220 # The HTML_COLORSTYLE_SAT tag controls the purity (or saturation) of the colors # in the HTML output. For a value of 0 the output will use grayscales only. A # value of 255 will produce the most vivid colors. # Minimum value: 0, maximum value: 255, default value: 100. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_COLORSTYLE_SAT = 100 # The HTML_COLORSTYLE_GAMMA tag controls the gamma correction applied to the # luminance component of the colors in the HTML output. Values below 100 # gradually make the output lighter, whereas values above 100 make the output # darker. The value divided by 100 is the actual gamma applied, so 80 represents # a gamma of 0.8, The value 220 represents a gamma of 2.2, and 100 does not # change the gamma. # Minimum value: 40, maximum value: 240, default value: 80. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_COLORSTYLE_GAMMA = 80 # If the HTML_TIMESTAMP tag is set to YES then the footer of each generated HTML # page will contain the date and time when the page was generated. Setting this # to NO can help when comparing the output of multiple runs. # The default value is: YES. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_TIMESTAMP = YES # If the HTML_DYNAMIC_SECTIONS tag is set to YES then the generated HTML # documentation will contain sections that can be hidden and shown after the # page has loaded. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_DYNAMIC_SECTIONS = NO # With HTML_INDEX_NUM_ENTRIES one can control the preferred number of entries # shown in the various tree structured indices initially; the user can expand # and collapse entries dynamically later on. Doxygen will expand the tree to # such a level that at most the specified number of entries are visible (unless # a fully collapsed tree already exceeds this amount). So setting the number of # entries 1 will produce a full collapsed tree by default. 0 is a special value # representing an infinite number of entries and will result in a full expanded # tree by default. # Minimum value: 0, maximum value: 9999, default value: 100. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_INDEX_NUM_ENTRIES = 100 # If the GENERATE_DOCSET tag is set to YES, additional index files will be # generated that can be used as input for Apple's Xcode 3 integrated development # environment (see: http://developer.apple.com/tools/xcode/), introduced with # OSX 10.5 (Leopard). To create a documentation set, doxygen will generate a # Makefile in the HTML output directory. Running make will produce the docset in # that directory and running make install will install the docset in # ~/Library/Developer/Shared/Documentation/DocSets so that Xcode will find it at # startup. See http://developer.apple.com/tools/creatingdocsetswithdoxygen.html # for more information. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_DOCSET = NO # This tag determines the name of the docset feed. A documentation feed provides # an umbrella under which multiple documentation sets from a single provider # (such as a company or product suite) can be grouped. # The default value is: Doxygen generated docs. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_FEEDNAME = "Doxygen generated docs" # This tag specifies a string that should uniquely identify the documentation # set bundle. This should be a reverse domain-name style string, e.g. # com.mycompany.MyDocSet. Doxygen will append .docset to the name. # The default value is: org.doxygen.Project. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_BUNDLE_ID = org.doxygen.Project # The DOCSET_PUBLISHER_ID tag specifies a string that should uniquely identify # the documentation publisher. This should be a reverse domain-name style # string, e.g. com.mycompany.MyDocSet.documentation. # The default value is: org.doxygen.Publisher. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_PUBLISHER_ID = org.doxygen.Publisher # The DOCSET_PUBLISHER_NAME tag identifies the documentation publisher. # The default value is: Publisher. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_PUBLISHER_NAME = Publisher # If the GENERATE_HTMLHELP tag is set to YES then doxygen generates three # additional HTML index files: index.hhp, index.hhc, and index.hhk. The # index.hhp is a project file that can be read by Microsoft's HTML Help Workshop # (see: http://www.microsoft.com/en-us/download/details.aspx?id=21138) on # Windows. # # The HTML Help Workshop contains a compiler that can convert all HTML output # generated by doxygen into a single compiled HTML file (.chm). Compiled HTML # files are now used as the Windows 98 help format, and will replace the old # Windows help format (.hlp) on all Windows platforms in the future. Compressed # HTML files also contain an index, a table of contents, and you can search for # words in the documentation. The HTML workshop also contains a viewer for # compressed HTML files. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_HTMLHELP = NO # The CHM_FILE tag can be used to specify the file name of the resulting .chm # file. You can add a path in front of the file if the result should not be # written to the html output directory. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. CHM_FILE = # The HHC_LOCATION tag can be used to specify the location (absolute path # including file name) of the HTML help compiler ( hhc.exe). If non-empty # doxygen will try to run the HTML help compiler on the generated index.hhp. # The file has to be specified with full path. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. HHC_LOCATION = # The GENERATE_CHI flag controls if a separate .chi index file is generated ( # YES) or that it should be included in the master .chm file ( NO). # The default value is: NO. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. GENERATE_CHI = NO # The CHM_INDEX_ENCODING is used to encode HtmlHelp index ( hhk), content ( hhc) # and project file content. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. CHM_INDEX_ENCODING = # The BINARY_TOC flag controls whether a binary table of contents is generated ( # YES) or a normal table of contents ( NO) in the .chm file. # The default value is: NO. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. BINARY_TOC = NO # The TOC_EXPAND flag can be set to YES to add extra items for group members to # the table of contents of the HTML help documentation and to the tree view. # The default value is: NO. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. TOC_EXPAND = NO # If the GENERATE_QHP tag is set to YES and both QHP_NAMESPACE and # QHP_VIRTUAL_FOLDER are set, an additional index file will be generated that # can be used as input for Qt's qhelpgenerator to generate a Qt Compressed Help # (.qch) of the generated HTML documentation. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_QHP = NO # If the QHG_LOCATION tag is specified, the QCH_FILE tag can be used to specify # the file name of the resulting .qch file. The path specified is relative to # the HTML output folder. # This tag requires that the tag GENERATE_QHP is set to YES. QCH_FILE = # The QHP_NAMESPACE tag specifies the namespace to use when generating Qt Help # Project output. For more information please see Qt Help Project / Namespace # (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#namespace). # The default value is: org.doxygen.Project. # This tag requires that the tag GENERATE_QHP is set to YES. QHP_NAMESPACE = org.doxygen.Project # The QHP_VIRTUAL_FOLDER tag specifies the namespace to use when generating Qt # Help Project output. For more information please see Qt Help Project / Virtual # Folders (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#virtual- # folders). # The default value is: doc. # This tag requires that the tag GENERATE_QHP is set to YES. QHP_VIRTUAL_FOLDER = doc # If the QHP_CUST_FILTER_NAME tag is set, it specifies the name of a custom # filter to add. For more information please see Qt Help Project / Custom # Filters (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#custom- # filters). # This tag requires that the tag GENERATE_QHP is set to YES. QHP_CUST_FILTER_NAME = # The QHP_CUST_FILTER_ATTRS tag specifies the list of the attributes of the # custom filter to add. For more information please see Qt Help Project / Custom # Filters (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#custom- # filters). # This tag requires that the tag GENERATE_QHP is set to YES. QHP_CUST_FILTER_ATTRS = # The QHP_SECT_FILTER_ATTRS tag specifies the list of the attributes this # project's filter section matches. Qt Help Project / Filter Attributes (see: # http://qt-project.org/doc/qt-4.8/qthelpproject.html#filter-attributes). # This tag requires that the tag GENERATE_QHP is set to YES. QHP_SECT_FILTER_ATTRS = # The QHG_LOCATION tag can be used to specify the location of Qt's # qhelpgenerator. If non-empty doxygen will try to run qhelpgenerator on the # generated .qhp file. # This tag requires that the tag GENERATE_QHP is set to YES. QHG_LOCATION = # If the GENERATE_ECLIPSEHELP tag is set to YES, additional index files will be # generated, together with the HTML files, they form an Eclipse help plugin. To # install this plugin and make it available under the help contents menu in # Eclipse, the contents of the directory containing the HTML and XML files needs # to be copied into the plugins directory of eclipse. The name of the directory # within the plugins directory should be the same as the ECLIPSE_DOC_ID value. # After copying Eclipse needs to be restarted before the help appears. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_ECLIPSEHELP = NO # A unique identifier for the Eclipse help plugin. When installing the plugin # the directory name containing the HTML and XML files should also have this # name. Each documentation set should have its own identifier. # The default value is: org.doxygen.Project. # This tag requires that the tag GENERATE_ECLIPSEHELP is set to YES. ECLIPSE_DOC_ID = org.doxygen.Project # If you want full control over the layout of the generated HTML pages it might # be necessary to disable the index and replace it with your own. The # DISABLE_INDEX tag can be used to turn on/off the condensed index (tabs) at top # of each HTML page. A value of NO enables the index and the value YES disables # it. Since the tabs in the index contain the same information as the navigation # tree, you can set this option to YES if you also set GENERATE_TREEVIEW to YES. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. DISABLE_INDEX = NO # The GENERATE_TREEVIEW tag is used to specify whether a tree-like index # structure should be generated to display hierarchical information. If the tag # value is set to YES, a side panel will be generated containing a tree-like # index structure (just like the one that is generated for HTML Help). For this # to work a browser that supports JavaScript, DHTML, CSS and frames is required # (i.e. any modern browser). Windows users are probably better off using the # HTML help feature. Via custom stylesheets (see HTML_EXTRA_STYLESHEET) one can # further fine-tune the look of the index. As an example, the default style # sheet generated by doxygen has an example that shows how to put an image at # the root of the tree instead of the PROJECT_NAME. Since the tree basically has # the same information as the tab index, you could consider setting # DISABLE_INDEX to YES when enabling this option. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_TREEVIEW = NO # The ENUM_VALUES_PER_LINE tag can be used to set the number of enum values that # doxygen will group on one line in the generated HTML documentation. # # Note that a value of 0 will completely suppress the enum values from appearing # in the overview section. # Minimum value: 0, maximum value: 20, default value: 4. # This tag requires that the tag GENERATE_HTML is set to YES. ENUM_VALUES_PER_LINE = 4 # If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be used # to set the initial width (in pixels) of the frame in which the tree is shown. # Minimum value: 0, maximum value: 1500, default value: 250. # This tag requires that the tag GENERATE_HTML is set to YES. TREEVIEW_WIDTH = 250 # When the EXT_LINKS_IN_WINDOW option is set to YES doxygen will open links to # external symbols imported via tag files in a separate window. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. EXT_LINKS_IN_WINDOW = NO # Use this tag to change the font size of LaTeX formulas included as images in # the HTML documentation. When you change the font size after a successful # doxygen run you need to manually remove any form_*.png images from the HTML # output directory to force them to be regenerated. # Minimum value: 8, maximum value: 50, default value: 10. # This tag requires that the tag GENERATE_HTML is set to YES. FORMULA_FONTSIZE = 10 # Use the FORMULA_TRANPARENT tag to determine whether or not the images # generated for formulas are transparent PNGs. Transparent PNGs are not # supported properly for IE 6.0, but are supported on all modern browsers. # # Note that when changing this option you need to delete any form_*.png files in # the HTML output directory before the changes have effect. # The default value is: YES. # This tag requires that the tag GENERATE_HTML is set to YES. FORMULA_TRANSPARENT = YES # Enable the USE_MATHJAX option to render LaTeX formulas using MathJax (see # http://www.mathjax.org) which uses client side Javascript for the rendering # instead of using prerendered bitmaps. Use this if you do not have LaTeX # installed or if you want to formulas look prettier in the HTML output. When # enabled you may also need to install MathJax separately and configure the path # to it using the MATHJAX_RELPATH option. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. USE_MATHJAX = NO # When MathJax is enabled you can set the default output format to be used for # the MathJax output. See the MathJax site (see: # http://docs.mathjax.org/en/latest/output.html) for more details. # Possible values are: HTML-CSS (which is slower, but has the best # compatibility), NativeMML (i.e. MathML) and SVG. # The default value is: HTML-CSS. # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_FORMAT = HTML-CSS # When MathJax is enabled you need to specify the location relative to the HTML # output directory using the MATHJAX_RELPATH option. The destination directory # should contain the MathJax.js script. For instance, if the mathjax directory # is located at the same level as the HTML output directory, then # MATHJAX_RELPATH should be ../mathjax. The default value points to the MathJax # Content Delivery Network so you can quickly see the result without installing # MathJax. However, it is strongly recommended to install a local copy of # MathJax from http://www.mathjax.org before deployment. # The default value is: http://cdn.mathjax.org/mathjax/latest. # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_RELPATH = http://cdn.mathjax.org/mathjax/latest # The MATHJAX_EXTENSIONS tag can be used to specify one or more MathJax # extension names that should be enabled during MathJax rendering. For example # MATHJAX_EXTENSIONS = TeX/AMSmath TeX/AMSsymbols # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_EXTENSIONS = # The MATHJAX_CODEFILE tag can be used to specify a file with javascript pieces # of code that will be used on startup of the MathJax code. See the MathJax site # (see: http://docs.mathjax.org/en/latest/output.html) for more details. For an # example see the documentation. # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_CODEFILE = # When the SEARCHENGINE tag is enabled doxygen will generate a search box for # the HTML output. The underlying search engine uses javascript and DHTML and # should work on any modern browser. Note that when using HTML help # (GENERATE_HTMLHELP), Qt help (GENERATE_QHP), or docsets (GENERATE_DOCSET) # there is already a search function so this one should typically be disabled. # For large projects the javascript based search engine can be slow, then # enabling SERVER_BASED_SEARCH may provide a better solution. It is possible to # search using the keyboard; to jump to the search box use + S # (what the is depends on the OS and browser, but it is typically # , /